Applies a function over elements with automatic decision between sequential and parallel execution based on input size and operation complexity. Avoids the overhead of parallel processing for small datasets where it would be counterproductive.
Usage
smart_map(
x,
fn,
...,
complexity = "medium",
threshold = NULL,
workers = NULL,
progress = NULL,
.type = "list"
)Arguments
- x
A list or vector to iterate over.
- fn
Function to apply to each element.
- ...
Additional arguments passed to `fn`.
- complexity
Character. Operation complexity level that determines the threshold automatically:
`"very_low"`: ~1-5ms/op (threshold=2000) - instant calculations
`"low"`: ~5-20ms/op (threshold=1000) - raster extraction
`"medium"`: ~20-100ms/op (threshold=200) - spatial intersections
`"high"`: ~100-500ms/op (threshold=50) - WFS queries
`"very_high"`: ~500ms-5s/op (threshold=20) - API calls, downloads
Default is `"medium"`. Ignored if `threshold` is explicitly set.
- threshold
Integer or NULL. Explicit minimum number of elements to trigger parallel execution. If NULL (default), uses value from `complexity`.
- workers
Integer. Number of parallel workers. Default is `min(4, parallel::detectCores() - 1)`.
- progress
Logical. Show progress bar? Default TRUE for n > 50.
- .type
Character. Return type: "list" (default), "dbl", "chr", "lgl", "int". Determines the output format and uses appropriate map variant.
Details
The function implements a three-tier strategy:
If n <= threshold OR furrr not available: use purrr (or base lapply)
If n > threshold AND furrr available: use furrr with parallel plan
Falls back gracefully to base R if neither purrr nor furrr available
The `complexity` parameter is based on benchmarks showing that furrr has a setup overhead of ~15-18 seconds. Parallelization is only beneficial when: `n * time_per_op * (1 - 1/workers) > overhead`
Examples
if (FALSE) { # \dontrun{
# Using complexity parameter (recommended)
results <- smart_map(parcels, extract_raster, complexity = "low")
results <- smart_map(parcels, query_wfs, complexity = "high")
results <- smart_map(urls, download_file, complexity = "very_high")
# Using explicit threshold
results <- smart_map(1:500, complex_function, threshold = 100)
# Return numeric vector
values <- smart_map(parcels_list, extract_value, .type = "dbl")
} # }