Skip to contents

Applies a function over elements with automatic decision between sequential and parallel execution based on input size and operation complexity. Avoids the overhead of parallel processing for small datasets where it would be counterproductive.

Usage

smart_map(
  x,
  fn,
  ...,
  complexity = "medium",
  threshold = NULL,
  workers = NULL,
  progress = NULL,
  .type = "list"
)

Arguments

x

A list or vector to iterate over.

fn

Function to apply to each element.

...

Additional arguments passed to `fn`.

complexity

Character. Operation complexity level that determines the threshold automatically:

  • `"very_low"`: ~1-5ms/op (threshold=2000) - instant calculations

  • `"low"`: ~5-20ms/op (threshold=1000) - raster extraction

  • `"medium"`: ~20-100ms/op (threshold=200) - spatial intersections

  • `"high"`: ~100-500ms/op (threshold=50) - WFS queries

  • `"very_high"`: ~500ms-5s/op (threshold=20) - API calls, downloads

Default is `"medium"`. Ignored if `threshold` is explicitly set.

threshold

Integer or NULL. Explicit minimum number of elements to trigger parallel execution. If NULL (default), uses value from `complexity`.

workers

Integer. Number of parallel workers. Default is `min(4, parallel::detectCores() - 1)`.

progress

Logical. Show progress bar? Default TRUE for n > 50.

.type

Character. Return type: "list" (default), "dbl", "chr", "lgl", "int". Determines the output format and uses appropriate map variant.

Value

A list or vector (depending on `.type`) of results.

Details

The function implements a three-tier strategy:

  1. If n <= threshold OR furrr not available: use purrr (or base lapply)

  2. If n > threshold AND furrr available: use furrr with parallel plan

  3. Falls back gracefully to base R if neither purrr nor furrr available

The `complexity` parameter is based on benchmarks showing that furrr has a setup overhead of ~15-18 seconds. Parallelization is only beneficial when: `n * time_per_op * (1 - 1/workers) > overhead`

Examples

if (FALSE) { # \dontrun{
# Using complexity parameter (recommended)
results <- smart_map(parcels, extract_raster, complexity = "low")
results <- smart_map(parcels, query_wfs, complexity = "high")
results <- smart_map(urls, download_file, complexity = "very_high")

# Using explicit threshold
results <- smart_map(1:500, complex_function, threshold = 100)

# Return numeric vector
values <- smart_map(parcels_list, extract_value, .type = "dbl")
} # }