Design: experiment batch runner¶
Status: DRAFT for alignment. Run many experiments headless (explicit scenarios + parameter sweeps + Monte Carlo), collect per-run data and a metrics table.
1. Why / what's already there¶
The builder authors configs/experiments/*.yaml and vdsim_lab.Experiment.
from_config(name) runs ONE. We have, but fragmented and not on the authored
configs:
- python/sweep_runner.py — cartesian sweep over a C++ binary, dotted-path params.
- apps/doe/ — metrics.py (peak_yaw_rate, ss_yaw, …), scenarios.py, a DoE harness.
- examples/monte_carlo.py (#127) — stochastic sampling.
Gap: a campaign runner that expands explicit + swept + MC runs of the authored experiment configs, runs them in parallel headless, and reduces each to metrics.
2. Campaign spec (one YAML)¶
name: fdr_vs_surface
runs:
- scenario: yongin_lap # configs/experiments/*.yaml, run as-is
- sweep: # base + grid -> cartesian product
base: yongin_lap
grid:
vehicle.final_drive_ratio: [4.0, 5.0, 6.0]
road.surface: [minor_road, belgian_pave]
maneuver.v: [25, 30]
- monte_carlo: # base + stochastic samples
base: skidpad
n: 200
vary:
vehicle.mass: { dist: normal, mean: 1500, std: 50 }
mu: { dist: uniform, lo: 0.7, hi: 1.0 }
metrics: [lap_time, peak_ay, understeer_K, max_Fz, dist]
output: results/fdr_vs_surface/ # per-run CSV + summary.csv + resolved/
parallel: 8
duration: 40
Overrides use dotted paths on the experiment config (vehicle.* / tire.* /
road.* / maneuver.* / mu / level). vehicle.X loads the vehicle preset,
overrides field X in-memory, runs.
3. Execution¶
- Expand
runs-> a flat list of(run_id, resolved_config, params): sweep = itertools.product of the grid; monte_carlo = N seeded samples. - Run each with
vdsim_lab.Experiment.from_config(cfg, overrides=...), headless, in amultiprocessing.Pool(parallel)— runs are independent (embarrassingly parallel). Each worker writesresults/<name>/<run_id>.csv(Result.to_csv) + the resolved config toresolved/<run_id>.yaml(reproducibility). - Reduce each Result to the requested
metrics(registry name -> fn(Result)). - Aggregate ->
summary.csv: one row per run = {run_id, params…, metrics…}. Failures are captured (error logged, row marked failed) and don't kill the batch.
4. Metrics¶
A name->function registry reusing/extending apps/doe/metrics.py on the Result:
lap_time (closed-loop return-to-start), peak_ay, understeer_K, max_Fz,
dist, rms_slip, vmax, min_mu_margin, … Users add their own.
5. CLI¶
python tools/vdsim_batch.py run campaign.yaml # run the campaign
python tools/vdsim_batch.py run campaign.yaml --dry # list the expanded runs
6. Open decisions¶
- Run path = vdsim_lab Python + multiprocessing (reuses authored configs; sim core is C++; perf fine) — agree? (vs the C++ sweep_runner binary path.)
- Output: per-run CSV + summary.csv (metrics table) + resolved config per run. Add parquet later? CSV first — agree?
- Parallelism:
multiprocessing.Pool(parallel), default = cpu_count. OK? - Monte Carlo folded into the same spec (reuse #127's sampling), or keep
examples/monte_carlo.pyseparate and only do explicit+sweep here? - CLI-first, builder "Batch" tab later — agree?
- Need resume / caching (skip runs whose output exists) in v1, or later?