Problem
managed_microsimulation (and the underlying resolve_managed_dataset_reference) only accept a managed dataset name from the release manifest, or a remote URI (hf://, gs://). There is no way to run a simulation on a local dataset file — a build artifact the caller produced themselves (e.g. a downstream pipeline's per-year Stage-output H5 that is not part of any release manifest).
Passing a local path:
managed_microsimulation(dataset="/tmp/build/2026.h5", allow_unmanaged=True)
raises ValueError: Unknown dataset '/tmp/build/2026.h5' for country 'us'. Known datasets: [...]. allow_unmanaged=True only relaxes URIs (those containing ://), not local paths, and a file:// URI fails downstream with FileNotFoundError.
Impact
Local build-and-score pipelines — for example projecting the certified base to future years and then scoring reforms on the resulting local H5s — cannot go through the managed wrapper at all. They have to construct policyengine_us.Microsimulation directly, which bypasses the provenance recording and runtime-model pinning that the managed path exists to enforce.
Proposed fix
Accept a local filesystem path in resolve_managed_dataset_reference when allow_unmanaged=True (the same explicit opt-in already required for unmanaged URIs). materialize_dataset_source already passes non-URI paths through unchanged, so the simulation constructs normally and the provenance bundle is recorded (managed_by=policyengine.py). Passing a local path without allow_unmanaged=True should raise an actionable error instead of the generic "Unknown dataset" message.
Problem
managed_microsimulation(and the underlyingresolve_managed_dataset_reference) only accept a managed dataset name from the release manifest, or a remote URI (hf://,gs://). There is no way to run a simulation on a local dataset file — a build artifact the caller produced themselves (e.g. a downstream pipeline's per-year Stage-output H5 that is not part of any release manifest).Passing a local path:
raises
ValueError: Unknown dataset '/tmp/build/2026.h5' for country 'us'. Known datasets: [...].allow_unmanaged=Trueonly relaxes URIs (those containing://), not local paths, and afile://URI fails downstream withFileNotFoundError.Impact
Local build-and-score pipelines — for example projecting the certified base to future years and then scoring reforms on the resulting local H5s — cannot go through the managed wrapper at all. They have to construct
policyengine_us.Microsimulationdirectly, which bypasses the provenance recording and runtime-model pinning that the managed path exists to enforce.Proposed fix
Accept a local filesystem path in
resolve_managed_dataset_referencewhenallow_unmanaged=True(the same explicit opt-in already required for unmanaged URIs).materialize_dataset_sourcealready passes non-URI paths through unchanged, so the simulation constructs normally and the provenance bundle is recorded (managed_by=policyengine.py). Passing a local path withoutallow_unmanaged=Trueshould raise an actionable error instead of the generic "Unknown dataset" message.