[Observability] Finalize run manifests on early exits and unhandled pipeline failures

## Finding
Training run manifests are initialized as `running`, but several early exits and unhandled failures can leave them in that state instead of finalizing them as failed with an error reason.

## Evidence
- `train.py:1350-1374` initializes a run manifest with `status: "running"`.
- `train.py:2356-2363` writes that manifest before the main pipeline branches.
- Eval-only validation can call `sys.exit(1)` for a missing/invalid model or missing test dataset at `train.py:2377-2409` before `_finalize_manifest` runs.
- Full training can call `sys.exit(1)` for a missing dataset or missing `ETHERSCAN_API_KEY` at `train.py:2477-2504` before `_finalize_manifest` runs.
- Only the preflight-failure path explicitly calls `_finalize_manifest(..., "failed", error)` (`train.py:2431-2434`, `train.py:2556-2559`); there is no top-level try/except/finally around collection, training, or evaluation.

## Impact
A failed or misconfigured run can leave behind a manifest that says it is still running. This misleads experiment tracking, makes fleet/CI triage harder, and obscures the actual failure reason from the primary run artifact.

## Recommended fix
Wrap `main()` pipeline execution in a manifest-aware failure handler. Replace early `sys.exit` branches with exceptions or finalize before exiting, and ensure unhandled collection/training/evaluation exceptions update `status`, `finished_at`, duration, and structured error information.

## Acceptance criteria
- Missing model, missing test dataset, missing source dataset, missing API key, preflight failure, and unexpected exceptions all produce finalized manifests with `status: "failed"` and an error object.
- Successful dataset-only, eval-only, train, and skip-eval runs still finalize as completed.
- Tests or smoke commands verify manifest status for at least one early-exit and one exception path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Observability] Finalize run manifests on early exits and unhandled pipeline failures #111

Finding

Evidence

Impact

Recommended fix

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Observability] Finalize run manifests on early exits and unhandled pipeline failures #111

Description

Finding

Evidence

Impact

Recommended fix

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions