Why
JSON metrics are useful for the dashboard, but Prometheus-compatible output and percentiles make the proxy easier to monitor over time.
Scope
- Add
/metrics/prometheus.
- Track latency p50/p95/p99 per model.
- Include request counts, failures, 429s, retries, and circuit-breaker state.
Acceptance criteria
- Prometheus endpoint does not expose prompts, responses, keys, paths, or session IDs.
- Tests validate stable metric names and basic values.
- README documents local scraping examples.
Why
JSON metrics are useful for the dashboard, but Prometheus-compatible output and percentiles make the proxy easier to monitor over time.
Scope
/metrics/prometheus.Acceptance criteria