Skip to content

Add Prometheus metrics and latency percentiles #8

Description

@xodapi

Why

JSON metrics are useful for the dashboard, but Prometheus-compatible output and percentiles make the proxy easier to monitor over time.

Scope

  • Add /metrics/prometheus.
  • Track latency p50/p95/p99 per model.
  • Include request counts, failures, 429s, retries, and circuit-breaker state.

Acceptance criteria

  • Prometheus endpoint does not expose prompts, responses, keys, paths, or session IDs.
  • Tests validate stable metric names and basic values.
  • README documents local scraping examples.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions