Skip to content

P1: Display rate limit countdowns and parse error responses for 429 status codes #14

Description

@xodapi

Problem

When a model (like �ig-pickle) encounters a HTTP 429 error (Rate limit exceeded), the proxy's dashboard does not display a countdown for the remaining rate limit. Instead, fields like 'Сброс' and 'Ждать' show — and the model stays in a 'limited' state indefinitely until either it is rotated out of the memory buffer or a new successful request is sent.

Solution

  1. Add DEFAULT_RETRY_AFTER configuration parameter (default to 60 seconds) to act as a cooldown fallback when 429 responses don't include limit headers.
  2. In metrics.js, parse the response body of 429 errors to try and extract retry timeouts (e.g. 'try again in 45s', 'try again in 5 minutes').
  3. If no retry time can be parsed from headers or body, use the default fallback (e.g., 60 seconds).
  4. Update the dashboard to properly show the countdown for the model card and active limits table, transitioning the model to a yellow 'Check' state once the limit expires.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions