Problem
When a model is in a \limited\ or \error\ state, the proxy currently relies on user requests to verify if the model is healthy again. Furthermore, the developer has to manually keep checking the dashboard to see when a model is back online.
Solution
- Background Model Probing:
- Add a configuration parameter \PROBE_INTERVAL\ (default to 30 seconds) in \config.js.
- Implement \probeModel(model, config, metrics, logger)\ which fires a lightweight chat request (\max_tokens: 1) to test the model.
- Ignore probe requests in \UsageStore\ so they don't pollute the user's token/cost stats.
- Run a background polling loop in \proxy.js\ to probe models that are in a \limited, \error,
etry, or \untested\ state, or haven't been used for more than 5 minutes.
- Dashboard Web Notifications:
- In \dashboard.js, request permission for desktop notifications (\Notification.requestPermission).
- Track previous model states and trigger a desktop web notification when a model is back online (\�vailable\ status) or when it hits a new rate limit (\limited\ status).
Problem
When a model is in a \limited\ or \error\ state, the proxy currently relies on user requests to verify if the model is healthy again. Furthermore, the developer has to manually keep checking the dashboard to see when a model is back online.
Solution
etry, or \untested\ state, or haven't been used for more than 5 minutes.