You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Decide per-model whether to dispatch to upstream `/chat/completions` or `/responses`. Codex-family models on Copilot upstream are `/responses`-only — calling `/chat/completions` against `gpt-5.3-codex` or `gpt-5.1-codex-max` will fail. Conversely, `gpt-4o`, `claude-`, `gemini-` only work on `/chat/completions`.
Current state
`src/routes/models/route.ts:16–24` re-emits every upstream model verbatim with no metadata. There is no `mode` field tracked anywhere in the codebase.
Tasks
Add a model-mode classifier. Three reasonable options (pick one):
Static map in `src/lib/model-modes.ts` keyed by model id prefix (`gpt-5.3-codex`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5-codex`, `o1-pro`, `o3-pro` → `responses`; everything else → `chat`)
Heuristic based on substring match on `codex` and suffix `-pro` for o-series
Capability-based by reading `capabilities.supports.tools` / `capabilities.type` from the upstream `/models` response
Wire the classifier into:
`/v1/chat/completions`: if model is `responses`-only, return a structured 400 with a helpful error pointing to `/v1/responses`
`/v1/responses`: if model is `chat`-only, internally translate to chat-completions and bridge the response back (or return 400, depending on policy — recommend `Translate when feasible, else 400`)
`/v1/messages` (Anthropic): use the classifier to decide whether to dispatch via chat-completions or Responses (this is needed by the Anthropic→Responses adapter, see separate issue)
Surface `mode` in `/v1/models` response so clients/dashboards can introspect
Acceptance criteria
`POST /v1/chat/completions` with `model: "gpt-5.3-codex"` returns 400 with a message naming `/v1/responses`
`POST /v1/responses` with `model: "gpt-4o"` either translates to chat-completions or returns 400
Part of #1.
Goal
Decide per-model whether to dispatch to upstream `/chat/completions` or `/responses`. Codex-family models on Copilot upstream are `/responses`-only — calling `/chat/completions` against `gpt-5.3-codex` or `gpt-5.1-codex-max` will fail. Conversely, `gpt-4o`, `claude-`, `gemini-` only work on `/chat/completions`.
Current state
`src/routes/models/route.ts:16–24` re-emits every upstream model verbatim with no metadata. There is no `mode` field tracked anywhere in the codebase.
Tasks
Acceptance criteria
Reference
litellm: `model_prices_and_context_window.json` keys `github_copilot/gpt-5.3-codex` (`mode: "responses"`), `github_copilot/gpt-5.1-codex-max` (`mode: "responses"`).
litellm gate: `ProviderConfigManager.github_copilot_supports_responses_api(model)` (PR #19650)