Model-to-endpoint routing: chat-completions vs responses

Part of #1.

## Goal

Some Copilot upstream models are `/responses`-only (codex family, o-pro), some are `/chat/completions`-only (legacy gpt-3.5/4), most reasoning models work on both but with different fidelity. We need a single source of truth that decides which endpoint a given model id goes to, so that:

- A client hitting `/v1/chat/completions` with `model=gpt-5.3-codex` either gets a clear 400 or is transparently bridged to `/responses`.
- A client hitting `/v1/messages` (Anthropic shape) is dispatched correctly under the hood.
- `/v1/models` can optionally annotate the endpoint each model speaks.

## Current state

`src/routes/models/route.ts:16–24` re-emits **every** upstream model with no awareness of capability. There is no model→endpoint table anywhere in the repo.

## Tasks

- [ ] Add `src/lib/model-capabilities.ts` that, given a model id from `state.models`, returns:
  - `endpoint: 'chat' | 'responses' | 'both'`
  - `reasoning: boolean`
  - `vision: boolean`
- [ ] Seed the table from upstream `/models` response fields where possible (Copilot's `/models` returns `capabilities.type` — verify and use it instead of hardcoding model names where feasible)
- [ ] For known-Responses-only families, hardcode a fallback list as a guard: `gpt-5-codex`, `gpt-5.1-codex`, `gpt-5.1-codex-mini`, `gpt-5.1-codex-max`, `gpt-5.3-codex`, `o1-pro`, `o3-pro`
- [ ] In `/v1/chat/completions` handler, if `endpoint === 'responses'` only:
  - Decide policy: 400 with a clear error message **OR** transparent bridge (translate chat-completions request → Responses request, dispatch to upstream `/responses`, translate response back). Recommend a config flag `--bridge-codex` defaulting to off in v1.
- [ ] In `/v1/responses` handler, if `endpoint === 'chat'` only, return 400 with `model 'X' does not support /responses on Copilot upstream`

## Acceptance criteria

- `gpt-5.3-codex` via `/v1/responses` works
- `gpt-5.3-codex` via `/v1/chat/completions` returns a clear, actionable 400 (or transparently bridges, depending on flag)
- `gpt-4o` via `/v1/responses` returns 400, not a confusing 5xx
- Model list still appears in `/v1/models` with no breaking shape change

## File pointers

- `src/routes/models/route.ts:16–24`
- `src/services/copilot/get-models.ts`
- litellm reference: `model_prices_and_context_window.json` `mode` field; `ProviderConfigManager.get_provider_responses_api_config` (PR [#19650](https://github.com/BerriAI/litellm/pull/19650))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model-to-endpoint routing: chat-completions vs responses #14

Goal

Current state

Tasks

Acceptance criteria

File pointers

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Model-to-endpoint routing: chat-completions vs responses #14

Description

Goal

Current state

Tasks

Acceptance criteria

File pointers

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions