Skip to content

Model-to-endpoint routing: chat-completions vs responses #14

@HXYerror

Description

@HXYerror

Part of #1.

Goal

Some Copilot upstream models are /responses-only (codex family, o-pro), some are /chat/completions-only (legacy gpt-3.5/4), most reasoning models work on both but with different fidelity. We need a single source of truth that decides which endpoint a given model id goes to, so that:

  • A client hitting /v1/chat/completions with model=gpt-5.3-codex either gets a clear 400 or is transparently bridged to /responses.
  • A client hitting /v1/messages (Anthropic shape) is dispatched correctly under the hood.
  • /v1/models can optionally annotate the endpoint each model speaks.

Current state

src/routes/models/route.ts:16–24 re-emits every upstream model with no awareness of capability. There is no model→endpoint table anywhere in the repo.

Tasks

  • Add src/lib/model-capabilities.ts that, given a model id from state.models, returns:
    • endpoint: 'chat' | 'responses' | 'both'
    • reasoning: boolean
    • vision: boolean
  • Seed the table from upstream /models response fields where possible (Copilot's /models returns capabilities.type — verify and use it instead of hardcoding model names where feasible)
  • For known-Responses-only families, hardcode a fallback list as a guard: gpt-5-codex, gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.1-codex-max, gpt-5.3-codex, o1-pro, o3-pro
  • In /v1/chat/completions handler, if endpoint === 'responses' only:
    • Decide policy: 400 with a clear error message OR transparent bridge (translate chat-completions request → Responses request, dispatch to upstream /responses, translate response back). Recommend a config flag --bridge-codex defaulting to off in v1.
  • In /v1/responses handler, if endpoint === 'chat' only, return 400 with model 'X' does not support /responses on Copilot upstream

Acceptance criteria

  • gpt-5.3-codex via /v1/responses works
  • gpt-5.3-codex via /v1/chat/completions returns a clear, actionable 400 (or transparently bridges, depending on flag)
  • gpt-4o via /v1/responses returns 400, not a confusing 5xx
  • Model list still appears in /v1/models with no breaking shape change

File pointers

  • src/routes/models/route.ts:16–24
  • src/services/copilot/get-models.ts
  • litellm reference: model_prices_and_context_window.json mode field; ProviderConfigManager.get_provider_responses_api_config (PR #19650)

Metadata

Metadata

Assignees

No one assigned

    Labels

    responses-apiOpenAI /v1/responses API support

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions