Background
GitHub Copilot's upstream now exposes both /chat/completions and /responses. Codex-family models (gpt-5-codex, gpt-5.1-codex, gpt-5.1-codex-max, gpt-5.3-codex) and o-pro models (o1-pro, o3-pro) are /responses-only on the upstream — they cannot be reached via /chat/completions at all. Other reasoning models (gpt-5, gpt-5.1, gpt-5.2, o1, o3, o4-mini) work on /chat/completions but in a degraded mode: there is no way to round-trip reasoning.encrypted_content, so multi-turn reasoning quality and cost both regress.
copilot-api 0.7.0 is a strict Chat Completions–only proxy. A full audit confirms:
- No
/v1/responses Hono route (src/server.ts:17–31)
- No upstream
${copilotBaseUrl}/responses call site (src/services/copilot/)
- Zero matches for
reasoning, reasoning_content, reasoning_effort, encrypted_content, output_text, response_id across the entire repo
- Anthropic adapter explicitly suppresses
thinking block output (src/routes/messages/non-stream-translation.ts:305)
Impact on users
| Use case |
Today |
gpt-5-codex / 5.1-codex / 5.3-codex / -codex-max |
❌ unreachable |
o1-pro, o3-pro |
❌ unreachable |
gpt-5 / o-series multi-turn reasoning |
⚠️ encrypted_content lost, re-thinks every turn |
Codex CLI (local_shell, apply_patch, previous_response_id) |
❌ impossible |
| Claude Code → Copilot reasoning models with thinking visibility |
❌ adapter drops it |
Sub-issues (suggested implementation order)
Dependency graph
#2 (route) ──┐
├─► #14 (routing) ─► #17 (anthropic→responses) ─► #18 (thinking) ─► #22 (tests)
#3 (client) ─┤ │ │
│ └─► #19 (streaming) ───────────┘
├─► #15 (encrypted_content)
├─► #20 (vision header)
└─► #21 (previous_response_id)
#16 (chat-completions reasoning_effort) — independent, can land anytime
References
Background
GitHub Copilot's upstream now exposes both
/chat/completionsand/responses. Codex-family models (gpt-5-codex,gpt-5.1-codex,gpt-5.1-codex-max,gpt-5.3-codex) and o-pro models (o1-pro,o3-pro) are/responses-only on the upstream — they cannot be reached via/chat/completionsat all. Other reasoning models (gpt-5,gpt-5.1,gpt-5.2,o1,o3,o4-mini) work on/chat/completionsbut in a degraded mode: there is no way to round-tripreasoning.encrypted_content, so multi-turn reasoning quality and cost both regress.copilot-api 0.7.0 is a strict Chat Completions–only proxy. A full audit confirms:
/v1/responsesHono route (src/server.ts:17–31)${copilotBaseUrl}/responsescall site (src/services/copilot/)reasoning,reasoning_content,reasoning_effort,encrypted_content,output_text,response_idacross the entire repothinkingblock output (src/routes/messages/non-stream-translation.ts:305)Impact on users
gpt-5-codex/5.1-codex/5.3-codex/-codex-maxo1-pro,o3-progpt-5/ o-series multi-turn reasoninglocal_shell,apply_patch,previous_response_id)Sub-issues (suggested implementation order)
/v1/responsesroute scaffoldingencrypted_contentfor multi-turn reasoningreasoning_effortpassthrough in chat-completions/v1/messages→ Responses API adapter (Claude Code on Codex)thinkingblocks from reasoning channelCopilot-Vision-Requestheader for image inputs in Responses pathprevious_response_idstateful chain supportDependency graph
References
litellm/llms/github_copilot/responses/transformation.py(GithubCopilotResponsesAPIConfig(OpenAIResponsesAPIConfig))litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py