Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions docs/prd/native-anthropic-passthrough.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Native Anthropic Pass-Through for Claude Models

## Status
Approved

## Overview
Route Anthropic `/v1/messages` requests for Claude models directly to the GitHub Copilot upstream's native Anthropic endpoint, bypassing the existing OpenAI translation layer. This preserves thinking blocks with `signature` field, `top_k`, `cache_control`, and richer usage stats — none of which survive the current translation round-trip.

## Motivation
GitHub Copilot's upstream (`api.enterprise.githubcopilot.com`) natively speaks the Anthropic Messages API for all Claude 4.5+ models. The current code path translates Anthropic → OpenAI → sends → translates back, losing:
- `thinking` blocks (completely dropped)
- `signature` field on thinking blocks (required for multi-turn reasoning)
- `cache_creation_input_tokens` in usage
- `top_k` parameter
- `cache_control` on system/user blocks

The fix: detect Claude models by `vendor === "Anthropic"` from the `/models` endpoint, and forward requests verbatim to `/v1/messages` upstream.

## Requirements

1. **`create-messages-native.ts`** — Service client that POSTs Anthropic payloads directly to `${copilotBaseUrl}/v1/messages` with correct headers (`anthropic-version`, `anthropic-beta`).
2. **Route dispatch** — `handler.ts` checks `isNativeAnthropicModel(model)` and branches to native path for Claude, translation path for everything else.
3. **`native-models.ts`** — `isNativeAnthropicModel(modelId)` checks `state.models` vendor field; falls back to `claude-` prefix heuristic before models load.
4. **Type fixes** — `anthropic-types.ts`: `signature?` on `AnthropicThinkingBlock`; union `thinking` type for adaptive (opus-4.7+); `output_config`; `AnthropicImageBlock` URL source; `AnthropicToolResultBlock.content` widened.
5. **Adaptive thinking upgrade** — `create-messages-native.ts` auto-upgrades `{ type: "enabled" }` → `{ type: "adaptive" }` + `output_config.effort` for `claude-opus-4.7+` models.
6. **SSE proxy** — Streaming responses from native path forwarded verbatim to client (no re-translation needed).

## Acceptance Criteria

- Claude models (`vendor === "Anthropic"`) route to native path; non-Claude models route to translation path.
- Thinking blocks with `signature` field returned to client in both streaming and non-streaming.
- Multi-turn conversations with thinking blocks (echoing `signature`) work correctly.
- `claude-opus-4.7+` with `{ type: "enabled" }` thinking auto-upgrades to adaptive format; no HTTP 400.
- All existing tests pass; new tests cover native vs. translation dispatch.

## Technical Approach

### Model detection
`state.models.data` from `/models` endpoint has `vendor: "Anthropic"` for all Claude models. `isNativeAnthropicModel()` checks this first, falls back to `startsWith("claude-")` heuristic.

### Headers for native path
```
anthropic-version: 2023-06-01
anthropic-beta: interleaved-thinking-2025-05-14,prompt-caching-2024-07-31
```
Plus all standard Copilot headers (auth, editor-version, etc.).

### Streaming proxy
Native upstream sends proper Anthropic SSE events. Parse `event.type` for logging; forward `rawEvent.data` verbatim. No translation needed.

### Adaptive thinking (opus-4.7+)
If model matches `/^claude-opus-4[.-](\d+)/` with minor ≥ 7, auto-upgrade `{ type: "enabled", budget_tokens: N }` → `{ type: "adaptive" }` + `output_config: { effort: "medium" }`.

## File Changes

**New:**
- `src/services/copilot/create-messages-native.ts`
- `src/services/copilot/native-models.ts`

**Modified:**
- `src/routes/messages/anthropic-types.ts` — type fixes
- `src/routes/messages/handler.ts` — dispatch logic
- `src/routes/messages/non-stream-translation.ts` — remove stale comment; fix image source narrowing

## Testing Strategy
- Unit: `isNativeAnthropicModel()` with populated vs empty `state.models`
- Unit: `buildUpstreamPayload()` adaptive thinking upgrade
- Integration: handler routes Claude models to native, GPT models to translation
- Existing translation tests must still pass

## Out of Scope
- Persistent caching of native responses
- URL image sources (rejected by upstream; type kept for fidelity)
- Responses API (#1 epic)
39 changes: 30 additions & 9 deletions src/routes/messages/anthropic-types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,16 @@ export interface AnthropicMessagesPayload {
type: "auto" | "any" | "tool" | "none"
name?: string
}
thinking?: {
type: "enabled"
budget_tokens?: number
/**
* Thinking config.
* - Legacy (claude-3.7 / claude-4.5): `{ type: "enabled", budget_tokens: N }`
* - New adaptive (claude-opus-4.7+): `{ type: "adaptive" }` paired with
* `output_config.effort` in the request body.
*/
thinking?: { type: "enabled"; budget_tokens?: number } | { type: "adaptive" }
/** Used together with `thinking: { type: "adaptive" }` on opus-4.7+. */
output_config?: {
effort?: "low" | "medium" | "high"
}
service_tier?: "auto" | "standard_only"
}
Expand All @@ -32,17 +39,24 @@ export interface AnthropicTextBlock {

export interface AnthropicImageBlock {
type: "image"
source: {
type: "base64"
media_type: "image/jpeg" | "image/png" | "image/gif" | "image/webp"
data: string
}
source:
| {
type: "base64"
media_type: "image/jpeg" | "image/png" | "image/gif" | "image/webp"
data: string
}
| {
/** URL images are rejected by Copilot upstream — kept for type fidelity only. */
type: "url"
url: string
}
}

export interface AnthropicToolResultBlock {
type: "tool_result"
tool_use_id: string
content: string
/** May be a plain string or an array of content blocks. */
content: string | Array<AnthropicTextBlock | AnthropicImageBlock>
is_error?: boolean
}

Expand All @@ -56,6 +70,12 @@ export interface AnthropicToolUseBlock {
export interface AnthropicThinkingBlock {
type: "thinking"
thinking: string
/**
* Opaque signature returned by the upstream for extended thinking blocks.
* Must be echoed back in subsequent turns to enable multi-turn reasoning.
* Present on native pass-through responses; absent on translated responses.
*/
signature?: string
}

export type AnthropicUserContentBlock =
Expand Down Expand Up @@ -106,6 +126,7 @@ export interface AnthropicResponse {
output_tokens: number
cache_creation_input_tokens?: number
cache_read_input_tokens?: number
/** Present on native pass-through responses. */
service_tier?: "standard" | "priority" | "batch"
}
}
Expand Down
78 changes: 74 additions & 4 deletions src/routes/messages/handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,12 @@ import {
type ChatCompletionChunk,
type ChatCompletionResponse,
} from "~/services/copilot/create-chat-completions"
import { createMessagesNative } from "~/services/copilot/create-messages-native"
import { isNativeAnthropicModel } from "~/services/copilot/native-models"

import {
type AnthropicMessagesPayload,
type AnthropicStreamEventData,
type AnthropicStreamState,
} from "./anthropic-types"
import {
Expand All @@ -28,16 +31,83 @@ export async function handleCompletion(c: Context) {
const anthropicPayload = await c.req.json<AnthropicMessagesPayload>()
consola.debug("Anthropic request payload:", JSON.stringify(anthropicPayload))

if (state.manualApprove) {
await awaitApproval()
}

// Route to native Anthropic pass-through for Claude models to preserve
// thinking blocks (with signature), top_k, cache_control, and richer usage.
if (isNativeAnthropicModel(anthropicPayload.model)) {
return handleNative(c, anthropicPayload)
}

return handleTranslated(c, anthropicPayload)
}

// ---------------------------------------------------------------------------
// Native Anthropic pass-through (Claude 4.5+ models)
// ---------------------------------------------------------------------------

async function handleNative(
c: Context,
payload: AnthropicMessagesPayload,
): Promise<Response> {
consola.debug("Using native Anthropic pass-through for", payload.model)

const response = await createMessagesNative(payload)

if (!payload.stream) {
// Non-streaming: upstream already returned a complete Anthropic response
consola.debug(
"Native non-streaming response:",
JSON.stringify(response).slice(0, 400),
)
return c.json(response)
}

// Streaming: proxy the SSE events directly to the client
consola.debug("Native streaming response — proxying SSE events")
return streamSSE(c, async (stream) => {
for await (const rawEvent of response as AsyncIterable<{
data?: string
event?: string
}>) {
if (!rawEvent.data) continue

// Forward verbatim — never block on parse failure
await stream.writeSSE({
event: rawEvent.event,
data: rawEvent.data,
})

// Parse only for debug logging
try {
const parsed = JSON.parse(rawEvent.data) as AnthropicStreamEventData
consola.debug("Native SSE event:", parsed.type)
} catch {
consola.warn(
"Could not parse native SSE chunk for logging:",
rawEvent.data.slice(0, 200),
)
}
}
})
}

// ---------------------------------------------------------------------------
// Translation path (non-Claude models via /chat/completions)
// ---------------------------------------------------------------------------

async function handleTranslated(
c: Context,
anthropicPayload: AnthropicMessagesPayload,
): Promise<Response> {
const openAIPayload = translateToOpenAI(anthropicPayload)
consola.debug(
"Translated OpenAI request payload:",
JSON.stringify(openAIPayload),
)

if (state.manualApprove) {
await awaitApproval()
}

const response = await createChatCompletions(openAIPayload)

if (isNonStreaming(response)) {
Expand Down
26 changes: 19 additions & 7 deletions src/routes/messages/non-stream-translation.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import consola from "consola"

import {
type ChatCompletionResponse,
type ChatCompletionsPayload,
Expand Down Expand Up @@ -213,12 +215,20 @@ function mapContent(
break
}
case "image": {
contentParts.push({
type: "image_url",
image_url: {
url: `data:${block.source.media_type};base64,${block.source.data}`,
},
})
if (block.source.type === "base64") {
contentParts.push({
type: "image_url",
image_url: {
url: `data:${block.source.media_type};base64,${block.source.data}`,
},
})
} else {
// URL images are rejected by Copilot upstream — skip silently
// (type kept for fidelity when round-tripping through native path)
consola.warn(
"URL image source not supported in translation path — skipping",
)
}

break
}
Expand Down Expand Up @@ -302,7 +312,9 @@ export function translateToAnthropic(
}
}

// Note: GitHub Copilot doesn't generate thinking blocks, so we don't include them in responses
// Note: the translation path routes Claude models via /chat/completions which
// does not return thinking blocks. For thinking block support use the native
// Anthropic pass-through path (create-messages-native.ts).

return {
id: response.id,
Expand Down
Loading