Skip to content

Streaming reasoning events translation #10

@HXYerror

Description

@HXYerror

Part of #1. Depends on #4, #9.

Goal

Translate the SSE event stream of upstream `/responses` (and the streamed `reasoning_content` chunks of upstream `/chat/completions`) into:

  • OpenAI-shape SSE on `/v1/responses` (mostly straight passthrough)
  • Anthropic-shape SSE on `/v1/messages` (`message_start` → `content_block_start` (thinking) → `thinking_delta` × N → `signature_delta` → `content_block_stop` → next content_block_start (text) → ...)

Background

Upstream Responses-API SSE events include:

  • `response.created`, `response.in_progress`, `response.completed`, `response.failed`
  • `response.output_item.added` / `.done`
  • `response.content_part.added` / `.done`
  • `response.output_text.delta` / `.done`
  • `response.reasoning.delta` / `.done`
  • `response.reasoning_summary_text.delta` / `.done`
  • `response.function_call_arguments.delta` / `.done`

These need to map to Anthropic's:

  • `message_start`, `content_block_start`, `content_block_delta` (with `text_delta` | `thinking_delta` | `signature_delta` | `input_json_delta`), `content_block_stop`, `message_delta`, `message_stop`, `ping`

Current state

`src/routes/messages/stream-translation.ts` only handles the chat-completions stream shape (single `choices[0].delta`). `thinking_delta` and `signature_delta` types exist in `anthropic-types.ts:145–146` but have no producer.

Tasks

  • On `/v1/responses` streaming: passthrough upstream SSE with minimal transformation (only add headers / handle disconnects)
  • On `/v1/messages` streaming when target model is `responses`-mode (Model-to-endpoint routing (chat vs responses) #5):
    • Emit Anthropic `message_start`
    • For each upstream `response.output_item.added` of `type: reasoning` → emit Anthropic `content_block_start` (thinking)
    • Map `response.reasoning_summary_text.delta` → `thinking_delta`
    • On reasoning item completion, emit `signature_delta` carrying `encrypted_content` then `content_block_stop`
    • For `message` items → `content_block_start` (text), `response.output_text.delta` → `text_delta`
    • For `function_call` items → `content_block_start` (tool_use), `response.function_call_arguments.delta` → `input_json_delta`
    • On `response.completed` → `message_delta` (with stop_reason and usage) + `message_stop`
  • On `/v1/messages` streaming when target model is `chat`-mode and upstream returns `reasoning_content` deltas (Reasoning types & reasoning_effort passthrough in chat-completions #7): emit `thinking` block in addition to `text` block
  • Periodic `ping` events to keep long reasoning sessions alive

Acceptance criteria

  • `/v1/messages` streaming with `gpt-5.3-codex` shows incremental `thinking` text in Claude Code's UI
  • Tool calls inside reasoning context preserve correct ordering (reasoning → tool_use → reasoning → text)
  • Stream cancellation propagates upstream

File pointers

  • `src/routes/messages/stream-translation.ts` (extend)
  • New: `src/routes/responses/stream-translation.ts` (passthrough/normalization)

Metadata

Metadata

Assignees

No one assigned

    Labels

    anthropicAnthropic Messages API compatibilityreasoningReasoning / thinking / encrypted_contentresponses-apiOpenAI /v1/responses API support

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions