feat: add OpenAI Responses API endpoint (/responses)#1
Conversation
There was a problem hiding this comment.
Pull request overview
Adds OpenAI-compatible Responses API support by proxying GitHub Copilot’s native /responses endpoint and exposing it at both /responses and /v1/responses.
Changes:
- Introduces a new Copilot service (
createResponses) to call${copilotBaseUrl}/responses, including type definitions and header logic (X-Initiator + vision). - Adds a new Hono route + handler for
/responsesthat applies rate-limiting/manual approval, auto-fillsmax_output_tokens, and supports SSE streaming. - Mounts the new routes in
src/server.tsfor both unversioned and/v1prefixes.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/services/copilot/create-responses.ts | New Copilot proxy client for /responses, including payload/response type definitions and streaming support. |
| src/routes/responses/handler.ts | Implements request handling (rate limit, manual approval, token limit defaulting, streaming/non-streaming responses). |
| src/routes/responses/route.ts | Adds Hono router wiring for the new endpoint with error forwarding. |
| src/server.ts | Mounts /responses and /v1/responses routes alongside existing endpoints. |
| bun.lock | Lockfile updated (adds configVersion). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await checkRateLimit(state) | ||
|
|
||
| let payload = await c.req.json<ResponsesPayload>() | ||
| consola.debug("Request payload:", JSON.stringify(payload).slice(-400)) |
There was a problem hiding this comment.
Verbose debug logging prints portions of the raw request payload, which can include user prompts, tool arguments, or image/file URLs. Consider redacting sensitive fields (e.g., input, instructions, metadata) or logging only high-level metadata (model, stream flag, token limits) to avoid accidental disclosure when --verbose is enabled.
| consola.debug("Request payload:", JSON.stringify(payload).slice(-400)) | |
| consola.debug("Request metadata:", { | |
| model: payload.model, | |
| stream: payload.stream, | |
| max_output_tokens: payload.max_output_tokens, | |
| }) |
| for await (const chunk of response) { | ||
| consola.debug("Streaming chunk:", JSON.stringify(chunk)) | ||
| if (!chunk.data) continue | ||
| await stream.writeSSE({ | ||
| event: chunk.event, | ||
| data: chunk.data, | ||
| }) | ||
| } |
There was a problem hiding this comment.
The streaming path logs every SSE chunk (consola.debug("Streaming chunk"...)). For long responses this can generate very large logs and add significant overhead in verbose mode. Consider sampling, truncating, or logging only event types/byte sizes instead of the full chunk payload.
| export const createResponses = async (payload: ResponsesPayload) => { | ||
| if (!state.copilotToken) throw new Error("Copilot token not found") | ||
|
|
||
| const enableVision = | ||
| Array.isArray(payload.input) | ||
| && payload.input.some( | ||
| (x) => | ||
| Array.isArray(x.content) | ||
| && x.content.some((part) => part.type === "input_image"), | ||
| ) | ||
|
|
||
| const isAgentCall = | ||
| Array.isArray(payload.input) | ||
| && payload.input.some((msg) => ["assistant", "tool"].includes(msg.role)) | ||
|
|
||
| const headers: Record<string, string> = { | ||
| ...copilotHeaders(state, enableVision), | ||
| "X-Initiator": isAgentCall ? "agent" : "user", | ||
| } | ||
|
|
There was a problem hiding this comment.
The new /responses service adds logic for X-Initiator and vision header selection, but there are no tests covering these behaviors (unlike create-chat-completions, which has tests/create-chat-completions.test.ts). Add unit tests for createResponses to verify X-Initiator switches to agent when input contains assistant/tool, and that copilot-vision-request is set when input includes an input_image part.
| export const createResponses = async (payload: ResponsesPayload) => { | |
| if (!state.copilotToken) throw new Error("Copilot token not found") | |
| const enableVision = | |
| Array.isArray(payload.input) | |
| && payload.input.some( | |
| (x) => | |
| Array.isArray(x.content) | |
| && x.content.some((part) => part.type === "input_image"), | |
| ) | |
| const isAgentCall = | |
| Array.isArray(payload.input) | |
| && payload.input.some((msg) => ["assistant", "tool"].includes(msg.role)) | |
| const headers: Record<string, string> = { | |
| ...copilotHeaders(state, enableVision), | |
| "X-Initiator": isAgentCall ? "agent" : "user", | |
| } | |
| export const responsesPayloadHasImageInput = (payload: ResponsesPayload) => | |
| Array.isArray(payload.input) | |
| && payload.input.some( | |
| (message) => | |
| Array.isArray(message.content) | |
| && message.content.some((part) => part.type === "input_image"), | |
| ) | |
| export const responsesPayloadIsAgentCall = (payload: ResponsesPayload) => | |
| Array.isArray(payload.input) | |
| && payload.input.some((message) => ["assistant", "tool"].includes(message.role)) | |
| export const createResponsesHeaders = (payload: ResponsesPayload) => { | |
| const enableVision = responsesPayloadHasImageInput(payload) | |
| const isAgentCall = responsesPayloadIsAgentCall(payload) | |
| return { | |
| ...copilotHeaders(state, enableVision), | |
| "X-Initiator": isAgentCall ? "agent" : "user", | |
| } | |
| } | |
| export const createResponses = async (payload: ResponsesPayload) => { | |
| if (!state.copilotToken) throw new Error("Copilot token not found") | |
| const headers = createResponsesHeaders(payload) |
Adds
POST /responsesandPOST /v1/responsesendpoints that proxy GitHub Copilot's native/responsesendpoint, implementing the OpenAI Responses API alongside the existing Chat Completions API.New files
src/services/copilot/create-responses.ts— fetches${copilotBaseUrl}/responses; full TypeScript types forResponsesPayload,ResponseObject,OutputItem, etc.; setsX-Initiatorand vision headers using the same logic as the completions servicesrc/routes/responses/handler.ts— auto-fillsmax_output_tokensfrom model capabilities, respectsmanualApprove/rate-limiting, streams or returns the responsesrc/routes/responses/route.ts— Hono router with error forwardingServer
/responsesand/v1/responsesmounted insrc/server.ts, mirroring the existing dual-mount pattern for completions.Usage
Response shape:
{ "object": "response", "output": [...] }with SSE events for streaming.