Skip to content

F1.B — Bidirectional model alias rewriting (request + response + SSE) #25

@HXYerror

Description

@HXYerror

Part of #23. Depends on F1.A.

Background

src/routes/chat-completions/route.ts and src/routes/messages/route.ts forward the request model field unmodified upstream. Streaming responses in src/services/copilot/create-chat-completions.ts echo upstream model back. To present stable client-facing aliases (e.g. sonnetclaude-sonnet-4) we need bidirectional rewriting.

⚠️ Reviewer caveat

Devil-advocate flagged this as the lowest-value, highest-fragility piece of the admin plane. Hide-only filtering is sufficient for the operator-control use case; bidirectional rewriting is purely cosmetic. Consider scoping this issue to hide-only in v0.8 and revisit aliasing in a later release if real demand surfaces. If we ship rewriting, the SSE chunk handling below is mandatory — partial implementations will leak upstream names into client UIs intermittently.

Goal

Translate alias → upstream on ingress; if enabled === alias-mode, translate upstream → alias on egress for both JSON and SSE responses.

Tasks

  • resolveAlias(input) and applyAliasToResponse(upstream) helpers backed by config-store snapshot
  • SSE rewrite at parsed-event layer ONLY (per backend review B2): use fetch-event-stream's events() to get one parsed JSON object per event; rewrite the top-level model field; never regex on raw bytes (tool-call argument JSON routinely contains "model":"…" substrings that would corrupt content)
  • Special-case the [DONE] sentinel
  • Apply at chat-completions route (request body, non-stream JSON, SSE chunks)
  • Apply at Anthropic messages route (request body, response body, streaming message_delta events)
  • Strict mode flag in config: if alias not found AND strict, return 400 {error: "unknown_model"}; else fall through to upstream verbatim
  • Assertion: alias rewriting touches the model field ONLY, never message bodies (regression guard for X-Initiator: agent|user derivation in src/services/copilot/create-chat-completions.ts:21-29 per backend review Tests for Responses path & reasoning fidelity #13)
  • Tests: pass-through when no alias, ingress rewrite, SSE per-event rewrite, Anthropic shape, tool-call args containing "model":"…" substring NOT corrupted, multi-line data: frames

Acceptance criteria

  • Client sending "model":"sonnet" reaches upstream as claude-sonnet-4; response stream shows only sonnet
  • Tool call with arguments JSON {"model": "gpt-4"} is preserved byte-exact in the streamed delta
  • p99 latency regression < 5% in SSE microbench

File pointers

  • New: src/lib/alias.ts, tests/alias.test.ts
  • Touch: src/routes/chat-completions/handler.ts, src/routes/messages/handler.ts, src/services/copilot/create-chat-completions.ts

Dependencies

Depends on F1.A. Blocks F1.C.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions