Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
- Record-mode live proxying for the Veo surface (`record.providers.veo`) — submit and poll forwarded 1:1, eager fixture capture of the Files-API uri on `done:true`; captured operations replay later (#278)
- Native xAI Grok Imagine async video lifecycle mock — `POST /v1/videos/generations` submit (JSON-only; multipart rejected with 400), `GET /v1/videos/{request_id}` poll through `pending → done | failed | expired` with synthesized `progress`, `grokVideo` progression, `cost_in_usd_ticks` units, and a Sora-safe `/v1/videos/{id}` dispatch that leaves the OpenAI video surface unchanged (#278)
- Record-mode live proxying for the Grok surface (`record.providers.grok`) — submit and poll forwarded 1:1, eager fixture capture of url/duration/cost on `done`, `failed` persisted, `expired` passed through; captured jobs replay later (#278)
- Optional `blocks` array on the combined `content` + `toolCalls` fixture shape lets a fixture express ordered text/tool-call blocks (`{type:"text",text}` | `{type:"toolCall",name,arguments,id?}`); when present it takes precedence over `{content, toolCalls}` for stream order, enabling tool-first and interleaved ordering. Legacy `{content, toolCalls}` fixtures are unchanged (#274)
- All five providers stream combined responses in fixture block order: Anthropic, OpenAI Responses, and Gemini are fully observable; Ollama is best-effort (clients may reassemble positionally); OpenAI chat-completions emits in order but is degenerate (`delta.content`/`delta.tool_calls` are separate channels the client merges) (#274)
- Recorder captures block order and persists `blocks` only when the recorded upstream stream was genuinely tool-first or interleaved; text-first streams keep the legacy `{content, toolCalls}` shape so golden recordings round-trip byte-identically (#274)

## [1.34.0] - 2026-06-24

Expand Down
11 changes: 11 additions & 0 deletions docs/chat-completions/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,17 @@ <h3>Streaming (stream: true)</h3>
<code>ChatCompletionChunk</code> type with <code>delta</code> instead of
<code>message</code>.
</p>

<h3>Ordered blocks (tool-first)</h3>
<p>
A combined <code>content</code> + <code>toolCalls</code> fixture accepts an optional
<code>blocks</code> array to control stream order &mdash; see
<a href="/fixtures#ordered-blocks">Ordered blocks</a>. On chat-completions this is
<strong>degenerate</strong>: <code>delta.content</code> and <code>delta.tool_calls</code>
are separate channels the client merges, so the mock emits chunks in block order (the wire
order is assertable) but tool-first is not positionally observable to clients. Use
Anthropic, the Responses API, or Gemini for fully observable tool-first ordering.
</p>
</main>
<aside class="page-toc" id="page-toc"></aside>
</div>
Expand Down
10 changes: 10 additions & 0 deletions docs/claude-messages/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,16 @@ <h2>Request Translation</h2>
arrays (including content block arrays) to OpenAI-style messages so the same fixtures work
across all providers.
</p>

<h2>Ordered blocks (tool-first)</h2>
<p>
A combined <code>content</code> + <code>toolCalls</code> fixture accepts an optional
<code>blocks</code> array to control stream order &mdash; see
<a href="/fixtures#ordered-blocks">Ordered blocks</a>. Claude Messages has
<strong>full</strong> support: typed <code>text</code> / <code>tool_use</code> content
blocks stream at incrementing indices in array order, so tool-first and interleaved
ordering are natively observable to clients.
</p>
</main>
<aside class="page-toc" id="page-toc"></aside>
</div>
Expand Down
122 changes: 120 additions & 2 deletions docs/fixtures/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -312,8 +312,12 @@ <h2>Response Types</h2>
</tr>
<tr>
<td>Content + Tool Calls</td>
<td>content, toolCalls[], reasoning?, finishReason?</td>
<td>Text and tool calls in a single response</td>
<td>content, toolCalls[], blocks?, reasoning?, finishReason?</td>
<td>
Text and tool calls in a single response. Add an optional
<code>blocks</code> array to control stream order (e.g. tool-first) &mdash; see
<a href="#ordered-blocks">Ordered blocks</a> below.
</td>
</tr>
<tr>
<td>Error</td>
Expand Down Expand Up @@ -362,6 +366,120 @@ <h2>Response Types</h2>
</p>
</div>

<h2 id="ordered-blocks">Ordered blocks (tool-first &amp; interleaved streaming)</h2>
<p>
By default a <strong>Content + Tool Calls</strong> response streams its text first, then
its tool calls. To control that order &mdash; for example to emit a tool call
<em>before</em> any text (&ldquo;tool-first&rdquo;), or to interleave text and tool calls
&mdash; add an optional <code>blocks</code> array. Each entry is one of:
</p>
<ul>
<li><code>{ "type": "text", "text": "..." }</code> &mdash; a text segment</li>
<li>
<code>{ "type": "toolCall", "name": "...", "arguments": ..., "id": "..." }</code>
&mdash; a tool call (<code>id</code> optional; <code>arguments</code> accepts an object
or string, same auto-stringify rules as elsewhere)
</li>
</ul>
<p>
When <code>blocks</code> is present it <strong>takes precedence</strong> over the
<code>content</code> and <code>toolCalls</code> fields for stream ordering: the blocks are
streamed in array order. (Keep <code>content</code> and <code>toolCalls</code> populated
as well &mdash; they remain the canonical aggregate for replay and for consumers that do
not read <code>blocks</code>.) When <code>blocks</code> is <strong>absent</strong>, legacy
<code>{ content, toolCalls }</code> fixtures stream exactly as before &mdash; text-first,
byte-identical to prior releases. The field is purely additive.
</p>
<div class="code-block">
<div class="code-block-header">tool-first.json <span class="lang-tag">json</span></div>
<pre><code>{
<span class="prop">"content"</span>: <span class="str">"Here is the weather."</span>,
<span class="prop">"toolCalls"</span>: [
{ <span class="prop">"name"</span>: <span class="str">"get_weather"</span>, <span class="prop">"arguments"</span>: { <span class="prop">"city"</span>: <span class="str">"SF"</span> } }
],
<span class="prop">"blocks"</span>: [
{ <span class="prop">"type"</span>: <span class="str">"toolCall"</span>, <span class="prop">"name"</span>: <span class="str">"get_weather"</span>, <span class="prop">"arguments"</span>: { <span class="prop">"city"</span>: <span class="str">"SF"</span> }, <span class="prop">"id"</span>: <span class="str">"call_1"</span> },
{ <span class="prop">"type"</span>: <span class="str">"text"</span>, <span class="prop">"text"</span>: <span class="str">"Here is the weather."</span> }
]
}</code></pre>
</div>
<p>
The example above streams the <code>get_weather</code> tool call <em>before</em> the text.
For an interleaved stream, list blocks in the desired order, e.g.
<code>[toolCall, text, toolCall]</code>.
</p>

<h3>Per-provider observability</h3>
<p>
How faithfully &ldquo;tool-first&rdquo; / interleaved order is observable depends on each
provider's wire protocol. The mock always emits chunks in block order; what a client can
<em>reconstruct</em> from those chunks varies:
</p>
<table class="endpoint-table">
<thead>
<tr>
<th>Provider</th>
<th>Block-order support</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>Anthropic (Claude Messages)</td>
<td>Full</td>
<td>
Typed <code>text</code> / <code>tool_use</code> content blocks at incrementing
indices &mdash; tool-first and interleaved are natively observable.
</td>
</tr>
<tr>
<td>OpenAI Responses API</td>
<td>Full</td>
<td>
Ordered <code>output</code> items (message vs <code>function_call</code>) carry
<code>output_index</code> &mdash; SDKs honor the order, so a tool call can precede
the message.
</td>
</tr>
<tr>
<td>Gemini</td>
<td>Full</td>
<td>
Ordered parts/candidate chunks carry <code>functionCall</code> and text in any
order.
</td>
</tr>
<tr>
<td>Ollama</td>
<td>Partial</td>
<td>
A <code>tool_calls</code> chunk can be emitted before content on the wire, but some
clients reassemble positionally. Best-effort.
</td>
</tr>
<tr>
<td>OpenAI chat-completions</td>
<td>Degenerate</td>
<td>
<code>delta.content</code> and <code>delta.tool_calls</code> are separate channels
the client merges. The mock emits chunks in block order (and the wire order is
assertable), but the merge is <em>not</em> positionally interleaved, so tool-first
is not semantically observable to clients on this channel.
</td>
</tr>
</tbody>
</table>
<div class="info-box">
<p>
<strong>Recording:</strong> In record mode the recorder only persists a
<code>blocks</code> array when the recorded upstream stream was
<em>genuinely</em> tool-first or interleaved (a tool-call delta arrives before the first
content delta, or content arrives after a tool-call delta). Ordinary text-then-tools
streams are saved in the legacy <code>{ content, toolCalls }</code> shape with no
<code>blocks</code> key, so existing golden recordings round-trip byte-identically.
</p>
</div>

<div class="info-box">
<p>
<strong>JSON auto-stringify:</strong> In fixture files and programmatic API,
Expand Down
10 changes: 10 additions & 0 deletions docs/gemini/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,16 @@ <h2>Vertex AI</h2>
The same fixtures work for both Gemini AI Studio and Vertex AI endpoints. See the
<a href="/vertex-ai">Vertex AI</a> page for configuration details.
</p>

<h2>Ordered blocks (tool-first)</h2>
<p>
A combined <code>content</code> + <code>toolCalls</code> fixture accepts an optional
<code>blocks</code> array to control stream order &mdash; see
<a href="/fixtures#ordered-blocks">Ordered blocks</a>. Gemini has
<strong>full</strong> support: ordered parts/candidate chunks carry
<code>functionCall</code> and text in array order, so tool-first and interleaved ordering
are observable to clients.
</p>
</main>
<aside class="page-toc" id="page-toc"></aside>
</div>
Expand Down
10 changes: 10 additions & 0 deletions docs/ollama/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,16 @@ <h2>Request Translation</h2>
<code>options.num_predict</code> to <code>max_tokens</code>, so the same fixtures work
across all providers.
</p>

<h2>Ordered blocks (tool-first)</h2>
<p>
A combined <code>content</code> + <code>toolCalls</code> fixture accepts an optional
<code>blocks</code> array to control stream order &mdash; see
<a href="/fixtures#ordered-blocks">Ordered blocks</a>. Ollama support is
<strong>partial</strong>: a <code>tool_calls</code> chunk can be emitted before content on
the NDJSON wire, but some clients reassemble positionally, so tool-first is best-effort on
this provider.
</p>
</main>
<aside class="page-toc" id="page-toc"></aside>
</div>
Expand Down
10 changes: 10 additions & 0 deletions docs/responses-api/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,16 @@ <h2>SSE Event Sequence</h2>
<a href="/websocket">WebSocket APIs</a> page for WebSocket-specific details.
</p>
</div>

<h2>Ordered blocks (tool-first)</h2>
<p>
A combined <code>content</code> + <code>toolCalls</code> fixture accepts an optional
<code>blocks</code> array to control stream order &mdash; see
<a href="/fixtures#ordered-blocks">Ordered blocks</a>. The Responses API has
<strong>full</strong> support: <code>output</code> items (message vs
<code>function_call</code>) are assigned <code>output_index</code> in array order, so a
tool call can precede the message and SDKs honor the ordering.
</p>
</main>
<aside class="page-toc" id="page-toc"></aside>
</div>
Expand Down
37 changes: 37 additions & 0 deletions src/__tests__/async-fixture-response.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,43 @@ describe("async fixture response (function responses)", () => {
expect(res.status).toBe(500);
});

it("stringifies object arguments on a factory-returned toolCall block", async () => {
mock = new LLMock({ port: 0 });
mock.on(
{ userMessage: "blocks-fn" },
() =>
({
content: "Here you go.",
toolCalls: [{ name: "get_weather", arguments: { city: "NYC" } }],
blocks: [
// OBJECT arguments — must be auto-stringified like toolCalls[].arguments,
// otherwise resolveFixtureBlocks throws (FixtureBlock requires string args).
// eslint-disable-next-line @typescript-eslint/no-explicit-any
{ type: "toolCall", name: "get_weather", arguments: { city: "NYC" } } as any,
],
// eslint-disable-next-line @typescript-eslint/no-explicit-any
}) as any,
);
await mock.start();

const res = await fetch(`${mock.url}/v1/chat/completions`, {
method: "POST",
headers: { "Content-Type": "application/json", Authorization: "Bearer test" },
body: JSON.stringify({
model: "gpt-4o",
messages: [{ role: "user", content: "blocks-fn" }],
stream: true,
}),
});

expect(res.status).toBe(200);
const chunks = parseSSEChunks(await res.text());
const args = chunks
.map((c) => c.choices?.[0]?.delta?.tool_calls?.[0]?.function?.arguments ?? "")
.join("");
expect(args).toBe('{"city":"NYC"}');
});

it("works with async factory and streaming", async () => {
mock = new LLMock({ port: 0 });
mock.on({ userMessage: "async-stream" }, async () => {
Expand Down
66 changes: 64 additions & 2 deletions src/__tests__/content-with-toolcalls.test.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
import { describe, it, expect, afterEach } from "vitest";
import { isContentWithToolCallsResponse, isTextResponse, isToolCallResponse } from "../helpers.js";
import {
isContentWithToolCallsResponse,
isTextResponse,
isToolCallResponse,
resolveFixtureBlocks,
} from "../helpers.js";
import { LLMock } from "../llmock.js";
import type { SSEChunk } from "../types.js";
import type { FixtureBlock, SSEChunk } from "../types.js";

describe("isContentWithToolCallsResponse", () => {
it("returns true when both content and toolCalls are present", () => {
Expand Down Expand Up @@ -39,6 +44,63 @@ describe("isContentWithToolCallsResponse", () => {
});
});

describe("resolveFixtureBlocks", () => {
it("passes a valid mixed blocks array through in order", () => {
const blocks: FixtureBlock[] = [
{ type: "toolCall", name: "get_weather", arguments: '{"city":"NYC"}' },
{ type: "text", text: "Here you go" },
{ type: "toolCall", name: "get_time", arguments: "{}", id: "call_1" },
];
const result = resolveFixtureBlocks(blocks);
// Same reference, same order — passthrough, not reconstruction.
expect(result).toBe(blocks);
expect(result.map((b) => b.type)).toEqual(["toolCall", "text", "toolCall"]);
});

it("accepts a text block with a string text field", () => {
const blocks: FixtureBlock[] = [{ type: "text", text: "hi" }];
expect(resolveFixtureBlocks(blocks)).toEqual(blocks);
});

it("accepts a toolCall block without an optional id", () => {
const blocks: FixtureBlock[] = [{ type: "toolCall", name: "f", arguments: "{}" }];
expect(resolveFixtureBlocks(blocks)).toEqual(blocks);
});

it("rejects a non-array argument", () => {
expect(() => resolveFixtureBlocks({} as unknown as FixtureBlock[])).toThrow(
/expected an array/,
);
});

it("rejects a text block with a non-string text field", () => {
const blocks = [{ type: "text", text: 42 }] as unknown as FixtureBlock[];
expect(() => resolveFixtureBlocks(blocks)).toThrow(/index 0.*string "text"/);
});

it("rejects a toolCall block missing arguments", () => {
const blocks = [{ type: "toolCall", name: "f" }] as unknown as FixtureBlock[];
expect(() => resolveFixtureBlocks(blocks)).toThrow(/index 0.*"name" and "arguments"/);
});

it("rejects a toolCall block with a non-string id", () => {
const blocks = [
{ type: "toolCall", name: "f", arguments: "{}", id: 1 },
] as unknown as FixtureBlock[];
expect(() => resolveFixtureBlocks(blocks)).toThrow(/index 0.*"id" must be a string/);
});

it("rejects a block with an unknown type", () => {
const blocks = [{ type: "image" }] as unknown as FixtureBlock[];
expect(() => resolveFixtureBlocks(blocks)).toThrow(/unknown type/);
});

it("rejects a null entry", () => {
const blocks = [null] as unknown as FixtureBlock[];
expect(() => resolveFixtureBlocks(blocks)).toThrow(/index 0.*expected an object/);
});
});

function parseSSEChunks(body: string): SSEChunk[] {
return body
.split("\n\n")
Expand Down
Loading
Loading