Skip to content

Commit 58cf64d

Browse files
Add provider model and token limit overrides to ProviderConfig (#966)
* Add token limit and model override fields to ProviderConfig Adds the following optional fields to ProviderConfig across all SDKs: - modelId: well-known model ID for agent config + token limit lookup - wireModel: model name sent to the provider API for inference - maxPromptTokens: prompt token cap (triggers compaction) - maxOutputTokens: response token cap Both modelId and wireModel default to the session's configured model when unset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add tests for new ProviderConfig fields Extends existing provider-forwarding/serialization tests across all 4 SDKs to cover modelId, wireModel, maxPromptTokens, and maxOutputTokens. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add E2E tests for ProviderConfig model and token limit overrides Adds two end-to-end tests per SDK (Node, Python, Go, .NET) that exercise the new ProviderConfig fields against the replaying CAPI proxy: - should_forward_provider_wire_model_and_max_output_tokens: verifies wireModel overrides the wire request model and maxOutputTokens is forwarded as max_tokens. - should_use_provider_model_id_as_wire_model: verifies modelId acts as the wire model when wireModel is unspecified and SessionConfig.Model is omitted. Also adds MaxTokens to the Go and .NET ChatCompletionRequest harness types so the assertion is observable, and ships two shared snapshot YAMLs under test/snapshots/session_config/. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Drop unverifiable max_tokens assertion from BYOK wire model E2E test The OpenAI BYOK provider code path in the CLI does not echo the configured maxOutputTokens as max_tokens on the wire request body (it's used internally for token budgeting and only appears on Anthropic-style requests). The new wire model E2E test asserted on max_tokens in the captured chat completion request, which always returned undefined/nil and failed across all four SDKs. Rename the test to `should forward provider wire model'' and drop the wire-side max_tokens assertion. The test still sets maxOutputTokens to confirm the SDK serializes the field without errors; per-SDK unit tests already cover ProviderConfig serialization in detail. Also drop the now-unused MaxTokens field from the Go and .NET harness ChatCompletionRequest types. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename MaxPromptTokens to MaxInputTokens and clarify ProviderConfig docs Renames the SDK-facing ProviderConfig field across all four languages while preserving the wire JSON key as maxPromptTokens: - .NET: MaxPromptTokens -> MaxInputTokens (JsonPropertyName unchanged) - Go: MaxPromptTokens -> MaxInputTokens (json tag unchanged) - Python: max_prompt_tokens -> max_input_tokens (wire conversion in _convert_provider_to_wire_format unchanged) - Node: maxPromptTokens -> maxInputTokens; adds a small toWireProviderConfig helper in client.ts that remaps the field before sending session.create / session.resume. Also rewrites the doc comments for modelId, wireModel, maxInputTokens, and maxOutputTokens to make the priority order clear: WireModel falls back to ModelId falls back to SessionConfig.Model, and ModelId drives both runtime configuration lookup and the wire model when WireModel is unset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Windows teardown flake in ClientE2ETests Remove 'using var' from three ClientE2ETests that also call ForceStopAsync in their finally block. The double-disposal (using Dispose → DisposeAsync → ForceStopAsync, plus the explicit ForceStopAsync) races on Windows when the CLI process/pipes are mid-teardown, causing OperationCanceledException to bubble up and fail an otherwise-passing test. Matches the existing pattern in SessionFsE2ETests where ForceStopAsync is wrapped in try-catch to swallow teardown-only exceptions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "Fix Windows teardown flake in ClientE2ETests" This reverts commit 6c794c4. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 1832d0a commit 58cf64d

16 files changed

Lines changed: 492 additions & 3 deletions

dotnet/src/Types.cs

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1528,6 +1528,40 @@ public class ProviderConfig
15281528
/// </summary>
15291529
[JsonPropertyName("headers")]
15301530
public IDictionary<string, string>? Headers { get; set; }
1531+
1532+
/// <summary>
1533+
/// Well-known model name used by the runtime to look up agent configuration
1534+
/// (tools, prompts, reasoning behavior) and default token limits. Also used
1535+
/// as the wire model when <see cref="WireModel"/> is not set.
1536+
/// Falls back to <see cref="SessionConfig.Model"/>.
1537+
/// </summary>
1538+
[JsonPropertyName("modelId")]
1539+
public string? ModelId { get; set; }
1540+
1541+
/// <summary>
1542+
/// Model name sent to the provider API for inference. Use this when the
1543+
/// provider's model name (e.g. an Azure deployment name or a custom
1544+
/// fine-tune name) differs from <see cref="ModelId"/>.
1545+
/// Falls back to <see cref="ModelId"/>, then <see cref="SessionConfig.Model"/>.
1546+
/// </summary>
1547+
[JsonPropertyName("wireModel")]
1548+
public string? WireModel { get; set; }
1549+
1550+
/// <summary>
1551+
/// Overrides the resolved model's default max prompt tokens. The runtime
1552+
/// triggers conversation compaction before sending a request when the
1553+
/// prompt (system message, history, tool definitions, user message) would
1554+
/// exceed this limit.
1555+
/// </summary>
1556+
[JsonPropertyName("maxPromptTokens")]
1557+
public int? MaxInputTokens { get; set; }
1558+
1559+
/// <summary>
1560+
/// Overrides the resolved model's default max output tokens. When hit, the
1561+
/// model stops generating and returns a truncated response.
1562+
/// </summary>
1563+
[JsonPropertyName("maxOutputTokens")]
1564+
public int? MaxOutputTokens { get; set; }
15311565
}
15321566

15331567
/// <summary>

dotnet/test/E2E/SessionConfigE2ETests.cs

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,62 @@ public async Task Should_Forward_Custom_Provider_Headers_On_Resume()
238238
await session2.DisposeAsync();
239239
}
240240

241+
[Fact]
242+
public async Task Should_Forward_Provider_Wire_Model()
243+
{
244+
// Verifies that ProviderConfig.WireModel overrides the model name sent to
245+
// the provider API, while SessionConfig.Model still drives runtime
246+
// configuration lookup (capabilities, prompts, reasoning behavior).
247+
// MaxOutputTokens is also set here to confirm the SDK accepts it without
248+
// serialization errors; the CLI does not echo it as `max_tokens` on the
249+
// OpenAI-style wire request, so we don't assert on it directly (see unit
250+
// tests for serialization coverage).
251+
var session = await CreateSessionAsync(new SessionConfig
252+
{
253+
Model = "claude-sonnet-4.5",
254+
Provider = new ProviderConfig
255+
{
256+
Type = "openai",
257+
BaseUrl = Ctx.ProxyUrl,
258+
ApiKey = "test-provider-key",
259+
WireModel = "test-wire-model",
260+
MaxOutputTokens = 1024,
261+
},
262+
});
263+
264+
await session.SendAndWaitAsync(new MessageOptions { Prompt = "What is 1+1?" });
265+
266+
var exchange = Assert.Single(await Ctx.GetExchangesAsync());
267+
Assert.Equal("test-wire-model", exchange.Request.Model);
268+
269+
await session.DisposeAsync();
270+
}
271+
272+
[Fact]
273+
public async Task Should_Use_Provider_Model_Id_As_Wire_Model()
274+
{
275+
// ProviderConfig.ModelId drives both the runtime resolved model AND the wire model
276+
// when WireModel is not specified. Here SessionConfig.Model is intentionally omitted
277+
// so that ModelId is the only model source.
278+
var session = await CreateSessionAsync(new SessionConfig
279+
{
280+
Provider = new ProviderConfig
281+
{
282+
Type = "openai",
283+
BaseUrl = Ctx.ProxyUrl,
284+
ApiKey = "test-provider-key",
285+
ModelId = "claude-sonnet-4.5",
286+
},
287+
});
288+
289+
await session.SendAndWaitAsync(new MessageOptions { Prompt = "What is 1+1?" });
290+
291+
var exchange = Assert.Single(await Ctx.GetExchangesAsync());
292+
Assert.Equal("claude-sonnet-4.5", exchange.Request.Model);
293+
294+
await session.DisposeAsync();
295+
}
296+
241297
[Fact]
242298
public async Task Should_Use_WorkingDirectory_For_Tool_Execution()
243299
{

dotnet/test/Unit/SerializationTests.cs

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,19 +20,31 @@ public void ProviderConfig_CanSerializeHeaders_WithSdkOptions()
2020
var original = new ProviderConfig
2121
{
2222
BaseUrl = "https://example.com/provider",
23-
Headers = new Dictionary<string, string> { ["Authorization"] = "Bearer provider-token" }
23+
Headers = new Dictionary<string, string> { ["Authorization"] = "Bearer provider-token" },
24+
ModelId = "gpt-4o",
25+
WireModel = "my-finetune-v3",
26+
MaxInputTokens = 100_000,
27+
MaxOutputTokens = 4096
2428
};
2529

2630
var json = JsonSerializer.Serialize(original, options);
2731
using var document = JsonDocument.Parse(json);
2832
var root = document.RootElement;
2933
Assert.Equal("https://example.com/provider", root.GetProperty("baseUrl").GetString());
3034
Assert.Equal("Bearer provider-token", root.GetProperty("headers").GetProperty("Authorization").GetString());
35+
Assert.Equal("gpt-4o", root.GetProperty("modelId").GetString());
36+
Assert.Equal("my-finetune-v3", root.GetProperty("wireModel").GetString());
37+
Assert.Equal(100_000, root.GetProperty("maxPromptTokens").GetInt32());
38+
Assert.Equal(4096, root.GetProperty("maxOutputTokens").GetInt32());
3139

3240
var deserialized = JsonSerializer.Deserialize<ProviderConfig>(json, options);
3341
Assert.NotNull(deserialized);
3442
Assert.Equal("https://example.com/provider", deserialized.BaseUrl);
3543
Assert.Equal("Bearer provider-token", deserialized.Headers!["Authorization"]);
44+
Assert.Equal("gpt-4o", deserialized.ModelId);
45+
Assert.Equal("my-finetune-v3", deserialized.WireModel);
46+
Assert.Equal(100_000, deserialized.MaxInputTokens);
47+
Assert.Equal(4096, deserialized.MaxOutputTokens);
3648
}
3749

3850
[Fact]

go/internal/e2e/session_config_e2e_test.go

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -323,6 +323,85 @@ func TestSessionConfigExtrasE2E(t *testing.T) {
323323
}
324324
})
325325

326+
t.Run("should forward provider wire model", func(t *testing.T) {
327+
// Verifies that ProviderConfig.WireModel overrides the model name sent to
328+
// the provider API, while SessionConfig.Model still drives runtime
329+
// configuration lookup (capabilities, prompts, reasoning behavior).
330+
// MaxOutputTokens is also set here to confirm the SDK accepts it without
331+
// serialization errors; the CLI does not echo it as `max_tokens` on the
332+
// OpenAI-style wire request, so we don't assert on it directly (see unit
333+
// tests for serialization coverage).
334+
ctx.ConfigureForTest(t)
335+
336+
maxOutputTokens := 1024
337+
session, err := client.CreateSession(t.Context(), &copilot.SessionConfig{
338+
OnPermissionRequest: copilot.PermissionHandler.ApproveAll,
339+
Model: "claude-sonnet-4.5",
340+
Provider: &copilot.ProviderConfig{
341+
Type: "openai",
342+
BaseURL: ctx.ProxyURL,
343+
APIKey: "test-provider-key",
344+
WireModel: "test-wire-model",
345+
MaxOutputTokens: maxOutputTokens,
346+
},
347+
})
348+
if err != nil {
349+
t.Fatalf("CreateSession failed: %v", err)
350+
}
351+
352+
_, err = session.SendAndWait(t.Context(), copilot.MessageOptions{Prompt: "What is 1+1?"})
353+
if err != nil {
354+
t.Fatalf("SendAndWait failed: %v", err)
355+
}
356+
357+
exchanges, err := ctx.GetExchanges()
358+
if err != nil {
359+
t.Fatalf("GetExchanges failed: %v", err)
360+
}
361+
if len(exchanges) != 1 {
362+
t.Fatalf("Expected exactly 1 exchange, got %d", len(exchanges))
363+
}
364+
if exchanges[0].Request.Model != "test-wire-model" {
365+
t.Errorf("Expected request model to be 'test-wire-model', got %q", exchanges[0].Request.Model)
366+
}
367+
})
368+
369+
t.Run("should use provider model id as wire model", func(t *testing.T) {
370+
// ProviderConfig.ModelID drives both the runtime resolved model AND the wire
371+
// model when WireModel is not specified. SessionConfig.Model is intentionally
372+
// omitted so that ModelID is the only model source.
373+
ctx.ConfigureForTest(t)
374+
375+
session, err := client.CreateSession(t.Context(), &copilot.SessionConfig{
376+
OnPermissionRequest: copilot.PermissionHandler.ApproveAll,
377+
Provider: &copilot.ProviderConfig{
378+
Type: "openai",
379+
BaseURL: ctx.ProxyURL,
380+
APIKey: "test-provider-key",
381+
ModelID: "claude-sonnet-4.5",
382+
},
383+
})
384+
if err != nil {
385+
t.Fatalf("CreateSession failed: %v", err)
386+
}
387+
388+
_, err = session.SendAndWait(t.Context(), copilot.MessageOptions{Prompt: "What is 1+1?"})
389+
if err != nil {
390+
t.Fatalf("SendAndWait failed: %v", err)
391+
}
392+
393+
exchanges, err := ctx.GetExchanges()
394+
if err != nil {
395+
t.Fatalf("GetExchanges failed: %v", err)
396+
}
397+
if len(exchanges) != 1 {
398+
t.Fatalf("Expected exactly 1 exchange, got %d", len(exchanges))
399+
}
400+
if exchanges[0].Request.Model != "claude-sonnet-4.5" {
401+
t.Errorf("Expected request model to be 'claude-sonnet-4.5', got %q", exchanges[0].Request.Model)
402+
}
403+
})
404+
326405
t.Run("should use workingDirectory for tool execution", func(t *testing.T) {
327406
ctx.ConfigureForTest(t)
328407

go/types.go

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -859,6 +859,25 @@ type ProviderConfig struct {
859859
Azure *AzureProviderOptions `json:"azure,omitempty"`
860860
// Headers are custom HTTP headers included in outbound provider requests.
861861
Headers map[string]string `json:"headers,omitempty"`
862+
// ModelID is the well-known model name used by the runtime to look up
863+
// agent configuration (tools, prompts, reasoning behavior) and default
864+
// token limits. Also used as the wire model when WireModel is not set.
865+
// Falls back to SessionConfig.Model.
866+
ModelID string `json:"modelId,omitempty"`
867+
// WireModel is the model name sent to the provider API for inference. Use
868+
// this when the provider's model name (e.g. an Azure deployment name or a
869+
// custom fine-tune name) differs from ModelID.
870+
// Falls back to ModelID, then SessionConfig.Model.
871+
WireModel string `json:"wireModel,omitempty"`
872+
// MaxInputTokens overrides the resolved model's default max prompt tokens.
873+
// The runtime triggers conversation compaction before sending a request
874+
// when the prompt (system message, history, tool definitions, user
875+
// message) would exceed this limit.
876+
MaxInputTokens int `json:"maxPromptTokens,omitempty"`
877+
// MaxOutputTokens overrides the resolved model's default max output
878+
// tokens. When hit, the model stops generating and returns a truncated
879+
// response.
880+
MaxOutputTokens int `json:"maxOutputTokens,omitempty"`
862881
}
863882

864883
// AzureProviderOptions contains Azure-specific provider configuration

go/types_test.go

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,3 +151,68 @@ func TestSessionSendRequest_JSONIncludesRequestHeaders(t *testing.T) {
151151
t.Fatalf("expected Authorization header, got %v", headers["Authorization"])
152152
}
153153
}
154+
155+
func TestProviderConfig_JSONIncludesAllFields(t *testing.T) {
156+
cfg := ProviderConfig{
157+
BaseURL: "https://example.com/provider",
158+
APIKey: "test-key",
159+
Headers: map[string]string{"Authorization": "Bearer provider-token"},
160+
ModelID: "gpt-4o",
161+
WireModel: "my-finetune-v3",
162+
MaxInputTokens: 100000,
163+
MaxOutputTokens: 4096,
164+
}
165+
166+
data, err := json.Marshal(cfg)
167+
if err != nil {
168+
t.Fatalf("failed to marshal ProviderConfig: %v", err)
169+
}
170+
171+
var decoded map[string]any
172+
if err := json.Unmarshal(data, &decoded); err != nil {
173+
t.Fatalf("failed to unmarshal ProviderConfig: %v", err)
174+
}
175+
176+
if decoded["baseUrl"] != "https://example.com/provider" {
177+
t.Errorf("expected baseUrl to round-trip, got %v", decoded["baseUrl"])
178+
}
179+
if decoded["modelId"] != "gpt-4o" {
180+
t.Errorf("expected modelId 'gpt-4o', got %v", decoded["modelId"])
181+
}
182+
if decoded["wireModel"] != "my-finetune-v3" {
183+
t.Errorf("expected wireModel 'my-finetune-v3', got %v", decoded["wireModel"])
184+
}
185+
if decoded["maxPromptTokens"] != float64(100000) {
186+
t.Errorf("expected maxPromptTokens 100000, got %v", decoded["maxPromptTokens"])
187+
}
188+
if decoded["maxOutputTokens"] != float64(4096) {
189+
t.Errorf("expected maxOutputTokens 4096, got %v", decoded["maxOutputTokens"])
190+
}
191+
headers, ok := decoded["headers"].(map[string]any)
192+
if !ok {
193+
t.Fatalf("expected headers object, got %T", decoded["headers"])
194+
}
195+
if headers["Authorization"] != "Bearer provider-token" {
196+
t.Errorf("expected Authorization header, got %v", headers["Authorization"])
197+
}
198+
}
199+
200+
func TestProviderConfig_JSONOmitsUnsetTokenFields(t *testing.T) {
201+
cfg := ProviderConfig{BaseURL: "https://example.com/provider"}
202+
203+
data, err := json.Marshal(cfg)
204+
if err != nil {
205+
t.Fatalf("failed to marshal ProviderConfig: %v", err)
206+
}
207+
208+
var decoded map[string]any
209+
if err := json.Unmarshal(data, &decoded); err != nil {
210+
t.Fatalf("failed to unmarshal ProviderConfig: %v", err)
211+
}
212+
213+
for _, field := range []string{"modelId", "wireModel", "maxPromptTokens", "maxOutputTokens", "headers"} {
214+
if _, present := decoded[field]; present {
215+
t.Errorf("expected %q to be omitted when unset, got %v", field, decoded[field])
216+
}
217+
}
218+
}

nodejs/src/client.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ import type {
4242
GetAuthStatusResponse,
4343
GetStatusResponse,
4444
ModelInfo,
45+
ProviderConfig,
4546
ResumeSessionConfig,
4647
SectionTransformFn,
4748
SessionConfig,
@@ -64,6 +65,17 @@ import type {
6465
} from "./types.js";
6566
import { defaultJoinSessionPermissionHandler } from "./types.js";
6667

68+
/**
69+
* Convert a {@link ProviderConfig} to its JSON-RPC wire shape, remapping
70+
* camelCase SDK property names to the wire keys expected by the runtime
71+
* (e.g. `maxInputTokens` → `maxPromptTokens`).
72+
*/
73+
function toWireProviderConfig(provider: ProviderConfig): Record<string, unknown> {
74+
const { maxInputTokens, ...rest } = provider;
75+
if (maxInputTokens === undefined) return rest;
76+
return { ...rest, maxPromptTokens: maxInputTokens };
77+
}
78+
6779
/**
6880
* Minimum protocol version this SDK can communicate with.
6981
* Servers reporting a version below this are rejected.
@@ -788,7 +800,7 @@ export class CopilotClient {
788800
systemMessage: wireSystemMessage,
789801
availableTools: config.availableTools,
790802
excludedTools: config.excludedTools,
791-
provider: config.provider,
803+
provider: config.provider ? toWireProviderConfig(config.provider) : undefined,
792804
modelCapabilities: config.modelCapabilities,
793805
requestPermission: true,
794806
requestUserInput: !!config.onUserInputRequest,
@@ -931,7 +943,7 @@ export class CopilotClient {
931943
name: cmd.name,
932944
description: cmd.description,
933945
})),
934-
provider: config.provider,
946+
provider: config.provider ? toWireProviderConfig(config.provider) : undefined,
935947
modelCapabilities: config.modelCapabilities,
936948
requestPermission:
937949
config.onPermissionRequest !== defaultJoinSessionPermissionHandler,

nodejs/src/types.ts

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1503,6 +1503,36 @@ export interface ProviderConfig {
15031503
* Custom HTTP headers to include in outbound provider requests.
15041504
*/
15051505
headers?: Record<string, string>;
1506+
1507+
/**
1508+
* Well-known model name used by the runtime to look up agent configuration
1509+
* (tools, prompts, reasoning behavior) and default token limits. Also used
1510+
* as the wire model when {@link wireModel} is not set.
1511+
* Falls back to {@link SessionConfig.model}.
1512+
*/
1513+
modelId?: string;
1514+
1515+
/**
1516+
* Model name sent to the provider API for inference. Use this when the
1517+
* provider's model name (e.g. an Azure deployment name or a custom
1518+
* fine-tune name) differs from {@link modelId}.
1519+
* Falls back to {@link modelId}, then {@link SessionConfig.model}.
1520+
*/
1521+
wireModel?: string;
1522+
1523+
/**
1524+
* Overrides the resolved model's default max prompt tokens. The runtime
1525+
* triggers conversation compaction before sending a request when the
1526+
* prompt (system message, history, tool definitions, user message) would
1527+
* exceed this limit.
1528+
*/
1529+
maxInputTokens?: number;
1530+
1531+
/**
1532+
* Overrides the resolved model's default max output tokens. When hit, the
1533+
* model stops generating and returns a truncated response.
1534+
*/
1535+
maxOutputTokens?: number;
15061536
}
15071537

15081538
/**

0 commit comments

Comments
 (0)