Skip to content

Commit 04d8008

Browse files
tylerslatonclaude
andcommitted
fix(showcase/langgraph-python): unbreak shared-state pills, auth sign-out, gen-ui-agent progression, multimodal D5
Four independent showcase production bugs Alem reported, plus the D5 multimodal harness regression they unblocked. Shared-state-read-write: "Greet me" ("Say hi and introduce yourself.") and "Plan a weekend" ("Suggest a weekend plan based on my interests.") were matching the bare `hi` and `plan` catch-alls in feature-parity.json and returning the generic showcase-assistant blurb / 5-step content plan instead of shared-state-aware responses. Added pill-specific fixtures in shared-state.json (mirrored into d5-all.json) so the longer userMessage substrings win first-match-wins ahead of feature-parity. Auth sign-out: signing out unmounted CopilotKit entirely and bounced the user back to the SignInCard, so the demo never showcased the runtime returning 401 — its whole point. The QA contract in qa/auth.md spelled out the intended UX. Restored it: CopilotKit stays mounted after the first sign-in, the AuthBanner flips to an amber "Signed out — the agent will reject your messages" state with a re-Sign-in button, and CopilotKit's `onError` callback drives a `data-testid="auth-demo-error"` surface that displays the runtime's 401 the moment the user sends an unauthenticated message. Updated the e2e spec to match (the old "SignInCard re-mounts after sign-out" test pinned the regression). Gen-ui-agent: the aimock fixture short-circuited the 7-step progression spelled out in `gen_ui_agent.py`'s SYSTEM_PROMPT to a single set_steps call with all three steps already `completed`, so the InlineAgentStateCard rendered the final 3/3 state instantly with no sequential pending → in_progress → completed animation. Regenerated as a 7-leg toolCallId chain per pill (8 fixtures × 3 pills): seed leg keyed on userMessage with NO `hasToolResult` gate (matching PR CopilotKit#4770's pattern — `hasToolResult: false` would block the seed from firing on the second pill in a multi-pill session), then six toolCallId-keyed transitions, then a final narration. Fixture order: toolCallId legs FIRST so the most specific match wins. Multimodal D5: the sample-attachment buttons auto-send via `agent.addMessage + copilotkit.runAgent` (restored in PR CopilotKit#4761), but the D5 harness still typed `input` + pressed Enter via the runner after `preFill`, sending a second user message that competed with the in-flight image upload — the v1 LangGraph runtime SSE stream got tangled (browser DevTools showed `statusCode: pending` indefinitely) and the assistant message never rendered. Added `skipSend?: boolean` to ConversationTurn (distinct from `skipFill`, which still presses Enter once the textarea has content) and switched d5-multimodal.ts to `skipSend: true` with `responseTimeoutMs: 60_000` so the runner waits on the assistant response without poking the chat further. Bumped the PDF auto-prompt fixture in feature-parity.json to include the word "document" so the existing `buildModalityAssertion("document")` check still lands. D5 result: 37 → 39 of 40 features passing. Only `tool-rendering-reasoning-chain` remains and is a separate agent/runtime bug (Tokyo Responses-API `reasoning` message survives into the next turn's conversation history, runtime returns `RUN_ERROR: "message role is not supported"`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent d2840ff commit 04d8008

12 files changed

Lines changed: 927 additions & 72 deletions

File tree

showcase/aimock/d5-all.json

Lines changed: 276 additions & 12 deletions
Large diffs are not rendered by default.

showcase/aimock/feature-parity.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -566,7 +566,7 @@
566566
"userMessage": "can you tell me what is in this demo pdf I just attached"
567567
},
568568
"response": {
569-
"content": "The attached PDF is the CopilotKit Quickstart guide. It walks through three steps: install `@copilotkit/react-core` and `@copilotkit/react-ui`, wrap your app in a `<CopilotKit runtimeUrl=\"/api/copilotkit\">` provider, and drop a `<CopilotChat />` wherever you want the assistant to appear."
569+
"content": "The attached PDF document is the CopilotKit Quickstart guide. It walks through three steps: install `@copilotkit/react-core` and `@copilotkit/react-ui`, wrap your app in a `<CopilotKit runtimeUrl=\"/api/copilotkit\">` provider, and drop a `<CopilotChat />` wherever you want the assistant to appear."
570570
}
571571
},
572572
{

showcase/harness/fixtures/d5/gen-ui-agent.json

Lines changed: 292 additions & 11 deletions
Large diffs are not rendered by default.

showcase/harness/fixtures/d5/shared-state.json

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,24 @@
11
{
2-
"_comment": "D5 multi-turn fixture for /demos/shared-state-read-write. Uses hasToolResult matching: aimock checks for role:tool messages in the request to disambiguate multi-turn fixtures sharing the same userMessage. The 'favorite color' fixture has a unique userMessage and needs no hasToolResult. Fixtures ordered by hasToolResult false→true for readability.",
2+
"_comment": "D5 multi-turn fixture for /demos/shared-state-read-write. Uses hasToolResult matching: aimock checks for role:tool messages in the request to disambiguate multi-turn fixtures sharing the same userMessage. The 'favorite color' fixture has a unique userMessage and needs no hasToolResult. Fixtures ordered by hasToolResult false→true for readability. The 'Say hi and introduce yourself' and 'Suggest a weekend plan' fixtures exist to outrun the bare 'hi' and 'plan' substring matchers in feature-parity.json — without them, the Greet me and Plan a weekend pills mismatch generic catch-alls. Mirrored verbatim into showcase/aimock/d5-all.json so re-bundling stays consistent.",
33
"fixtures": [
4+
{
5+
"_comment": "Greet me pill (msg: 'Say hi and introduce yourself.'). Substring is unique enough to win over feature-parity's bare 'hi' matcher because d5-all.json loads before feature-parity.json. Response references the shared-state contract to demonstrate what the demo is about.",
6+
"match": {
7+
"userMessage": "Say hi and introduce yourself"
8+
},
9+
"response": {
10+
"content": "Hi — I'm your shared-state co-pilot. Your Preferences panel (name, tone, language, interests) is fed to me on every turn, and I jot notes back into the Agent Scratch Pad via set_notes so the UI re-renders. Try setting your name or asking me to remember something."
11+
}
12+
},
13+
{
14+
"_comment": "Plan a weekend pill (msg: 'Suggest a weekend plan based on my interests.'). Outranks feature-parity's bare 'plan' matcher for the same reason. Response references the interests panel so the demo's intent is clear even when interests is empty.",
15+
"match": {
16+
"userMessage": "weekend plan based on my interests"
17+
},
18+
"response": {
19+
"content": "A weekend tailored to your interests panel: if you haven't picked any yet, try Cooking + Travel for a market-and-day-trip combo, or Tech + Books for a maker session and a long reading afternoon. Add interests in the Preferences panel and re-ask for a more specific plan."
20+
}
21+
},
422
{
523
"match": {
624
"userMessage": "remember that my favorite color is blue",

showcase/harness/src/probes/helpers/conversation-runner.ts

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,24 @@ export interface ConversationTurn {
170170
* label for logging).
171171
*/
172172
skipFill?: boolean;
173+
/**
174+
* When true, the runner skips BOTH `page.fill()` AND the Enter press
175+
* for this turn — `preFill` is expected to have already issued the
176+
* user message via some other path (e.g. clicking a sample-attachment
177+
* button that dispatches `agent.addMessage` + `copilotkit.runAgent`).
178+
*
179+
* Distinct from `skipFill` (which still presses Enter once the
180+
* textarea has content): a `skipFill` turn assumes preFill populated
181+
* the textarea but didn't submit it, while a `skipSend` turn assumes
182+
* preFill handled the whole submission and the textarea was never
183+
* touched. Used by the multimodal-sample flow where the sample
184+
* button auto-sends via the agent surface and the chat textarea
185+
* stays empty for the entire turn.
186+
*
187+
* `skipSend` takes precedence over `skipFill` when both are set.
188+
* `input` is ignored (used only for log labels).
189+
*/
190+
skipSend?: boolean;
173191
/**
174192
* Optional callback executed BEFORE the runner fills the chat input
175193
* and presses Enter for this turn. Use this to click attachment
@@ -360,7 +378,11 @@ export async function runConversation(
360378
}
361379
}
362380

363-
if (turn.skipFill) {
381+
if (turn.skipSend) {
382+
console.debug(
383+
`[conversation-runner] turn ${turnNum}/${total} — skipSend=true, preFill handled submission; not touching textarea`,
384+
);
385+
} else if (turn.skipFill) {
364386
console.debug(
365387
`[conversation-runner] turn ${turnNum}/${total} — skipFill=true, waiting for textarea content then pressing Enter`,
366388
);

showcase/harness/src/probes/scripts/d5-multimodal.ts

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,15 +120,30 @@ export async function preTurnAttachPdf(page: Page): Promise<void> {
120120
}
121121

122122
export function buildTurns(_ctx: D5BuildContext): ConversationTurn[] {
123+
// `skipSend: true` because the sample buttons auto-send the user
124+
// message via `agent.addMessage` + `copilotkit.runAgent` (see
125+
// showcase/integrations/langgraph-python/src/app/demos/multimodal/
126+
// sample-attachment-buttons.tsx). Without `skipSend` the runner
127+
// would ALSO type `input` and press Enter, sending a second user
128+
// message that competes with the in-flight image upload — the v1
129+
// LangGraph runtime stream gets tangled and neither response makes
130+
// it back to the UI. `input` is kept as a descriptive label for
131+
// logs only. Timeout bumped to 60s to cover image-attachment payload
132+
// round-trip latency (the e2e Playwright spec uses 60-90s for the
133+
// same path — see tests/e2e/multimodal.spec.ts).
123134
return [
124135
{
125-
input: "describe the sample image",
136+
input: "image-sample-button (auto-sent)",
126137
preFill: preTurnAttachImage,
138+
skipSend: true,
139+
responseTimeoutMs: 60_000,
127140
assertions: buildModalityAssertion("image"),
128141
},
129142
{
130-
input: "summarize the sample document",
143+
input: "pdf-sample-button (auto-sent)",
131144
preFill: preTurnAttachPdf,
145+
skipSend: true,
146+
responseTimeoutMs: 60_000,
132147
assertions: buildModalityAssertion("document"),
133148
},
134149
];

showcase/integrations/langgraph-python/src/app/demos/auth/auth-banner.tsx

Lines changed: 42 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,33 +3,61 @@
33
import { Button } from "@/components/ui/button";
44

55
interface AuthBannerProps {
6+
authenticated: boolean;
67
onSignOut: () => void;
8+
onSignIn: () => void;
79
}
810

911
/**
10-
* Status strip rendered above <CopilotChat /> while the user is signed in.
12+
* Status strip rendered above <CopilotChat /> in both authenticated and
13+
* post-sign-out states. The post-sign-out (amber) variant exists so the demo
14+
* actually showcases what its name promises — the runtime rejecting an
15+
* unauthenticated request — instead of bouncing the user back to the gate
16+
* page where the rejection never happens.
17+
*
1118
* Pure presentational — owns no state itself. Testids are stable contract
1219
* for QA + Playwright specs.
1320
*/
14-
export function AuthBanner({ onSignOut }: AuthBannerProps) {
21+
export function AuthBanner({
22+
authenticated,
23+
onSignOut,
24+
onSignIn,
25+
}: AuthBannerProps) {
26+
const classes = authenticated
27+
? "border-emerald-300 bg-emerald-50 text-emerald-900"
28+
: "border-amber-300 bg-amber-50 text-amber-900";
29+
1530
return (
1631
<div
1732
data-testid="auth-banner"
18-
data-authenticated="true"
19-
className="flex items-center justify-between gap-3 rounded-md border border-emerald-300 bg-emerald-50 px-4 py-2 text-sm text-emerald-900"
33+
data-authenticated={authenticated ? "true" : "false"}
34+
className={`flex items-center justify-between gap-3 rounded-md border px-4 py-2 text-sm ${classes}`}
2035
>
2136
<span data-testid="auth-status" className="font-medium">
22-
Signed in as demo user
37+
{authenticated
38+
? "✓ Signed in as demo user"
39+
: "⚠ Signed out — the agent will reject your messages until you sign in."}
2340
</span>
24-
<Button
25-
type="button"
26-
data-testid="auth-sign-out-button"
27-
size="sm"
28-
variant="outline"
29-
onClick={onSignOut}
30-
>
31-
Sign out
32-
</Button>
41+
{authenticated ? (
42+
<Button
43+
type="button"
44+
data-testid="auth-sign-out-button"
45+
size="sm"
46+
variant="outline"
47+
onClick={onSignOut}
48+
>
49+
Sign out
50+
</Button>
51+
) : (
52+
<Button
53+
type="button"
54+
data-testid="auth-authenticate-button"
55+
size="sm"
56+
onClick={onSignIn}
57+
>
58+
Sign in
59+
</Button>
60+
)}
3361
</div>
3462
);
3563
}

showcase/integrations/langgraph-python/src/app/demos/auth/page.tsx

Lines changed: 62 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,27 +7,53 @@
77
//
88
// UX shape: the demo defaults to UNAUTHENTICATED on first paint so visitors
99
// land on a clear sign-in card. We don't render `<CopilotKit>` until the user
10-
// has a token — that sidesteps the transport 401 that would otherwise crash
11-
// `<CopilotChat>` during its initial `/info` handshake. Once signed in,
12-
// `<CopilotKit>` mounts with the bearer header attached, the chat boots
13-
// cleanly, and signing out unmounts the whole tree.
10+
// has signed in at least once — that sidesteps the transport 401 that would
11+
// otherwise crash `<CopilotChat>` during its initial `/info` handshake.
12+
// After the user signs in once, `<CopilotKit>` stays mounted across the
13+
// sign-out → sign-in cycle so the post-sign-out state can actually
14+
// demonstrate the runtime rejecting unauthenticated requests in the chat
15+
// surface (the whole point of the demo).
1416

15-
import { useMemo } from "react";
16-
import { CopilotKit, CopilotChat } from "@copilotkit/react-core/v2";
17+
import { useEffect, useMemo, useState } from "react";
18+
import {
19+
CopilotKit,
20+
CopilotChat,
21+
type CopilotKitCoreErrorCode,
22+
} from "@copilotkit/react-core/v2";
1723
import { AuthBanner } from "./auth-banner";
1824
import { SignInCard } from "./sign-in-card";
1925
import { useDemoAuth } from "./use-demo-auth";
26+
import { DEMO_TOKEN } from "./demo-token";
27+
28+
interface AuthDemoErrorState {
29+
message: string;
30+
code: CopilotKitCoreErrorCode | string;
31+
}
2032

2133
export default function AuthDemoPage() {
22-
const { isAuthenticated, authorizationHeader, signIn, signOut } =
23-
useDemoAuth();
34+
const {
35+
isAuthenticated,
36+
authorizationHeader,
37+
hasEverSignedIn,
38+
signIn,
39+
signOut,
40+
} = useDemoAuth();
2441

2542
const headers = useMemo<Record<string, string>>(
2643
() => (authorizationHeader ? { Authorization: authorizationHeader } : {}),
2744
[authorizationHeader],
2845
);
2946

30-
if (!isAuthenticated) {
47+
const [authError, setAuthError] = useState<AuthDemoErrorState | null>(null);
48+
49+
// Clear stale errors as soon as the user re-authenticates. Without this
50+
// the amber error surface would persist after sign-in even though the
51+
// failure is no longer relevant.
52+
useEffect(() => {
53+
if (isAuthenticated) setAuthError(null);
54+
}, [isAuthenticated]);
55+
56+
if (!hasEverSignedIn) {
3157
return (
3258
<div className="flex h-screen flex-col">
3359
<SignInCard onSignIn={signIn} />
@@ -44,12 +70,38 @@ export default function AuthDemoPage() {
4470
agent="auth-demo"
4571
headers={headers}
4672
useSingleEndpoint={false}
73+
onError={(event) => {
74+
setAuthError({
75+
message: event.error?.message ?? String(event.error),
76+
code: event.code,
77+
});
78+
}}
4779
>
4880
<div className="flex h-screen flex-col gap-3 p-6">
49-
<AuthBanner onSignOut={signOut} />
81+
<AuthBanner
82+
authenticated={isAuthenticated}
83+
onSignOut={signOut}
84+
onSignIn={() => signIn(DEMO_TOKEN)}
85+
/>
5086
<header>
5187
<h1 className="text-lg font-semibold">Authentication</h1>
5288
</header>
89+
{authError && !isAuthenticated && (
90+
<div
91+
data-testid="auth-demo-error"
92+
className="rounded-md border border-amber-300 bg-amber-50 px-3 py-2 text-sm text-amber-900"
93+
>
94+
<strong className="font-semibold">
95+
Runtime rejected the request:
96+
</strong>{" "}
97+
<span data-testid="auth-demo-error-message">
98+
{authError.message}
99+
</span>{" "}
100+
<code className="ml-1 rounded bg-amber-100 px-1 py-0.5 font-mono text-xs">
101+
{authError.code}
102+
</code>
103+
</div>
104+
)}
53105
<div className="flex-1 overflow-hidden rounded-md border border-neutral-200">
54106
<CopilotChat agentId="auth-demo" className="h-full" />
55107
</div>

showcase/integrations/langgraph-python/src/app/demos/auth/use-demo-auth.ts

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,15 @@ export interface DemoAuthHandle {
1111
token: string | null;
1212
/** The full `Bearer <token>` value when authenticated, otherwise null. */
1313
authorizationHeader: string | null;
14+
/**
15+
* Whether the user has signed in at least once during the current page
16+
* session. Used by the page to decide between the first-paint SignInCard
17+
* (never signed in) and the persistent chat-with-amber-banner state
18+
* (signed in, then signed out) — the latter is the only state that
19+
* actually showcases the runtime rejecting unauthenticated requests.
20+
* Resets on full page reload by design.
21+
*/
22+
hasEverSignedIn: boolean;
1423
/** Sign in with the provided token. */
1524
signIn: (token: string) => void;
1625
/** Clear the stored token. */
@@ -29,6 +38,7 @@ export interface DemoAuthHandle {
2938
*/
3039
export function useDemoAuth(): DemoAuthHandle {
3140
const [token, setToken] = useState<string | null>(null);
41+
const [hasEverSignedIn, setHasEverSignedIn] = useState(false);
3242

3343
// Hydrate from localStorage after mount. Reading on initial render would
3444
// mismatch SSR (where window is undefined); deferring to useEffect keeps
@@ -37,7 +47,10 @@ export function useDemoAuth(): DemoAuthHandle {
3747
if (typeof window === "undefined") return;
3848
try {
3949
const stored = window.localStorage.getItem(STORAGE_KEY);
40-
if (stored) setToken(stored);
50+
if (stored) {
51+
setToken(stored);
52+
setHasEverSignedIn(true);
53+
}
4154
} catch {
4255
// localStorage unavailable (privacy mode, etc.) — fall back to
4356
// in-memory only.
@@ -46,6 +59,7 @@ export function useDemoAuth(): DemoAuthHandle {
4659

4760
const signIn = useCallback((nextToken: string) => {
4861
setToken(nextToken);
62+
setHasEverSignedIn(true);
4963
try {
5064
window.localStorage.setItem(STORAGE_KEY, nextToken);
5165
} catch {
@@ -55,6 +69,9 @@ export function useDemoAuth(): DemoAuthHandle {
5569

5670
const signOut = useCallback(() => {
5771
setToken(null);
72+
// hasEverSignedIn intentionally stays true so the page keeps showing
73+
// the chat surface (with amber banner) after sign-out. That is the
74+
// state that demonstrates the runtime returning 401.
5875
try {
5976
window.localStorage.removeItem(STORAGE_KEY);
6077
} catch {
@@ -70,6 +87,7 @@ export function useDemoAuth(): DemoAuthHandle {
7087
isAuthenticated: token !== null,
7188
token,
7289
authorizationHeader: token ? `Bearer ${token}` : null,
90+
hasEverSignedIn,
7391
signIn,
7492
signOut,
7593
};

0 commit comments

Comments
 (0)