[pull] main from CopilotKit:main#37
Open
pull[bot] wants to merge 8302 commits into
Open
Conversation
e4eae1a to
fac1c7c
Compare
…eads
Restores an archived thread via the existing generic PATCH /threads/:id
update path with { archived: false } — the same mechanism example apps
already use for restore — so no new runtime route is required. Mirrors
archiveThread across the core thread store and the v2 useThreads hook.
…ells Takes d6:strands and d6:strands-typescript from 32/35 to 34/35. - shared-state-read: the turn-2 fixture leg wrongly pinned turnIndex:0, so the aimock matcher skipped it on turn 2 -> 404 -> turn-2 sse-missing. Drop turnIndex to mirror the langgraph-python gold-standard fixture. - multimodal: sample.png/pdf/wav shipped as git-LFS pointers, so deploy/test environments without 'git lfs pull' served the ~130-byte pointer text as the upload -> the run never started (runsFinished=0). Ship them as regular binaries via a per-integration .gitattributes lfs-unset + real bytes, mirroring langgraph-python's convention. Remaining red (gen-ui-declarative) is a Strands A2UI-dynamic run-completion bug (reproduces on real-LLM staging too): the surface paints but generate_a2ui never completes, so the run hangs 'Running'. Tracked separately.
…ery demo (#5671) ## What Brings the google-adk showcase **A2UI** demos to D6 parity with the langgraph-python / AWS Strands gold standard. - **Auto-inject**: `declarative-gen-ui` switches from backend-owned (`get_a2ui_tool`, `injectA2UITool: false`) to runtime auto-injection (`injectA2UITool: true`, plain agent). The `ag-ui-adk` 0.7.0 adapter auto-injects `generate_a2ui` via `plan_a2ui_injection`, matching how langgraph-python and Strands wire the same demo. First time ADK's adapter auto-injection is exercised in the showcase. - **Recovery un-red**: removes the "known-failing on purpose" mark from the `a2ui-recovery` heal e2e and the OSS-374 inner-render-disambiguation notes (spec, fixture `_meta`, QA doc, agent docstring, route comment). That premise was stale: on `ag-ui-adk` 0.7.0 the adapter forwards the run conversation into the inner `render_a2ui` call, so each pill's last user turn is its own prompt and aimock selects the right per-pill fixture. Confirmed against the aimock journal. `a2ui-recovery` stays backend-owned (the only path that surfaces the recovery loop) and remains ADK-only (OSS-375 tracks langgraph-python parity). ## Verification Canonical D6 driver (`showcase test google-adk --d6 --isolate`), per pill: - `declarative-gen-ui`: all 4 pills pass under auto-inject (sales-dashboard metric>=4 + pie + bar; team-performance data-table + bar; at-risk status-badge>=3 + metric>=3; top-account info-row + pie) - `a2ui-fixed-schema`: pass - `beautiful-chat` (5 sub-pills): pass - `a2ui-recovery` Playwright spec: 3/3 (page-load, heal, exhaust) - aimock journal confirms `generate_a2ui` was adapter-injected (the agent declares no tools) then drove the inner `render_a2ui`. ## Scope and honest status This is **A2UI D6 parity**, not full-slug D6. The google-adk D6 aggregate is still red (33/39): six **non-A2UI** cells fail and are untouched by this PR: `shared-state-write`, `tool-rendering-custom-catchall`, `subagents`, `tool-rendering-reasoning-chain`, `gen-ui-interrupt`, `gen-ui-headless-complete`. They are pre-existing (D6 is not CI-gated for google-adk) and out of scope here.
…reads (#5624) ## Summary Adds `unarchiveThread(id)` to the v2 thread store (`@copilotkit/core`) and the `useThreads` hook (`@copilotkit/react-core/v2`), restoring an archived thread via the existing generic `PATCH /threads/:id { archived: false }` update path — no new runtime endpoint. Mirrors `archiveThread` across the store and hook. This is the durable, architecture-independent piece extracted from the threads-drawer effort. The drawer UI itself is being restarted as a framework-agnostic **CopilotDrawer** (Lit web component + React/Angular wrappers) under a separate spec; this hook method stands on its own and is needed regardless. ## Testing TDD. New core store test (`PATCH … { archived: false }`) and `useThreads` hook test; full suites green (core, react-core).
…5635) (#5637) Fixes #5635. ## What Headers set directly on an `HttpAgent` registered via `agents__unsafe_dev_only` were silently replaced by the provider headers. Per-agent auth headers (like an `Authorization` for a self-hosted backend) got dropped, causing 401s. ## Why `AgentRegistry.applyHeadersToAgent` did `agent.headers = { ...core.headers }`, a full overwrite. The run handler and the react-core `useAgent` hook did the same. So an agent built with its own headers lost them on registration, on every `setHeaders`, and before each request. ## Fix Merge instead of replace. The registry captures each agent's own headers once (in a WeakMap, before the first apply) and rebuilds `{ ...ownHeaders, ...coreHeaders }`. Core wins on key conflicts, which keeps the existing "provider headers are authoritative" and logout/clear behavior. All header application now routes through one method, `CopilotKitCore.applyHeadersToAgent`, so runs never clobber per-agent headers. Vue and Angular benefit too: they dispatch runs through `core.runAgent` / `connectAgent`, so the merge is re-applied before every request. ## Tests - core: 3 new cases in `core-headers.test.ts` (preserve, merge, retain-across-setHeaders); existing overwrite and clear tests still pass. - react-core: new `use-agent-provider-headers.e2e.test.tsx` with a real provider and an HttpAgent that has its own headers. Verified locally: format, lint, full core + react-core suites, and both builds.
- @ag-ui/aws-strands 0.2.2 -> 0.2.3 (strands-typescript) - ag_ui_strands 0.2.1 -> 0.2.2 (strands) These releases carry the A2UI-dynamic (declarative-gen-ui) run-completion fix: the auto-injected generate_a2ui now completes after the A2UI surface paints, so the run emits RUN_FINISHED instead of hanging 'Running'. Should green the gen-ui-declarative D6 cell on both integrations (-> 35/35) and resolve the real-LLM staging hang. Pending local D6 re-verify.
Fail loud on a dropped #oss-alerts page: the failure-alert cross-post no longer
swallows a 200/ok:false Slack response, so a dropped page-the-humans alert reds
the renderer job instead of vanishing on a green run. The thread reply stays
warn-only. Both posts capture the response via a shared slack_alert_posted_ok
predicate, mirrored byte-identically across the live workflow and the dry-run
helper.
Debt cleanup: drop a dead failed_count var, correct a misleading gha_url comment,
and validate the decoded blob run_id against ^[0-9a-f]{6}$ in the render step so
a malformed run_id can't reach Slack or the run name.
Tests: predicate edge cases (non-JSON, malformed, missing/null ok), an anti-drift
parity guard asserting the predicate is identical in both files, and call-site
tests locking the #oss-alerts fail-loud vs thread warn-only exit semantics.
…es at the model-call boundary Flatten AG-UI attachment content into native pydantic-ai content types in the OUTGOING request only, via a WrapperModel-scoped flatten rather than a history_processor (so the flatten never persists into ctx.state.message_history and leak into UI state). Normalize mime types, gate on supported content types, and degrade unsupported types at the single emission choke point. Fixes the _map_user_prompt assert_never crash on raw AG-UI multimodal content.
… interrupt + headless Brings google-adk to 39/39 D6 (reproduced across two independent full-matrix runs, zero regressions). Four changes: - entrypoint.sh: remove ADK_DISABLE_PROGRESSIVE_SSE_STREAMING=1. That flag's non-progressive aggregation path ended ADK's agentic loop after the first tool round (no post-tool LLM re-invoke), which broke every demo needing a second turn: the subagents chain (research -> writing -> critique), tool-rendering-reasoning-chain (AAPL -> MSFT), shared-state-read-write's confirmation, and the custom-catchall narration. The partial-event abort it guarded against is already handled in-callback by stop_on_terminal_text. - manifest.yaml: un-skip-list tool-rendering-reasoning-chain (now passes with the loop restored). - headless_complete_agent.py: add AGUIToolset() so the frontend highlight_note tool is injected and routed to the browser. Removing the flag unmasked this pre-existing gap — turn 3 dispatched highlight_note server-side and the backend registry rejected it. langgraph-python auto-injects frontend tools; ADK needs AGUIToolset() in the agent's tools list. - aimock/d6/google-adk/gen-ui-interrupt.json: order each pill's narration leg (toolCallId) before its emit leg and drop the thread-global hasToolResult gate, so the alice pill no longer 503s after the sales pill leaves a tool result in the thread.
## Summary Hardening + pre-existing-debt cleanup of the showcase promote-notify Slack renderer (follow-up to #5657, scoped to deferred review items). **Fail loud on a dropped page-the-humans alert.** The #oss-alerts failure cross-post used to pipe the Slack API response to `/dev/null`, so a `200`/`{"ok":false,"error":"channel_not_found"}` silently dropped the alert that fires when a promote *failed* — on a green job nobody would notice. Both posts now capture the response via a shared `slack_alert_posted_ok` predicate; the #oss-alerts page fails the renderer job loud on a dropped delivery, while the informational thread reply stays warn-only. The predicate is mirrored byte-identically across the live workflow and the dry-run helper. **Debt cleanup.** Removed a dead `failed_count` var, corrected a misleading `gha_url` comment, and added `^[0-9a-f]{6}$` validation on the decoded blob `run_id` in the render step so a malformed run_id can't reach Slack or the run name. ## Test plan - [x] `promote-notify.bats` 11/11 — predicate edge cases (non-JSON, malformed, missing/`null` ok), anti-drift parity guard (predicate identical in both files), and call-site tests locking the #oss-alerts fail-loud vs thread warn-only exit semantics - [x] red→green proven: re-adding `|| true` to the #oss-alerts call-site flips the call-site test red - [x] shellcheck clean on the dry-run helper
… (kill probe false-red flaps) (#5649) ## Summary Makes the showcase harness probe's turn-done signal **reliable**, killing the dominant class of dashboard false-red flaps without ever hiding a real failure. `waitForTurnComplete` previously relied on a fragile SSE fetch-counter conjunct that false-reds healthy demos whenever the page-side fetch wrapper missed the runtime URL/transport. This change makes the **`data-copilot-running` DOM attribute** (driven directly by the agent run lifecycle, `RUN_STARTED`→true / `RUN_FINISHED`→false, transport-independent) the **PRIMARY** done-signal, with the SSE counter demoted to a **headless-only fallback** (headless demos never render `CopilotChatView`, so the attribute is absent). Design (all three preserved — no false-green, no false-red, hangs still red): - **Primary signal** = the `data-copilot-running` true→false **transition** with a **stayed-stopped quiescence window** (a stop must persist on the same run-start count for `settleMs`; a new sub-run resets it) — so it cannot complete on an intermediate stop in a multi-step turn. - **SSE counter** = headless fallback only; never an OR-trigger when the DOM signal is present. - **`done-signal-missing` backstop** (gated on `attrPresent===true` + `runningNow!==true`) reds a genuine painted-but-never-finished DOM turn before the hard timeout; headless turns use their full timeout for their only signal. ## How it was reviewed A full 4-round `cr-loop` (7 unbiased agents/round + confirmation rounds + a Procedure-3 promotion audit) caught and fixed **5 distinct correctness defects** in the implementation before merge: - **F1** — SSE OR-trigger could complete a multi-step turn early on an intermediate stop (false-GREEN), in both the loop and the post-loop classifier. - **F2** — the run-start baseline was captured *after* the message send, killing the primary signal on fast turns (false-RED). - **F3** — non-atomic double `surfaceReady` read per poll (latent hazard + wasted round-trip). - **F4** — the surface-mount (`completeOnMount`) path had no quiescence window (false-GREEN on intermediate stop + false-RED on a still-running gen-UI turn). - **F5** — the early backstop false-redded slow-but-healthy **headless** turns (now gated on the DOM signal). Bidirectional red-green tests for F1–F5 plus a systematic `{DOM, headless} × {completes, lagging-recovers, genuine-hang} × {text, surface}` completion/backstop matrix. Full harness unit suite: **3173 passed / 18 skipped / 0 failed**; `tsc --noEmit` clean; lint 0 errors; build clean. ## Known follow-ups (NOT in this PR — pre-existing / non-blocking) - **Theoretical edge (not reachable on real or realistically-streamed turns):** if a run completed within a single synchronous microtask (zero-duration), the page-side MutationObserver could miss the true edge while `attrPresent===true` → false-red. Real LLM turns and aimock realistic-streaming hold the attribute true across many event-loop ticks, so the observer reliably latches it. A naive "re-add SSE fallback for DOM-present" fix would reintroduce F1's multi-step false-green, so it's intentionally not done here. - **Recommended quick follow-up (latency only, no wrong verdict):** capture `baselineBannerText` pre-`sendTurnMessage` (mirroring the run-start/count baselines) so a fast-erroring cold-start turn fast-fails (#5142) instead of burning the full timeout. - **Pre-existing sse-interceptor capture/counter internals** (none load-bearing for the new done-signal; verified STAY_IN_C by the Procedure-3 audit): page-side counter soft-nav/multi-capture reset, `__hk_fetchWrapped` pattern reuse + hardcoded fallback, g/y-flag stateful RegExp, TextDecoder end-of-stream flush, bare-catch reader-error swallow, framenav payload discard/TOCTOU, CDP-wallTime-vs-Date.now TTFT, addInitScript/close-listener re-registration accumulation. ## Test plan - [x] `pnpm test` (harness) — 3173 passed / 18 skipped / 0 failed - [x] `tsc --noEmit` exit 0, lint 0 errors, build exit 0 - [ ] Verify on staging that auth / prebuilt-sidebar / claude-sdk-tools (and other previously-flapping cells) stop false-redding while genuinely-broken cells stay red Please review the replay/primary-signal approach. Not auto-merging.
…03 flap) (#5661) ## Summary - The pydantic-ai `generate_a2ui` declarative D6 turn was missing an aimock fixture, producing HTTP 503 `no_fixture_match` on staging (pydantic-ai 503 vs ms-agent-dotnet 200 for the same turn) — a source of dashboard flapping. - Adds the canonical mirror fixtures (outer `generate_a2ui` + matching inner `_design_a2ui_surface`) to `showcase/aimock/d6/pydantic-ai/gen-ui-declarative.json`. These are deterministic canonical mirrors matching the langgraph-python convention — **not** a non-deterministic real-LLM recording — preserving the mandatory LGP 1:1 parity. ## Red-green proof - **RED:** exact failing request (`POST /v1/responses`, gpt-4.1, "Show me my sales dashboard for this quarter.", tools=[`generate_a2ui`], header `x-aimock-context: pydantic-ai`, strict) against the pre-fix fixture set → **HTTP 503 `no_fixture_match`** (reproduces staging exactly; confirmed live on staging too). - **GREEN:** same request against the new set → **HTTP 200** SSE emitting the `generate_a2ui` tool call; the inner `_design_a2ui_surface` turn also returns 200 with the dashboard surface. - Independently re-verified. `validate-on-load` clean (no fixture shadowing); existing pydantic-ai D6 turns (KPI/pie/bar/status) still match identically — no regression. ## Notes - No credentials in the committed fixture — the OpenAI key was never even resolved (canonical mirror, not a recording). Credential scan of the diff + full blob: zero matches. ## Test plan - [ ] CI green - [ ] After deploy, confirm the `generate_a2ui` declarative D6 cell flips red→green on staging
…es (fix assert_never) (#5675) ## Summary The pydantic-ai showcase integration crashed with `assert_never` in pydantic-ai's `_map_user_prompt` whenever AG-UI multimodal `InputContent` (images / documents / binary, data: and url: sources) reached the model — AG-UI content types were never normalized to the native pydantic-ai types (`str` / `ImageUrl` / `BinaryContent` / `DocumentUrl`) the mapper requires. ## What changed - **`_MultimodalFlattenModel(WrapperModel)`** normalizes content at the **model-call boundary** (overriding `request` / `request_stream` / `count_tokens`) — deliberately NOT a `history_processor`, because that hook persists its return into `message_history` and would leak flattened content back to the UI. - **Supported-type gating + degrade centralized at the single native-type emission choke point** (instead of scattered per-content-branch): unsupported image subtypes (HEIC/SVG/TIFF/BMP), audio/video, and non-fetchable url attachments degrade to a text placeholder rather than emitting a native type the OpenAI Responses vision API rejects (which would fail the turn). - **Mime normalized once** — strips RFC-2045 params/whitespace, lowercases, and aliases the common non-canonical `image/jpg` → `image/jpeg` before the allow-list (png/jpeg/gif/webp) test, so real JPEGs are no longer silently dropped. - Identity-based (`is`) no-op detection replaces fragile structural `==`. ## Why it matters Unblocks multimodal turns in the pydantic-ai showcase integration; eliminates the `assert_never` crash and stops valid `image/jpg` JPEGs and url-borne documents from being silently mishandled. ## Testing - **43 unit tests** (clean pinned venv, pydantic-ai 1.0.18), red-green proven per gap: `image/jpg` forwarded as a supported JPEG; url-media / non-PDF-doc degrade instead of emitting an unconditional `DocumentUrl`; parameterized mime forwarded; state-leak guard (`request_stream` forwards flattened, not raw); `count_tokens` override; `assert_never` provably unreachable (all flatten paths return a native type or raise). - 9 rounds of code review (7 agents/round) to a clean confirmation round (zero blocking findings). ## Test plan - [ ] CI green
…ost-tool re-invoke loop (#5676) ## What Fixes the 6 remaining red D6 cells for the **google-adk** showcase integration, bringing it to **39/39 on the D6 matrix** (under aimock replay, reproduced across two independent full-matrix runs, no in-matrix regression). See the scope/caveats section before reading this as "fully done." ## Root cause + changes **1. `entrypoint.sh` — remove `ADK_DISABLE_PROGRESSIVE_SSE_STREAMING=1` (the big one).** A/B tested in-container: with the flag set, ADK's non-progressive aggregation path ends the agentic loop after the first tool round — no post-tool LLM re-invoke. That broke every demo needing a second turn after a tool result: - `subagents` (research → writing → critique never chained past research) - `tool-rendering-reasoning-chain` (AAPL → MSFT) - `shared-state-read-write` (the post-`set_notes` confirmation text) - `tool-rendering-custom-catchall` (the post-tool narration) The flag was added to dodge an intermittent "last event is partial" abort on tool-rendering; the in-callback `stop_on_terminal_text` guard (shared by every agent) is intended to cover that. See caveat 1 — that guard's sufficiency is verified under aimock only, not against real Gemini. **2. `manifest.yaml` — un-skip-list `tool-rendering-reasoning-chain`.** It was marked `not_supported` (vacuous-green) only because of the loop gap; it passes now. **3. `headless_complete_agent.py` — add `AGUIToolset()`.** Removing the flag unmasked a pre-existing gap: the frontend `highlight_note` tool was never injected/routed, so turn 3 dispatched it server-side and the backend registry rejected it (`Tool 'highlight_note' not found`). langgraph-python auto-injects frontend tools; ADK needs `AGUIToolset()` in the agent's `tools` list (every other frontend-tool google-adk agent has it). **4. `aimock/d6/google-adk/gen-ui-interrupt.json` — re-key.** The alice pill 503'd: its emit leg was gated `hasToolResult:false`, but the earlier sales pill leaves a tool result in the thread (thread-global). Re-ordered each pill's narration (`toolCallId`) leg before its emit leg and dropped the `hasToolResult` gate, mirroring the reasoning-chain fixture pattern. ## Verification `showcase test google-adk --d6 --isolate` (full per-pill matrix), run twice on separate isolated stacks: - Both: `passed=39, failed=0, total=39, state=green`. - The 6 previously-red cells all pass: shared-state-write, tool-rendering-custom-catchall, subagents, tool-rendering-reasoning-chain, gen-ui-interrupt (sales + alice), gen-ui-headless-complete. - No regression: all previously-passing cells (incl. all `tool-rendering*` and A2UI: gen-ui-declarative, gen-ui-a2ui-fixed, beautiful-chat) stay green. ## Scope and caveats (read before merging) 1. **The flag removal is verified under aimock only, NOT against real Gemini.** `ADK_DISABLE_PROGRESSIVE_SSE_STREAMING=1` was originally added for a *real-Gemini intermittent* "last event is partial" abort. D6 is deterministic aimock replay and cannot reproduce that intermittent condition, so the "0 partial-aborts" result does not prove the `stop_on_terminal_text` guard is sufficient against real Gemini. There is a real (unmeasured) risk this re-introduces the partial-abort on real-LLM tool-rendering runs. Recommend a real-Gemini smoke of the tool-rendering demos before relying on this in a real-LLM context. 2. **"39/39" is the in-matrix set, not every feature.** Three features remain `not_supported` and are excluded from the matrix (not fixed): `interrupt-headless`, `reasoning-default-render`, `agentic-chat-reasoning`. 3. **Local verification only.** D6 is not CI-gated for google-adk; evidence is two isolate runs on one machine, not an independent CI signal.
Wire the strands-typescript showcase integration for staging deployment, mirroring how the Python strands integration is deployed. - manifest: flip deployed: true so the shell lists it in the integration menu - railway-envs.ts: add showcase-strands-typescript SSOT entry (staging-only for now: prod instance not yet provisioned, so it omits the prod env and is gateIgnore'd until promoted dual-env); regenerate railway-envs.generated.json - showcase_build.yml + showcase_build_check.yml: add the strands-typescript build matrix entry, change-detection filter, and dispatch option (railway_id is the new Railway service id) - golden fixture + image-ref-gate inventory tests updated for the new service Railway staging service showcase-strands-typescript provisioned (showcase-strands-typescript-staging.up.railway.app, health /api/health, OpenAI-via-aimock env). Prod is added later via the promote pipeline.
…32→34/35) (#5673) ## Summary Greens two of the three failing D6 (e2e-full) cells for **both** the `strands` (Python) and `strands-typescript` integrations: **32/35 → 34/35** each. ## Fixes - **shared-state-read** — the turn-2 fixture leg wrongly pinned `turnIndex: 0`, so the aimock matcher skipped it on turn 2 ("candidate fixture skipped by sequence/turn state") → 404 → turn-2 `sse-missing`. Dropped `turnIndex` to mirror the langgraph-python gold-standard fixture (whose turn-2 leg omits it). - **multimodal** — `sample.png/pdf/wav` were committed as **git-LFS pointers**, so deploy/test environments without `git lfs pull` served the ~130-byte pointer text as the uploaded file → the agent run never started (`runsFinished=0`). Now shipped as regular binaries via a per-integration `.gitattributes` lfs-unset + real bytes, mirroring langgraph-python's existing convention ("must stay as regular binaries ... so deploy environments without `git lfs pull` serve the actual files"). Both fixes verified locally: `showcase/bin/showcase test {strands,strands-typescript} --d6 --direct --rebuild --isolate` → 34/35 each, shared-state-read + multimodal green. ## Known remaining red (out of scope here) **gen-ui-declarative** (A2UI dynamic) stays red on both. The A2UI surface **paints correctly** (real dashboard data), but `generate_a2ui` never completes, so the run hangs "Running" (`runsFinished=0`). This reproduces on **real-LLM staging** too, so it is not a fixture/aimock artifact — it is a Strands A2UI-dynamic run-completion issue. The fixtures and frontend are byte-identical to langgraph-python's (which passes), pointing at the Strands adapter's auto-inject completion path. Tracked separately for an adapter-level fix.
Follow-up to #5720 (merged). That PR added the **A2UI Error Recovery** demo (`a2ui-recovery`) across google-adk + langgraph-{python,fastapi,typescript} + strands{,-typescript}, but the recovery flow was only exercised by the manual on-demand workflow (`/test-aimock`). It was NOT covered by the per-PR d5/d6 fleet harness, so a regression in heal/exhaust would not turn any automatic CI cell red. ## What this does Adds a `d5-a2ui-recovery` probe and wires it into the harness so `a2ui-recovery` runs on every PR like `gen-ui-declarative`: - **New probe** `showcase/harness/src/probes/scripts/d5-a2ui-recovery.ts` drives both pills in one session and asserts the stable end-states: - **HEAL**: the healed surface paints (>= 2 newly-mounted `declarative-metric` tiles) and NO "Couldn't generate the UI" card. - **EXHAUST**: the hard-failure "Couldn't generate the UI" card appears and NO surface paints. - Assertions are delta-based against a pre-send baseline so the two mutually-exclusive negatives stay correct across the shared two-turn session. The transient "Retrying..." label is not asserted (timing-flaky). - **Per-slug prompts**: the recovery prompts are unique per integration slug (the inner `render_a2ui` calls carry no `x-aimock-context`, so identical prompts would collide in the shared aimock matcher). The probe sends each slug's exact `suggestions.ts` message as typed input, which is byte-identical to the pill dispatch and matches the same fixture. Heal is sent exactly once (its fixture stages invalid->valid via `sequenceIndex` 0->1). - **Wiring**: register `a2ui-recovery` in `d5-registry`, map it in `d5-feature-mapping` (`"a2ui-recovery": ["a2ui-recovery"]`), add its representative fixture in `d5-representatives`, and mirror the mapping in the dashboard `CATALOG_TO_D5_KEY` (kept in lock-step by the drift test). ## Verification Local fleet harness, both recovery paths green (heal + exhaust): - `showcase test langgraph-python:a2ui-recovery --d6 --isolate` -> green (backend-owned `get_a2ui_tools` path) - `showcase test strands:a2ui-recovery --d6 --isolate` -> green (auto-inject middleware path) Harness unit tests pass (`d5-representatives`, `d5-mapping-drift`, `starter-mapping-drift`); harness typecheck clean. ## Note (separate, pre-existing) The on-demand workflow `test_e2e-showcase-on-demand.yml` is currently broken for all slugs: it installs `@copilotkit/aimock@^1.16.4` then invokes `aimock --fixtures ...`, but the resolved aimock CLI no longer accepts `--fixtures` (`Error: Unknown option '--fixtures'`, only `-c/--config`). aimock dies at the "Start aimock" step before any test runs. This is unrelated to the recovery demo and is tracked separately.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…26-03-17), no gpt-4o default Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ted memory in oracle showcase
Ports two fixes from the canonical oracle-cookbook demo into the oracle-agent-memory
showcase agent (server.py was byte-identical to the demo's pre-fix version):
1. Dangling tool-calls: booking conversationally calls the book_flight HITL tool,
which interrupts and emits an assistant tool_call awaiting the UI's Confirm/Cancel.
If the traveler sends another chat message instead, the unanswered tool_call made
the next turn 400 ("tool_call_ids did not have response messages"). _repair_dangling_tool_calls
synthesizes a "not completed" tool result for any dangling call before the history
reaches the graph. (Inverse of the duplicate-tool-block issue the history-replace
already handled — documented in docs/known-issues/agentspec-multiturn-toolcall-correlation.md.)
2. HTML-escaped persistence: the agentspec exporter HTML-escapes streamed deltas
(& < > -> & < >); the SSE generator persisted them raw, so assistant
replies were stored in Oracle Agent Memory as e.g. "fares < $700".
_clean_assistant_text html.unescapes the assembled text before persisting.
Both are reproduced + verified in oracle-cookbook (unit tests + in-browser); the ported
functions are byte-identical to the verified demo code.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…derer (#5719) ## Problem There is currently no way to intercept an A2UI action before it reaches the agent when using `createA2UIMessageRenderer`. Every dispatched action is forwarded to the agent unconditionally: 1. `A2UIMessageRendererOptions` exposed only `{ theme, catalog?, loadingComponent?, recovery? }` with no `onAction` (`packages/react-core/src/v2/a2ui/A2UIMessageRenderer.tsx`). 2. `ReactSurfaceHost` mounted `<A2UIProvider onAction={handleAction}>` and `handleAction` always called `copilotkit.runAgent({ agent })` unconditionally. 3. Per-component interception fails: `web_core@0.9.0`'s `GenericBinder` classifies any prop whose Zod schema is a union containing `{ event }` as `ACTION` and replaces the raw value with a zero-arg dispatching closure (so `props.action` is a function, `props.action.event` is undefined). 4. `functionCall` actions are dropped, not a workaround: `surface-model.js` `dispatchAction` only emits when the payload has `event`, and `createCatalog` provides no way to register client functions. Raised in Discord by a user with a custom catalog + custom Button who needs a `navigate` event handled client-side. DevRel confirmed `onAction` was not exposed. ## Change Add an optional `onAction` interceptor to `createA2UIMessageRenderer`: ```ts onAction?: ( action: A2UIUserAction, forward: (action?: A2UIUserAction) => Promise<void>, ) => void | A2UIUserAction | null | Promise<void | A2UIUserAction | null>; ``` - Return `null` to handle the action client-side and stop forwarding (agent is not run). - Return an `A2UIUserAction` to forward the (possibly modified) action. - Return `undefined` / `void` to forward unchanged. The interceptor is threaded from `createA2UIMessageRenderer` options through `ReactSurfaceHost` into `handleAction`. The forwarding logic is extracted into a `runA2UIAction` helper so the three behaviors are unit-testable. The original finally-block property cleanup is preserved, and default behavior is byte-identical when `onAction` is not supplied. New `A2UIActionInterceptor` type is exported from the v2 entrypoint. ## Tests Added to `packages/react-core/src/v2/__tests__/A2UIMessageRenderer.test.tsx`: - `onAction` returning `null` does not run the agent. - `onAction` returning a modified action forwards the modified payload. - No `onAction` forwards the original message unchanged. - `onAction` returning `undefined` forwards unchanged. ## Verify - `nx test react-core` (1374 passed) - `nx run-many -t lint build --projects=react-core` (clean)
## What this adds A new **"OpenBox Governance"** recipe in the cookbook (`showcase/shell-docs`), documenting how to add **OpenBox runtime governance** — guardrails, OPA/Rego policy, redaction, human-in-the-loop approvals, and halt — to a **CopilotKit + LangGraph** agent, with decisions rendered as generative UI. Follows the existing cookbook recipe pattern (Oracle / Arcade / Daytona): - `src/content/docs/cookbook/openbox-governed-copilotkit.mdx` — the recipe (how it works, the stack, prerequisites, run + **provisioning** steps, the four-verdict governance matrix, the key pieces in code, going further). - `meta.json` — registers the recipe in cookbook nav. - `index.mdx` — overview `<Card>`. - `src/lib/sidebar-icon.tsx` — `custom/openbox` sidebar icon. - `public/logos/openbox.png` — brand mark (Git LFS). The recipe documents the companion showcase **at full parity**: the provisioning step (`npm run openbox:admin:setup`), the Allow / Constrain / Approval / Block / Halt matrix using the demo's real suggestion prompts, and "key pieces in code" snippets pulled verbatim from the final showcase source (the LLM-driven governed engine, the OpenBox-middleware-first agent, and the runtime + approval routes). ## Red → green (TDD) The cookbook-nav test in `src/lib/__tests__/docs-render.test.ts` is the red/green hook — it hard-asserts the recipe count, titles, slugs, and URLs (6 entries incl. `["OpenBox Governance", "cookbook/openbox-governed-copilotkit"]`). ## Verification - `docs-render` test suite: **15/15** (nav asserts 6 entries). - `next build`: ✅ (all cookbook routes, no MDX/link errors). - The run-steps, four-verdict matrix, and code snippets were verified against the **final** demo source (companion PR #5685), so the docs and the runnable code stay in sync. ## Companion demo PR The runnable showcase this recipe documents: **#5685 > **Ships standalone.** This recipe no longer links the hosted live demo — that link now lives on the companion demo PR (#5685), so the cookbook recipe can merge independently of the demo deployment.
…howcase lands The OpenBox Governance recipe (#5686) merged ahead of its companion showcase (#5685), so the "Get the code" link pointed at github.com/.../tree/main/examples/showcases/openbox-governed-copilotkit, which 404s while that code is not yet on main. Replace the broken link with a plain "Full source to follow" note (no hyperlink, so nothing 404s) that still describes what the showcase will contain. The live upstream reference-repo link is kept. Swap the link back in once the showcase merges to main. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Fr5HVeDzDyC4S6DjyhAFWZ
…howcase lands (#5767) ## What does this PR do? The **OpenBox Governance** cookbook recipe (#5686) merged to `main` ahead of its companion showcase (#5685), so the recipe's **"Get the code"** section linked to: `https://github.com/CopilotKit/CopilotKit/tree/main/examples/showcases/openbox-governed-copilotkit` …which **404s** today because that code isn't on `main` yet. This swaps the broken link for a plain-text **"Full source to follow"** note (no hyperlink, so nothing 404s) that still describes what the showcase will contain. The live **upstream reference-repo** link directly below it is kept. ```diff - Full source: [`examples/showcases/openbox-governed-copilotkit`](https://github.com/.../tree/main/examples/showcases/openbox-governed-copilotkit) — `agent/` … and `frontend/` … + Full source to follow — the runnable showcase (the `agent/` LangGraph service with OpenBox middleware and the `frontend/` CopilotKit V2 chat with the wrapped runtime and approval route) will be published under `examples/showcases/openbox-governed-copilotkit`. ``` **Follow-up:** once the showcase merges to `main`, restore the direct `tree/main/...` link. _Note:_ the recipe's run-instructions still `cd` into that path (expected — the whole "run it yourself" flow assumes the code), so those steps only work once the showcase lands. Left as-is since they read as setup instructions, not a clickable link. ## Related PRs and Issues - Docs recipe (merged): #5686 - Companion showcase (pending): #5685 ## Checklist - [x] I have read the [Contribution Guide](https://github.com/copilotkit/copilotkit/blob/master/CONTRIBUTING.md) - [x] If the PR changes or adds functionality, I have updated the relevant documentation - [x] "Allow edits by maintainers" is checked 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01Fr5HVeDzDyC4S6DjyhAFWZ --- _Generated by [Claude Code](https://claude.ai/code/session_01Fr5HVeDzDyC4S6DjyhAFWZ)_
Address review feedback on the OpenBox Governance recipe: - Replace the ASCII flow with a real inline-SVG architecture diagram - Condense the wall-of-text provisioning warning to a few lines - Convert the governance-matrix table into per-prompt accordions - Highlight the key lines across the code samples to guide the reader - Move the coding-agent prompt to the top in a collapsed accordion
## What does this PR do? Follow-up polish on the **OpenBox Governance** cookbook recipe (shipped in #5686), addressing review feedback from the launch. All changes are in the single recipe page; no functional/code changes. | Feedback | Change | | --- | --- | | "Make this an actual diagram — hard to read as ASCII" | Replaced the ASCII flow under **How it works** with a real, color-coded **inline-SVG architecture diagram** (same self-contained pattern as the Oracle recipe — no external assets). | | "Huge, scary warning — condense it" | Cut the wall-of-text **provisioning** warning down to a few lines while keeping the essentials (idempotent, which keys it needs, how to verify). | | "Prompts are unreadable in the chart layout → accordions" | Converted the **Try it** governance-matrix table into one **accordion per prompt** (verdict-labeled), so the long prompts are on-demand instead of crammed into table cells. | | "Highlight the important stuff — guide the user. Applies to all code." | Added line highlighting across the bash + TypeScript samples so the key lines (the runtime wrap, `selfGovernedToolNames`, middleware-first, the provisioning command, the approval schema) stand out. | | "Move this to the top top and make it an accordion" | Moved the **coding-agent prompt** from the bottom to just under the intro, collapsed in an accordion. | ## Verification Rendered locally against `shell-docs` (`next dev`) and confirmed in-browser: the SVG diagram renders, all five accordions expand correctly, the warning is condensed, and code highlighting shows on both bash and TS blocks (no leaked `[!code]` markers). ## Related - Recipe (merged): #5686 - Source-link hotfix (merged): #5767 - Companion showcase (pending): #5685 — once it lands, restore the direct `tree/main/...` source link in **Get the code**. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Problem Docs OG image generation was producing unreliable social-preview output for docs URLs. The route depended on old static OG assets and Inter-era styling, and the preview did not match the current CopilotKit docs theme or logo. ## Why The OG route should render a consistent branded card from page frontmatter for every docs slug. It also needs local render assets for request-time reliability: `next/og` does not inherit the app layout font, and image inputs need to be available as bytes when the route renders. ## Fix - Reworked `showcase/shell-docs/src/app/og/[...slug]/route.tsx` to render a branded 1200x630 card with a tighter layout, CopilotKit theme colors, Plus Jakarta Sans, and per-page title/description/section labels. - Kept `next/font/google` for normal docs pages, and added upstream Plus Jakarta Sans static TTFs only for the OG renderer. `SOURCE.md` records the upstream URLs and SHA-256 hashes. The Google Fonts variable TTF was tested but the bundled `next/og` renderer crashes while parsing its `fvar` table. - Added the official CopilotKit full lockup as a real PNG asset. It is covered by the repo-level `*.png filter=lfs` rule, and the route encodes the PNG bytes to a data URI only at render time for `ImageResponse`. - Removed the hardcoded runtime/frontend/agent pills and the yellow gradient stop from the card. - Updated the focused OG route test to assert the card dimensions and bundled Plus Jakarta fonts. Validation: focused OG test, direct `ImageResponse` render with the upstream fonts, lint, typecheck, build, and live local OG route checks passed. Full shell-docs test has unrelated existing failures in public LFS PNG assets and one docs-render nav expectation.
Runnable companion to the #5521 cookbook recipe — a portable Oracle Agent Spec agent on LangGraph over AG-UI, long-term memory on Oracle AI Database, and a CopilotKit V2 frontend (generative UI + HITL booking). Lives beside `daytona-runcode`. - `agent/` — Python (uv) Agent Spec agent + FastAPI AG-UI server - `frontend/` — Next.js CopilotKit V2 chat (flight cards, recall chip, HITL booking) - `db/` — Oracle AI Database (Free image) + cookbook-user init - `docker-compose.yml` (local Oracle) + per-service `railway.json` ✅ Live cross-session recall verified working on the hosted demo (a preference taught in one thread is recalled in a fresh thread). ### Pre-merge follow-ups — resolved ✅ - [x] **Workspace deps** — the frontend builds standalone (`npm ci` + `next build` ✓ inside the monorepo). It ships its own `package-lock.json` and is intentionally excluded from the pnpm workspace (the `daytona-runcode` / `shell-dashboard` convention), so the published prerelease `@copilotkit/*` resolve and no `workspace:*`/`catalog:` is needed. - [x] **Promote SSOT** — the live demo is hosted independently (not part of CopilotKit's Railway promote fleet), so no `showcase/scripts/railway-envs.ts` entry is required. - [x] **`db/build-and-push.sh`** — default image namespace genericized to an `OWNER` placeholder (still `IMAGE`-overridable). - [x] **Oracle license** — `db/Dockerfile` and `build-and-push.sh` both document that the built image must stay in a PRIVATE registry (re-publishing the Oracle base image violates the license). > The Vercel preview deploys and `fork-pr-monitor` fail because this is a cross-fork PR (forks can't access deploy secrets) — not code issues. `config-allowlist` is fixed. Docs: #5521.
…agent-memory showcase (#5762) ##⚠️ Draft — stacked on #5563 This builds on #5563 (the `oracle-agent-memory` showcase). Since the showcase isn't on `main` yet, **this PR's diff currently includes #5563's files** — once #5563 merges, it auto-reduces to just the checkpointer change below. Keeping it as a **draft** until then; opening early to stage the work. ## What this adds An **optional, flag-gated** durable LangGraph checkpointer for the showcase agent, using Oracle's [`langgraph-oracledb`](https://github.com/oracle/langchain-oracle/tree/main/libs/langgraph-oracledb) `AsyncOracleSaver` — so per-thread LangGraph graph state persists in **Oracle**, **complementing** (not replacing) `oracleagentmemory`. That rounds out the "whole stack on Oracle" story: durable *memory* **and** durable *graph checkpoints*. **Default-safe:** gated behind `LANGGRAPH_CHECKPOINTER` (default `memory` → behavior unchanged; `oracle` is opt-in), with graceful fallback to in-memory on any Oracle error — so CI and the default run path never touch a DB. ## The actual delta (just these agent files) - `agent/concierge/checkpointer.py` *(new)* — flag-gated resolver: builds a dedicated async Oracle pool + `AsyncOracleSaver`, runs `await setup()`, degrades to `MemorySaver` on failure. - `agent/concierge/server.py` — an import-time monkeypatch injects the resolved checkpointer past `ag_ui_agentspec`'s hardcoded `MemorySaver`; the FastAPI lifespan inits/closes it. - `agent/pyproject.toml` — adds `langgraph-oracledb>=1.0.1`, bumps `oracledb>=2.2.0`, adds dev `pytest`/`pytest-asyncio`. - `agent/.env.example` — documents `LANGGRAPH_CHECKPOINTER`. - `agent/tests/test_oracle_checkpointer.py` *(new)* — a durability round-trip test (skipped unless `LANGGRAPH_CHECKPOINTER=oracle`). Mirrors [jerelvelarde/oracle-cookbook#4](jerelvelarde/oracle-cookbook#4), where it's reviewed + verified — the durability round-trip passes against a live Oracle DB, and 5/5 e2e pass with the flag off (no regression on the default path). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…turn scripted replies The 6 OpenAI drop-in integration examples enabled for the CLI's keyless AIMock mock mode shipped fixtures that did not cover the prompts each starter's own UI suggests, so a keyless first-run user clicking the demo suggestions mostly hit the generic catch-all. Re-key/extend each example's fixtures/default.json (substring, most-specific-first, catch-all last) so the surfaced suggestions return scripted replies. ENT-1003.
…ite re-call) The chart-chain fixtures looped under multi-step tool execution: langgraph-python emitted a 2nd tool call (pieChart/barChart/dashboard) with no terminating toolCallId result fixture, so it fell through to the still-matching userMessage fixture and re-called query_data. langgraph-js had its toolCallId result fixtures ordered after the userMessage fixtures (first array match wins -> re-call). Add the missing terminating result fixtures (langgraph-python) and reorder so all toolCallId fixtures precede userMessage fixtures (langgraph-js). Verified termination via the 2-3 step multi-turn repro; mastra fixtures already terminated.
The langgraph-js Toggle Theme chip returned a text response claiming it toggled the theme, but emitted no tool call -- so the theme never changed, while the same chip in langgraph-python emits the toggleTheme frontend tool call and works. Identical chip, different outcome by template. Swap the text response for the toggleTheme tool call, mirroring langgraph-python (frontend tool, no terminating result fixture needed -> no loop). Verified aimock now returns the toggleTheme tool call for the chip.
…nate HITL turn) Same class as the toggle-theme fix: langgraph-js's 'schedule a meeting' chip returned text, so the meeting picker never rendered, while langgraph-python emits the scheduleTime tool call. scheduleTime is a registered langgraph-js frontend tool (reasonForScheduling/meetingDuration) — swap text -> tool call, mirroring langgraph-python. Because scheduleTime is Human-in-the-Loop (the user's pick returns as a tool result and re-invokes the LLM), add a terminating call_schedule_time_001 result fixture in BOTH templates so the turn doesn't re-match the user message and loop (langgraph-python lacked it too — latent). Verified both terminate at 2 steps (tool call -> terminating text).
…turn scripted replies (#5728) ## Why The 6 OpenAI drop-in integration examples enabled for the CLI's **keyless AIMock mock mode** (`langgraph-python`, `langgraph-js`, `mastra`, `llamaindex`, `agno`, `pydantic-ai`) shipped `fixtures/default.json` files that **did not cover the prompts each starter's own UI suggests**. AIMock matches `userMessage` as a **case-sensitive substring, first-match-wins**, so a keyless first-run user who clicked the demo's suggestion chips mostly fell through to the generic catch-all ("I only have scripted replies…") instead of getting a scripted demo reply. Two root issues found (audit at `origin/main`): - **`langgraph-python` was mis-keyed** — fixtures keyed on suggestion *titles* (`"Pie Chart"`, `"Toggle Theme"`, `"Task Manager"`) while the UI sends long *message* strings that don't contain those substrings → all 9 suggestions missed. - The other 5 shipped only a generic `Hello` (+ a `weather` fixture in langgraph-js/mastra) → 0–1 of each starter's chips covered. This PR re-keys/extends each example's fixtures so the surfaced suggestions return scripted replies. Ordering preserved: most-specific-first, `{}` catch-all last (unchanged text). > Tracking: **ENT-1003** (CopilotKit/Intelligence). Sibling to ENT-989 (fixtures for *not-yet-enabled* frameworks). This PR covers the *already-enabled* 6. ## What changed (per template) | Template | Suggestions now covered | Tool-call replies | Text replies | |---|---|---|---| | langgraph-python | 9/9 (3 re-keyed) | pie/bar chart (`query_data`→component), `scheduleTime`, `toggleTheme`, `manage_todos` | Search Flights, Excalidraw, Calculator, Sales-Dashboard A2UI step | | langgraph-js | 9/9 | `query_data`, `search_flights`, `generate_a2ui`, `manage_todos` (schemas verified vs agent) | scheduleTime, Excalidraw, generateSandboxedUi, toggleTheme | | mastra | 6/6 | `get-weather`, `setThemeColor`, `go_to_moon` (HITL) | 3 proverb chips (proverbs are agent shared-state, not a tool) | | llamaindex | 3/3 | — | theme / proverb / weather | | agno | 4/4 | — | weather / theme / stock / proverb | | pydantic-ai | n/a (UI surfaces no chips) | — | best-effort free-typer fixtures: "what can you do" / "proverb" / "weather" | ## Honest caveats (best-effort; draft) - **Tool-call shapes only where evidenced.** Where a suggestion drives A2UI streaming, an MCP app (Excalidraw), or a frontend-only tool (`generateSandboxedUi`, `scheduleTime` in some templates) whose call shape isn't defined in the template, I used an **on-topic text reply** rather than fabricating a tool envelope. Those replies beat the catch-all but won't trigger the live generative UI under mock mode — a follow-up could record the real shapes. - **`pydantic-ai` surfaces no suggestion chips** in its UI, so there's nothing to key on; the real fix is a small UI change (add `suggestions` to `CopilotSidebar`), which is out of scope for a fixtures-only PR. The added fixtures are a fallback for free-typing users. - **Not verified end-to-end here** (authored against `origin/main`, not run live). Each template's `docker-compose.test.yml` AIMock smoke should stay green; note its `@chat` test only sends `"Hello"` and asserts a non-empty reply, so it does **not** validate suggestion-chip coverage — extending it to assert a non-catch-all reply for a real suggestion would close that blind spot (also noted in ENT-1003). ## Test plan - [ ] Per-template `docker-compose.test.yml` AIMock smoke still green. - [ ] Manual: scaffold/run each keyless, click each suggestion chip, confirm a scripted reply (not the catch-all).
### What The Angular `CopilotChatInput.handleKeyDown` submits the message when Enter is pressed without Shift, but it never checks whether an IME composition is in progress. When typing CJK text (Japanese, Chinese, Korean), the Enter that confirms an IME candidate also fires a `keydown`, so the half-composed text gets sent instead of the candidate being committed. The React and Vue bindings of the same v2 `CopilotChatInput` already guard against this; Angular was the one binding still missing it: - `packages/react-core/src/v2/components/chat/CopilotChatInput.tsx` — `handleKeyDown` returns early on `e.nativeEvent.isComposing || e.keyCode === 229` - `packages/vue/src/v2/components/chat/CopilotChatInput.vue` — `handleKeydown` returns early on `isComposing.value || event.isComposing || event.keyCode === 229`, and has a test asserting it does not submit while composing ### Change Add the same early return to the Angular `handleKeyDown`, using the native `KeyboardEvent` (`event.isComposing || event.keyCode === 229`), which matches the Vue binding's idiom. ### Notes When composition is not active `isComposing` is `false`, so Enter submits exactly as before and Shift+Enter still inserts a newline. The most visible case is Safari with a Japanese IME, where the confirming Enter reports `key === "Enter"` with `isComposing === true`; the `keyCode === 229` arm mirrors the sibling guards for browsers that report the composing key that way. I verified the handler logic in isolation (Enter while composing no longer submits; plain Enter and Shift+Enter are unchanged). I did not run the full Angular suite locally.
…uery CopilotChat message wrappers used viewport-keyed `cpk:sm:px-0`, collapsing horizontal padding to 0 at any viewport >=640px. The message column is `max-w-3xl` (768px) centered; the design assumes the chat fills the viewport, so at >=640px the column has side gutters and inner padding can drop to 0. But when the chat lives in a sub-viewport-width pane (e.g. the threads drawer rail beside the chat, ~580px on an 820px iPad-portrait viewport), `sm:px-0` still fires on viewport width while the 768px column overflows the narrow pane and sits flush against both edges. The input wrapper looked fine because it is visually inset by its own pill, so only message text appeared broken. Make the padding container-relative instead of viewport-relative: - add `cpk:@container` (container-type: inline-size) to the chat root, and - switch the message/input/suggestion wrappers from `cpk:sm:px-0` to the container variant `cpk:@3xl:px-0`. Padding now tracks the chat's own width and drops to 0 only once the container is at least as wide as the column's own max-width, so the column has real gutters; in any narrower pane the `px-4` inner padding is retained. React, Angular, and Vue kept in lockstep. Note on the breakpoint: Tailwind v4 container-query breakpoints differ from viewport breakpoints (`@sm` = 24rem/384px, not 640px). A mechanical `sm:` -> `@sm:` swap would still collapse the ~580px repro pane. `@3xl` (48rem/768px) is used because it exactly matches the column's `max-w-3xl`, which is the width at which side gutters first appear. Verified: full-width desktop chat unchanged (container >=768px -> px-0); 580px pane retains 16px padding; render-prop layouts without a container ancestor degrade safely to `px-4`; sidebar/popup `data-*` padding overrides are unaffected. ENT-1020 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uery (ENT-1020) (#5778) ## What & why Fixes **ENT-1020**. In a side-by-side layout (the threads drawer rail next to the chat), at iPad-portrait / tablet widths the chat **message text sat flush against both pane edges** — no horizontal padding — while the input stayed correctly inset. Surfaced while manually testing the Angular `CopilotDrawer` (#5746), but it is **not a drawer bug**: it reproduces in any layout that puts `CopilotChat` in a pane narrower than the viewport. ### Root cause The message column is `max-w-3xl` (768px) centered; its wrappers used `cpk:px-4 cpk:sm:px-0` — 16px below 640px, then **0 at viewport ≥640px**. There was no `container-type` on the chat root, so the `sm:` variant keyed on the **viewport**, not the chat's own width. In a sub-viewport pane (~580px chat on an 820px viewport) `sm:px-0` still fired (viewport ≥640) while the 768px column overflowed the pane edge-to-edge → flush text. ## The fix (robust / container-relative) - Add `cpk:@container` (`container-type: inline-size`) to the chat root (React ×2 render paths, Angular, Vue). - Switch every message / input / suggestion wrapper from viewport `cpk:sm:px-0` to the **container** variant `cpk:@3xl:px-0`. Padding now tracks the chat's **own** width and drops to 0 only once the container is at least as wide as the column's `max-w-3xl`, i.e. once the column has real side gutters. In any narrower pane the `px-4` inner padding is retained. **React, Angular, and Vue kept in lockstep.** ### Why `@3xl`, not the `@sm` the ticket suggested Tailwind v4 **container-query** breakpoints are a *different scale* from viewport breakpoints: `@sm` = **24rem/384px** (viewport `sm` = 640px). A mechanical `sm:` → `@sm:` swap would still collapse the ~580px repro pane (580 ≥ 384). `@3xl` = **48rem/768px**, which exactly matches the column's `max-w-3xl` — the width at which gutters first appear — so it is the semantically correct breakpoint. > Vue was not named in the ticket scope but shares the identical `sm:px-0` pattern; left unfixed it would reproduce the bug there, so it is included for true framework lockstep. web-components only *hosts* the chat (no `sm:px-0`), so it is correctly untouched. ## Testing **Browser behavior — real built CSS, exact DOM (`copilotKitChat`/`@container` root → `cpk:max-w-3xl cpk:mx-auto` → `cpk:px-4 cpk:@3xl:px-0` message wrapper), `getComputedStyle().paddingLeft`:** | Scenario | Result | Expectation | |---|---|---| | **580px narrow pane** (the repro) | `padding-left: 16px` | ✅ `px-4` retained — text no longer flush | | **900px wide pane** (full desktop) | `padding-left: 0px` | ✅ `px-0` — column has gutters, behavior preserved | | **No `@container` ancestor** (render-prop path) | `padding-left: 16px` | ✅ graceful `px-4`, no flush | **Generated CSS confirmed** (Tailwind v4.1.18): root emits `container-type: inline-size`; wrapper emits `@container (min-width: 48rem) { padding-inline: 0 }` (verified in both react-core and angular builds — a true container query, not a media query). **Unit/component tests (pass):** - react-core — full chat suite: **645 tests / 39 files** green (incl. `CopilotChatCssClasses`). - angular — `copilot-chat-view` + `copilot-chat-input` specs: **10** green. - vue — `CopilotChatView.connectingGate` + `CopilotChatSuggestionView.slots.e2e`: **30** green. - pre-commit `test-and-check-packages` (test + publint + attw across all 4 affected projects) passed. No tests assert these class strings and no snapshots capture them, so nothing needed regenerating. ## Acceptance criteria - [x] Messages retain horizontal padding when the chat is in a narrow (<~768px) pane at viewports ≥640px. - [x] Full-width chat behavior preserved (container ≥768px → `px-0`), verified in React, Angular, Vue. - [x] No regression to input / disclaimer / suggestion alignment with the message column (all share the same `@3xl` switch). Closes ENT-1020 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )