Skip to content

[pull] main from CopilotKit:main#37

Open
pull[bot] wants to merge 8302 commits into
MervinPraison:mainfrom
CopilotKit:main
Open

[pull] main from CopilotKit:main#37
pull[bot] wants to merge 8302 commits into
MervinPraison:mainfrom
CopilotKit:main

Conversation

@pull

@pull pull Bot commented Jan 7, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull Bot locked and limited conversation to collaborators Jan 7, 2026
@pull pull Bot added the ⤵️ pull label Jan 7, 2026
BenTaylorDev and others added 25 commits June 24, 2026 08:44
…eads

Restores an archived thread via the existing generic PATCH /threads/:id
update path with { archived: false } — the same mechanism example apps
already use for restore — so no new runtime route is required. Mirrors
archiveThread across the core thread store and the v2 useThreads hook.
…ells

Takes d6:strands and d6:strands-typescript from 32/35 to 34/35.

- shared-state-read: the turn-2 fixture leg wrongly pinned turnIndex:0, so the
  aimock matcher skipped it on turn 2 -> 404 -> turn-2 sse-missing. Drop
  turnIndex to mirror the langgraph-python gold-standard fixture.
- multimodal: sample.png/pdf/wav shipped as git-LFS pointers, so deploy/test
  environments without 'git lfs pull' served the ~130-byte pointer text as the
  upload -> the run never started (runsFinished=0). Ship them as regular
  binaries via a per-integration .gitattributes lfs-unset + real bytes,
  mirroring langgraph-python's convention.

Remaining red (gen-ui-declarative) is a Strands A2UI-dynamic run-completion bug
(reproduces on real-LLM staging too): the surface paints but generate_a2ui
never completes, so the run hangs 'Running'. Tracked separately.
…ery demo (#5671)

## What

Brings the google-adk showcase **A2UI** demos to D6 parity with the
langgraph-python / AWS Strands gold standard.

- **Auto-inject**: `declarative-gen-ui` switches from backend-owned
(`get_a2ui_tool`, `injectA2UITool: false`) to runtime auto-injection
(`injectA2UITool: true`, plain agent). The `ag-ui-adk` 0.7.0 adapter
auto-injects `generate_a2ui` via `plan_a2ui_injection`, matching how
langgraph-python and Strands wire the same demo. First time ADK's
adapter auto-injection is exercised in the showcase.
- **Recovery un-red**: removes the "known-failing on purpose" mark from
the `a2ui-recovery` heal e2e and the OSS-374 inner-render-disambiguation
notes (spec, fixture `_meta`, QA doc, agent docstring, route comment).
That premise was stale: on `ag-ui-adk` 0.7.0 the adapter forwards the
run conversation into the inner `render_a2ui` call, so each pill's last
user turn is its own prompt and aimock selects the right per-pill
fixture. Confirmed against the aimock journal.

`a2ui-recovery` stays backend-owned (the only path that surfaces the
recovery loop) and remains ADK-only (OSS-375 tracks langgraph-python
parity).

## Verification

Canonical D6 driver (`showcase test google-adk --d6 --isolate`), per
pill:

- `declarative-gen-ui`: all 4 pills pass under auto-inject
(sales-dashboard metric>=4 + pie + bar; team-performance data-table +
bar; at-risk status-badge>=3 + metric>=3; top-account info-row + pie)
- `a2ui-fixed-schema`: pass
- `beautiful-chat` (5 sub-pills): pass
- `a2ui-recovery` Playwright spec: 3/3 (page-load, heal, exhaust)
- aimock journal confirms `generate_a2ui` was adapter-injected (the
agent declares no tools) then drove the inner `render_a2ui`.

## Scope and honest status

This is **A2UI D6 parity**, not full-slug D6. The google-adk D6
aggregate is still red (33/39): six **non-A2UI** cells fail and are
untouched by this PR: `shared-state-write`,
`tool-rendering-custom-catchall`, `subagents`,
`tool-rendering-reasoning-chain`, `gen-ui-interrupt`,
`gen-ui-headless-complete`. They are pre-existing (D6 is not CI-gated
for google-adk) and out of scope here.
…reads (#5624)

## Summary

Adds `unarchiveThread(id)` to the v2 thread store (`@copilotkit/core`)
and the `useThreads` hook (`@copilotkit/react-core/v2`), restoring an
archived thread via the existing generic `PATCH /threads/:id { archived:
false }` update path — no new runtime endpoint. Mirrors `archiveThread`
across the store and hook.

This is the durable, architecture-independent piece extracted from the
threads-drawer effort. The drawer UI itself is being restarted as a
framework-agnostic **CopilotDrawer** (Lit web component + React/Angular
wrappers) under a separate spec; this hook method stands on its own and
is needed regardless.

## Testing

TDD. New core store test (`PATCH … { archived: false }`) and
`useThreads` hook test; full suites green (core, react-core).
…5635) (#5637)

Fixes #5635.

## What

Headers set directly on an `HttpAgent` registered via
`agents__unsafe_dev_only` were silently replaced by the provider
headers. Per-agent auth headers (like an `Authorization` for a
self-hosted backend) got dropped, causing 401s.

## Why

`AgentRegistry.applyHeadersToAgent` did `agent.headers = {
...core.headers }`, a full overwrite. The run handler and the react-core
`useAgent` hook did the same. So an agent built with its own headers
lost them on registration, on every `setHeaders`, and before each
request.

## Fix

Merge instead of replace. The registry captures each agent's own headers
once (in a WeakMap, before the first apply) and rebuilds `{
...ownHeaders, ...coreHeaders }`. Core wins on key conflicts, which
keeps the existing "provider headers are authoritative" and logout/clear
behavior. All header application now routes through one method,
`CopilotKitCore.applyHeadersToAgent`, so runs never clobber per-agent
headers.

Vue and Angular benefit too: they dispatch runs through `core.runAgent`
/ `connectAgent`, so the merge is re-applied before every request.

## Tests

- core: 3 new cases in `core-headers.test.ts` (preserve, merge,
retain-across-setHeaders); existing overwrite and clear tests still
pass.
- react-core: new `use-agent-provider-headers.e2e.test.tsx` with a real
provider and an HttpAgent that has its own headers.

Verified locally: format, lint, full core + react-core suites, and both
builds.
- @ag-ui/aws-strands 0.2.2 -> 0.2.3 (strands-typescript)
- ag_ui_strands 0.2.1 -> 0.2.2 (strands)

These releases carry the A2UI-dynamic (declarative-gen-ui) run-completion fix:
the auto-injected generate_a2ui now completes after the A2UI surface paints,
so the run emits RUN_FINISHED instead of hanging 'Running'. Should green the
gen-ui-declarative D6 cell on both integrations (-> 35/35) and resolve the
real-LLM staging hang. Pending local D6 re-verify.
Fail loud on a dropped #oss-alerts page: the failure-alert cross-post no longer
swallows a 200/ok:false Slack response, so a dropped page-the-humans alert reds
the renderer job instead of vanishing on a green run. The thread reply stays
warn-only. Both posts capture the response via a shared slack_alert_posted_ok
predicate, mirrored byte-identically across the live workflow and the dry-run
helper.

Debt cleanup: drop a dead failed_count var, correct a misleading gha_url comment,
and validate the decoded blob run_id against ^[0-9a-f]{6}$ in the render step so
a malformed run_id can't reach Slack or the run name.

Tests: predicate edge cases (non-JSON, malformed, missing/null ok), an anti-drift
parity guard asserting the predicate is identical in both files, and call-site
tests locking the #oss-alerts fail-loud vs thread warn-only exit semantics.
…es at the model-call boundary

Flatten AG-UI attachment content into native pydantic-ai content types in
the OUTGOING request only, via a WrapperModel-scoped flatten rather than a
history_processor (so the flatten never persists into ctx.state.message_history
and leak into UI state). Normalize mime types, gate on supported content types,
and degrade unsupported types at the single emission choke point. Fixes the
_map_user_prompt assert_never crash on raw AG-UI multimodal content.
… interrupt + headless

Brings google-adk to 39/39 D6 (reproduced across two independent full-matrix
runs, zero regressions). Four changes:

- entrypoint.sh: remove ADK_DISABLE_PROGRESSIVE_SSE_STREAMING=1. That flag's
  non-progressive aggregation path ended ADK's agentic loop after the first
  tool round (no post-tool LLM re-invoke), which broke every demo needing a
  second turn: the subagents chain (research -> writing -> critique),
  tool-rendering-reasoning-chain (AAPL -> MSFT), shared-state-read-write's
  confirmation, and the custom-catchall narration. The partial-event abort it
  guarded against is already handled in-callback by stop_on_terminal_text.
- manifest.yaml: un-skip-list tool-rendering-reasoning-chain (now passes with
  the loop restored).
- headless_complete_agent.py: add AGUIToolset() so the frontend highlight_note
  tool is injected and routed to the browser. Removing the flag unmasked this
  pre-existing gap — turn 3 dispatched highlight_note server-side and the
  backend registry rejected it. langgraph-python auto-injects frontend tools;
  ADK needs AGUIToolset() in the agent's tools list.
- aimock/d6/google-adk/gen-ui-interrupt.json: order each pill's narration leg
  (toolCallId) before its emit leg and drop the thread-global hasToolResult
  gate, so the alice pill no longer 503s after the sales pill leaves a tool
  result in the thread.
## Summary

Hardening + pre-existing-debt cleanup of the showcase promote-notify
Slack renderer (follow-up to #5657, scoped to deferred review items).

**Fail loud on a dropped page-the-humans alert.** The #oss-alerts
failure cross-post used to pipe the Slack API response to `/dev/null`,
so a `200`/`{"ok":false,"error":"channel_not_found"}` silently dropped
the alert that fires when a promote *failed* — on a green job nobody
would notice. Both posts now capture the response via a shared
`slack_alert_posted_ok` predicate; the #oss-alerts page fails the
renderer job loud on a dropped delivery, while the informational thread
reply stays warn-only. The predicate is mirrored byte-identically across
the live workflow and the dry-run helper.

**Debt cleanup.** Removed a dead `failed_count` var, corrected a
misleading `gha_url` comment, and added `^[0-9a-f]{6}$` validation on
the decoded blob `run_id` in the render step so a malformed run_id can't
reach Slack or the run name.

## Test plan
- [x] `promote-notify.bats` 11/11 — predicate edge cases (non-JSON,
malformed, missing/`null` ok), anti-drift parity guard (predicate
identical in both files), and call-site tests locking the #oss-alerts
fail-loud vs thread warn-only exit semantics
- [x] red→green proven: re-adding `|| true` to the #oss-alerts call-site
flips the call-site test red
- [x] shellcheck clean on the dry-run helper
… (kill probe false-red flaps) (#5649)

## Summary

Makes the showcase harness probe's turn-done signal **reliable**,
killing the dominant class of dashboard false-red flaps without ever
hiding a real failure.

`waitForTurnComplete` previously relied on a fragile SSE fetch-counter
conjunct that false-reds healthy demos whenever the page-side fetch
wrapper missed the runtime URL/transport. This change makes the
**`data-copilot-running` DOM attribute** (driven directly by the agent
run lifecycle, `RUN_STARTED`→true / `RUN_FINISHED`→false,
transport-independent) the **PRIMARY** done-signal, with the SSE counter
demoted to a **headless-only fallback** (headless demos never render
`CopilotChatView`, so the attribute is absent).

Design (all three preserved — no false-green, no false-red, hangs still
red):
- **Primary signal** = the `data-copilot-running` true→false
**transition** with a **stayed-stopped quiescence window** (a stop must
persist on the same run-start count for `settleMs`; a new sub-run resets
it) — so it cannot complete on an intermediate stop in a multi-step
turn.
- **SSE counter** = headless fallback only; never an OR-trigger when the
DOM signal is present.
- **`done-signal-missing` backstop** (gated on `attrPresent===true` +
`runningNow!==true`) reds a genuine painted-but-never-finished DOM turn
before the hard timeout; headless turns use their full timeout for their
only signal.

## How it was reviewed

A full 4-round `cr-loop` (7 unbiased agents/round + confirmation rounds
+ a Procedure-3 promotion audit) caught and fixed **5 distinct
correctness defects** in the implementation before merge:
- **F1** — SSE OR-trigger could complete a multi-step turn early on an
intermediate stop (false-GREEN), in both the loop and the post-loop
classifier.
- **F2** — the run-start baseline was captured *after* the message send,
killing the primary signal on fast turns (false-RED).
- **F3** — non-atomic double `surfaceReady` read per poll (latent hazard
+ wasted round-trip).
- **F4** — the surface-mount (`completeOnMount`) path had no quiescence
window (false-GREEN on intermediate stop + false-RED on a still-running
gen-UI turn).
- **F5** — the early backstop false-redded slow-but-healthy **headless**
turns (now gated on the DOM signal).

Bidirectional red-green tests for F1–F5 plus a systematic `{DOM,
headless} × {completes, lagging-recovers, genuine-hang} × {text,
surface}` completion/backstop matrix. Full harness unit suite: **3173
passed / 18 skipped / 0 failed**; `tsc --noEmit` clean; lint 0 errors;
build clean.

## Known follow-ups (NOT in this PR — pre-existing / non-blocking)

- **Theoretical edge (not reachable on real or realistically-streamed
turns):** if a run completed within a single synchronous microtask
(zero-duration), the page-side MutationObserver could miss the true edge
while `attrPresent===true` → false-red. Real LLM turns and aimock
realistic-streaming hold the attribute true across many event-loop
ticks, so the observer reliably latches it. A naive "re-add SSE fallback
for DOM-present" fix would reintroduce F1's multi-step false-green, so
it's intentionally not done here.
- **Recommended quick follow-up (latency only, no wrong verdict):**
capture `baselineBannerText` pre-`sendTurnMessage` (mirroring the
run-start/count baselines) so a fast-erroring cold-start turn fast-fails
(#5142) instead of burning the full timeout.
- **Pre-existing sse-interceptor capture/counter internals** (none
load-bearing for the new done-signal; verified STAY_IN_C by the
Procedure-3 audit): page-side counter soft-nav/multi-capture reset,
`__hk_fetchWrapped` pattern reuse + hardcoded fallback, g/y-flag
stateful RegExp, TextDecoder end-of-stream flush, bare-catch
reader-error swallow, framenav payload discard/TOCTOU,
CDP-wallTime-vs-Date.now TTFT, addInitScript/close-listener
re-registration accumulation.

## Test plan

- [x] `pnpm test` (harness) — 3173 passed / 18 skipped / 0 failed
- [x] `tsc --noEmit` exit 0, lint 0 errors, build exit 0
- [ ] Verify on staging that auth / prebuilt-sidebar / claude-sdk-tools
(and other previously-flapping cells) stop false-redding while
genuinely-broken cells stay red

Please review the replay/primary-signal approach. Not auto-merging.
…03 flap) (#5661)

## Summary
- The pydantic-ai `generate_a2ui` declarative D6 turn was missing an
aimock fixture, producing HTTP 503 `no_fixture_match` on staging
(pydantic-ai 503 vs ms-agent-dotnet 200 for the same turn) — a source of
dashboard flapping.
- Adds the canonical mirror fixtures (outer `generate_a2ui` + matching
inner `_design_a2ui_surface`) to
`showcase/aimock/d6/pydantic-ai/gen-ui-declarative.json`. These are
deterministic canonical mirrors matching the langgraph-python convention
— **not** a non-deterministic real-LLM recording — preserving the
mandatory LGP 1:1 parity.

## Red-green proof
- **RED:** exact failing request (`POST /v1/responses`, gpt-4.1, "Show
me my sales dashboard for this quarter.", tools=[`generate_a2ui`],
header `x-aimock-context: pydantic-ai`, strict) against the pre-fix
fixture set → **HTTP 503 `no_fixture_match`** (reproduces staging
exactly; confirmed live on staging too).
- **GREEN:** same request against the new set → **HTTP 200** SSE
emitting the `generate_a2ui` tool call; the inner `_design_a2ui_surface`
turn also returns 200 with the dashboard surface.
- Independently re-verified. `validate-on-load` clean (no fixture
shadowing); existing pydantic-ai D6 turns (KPI/pie/bar/status) still
match identically — no regression.

## Notes
- No credentials in the committed fixture — the OpenAI key was never
even resolved (canonical mirror, not a recording). Credential scan of
the diff + full blob: zero matches.

## Test plan
- [ ] CI green
- [ ] After deploy, confirm the `generate_a2ui` declarative D6 cell
flips red→green on staging
…es (fix assert_never) (#5675)

## Summary
The pydantic-ai showcase integration crashed with `assert_never` in
pydantic-ai's `_map_user_prompt` whenever AG-UI multimodal
`InputContent` (images / documents / binary, data: and url: sources)
reached the model — AG-UI content types were never normalized to the
native pydantic-ai types (`str` / `ImageUrl` / `BinaryContent` /
`DocumentUrl`) the mapper requires.

## What changed
- **`_MultimodalFlattenModel(WrapperModel)`** normalizes content at the
**model-call boundary** (overriding `request` / `request_stream` /
`count_tokens`) — deliberately NOT a `history_processor`, because that
hook persists its return into `message_history` and would leak flattened
content back to the UI.
- **Supported-type gating + degrade centralized at the single
native-type emission choke point** (instead of scattered
per-content-branch): unsupported image subtypes (HEIC/SVG/TIFF/BMP),
audio/video, and non-fetchable url attachments degrade to a text
placeholder rather than emitting a native type the OpenAI Responses
vision API rejects (which would fail the turn).
- **Mime normalized once** — strips RFC-2045 params/whitespace,
lowercases, and aliases the common non-canonical `image/jpg` →
`image/jpeg` before the allow-list (png/jpeg/gif/webp) test, so real
JPEGs are no longer silently dropped.
- Identity-based (`is`) no-op detection replaces fragile structural
`==`.

## Why it matters
Unblocks multimodal turns in the pydantic-ai showcase integration;
eliminates the `assert_never` crash and stops valid `image/jpg` JPEGs
and url-borne documents from being silently mishandled.

## Testing
- **43 unit tests** (clean pinned venv, pydantic-ai 1.0.18), red-green
proven per gap: `image/jpg` forwarded as a supported JPEG; url-media /
non-PDF-doc degrade instead of emitting an unconditional `DocumentUrl`;
parameterized mime forwarded; state-leak guard (`request_stream`
forwards flattened, not raw); `count_tokens` override; `assert_never`
provably unreachable (all flatten paths return a native type or raise).
- 9 rounds of code review (7 agents/round) to a clean confirmation round
(zero blocking findings).

## Test plan
- [ ] CI green
…ost-tool re-invoke loop (#5676)

## What

Fixes the 6 remaining red D6 cells for the **google-adk** showcase
integration, bringing it to **39/39 on the D6 matrix** (under aimock
replay, reproduced across two independent full-matrix runs, no in-matrix
regression). See the scope/caveats section before reading this as "fully
done."

## Root cause + changes

**1. `entrypoint.sh` — remove `ADK_DISABLE_PROGRESSIVE_SSE_STREAMING=1`
(the big one).**
A/B tested in-container: with the flag set, ADK's non-progressive
aggregation path ends the agentic loop after the first tool round — no
post-tool LLM re-invoke. That broke every demo needing a second turn
after a tool result:
- `subagents` (research → writing → critique never chained past
research)
- `tool-rendering-reasoning-chain` (AAPL → MSFT)
- `shared-state-read-write` (the post-`set_notes` confirmation text)
- `tool-rendering-custom-catchall` (the post-tool narration)

The flag was added to dodge an intermittent "last event is partial"
abort on tool-rendering; the in-callback `stop_on_terminal_text` guard
(shared by every agent) is intended to cover that. See caveat 1 — that
guard's sufficiency is verified under aimock only, not against real
Gemini.

**2. `manifest.yaml` — un-skip-list `tool-rendering-reasoning-chain`.**
It was marked `not_supported` (vacuous-green) only because of the loop
gap; it passes now.

**3. `headless_complete_agent.py` — add `AGUIToolset()`.** Removing the
flag unmasked a pre-existing gap: the frontend `highlight_note` tool was
never injected/routed, so turn 3 dispatched it server-side and the
backend registry rejected it (`Tool 'highlight_note' not found`).
langgraph-python auto-injects frontend tools; ADK needs `AGUIToolset()`
in the agent's `tools` list (every other frontend-tool google-adk agent
has it).

**4. `aimock/d6/google-adk/gen-ui-interrupt.json` — re-key.** The alice
pill 503'd: its emit leg was gated `hasToolResult:false`, but the
earlier sales pill leaves a tool result in the thread (thread-global).
Re-ordered each pill's narration (`toolCallId`) leg before its emit leg
and dropped the `hasToolResult` gate, mirroring the reasoning-chain
fixture pattern.

## Verification

`showcase test google-adk --d6 --isolate` (full per-pill matrix), run
twice on separate isolated stacks:
- Both: `passed=39, failed=0, total=39, state=green`.
- The 6 previously-red cells all pass: shared-state-write,
tool-rendering-custom-catchall, subagents,
tool-rendering-reasoning-chain, gen-ui-interrupt (sales + alice),
gen-ui-headless-complete.
- No regression: all previously-passing cells (incl. all
`tool-rendering*` and A2UI: gen-ui-declarative, gen-ui-a2ui-fixed,
beautiful-chat) stay green.

## Scope and caveats (read before merging)

1. **The flag removal is verified under aimock only, NOT against real
Gemini.** `ADK_DISABLE_PROGRESSIVE_SSE_STREAMING=1` was originally added
for a *real-Gemini intermittent* "last event is partial" abort. D6 is
deterministic aimock replay and cannot reproduce that intermittent
condition, so the "0 partial-aborts" result does not prove the
`stop_on_terminal_text` guard is sufficient against real Gemini. There
is a real (unmeasured) risk this re-introduces the partial-abort on
real-LLM tool-rendering runs. Recommend a real-Gemini smoke of the
tool-rendering demos before relying on this in a real-LLM context.
2. **"39/39" is the in-matrix set, not every feature.** Three features
remain `not_supported` and are excluded from the matrix (not fixed):
`interrupt-headless`, `reasoning-default-render`,
`agentic-chat-reasoning`.
3. **Local verification only.** D6 is not CI-gated for google-adk;
evidence is two isolate runs on one machine, not an independent CI
signal.
Wire the strands-typescript showcase integration for staging deployment,
mirroring how the Python strands integration is deployed.

- manifest: flip deployed: true so the shell lists it in the integration menu
- railway-envs.ts: add showcase-strands-typescript SSOT entry (staging-only
  for now: prod instance not yet provisioned, so it omits the prod env and is
  gateIgnore'd until promoted dual-env); regenerate railway-envs.generated.json
- showcase_build.yml + showcase_build_check.yml: add the strands-typescript
  build matrix entry, change-detection filter, and dispatch option (railway_id
  is the new Railway service id)
- golden fixture + image-ref-gate inventory tests updated for the new service

Railway staging service showcase-strands-typescript provisioned
(showcase-strands-typescript-staging.up.railway.app, health /api/health,
OpenAI-via-aimock env). Prod is added later via the promote pipeline.
…32→34/35) (#5673)

## Summary

Greens two of the three failing D6 (e2e-full) cells for **both** the
`strands` (Python) and `strands-typescript` integrations: **32/35 →
34/35** each.

## Fixes

- **shared-state-read** — the turn-2 fixture leg wrongly pinned
`turnIndex: 0`, so the aimock matcher skipped it on turn 2 ("candidate
fixture skipped by sequence/turn state") → 404 → turn-2 `sse-missing`.
Dropped `turnIndex` to mirror the langgraph-python gold-standard fixture
(whose turn-2 leg omits it).
- **multimodal** — `sample.png/pdf/wav` were committed as **git-LFS
pointers**, so deploy/test environments without `git lfs pull` served
the ~130-byte pointer text as the uploaded file → the agent run never
started (`runsFinished=0`). Now shipped as regular binaries via a
per-integration `.gitattributes` lfs-unset + real bytes, mirroring
langgraph-python's existing convention ("must stay as regular binaries
... so deploy environments without `git lfs pull` serve the actual
files").

Both fixes verified locally: `showcase/bin/showcase test
{strands,strands-typescript} --d6 --direct --rebuild --isolate` → 34/35
each, shared-state-read + multimodal green.

## Known remaining red (out of scope here)

**gen-ui-declarative** (A2UI dynamic) stays red on both. The A2UI
surface **paints correctly** (real dashboard data), but `generate_a2ui`
never completes, so the run hangs "Running" (`runsFinished=0`). This
reproduces on **real-LLM staging** too, so it is not a fixture/aimock
artifact — it is a Strands A2UI-dynamic run-completion issue. The
fixtures and frontend are byte-identical to langgraph-python's (which
passes), pointing at the Strands adapter's auto-inject completion path.
Tracked separately for an adapter-level fix.
ranst91 and others added 30 commits June 29, 2026 18:54
Follow-up to #5720 (merged). That PR added the **A2UI Error Recovery**
demo (`a2ui-recovery`) across google-adk +
langgraph-{python,fastapi,typescript} + strands{,-typescript}, but the
recovery flow was only exercised by the manual on-demand workflow
(`/test-aimock`). It was NOT covered by the per-PR d5/d6 fleet harness,
so a regression in heal/exhaust would not turn any automatic CI cell
red.

## What this does

Adds a `d5-a2ui-recovery` probe and wires it into the harness so
`a2ui-recovery` runs on every PR like `gen-ui-declarative`:

- **New probe**
`showcase/harness/src/probes/scripts/d5-a2ui-recovery.ts` drives both
pills in one session and asserts the stable end-states:
- **HEAL**: the healed surface paints (>= 2 newly-mounted
`declarative-metric` tiles) and NO "Couldn't generate the UI" card.
- **EXHAUST**: the hard-failure "Couldn't generate the UI" card appears
and NO surface paints.
- Assertions are delta-based against a pre-send baseline so the two
mutually-exclusive negatives stay correct across the shared two-turn
session. The transient "Retrying..." label is not asserted
(timing-flaky).
- **Per-slug prompts**: the recovery prompts are unique per integration
slug (the inner `render_a2ui` calls carry no `x-aimock-context`, so
identical prompts would collide in the shared aimock matcher). The probe
sends each slug's exact `suggestions.ts` message as typed input, which
is byte-identical to the pill dispatch and matches the same fixture.
Heal is sent exactly once (its fixture stages invalid->valid via
`sequenceIndex` 0->1).
- **Wiring**: register `a2ui-recovery` in `d5-registry`, map it in
`d5-feature-mapping` (`"a2ui-recovery": ["a2ui-recovery"]`), add its
representative fixture in `d5-representatives`, and mirror the mapping
in the dashboard `CATALOG_TO_D5_KEY` (kept in lock-step by the drift
test).

## Verification

Local fleet harness, both recovery paths green (heal + exhaust):

- `showcase test langgraph-python:a2ui-recovery --d6 --isolate` -> green
(backend-owned `get_a2ui_tools` path)
- `showcase test strands:a2ui-recovery --d6 --isolate` -> green
(auto-inject middleware path)

Harness unit tests pass (`d5-representatives`, `d5-mapping-drift`,
`starter-mapping-drift`); harness typecheck clean.

## Note (separate, pre-existing)

The on-demand workflow `test_e2e-showcase-on-demand.yml` is currently
broken for all slugs: it installs `@copilotkit/aimock@^1.16.4` then
invokes `aimock --fixtures ...`, but the resolved aimock CLI no longer
accepts `--fixtures` (`Error: Unknown option '--fixtures'`, only
`-c/--config`). aimock dies at the "Start aimock" step before any test
runs. This is unrelated to the recovery demo and is tracked separately.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…26-03-17), no gpt-4o default

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ted memory in oracle showcase

Ports two fixes from the canonical oracle-cookbook demo into the oracle-agent-memory
showcase agent (server.py was byte-identical to the demo's pre-fix version):

1. Dangling tool-calls: booking conversationally calls the book_flight HITL tool,
   which interrupts and emits an assistant tool_call awaiting the UI's Confirm/Cancel.
   If the traveler sends another chat message instead, the unanswered tool_call made
   the next turn 400 ("tool_call_ids did not have response messages"). _repair_dangling_tool_calls
   synthesizes a "not completed" tool result for any dangling call before the history
   reaches the graph. (Inverse of the duplicate-tool-block issue the history-replace
   already handled — documented in docs/known-issues/agentspec-multiturn-toolcall-correlation.md.)

2. HTML-escaped persistence: the agentspec exporter HTML-escapes streamed deltas
   (& < > -> &amp; &lt; &gt;); the SSE generator persisted them raw, so assistant
   replies were stored in Oracle Agent Memory as e.g. "fares &lt; $700".
   _clean_assistant_text html.unescapes the assembled text before persisting.

Both are reproduced + verified in oracle-cookbook (unit tests + in-browser); the ported
functions are byte-identical to the verified demo code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…derer (#5719)

## Problem

There is currently no way to intercept an A2UI action before it reaches
the agent when using `createA2UIMessageRenderer`. Every dispatched
action is forwarded to the agent unconditionally:

1. `A2UIMessageRendererOptions` exposed only `{ theme, catalog?,
loadingComponent?, recovery? }` with no `onAction`
(`packages/react-core/src/v2/a2ui/A2UIMessageRenderer.tsx`).
2. `ReactSurfaceHost` mounted `<A2UIProvider onAction={handleAction}>`
and `handleAction` always called `copilotkit.runAgent({ agent })`
unconditionally.
3. Per-component interception fails: `web_core@0.9.0`'s `GenericBinder`
classifies any prop whose Zod schema is a union containing `{ event }`
as `ACTION` and replaces the raw value with a zero-arg dispatching
closure (so `props.action` is a function, `props.action.event` is
undefined).
4. `functionCall` actions are dropped, not a workaround:
`surface-model.js` `dispatchAction` only emits when the payload has
`event`, and `createCatalog` provides no way to register client
functions.

Raised in Discord by a user with a custom catalog + custom Button who
needs a `navigate` event handled client-side. DevRel confirmed
`onAction` was not exposed.

## Change

Add an optional `onAction` interceptor to `createA2UIMessageRenderer`:

```ts
onAction?: (
  action: A2UIUserAction,
  forward: (action?: A2UIUserAction) => Promise<void>,
) => void | A2UIUserAction | null | Promise<void | A2UIUserAction | null>;
```

- Return `null` to handle the action client-side and stop forwarding
(agent is not run).
- Return an `A2UIUserAction` to forward the (possibly modified) action.
- Return `undefined` / `void` to forward unchanged.

The interceptor is threaded from `createA2UIMessageRenderer` options
through `ReactSurfaceHost` into `handleAction`. The forwarding logic is
extracted into a `runA2UIAction` helper so the three behaviors are
unit-testable. The original finally-block property cleanup is preserved,
and default behavior is byte-identical when `onAction` is not supplied.
New `A2UIActionInterceptor` type is exported from the v2 entrypoint.

## Tests

Added to
`packages/react-core/src/v2/__tests__/A2UIMessageRenderer.test.tsx`:
- `onAction` returning `null` does not run the agent.
- `onAction` returning a modified action forwards the modified payload.
- No `onAction` forwards the original message unchanged.
- `onAction` returning `undefined` forwards unchanged.

## Verify

- `nx test react-core` (1374 passed)
- `nx run-many -t lint build --projects=react-core` (clean)
## What this adds

A new **"OpenBox Governance"** recipe in the cookbook
(`showcase/shell-docs`), documenting how to add **OpenBox runtime
governance** — guardrails, OPA/Rego policy, redaction, human-in-the-loop
approvals, and halt — to a **CopilotKit + LangGraph** agent, with
decisions rendered as generative UI.

Follows the existing cookbook recipe pattern (Oracle / Arcade /
Daytona):

- `src/content/docs/cookbook/openbox-governed-copilotkit.mdx` — the
recipe (how it works, the stack, prerequisites, run + **provisioning**
steps, the four-verdict governance matrix, the key pieces in code, going
further).
- `meta.json` — registers the recipe in cookbook nav.
- `index.mdx` — overview `<Card>`.
- `src/lib/sidebar-icon.tsx` — `custom/openbox` sidebar icon.
- `public/logos/openbox.png` — brand mark (Git LFS).

The recipe documents the companion showcase **at full parity**: the
provisioning step (`npm run openbox:admin:setup`), the Allow / Constrain
/ Approval / Block / Halt matrix using the demo's real suggestion
prompts, and "key pieces in code" snippets pulled verbatim from the
final showcase source (the LLM-driven governed engine, the
OpenBox-middleware-first agent, and the runtime + approval routes).

## Red → green (TDD)

The cookbook-nav test in `src/lib/__tests__/docs-render.test.ts` is the
red/green hook — it hard-asserts the recipe count, titles, slugs, and
URLs (6 entries incl. `["OpenBox Governance",
"cookbook/openbox-governed-copilotkit"]`).

## Verification

- `docs-render` test suite: **15/15** (nav asserts 6 entries).
- `next build`: ✅ (all cookbook routes, no MDX/link errors).
- The run-steps, four-verdict matrix, and code snippets were verified
against the **final** demo source (companion PR #5685), so the docs and
the runnable code stay in sync.

## Companion demo PR

The runnable showcase this recipe documents:
**#5685

> **Ships standalone.** This recipe no longer links the hosted live demo
— that link now lives on the companion demo PR (#5685), so the cookbook
recipe can merge independently of the demo deployment.
…howcase lands

The OpenBox Governance recipe (#5686) merged ahead of its companion
showcase (#5685), so the "Get the code" link pointed at
github.com/.../tree/main/examples/showcases/openbox-governed-copilotkit,
which 404s while that code is not yet on main.

Replace the broken link with a plain "Full source to follow" note (no
hyperlink, so nothing 404s) that still describes what the showcase will
contain. The live upstream reference-repo link is kept. Swap the link
back in once the showcase merges to main.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Fr5HVeDzDyC4S6DjyhAFWZ
…howcase lands (#5767)

## What does this PR do?

The **OpenBox Governance** cookbook recipe (#5686) merged to `main`
ahead of its companion showcase (#5685), so the recipe's **"Get the
code"** section linked to:


`https://github.com/CopilotKit/CopilotKit/tree/main/examples/showcases/openbox-governed-copilotkit`

…which **404s** today because that code isn't on `main` yet.

This swaps the broken link for a plain-text **"Full source to follow"**
note (no hyperlink, so nothing 404s) that still describes what the
showcase will contain. The live **upstream reference-repo** link
directly below it is kept.

```diff
- Full source: [`examples/showcases/openbox-governed-copilotkit`](https://github.com/.../tree/main/examples/showcases/openbox-governed-copilotkit) — `agent/` … and `frontend/` …
+ Full source to follow — the runnable showcase (the `agent/` LangGraph service with OpenBox middleware and the `frontend/` CopilotKit V2 chat with the wrapped runtime and approval route) will be published under `examples/showcases/openbox-governed-copilotkit`.
```

**Follow-up:** once the showcase merges to `main`, restore the direct
`tree/main/...` link.

_Note:_ the recipe's run-instructions still `cd` into that path
(expected — the whole "run it yourself" flow assumes the code), so those
steps only work once the showcase lands. Left as-is since they read as
setup instructions, not a clickable link.

## Related PRs and Issues

- Docs recipe (merged): #5686
- Companion showcase (pending): #5685

## Checklist

- [x] I have read the [Contribution
Guide](https://github.com/copilotkit/copilotkit/blob/master/CONTRIBUTING.md)
- [x] If the PR changes or adds functionality, I have updated the
relevant documentation
- [x] "Allow edits by maintainers" is checked

🤖 Generated with [Claude Code](https://claude.com/claude-code)

https://claude.ai/code/session_01Fr5HVeDzDyC4S6DjyhAFWZ

---
_Generated by [Claude
Code](https://claude.ai/code/session_01Fr5HVeDzDyC4S6DjyhAFWZ)_
Address review feedback on the OpenBox Governance recipe:
- Replace the ASCII flow with a real inline-SVG architecture diagram
- Condense the wall-of-text provisioning warning to a few lines
- Convert the governance-matrix table into per-prompt accordions
- Highlight the key lines across the code samples to guide the reader
- Move the coding-agent prompt to the top in a collapsed accordion
## What does this PR do?

Follow-up polish on the **OpenBox Governance** cookbook recipe (shipped
in #5686), addressing review feedback from the launch. All changes are
in the single recipe page; no functional/code changes.

| Feedback | Change |
| --- | --- |
| "Make this an actual diagram — hard to read as ASCII" | Replaced the
ASCII flow under **How it works** with a real, color-coded **inline-SVG
architecture diagram** (same self-contained pattern as the Oracle recipe
— no external assets). |
| "Huge, scary warning — condense it" | Cut the wall-of-text
**provisioning** warning down to a few lines while keeping the
essentials (idempotent, which keys it needs, how to verify). |
| "Prompts are unreadable in the chart layout → accordions" | Converted
the **Try it** governance-matrix table into one **accordion per prompt**
(verdict-labeled), so the long prompts are on-demand instead of crammed
into table cells. |
| "Highlight the important stuff — guide the user. Applies to all code."
| Added line highlighting across the bash + TypeScript samples so the
key lines (the runtime wrap, `selfGovernedToolNames`, middleware-first,
the provisioning command, the approval schema) stand out. |
| "Move this to the top top and make it an accordion" | Moved the
**coding-agent prompt** from the bottom to just under the intro,
collapsed in an accordion. |

## Verification

Rendered locally against `shell-docs` (`next dev`) and confirmed
in-browser: the SVG diagram renders, all five accordions expand
correctly, the warning is condensed, and code highlighting shows on both
bash and TS blocks (no leaked `[!code]` markers).

## Related

- Recipe (merged): #5686
- Source-link hotfix (merged): #5767
- Companion showcase (pending): #5685 — once it lands, restore the
direct `tree/main/...` source link in **Get the code**.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Problem

Docs OG image generation was producing unreliable social-preview output
for docs URLs. The route depended on old static OG assets and Inter-era
styling, and the preview did not match the current CopilotKit docs theme
or logo.

## Why

The OG route should render a consistent branded card from page
frontmatter for every docs slug. It also needs local render assets for
request-time reliability: `next/og` does not inherit the app layout
font, and image inputs need to be available as bytes when the route
renders.

## Fix

- Reworked `showcase/shell-docs/src/app/og/[...slug]/route.tsx` to
render a branded 1200x630 card with a tighter layout, CopilotKit theme
colors, Plus Jakarta Sans, and per-page title/description/section
labels.
- Kept `next/font/google` for normal docs pages, and added upstream Plus
Jakarta Sans static TTFs only for the OG renderer. `SOURCE.md` records
the upstream URLs and SHA-256 hashes. The Google Fonts variable TTF was
tested but the bundled `next/og` renderer crashes while parsing its
`fvar` table.
- Added the official CopilotKit full lockup as a real PNG asset. It is
covered by the repo-level `*.png filter=lfs` rule, and the route encodes
the PNG bytes to a data URI only at render time for `ImageResponse`.
- Removed the hardcoded runtime/frontend/agent pills and the yellow
gradient stop from the card.
- Updated the focused OG route test to assert the card dimensions and
bundled Plus Jakarta fonts.

Validation: focused OG test, direct `ImageResponse` render with the
upstream fonts, lint, typecheck, build, and live local OG route checks
passed. Full shell-docs test has unrelated existing failures in public
LFS PNG assets and one docs-render nav expectation.
Runnable companion to the #5521 cookbook recipe — a portable Oracle
Agent Spec agent on LangGraph over AG-UI, long-term memory on Oracle AI
Database, and a CopilotKit V2 frontend (generative UI + HITL booking).
Lives beside `daytona-runcode`.

- `agent/` — Python (uv) Agent Spec agent + FastAPI AG-UI server
- `frontend/` — Next.js CopilotKit V2 chat (flight cards, recall chip,
HITL booking)
- `db/` — Oracle AI Database (Free image) + cookbook-user init
- `docker-compose.yml` (local Oracle) + per-service `railway.json`

✅ Live cross-session recall verified working on the hosted demo (a
preference taught in one thread is recalled in a fresh thread).

### Pre-merge follow-ups — resolved ✅
- [x] **Workspace deps** — the frontend builds standalone (`npm ci` +
`next build` ✓ inside the monorepo). It ships its own
`package-lock.json` and is intentionally excluded from the pnpm
workspace (the `daytona-runcode` / `shell-dashboard` convention), so the
published prerelease `@copilotkit/*` resolve and no
`workspace:*`/`catalog:` is needed.
- [x] **Promote SSOT** — the live demo is hosted independently (not part
of CopilotKit's Railway promote fleet), so no
`showcase/scripts/railway-envs.ts` entry is required.
- [x] **`db/build-and-push.sh`** — default image namespace genericized
to an `OWNER` placeholder (still `IMAGE`-overridable).
- [x] **Oracle license** — `db/Dockerfile` and `build-and-push.sh` both
document that the built image must stay in a PRIVATE registry
(re-publishing the Oracle base image violates the license).

> The Vercel preview deploys and `fork-pr-monitor` fail because this is
a cross-fork PR (forks can't access deploy secrets) — not code issues.
`config-allowlist` is fixed.

Docs: #5521.
…agent-memory showcase (#5762)

## ⚠️ Draft — stacked on #5563

This builds on #5563 (the `oracle-agent-memory` showcase). Since the
showcase isn't on `main` yet, **this PR's diff currently includes
#5563's files** — once #5563 merges, it auto-reduces to just the
checkpointer change below. Keeping it as a **draft** until then; opening
early to stage the work.

## What this adds
An **optional, flag-gated** durable LangGraph checkpointer for the
showcase agent, using Oracle's
[`langgraph-oracledb`](https://github.com/oracle/langchain-oracle/tree/main/libs/langgraph-oracledb)
`AsyncOracleSaver` — so per-thread LangGraph graph state persists in
**Oracle**, **complementing** (not replacing) `oracleagentmemory`. That
rounds out the "whole stack on Oracle" story: durable *memory* **and**
durable *graph checkpoints*.

**Default-safe:** gated behind `LANGGRAPH_CHECKPOINTER` (default
`memory` → behavior unchanged; `oracle` is opt-in), with graceful
fallback to in-memory on any Oracle error — so CI and the default run
path never touch a DB.

## The actual delta (just these agent files)
- `agent/concierge/checkpointer.py` *(new)* — flag-gated resolver:
builds a dedicated async Oracle pool + `AsyncOracleSaver`, runs `await
setup()`, degrades to `MemorySaver` on failure.
- `agent/concierge/server.py` — an import-time monkeypatch injects the
resolved checkpointer past `ag_ui_agentspec`'s hardcoded `MemorySaver`;
the FastAPI lifespan inits/closes it.
- `agent/pyproject.toml` — adds `langgraph-oracledb>=1.0.1`, bumps
`oracledb>=2.2.0`, adds dev `pytest`/`pytest-asyncio`.
- `agent/.env.example` — documents `LANGGRAPH_CHECKPOINTER`.
- `agent/tests/test_oracle_checkpointer.py` *(new)* — a durability
round-trip test (skipped unless `LANGGRAPH_CHECKPOINTER=oracle`).

Mirrors
[jerelvelarde/oracle-cookbook#4](jerelvelarde/oracle-cookbook#4),
where it's reviewed + verified — the durability round-trip passes
against a live Oracle DB, and 5/5 e2e pass with the flag off (no
regression on the default path).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…turn scripted replies

The 6 OpenAI drop-in integration examples enabled for the CLI's keyless
AIMock mock mode shipped fixtures that did not cover the prompts each
starter's own UI suggests, so a keyless first-run user clicking the demo
suggestions mostly hit the generic catch-all. Re-key/extend each
example's fixtures/default.json (substring, most-specific-first,
catch-all last) so the surfaced suggestions return scripted replies.

ENT-1003.
…ite re-call)

The chart-chain fixtures looped under multi-step tool execution: langgraph-python
emitted a 2nd tool call (pieChart/barChart/dashboard) with no terminating
toolCallId result fixture, so it fell through to the still-matching userMessage
fixture and re-called query_data. langgraph-js had its toolCallId result fixtures
ordered after the userMessage fixtures (first array match wins -> re-call).

Add the missing terminating result fixtures (langgraph-python) and reorder so all
toolCallId fixtures precede userMessage fixtures (langgraph-js). Verified termination
via the 2-3 step multi-turn repro; mastra fixtures already terminated.
The langgraph-js Toggle Theme chip returned a text response claiming it toggled
the theme, but emitted no tool call -- so the theme never changed, while the same
chip in langgraph-python emits the toggleTheme frontend tool call and works.
Identical chip, different outcome by template. Swap the text response for the
toggleTheme tool call, mirroring langgraph-python (frontend tool, no terminating
result fixture needed -> no loop). Verified aimock now returns the toggleTheme
tool call for the chip.
…nate HITL turn)

Same class as the toggle-theme fix: langgraph-js's 'schedule a meeting' chip
returned text, so the meeting picker never rendered, while langgraph-python emits
the scheduleTime tool call. scheduleTime is a registered langgraph-js frontend
tool (reasonForScheduling/meetingDuration) — swap text -> tool call, mirroring
langgraph-python. Because scheduleTime is Human-in-the-Loop (the user's pick
returns as a tool result and re-invokes the LLM), add a terminating
call_schedule_time_001 result fixture in BOTH templates so the turn doesn't
re-match the user message and loop (langgraph-python lacked it too — latent).
Verified both terminate at 2 steps (tool call -> terminating text).
…turn scripted replies (#5728)

## Why

The 6 OpenAI drop-in integration examples enabled for the CLI's
**keyless AIMock mock mode** (`langgraph-python`, `langgraph-js`,
`mastra`, `llamaindex`, `agno`, `pydantic-ai`) shipped
`fixtures/default.json` files that **did not cover the prompts each
starter's own UI suggests**. AIMock matches `userMessage` as a
**case-sensitive substring, first-match-wins**, so a keyless first-run
user who clicked the demo's suggestion chips mostly fell through to the
generic catch-all ("I only have scripted replies…") instead of getting a
scripted demo reply.

Two root issues found (audit at `origin/main`):
- **`langgraph-python` was mis-keyed** — fixtures keyed on suggestion
*titles* (`"Pie Chart"`, `"Toggle Theme"`, `"Task Manager"`) while the
UI sends long *message* strings that don't contain those substrings →
all 9 suggestions missed.
- The other 5 shipped only a generic `Hello` (+ a `weather` fixture in
langgraph-js/mastra) → 0–1 of each starter's chips covered.

This PR re-keys/extends each example's fixtures so the surfaced
suggestions return scripted replies. Ordering preserved:
most-specific-first, `{}` catch-all last (unchanged text).

> Tracking: **ENT-1003** (CopilotKit/Intelligence). Sibling to ENT-989
(fixtures for *not-yet-enabled* frameworks). This PR covers the
*already-enabled* 6.

## What changed (per template)

| Template | Suggestions now covered | Tool-call replies | Text replies
|
|---|---|---|---|
| langgraph-python | 9/9 (3 re-keyed) | pie/bar chart
(`query_data`→component), `scheduleTime`, `toggleTheme`, `manage_todos`
| Search Flights, Excalidraw, Calculator, Sales-Dashboard A2UI step |
| langgraph-js | 9/9 | `query_data`, `search_flights`, `generate_a2ui`,
`manage_todos` (schemas verified vs agent) | scheduleTime, Excalidraw,
generateSandboxedUi, toggleTheme |
| mastra | 6/6 | `get-weather`, `setThemeColor`, `go_to_moon` (HITL) | 3
proverb chips (proverbs are agent shared-state, not a tool) |
| llamaindex | 3/3 | — | theme / proverb / weather |
| agno | 4/4 | — | weather / theme / stock / proverb |
| pydantic-ai | n/a (UI surfaces no chips) | — | best-effort free-typer
fixtures: "what can you do" / "proverb" / "weather" |

## Honest caveats (best-effort; draft)

- **Tool-call shapes only where evidenced.** Where a suggestion drives
A2UI streaming, an MCP app (Excalidraw), or a frontend-only tool
(`generateSandboxedUi`, `scheduleTime` in some templates) whose call
shape isn't defined in the template, I used an **on-topic text reply**
rather than fabricating a tool envelope. Those replies beat the
catch-all but won't trigger the live generative UI under mock mode — a
follow-up could record the real shapes.
- **`pydantic-ai` surfaces no suggestion chips** in its UI, so there's
nothing to key on; the real fix is a small UI change (add `suggestions`
to `CopilotSidebar`), which is out of scope for a fixtures-only PR. The
added fixtures are a fallback for free-typing users.
- **Not verified end-to-end here** (authored against `origin/main`, not
run live). Each template's `docker-compose.test.yml` AIMock smoke should
stay green; note its `@chat` test only sends `"Hello"` and asserts a
non-empty reply, so it does **not** validate suggestion-chip coverage —
extending it to assert a non-catch-all reply for a real suggestion would
close that blind spot (also noted in ENT-1003).

## Test plan
- [ ] Per-template `docker-compose.test.yml` AIMock smoke still green.
- [ ] Manual: scaffold/run each keyless, click each suggestion chip,
confirm a scripted reply (not the catch-all).
### What

The Angular `CopilotChatInput.handleKeyDown` submits the message when
Enter is pressed without Shift, but it never checks whether an IME
composition is in progress. When typing CJK text (Japanese, Chinese,
Korean), the Enter that confirms an IME candidate also fires a
`keydown`, so the half-composed text gets sent instead of the candidate
being committed.

The React and Vue bindings of the same v2 `CopilotChatInput` already
guard against this; Angular was the one binding still missing it:

- `packages/react-core/src/v2/components/chat/CopilotChatInput.tsx` —
`handleKeyDown` returns early on `e.nativeEvent.isComposing || e.keyCode
=== 229`
- `packages/vue/src/v2/components/chat/CopilotChatInput.vue` —
`handleKeydown` returns early on `isComposing.value || event.isComposing
|| event.keyCode === 229`, and has a test asserting it does not submit
while composing

### Change

Add the same early return to the Angular `handleKeyDown`, using the
native `KeyboardEvent` (`event.isComposing || event.keyCode === 229`),
which matches the Vue binding's idiom.

### Notes

When composition is not active `isComposing` is `false`, so Enter
submits exactly as before and Shift+Enter still inserts a newline. The
most visible case is Safari with a Japanese IME, where the confirming
Enter reports `key === "Enter"` with `isComposing === true`; the
`keyCode === 229` arm mirrors the sibling guards for browsers that
report the composing key that way.

I verified the handler logic in isolation (Enter while composing no
longer submits; plain Enter and Shift+Enter are unchanged). I did not
run the full Angular suite locally.
…uery

CopilotChat message wrappers used viewport-keyed `cpk:sm:px-0`, collapsing
horizontal padding to 0 at any viewport >=640px. The message column is
`max-w-3xl` (768px) centered; the design assumes the chat fills the viewport,
so at >=640px the column has side gutters and inner padding can drop to 0.

But when the chat lives in a sub-viewport-width pane (e.g. the threads drawer
rail beside the chat, ~580px on an 820px iPad-portrait viewport), `sm:px-0`
still fires on viewport width while the 768px column overflows the narrow
pane and sits flush against both edges. The input wrapper looked fine because
it is visually inset by its own pill, so only message text appeared broken.

Make the padding container-relative instead of viewport-relative:
- add `cpk:@container` (container-type: inline-size) to the chat root, and
- switch the message/input/suggestion wrappers from `cpk:sm:px-0` to the
  container variant `cpk:@3xl:px-0`.

Padding now tracks the chat's own width and drops to 0 only once the container
is at least as wide as the column's own max-width, so the column has real
gutters; in any narrower pane the `px-4` inner padding is retained. React,
Angular, and Vue kept in lockstep.

Note on the breakpoint: Tailwind v4 container-query breakpoints differ from
viewport breakpoints (`@sm` = 24rem/384px, not 640px). A mechanical
`sm:` -> `@sm:` swap would still collapse the ~580px repro pane. `@3xl`
(48rem/768px) is used because it exactly matches the column's `max-w-3xl`,
which is the width at which side gutters first appear.

Verified: full-width desktop chat unchanged (container >=768px -> px-0);
580px pane retains 16px padding; render-prop layouts without a container
ancestor degrade safely to `px-4`; sidebar/popup `data-*` padding overrides
are unaffected.

ENT-1020

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uery (ENT-1020) (#5778)

## What & why

Fixes **ENT-1020**. In a side-by-side layout (the threads drawer rail
next to the chat), at iPad-portrait / tablet widths the chat **message
text sat flush against both pane edges** — no horizontal padding — while
the input stayed correctly inset. Surfaced while manually testing the
Angular `CopilotDrawer` (#5746), but it is **not a drawer bug**: it
reproduces in any layout that puts `CopilotChat` in a pane narrower than
the viewport.

### Root cause
The message column is `max-w-3xl` (768px) centered; its wrappers used
`cpk:px-4 cpk:sm:px-0` — 16px below 640px, then **0 at viewport
≥640px**. There was no `container-type` on the chat root, so the `sm:`
variant keyed on the **viewport**, not the chat's own width. In a
sub-viewport pane (~580px chat on an 820px viewport) `sm:px-0` still
fired (viewport ≥640) while the 768px column overflowed the pane
edge-to-edge → flush text.

## The fix (robust / container-relative)
- Add `cpk:@container` (`container-type: inline-size`) to the chat root
(React ×2 render paths, Angular, Vue).
- Switch every message / input / suggestion wrapper from viewport
`cpk:sm:px-0` to the **container** variant `cpk:@3xl:px-0`.

Padding now tracks the chat's **own** width and drops to 0 only once the
container is at least as wide as the column's `max-w-3xl`, i.e. once the
column has real side gutters. In any narrower pane the `px-4` inner
padding is retained. **React, Angular, and Vue kept in lockstep.**

### Why `@3xl`, not the `@sm` the ticket suggested
Tailwind v4 **container-query** breakpoints are a *different scale* from
viewport breakpoints: `@sm` = **24rem/384px** (viewport `sm` = 640px). A
mechanical `sm:` → `@sm:` swap would still collapse the ~580px repro
pane (580 ≥ 384). `@3xl` = **48rem/768px**, which exactly matches the
column's `max-w-3xl` — the width at which gutters first appear — so it
is the semantically correct breakpoint.

> Vue was not named in the ticket scope but shares the identical
`sm:px-0` pattern; left unfixed it would reproduce the bug there, so it
is included for true framework lockstep. web-components only *hosts* the
chat (no `sm:px-0`), so it is correctly untouched.

## Testing

**Browser behavior — real built CSS, exact DOM
(`copilotKitChat`/`@container` root → `cpk:max-w-3xl cpk:mx-auto` →
`cpk:px-4 cpk:@3xl:px-0` message wrapper),
`getComputedStyle().paddingLeft`:**

| Scenario | Result | Expectation |
|---|---|---|
| **580px narrow pane** (the repro) | `padding-left: 16px` | ✅ `px-4`
retained — text no longer flush |
| **900px wide pane** (full desktop) | `padding-left: 0px` | ✅ `px-0` —
column has gutters, behavior preserved |
| **No `@container` ancestor** (render-prop path) | `padding-left: 16px`
| ✅ graceful `px-4`, no flush |

**Generated CSS confirmed** (Tailwind v4.1.18): root emits
`container-type: inline-size`; wrapper emits `@container (min-width:
48rem) { padding-inline: 0 }` (verified in both react-core and angular
builds — a true container query, not a media query).

**Unit/component tests (pass):**
- react-core — full chat suite: **645 tests / 39 files** green (incl.
`CopilotChatCssClasses`).
- angular — `copilot-chat-view` + `copilot-chat-input` specs: **10**
green.
- vue — `CopilotChatView.connectingGate` +
`CopilotChatSuggestionView.slots.e2e`: **30** green.
- pre-commit `test-and-check-packages` (test + publint + attw across all
4 affected projects) passed.

No tests assert these class strings and no snapshots capture them, so
nothing needed regenerating.

## Acceptance criteria
- [x] Messages retain horizontal padding when the chat is in a narrow
(<~768px) pane at viewports ≥640px.
- [x] Full-width chat behavior preserved (container ≥768px → `px-0`),
verified in React, Angular, Vue.
- [x] No regression to input / disclaimer / suggestion alignment with
the message column (all share the same `@3xl` switch).

Closes ENT-1020

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.