# LangGraph-Python Column Wave 1 — Discovered Bugs & Descoped Cells Wave 1 of the langgraph-python column completeness effort surfaced the following issues while authoring QA checklists, E2E specs, and ops probes. Each is tracked for follow-up separately from Wave 1's merge. ## How to read this - **Descoped cell**: the Wave 1 "green the column" declaration explicitly excludes this cell. The dashboard will show amber/red for it until the underlying cause is addressed. - **Follow-up**: the issue doesn't block Wave 1 completion; filed here for later. Entries are grouped by area (docs, backend-agent, probe plumbing, frontend / CSS, test infra). Cross-references use the `W8-*` tag as it appears in `docs/superpowers/plans/langgraph-python-column-wave1-bugs-scratch.md` and in inline `// See W8-*` comments inside Playwright specs under `showcase/packages/langgraph-python/tests/e2e/`. ## Bugs ### B1 — probe-docs.ts does not read `packages/*/docs-links.json` overrides (W8-1) - **Symptom:** `scripts/probe-docs.ts` only validates URLs in `shared/feature-registry.json`. Per-integration overrides in `packages//docs-links.json` are invisible to the probe, so `showcase/shell/src/data/docs-status.json` can report `notfound` for a URL that actually resolves 200 — and conversely a broken override would not show red. - **Evidence:** Post-Task-1.4 probe aggregate is `ok=0 notfound=60 error=0 missing=16` even though every langgraph-python cell except `chat-customization-css` renders ✓/✓ on the dashboard. The dashboard flips to green via `showcase/shell-dashboard/src/components/cell-pieces.tsx:36-57` which trusts the override. Example: `og_docs_url` `https://docs.copilotkit.ai/langgraph/prebuilt-components` in `packages/langgraph-python/docs-links.json` is 200-verified but shows `notfound` in the probe output. - **Suspected cause:** `probe-docs.ts` scope predates the `docs-links.json` override pattern; it reads only `REGISTRY_PATH` and never walks `packages/*/docs-links.json`. - **Suggested owner:** showcase ops. - **Next step:** either (a) extend `probe-docs.ts` to walk `packages/*/docs-links.json` and emit per-integration docs-status rows, or (b) teach `cell-pieces.tsx` to defer to probe state whenever a URL exists. - **Descoped cell(s):** none — dashboard is already green via the override. Affects probe accuracy column-wide but not visible cell state. ### B2 — Every `/features/` URL in feature-registry soft-404s (W8-3) - **Symptom:** Every `https://docs.copilotkit.ai/features/` entry in `shared/feature-registry.json` returns the Next.js catch-all `[[...slug]]` page. This affects integrations that don't ship a `docs-links.json` override. - **Evidence:** Curl of any `/features/` URL returns 200 with `x-matched-path: /[[...slug]]` or `/integrations/[[...slug]]`. Probe output's `notfound=60` aggregate is almost entirely these fallback URLs. See `docs/superpowers/plans/langgraph-python-docs-audit.md` surprise #3. - **Suspected cause:** registry URLs were written against an older docs IA (`/features/`) that no longer exists. - **Suggested owner:** docs IA. - **Next step:** short-term, ensure every integration has a `docs-links.json` override. Long-term, update feature-registry URLs to point at integration-specific pages or drop the feature-level fallbacks. - **Descoped cell(s):** none for langgraph-python (overrides cover every cell). Other integration columns may still render red until each ships its own override. ### B3 — `chat-customization-css` has no dedicated docs page (W8-2) - **Symptom:** langgraph-python ships a `chat-customization-css` demo but no dedicated CSS-customization page exists under docs.copilotkit.ai or shell-docs. The cell renders the "missing" state for og. - **Evidence:** - `packages/langgraph-python/docs-links.json` entry for `chat-customization-css` has `og_docs_url: null` and `shell_docs_path: "/custom-look-and-feel/css"`. - `https://docs.copilotkit.ai/langgraph/custom-look-and-feel/css` soft-404s (catch-all `[[...slug]]`). - `https://docs.copilotkit.ai/custom-look-and-feel/css` also soft-404s. - No `integrations/langgraph/custom-look-and-feel/css.mdx` exists under `showcase/shell-docs/src/content/docs/` (a non-scoped `custom-look-and-feel/css.mdx` does exist, which shell resolution matches). - **Suspected cause:** docs page was never authored. - **Suggested owner:** docs. - **Next step:** author `langgraph/custom-look-and-feel/css` (matching the `/slots` sibling) and the corresponding shell-docs mdx under `integrations/langgraph/custom-look-and-feel/css.mdx`. Then un-null `og_docs_url` in `packages/langgraph-python/docs-links.json`. - **Descoped cell(s):** `chat-customization-css` docs-og. ### B4 — `reasoning_agent` non-responsive on Railway (W8-3 E2E) - **Symptom:** `/demos/agentic-chat-reasoning` on `showcase-langgraph-python-production.up.railway.app` loads fine, but any typed prompt produces no `[data-testid="reasoning-block"]` and no `[data-role="assistant"]` bubble within 60s. - **Evidence:** - Three consecutive E2E runs all time out at 60s on the reasoning-block locator. - Traces under `showcase/packages/langgraph-python/test-results/agentic-chat-reasoning-*`. - Same Railway host handles `frontend-tools` (5/5) and `frontend-tools-async` (2/3 LLM-dependent) — deployment is up; the `reasoning_agent` graph specifically is non-responsive. - Mitigation already landed in `showcase/packages/langgraph-python/tests/e2e/agentic-chat-reasoning.spec.ts` (three `test.skip`s with TODO). - **Suspected cause:** `deepagents.create_deep_agent` / `init_chat_model` path in `src/agents/reasoning_agent.py` may be missing a Python dep or an OpenAI Responses-API permission on Railway, or the agent name mapping in `src/app/api/copilotkit/route.ts:76-77` (`agentic-chat-reasoning` → `reasoning_agent`) fails at the runtime layer. - **Suggested owner:** showcase-langgraph-python deploy. - **Next step:** tail Railway logs while hitting `/api/copilotkit` POST with an `agentic-chat-reasoning` agent run; confirm whether `reasoning_agent.graph` actually imports. - **Descoped cell(s):** `agentic-chat-reasoning` E2E (reasoning-stream assertions skipped; page-load/submit-pipeline still live). ### B5 — `request_user_approval` does not fire on Railway within 60s (W8-5) - **Symptom:** `/demos/hitl-in-app` on Railway loads fine; suggestion pills and the 3 ticket cards render. A typed prompt explicitly naming the tool and a ticket (e.g. "Use request_user_approval to ask me to approve a $50 refund on ticket #12345.") does not cause the agent to invoke the `useFrontendTool` handler. No `[data-testid="approval-dialog-overlay"]` portal appears; all three flows time out at 60s with two Playwright retries each. - **Evidence:** traces under `showcase/packages/langgraph-python/test-results/hitl-in-app-*`. Mitigation in `tests/e2e/hitl-in-app.spec.ts` — three approval flows marked `test.skip` with TODO; page-load / ticket-card / suggestion-pill assertions remain live. - **Suspected cause:** deployed `hitl_in_app_agent` graph may be missing the `request_user_approval` tool binding; or the agent-name mapping in `src/app/api/copilotkit/route.ts` does not route to a graph that receives frontend-tool registration; or the system prompt does not prime the model to call the tool for the typed prompt. - **Suggested owner:** showcase-langgraph-python agent authoring / deploy. - **Next step:** verify the HITL-in-app agent graph definition against the deployed image and confirm `useFrontendTool(request_user_approval)` is registered on the session by the time the user prompt is sent. - **Descoped cell(s):** `hitl-in-app` E2E (approval flows skipped). ### B6 — `useInterrupt` / `schedule_meeting` does not fire on Railway within 60s (W8-6) - **Symptom:** `/demos/gen-ui-interrupt` on Railway loads fine; suggestion pills render. Typed prompts naming the backend tool (e.g. "Use schedule_meeting to book an intro call …") do not trigger the `interrupt_agent` graph's `interrupt()` within 60s; no inline `[data-testid="time-picker-card"]` renders; both pick-a-slot and cancel flows time out. - **Evidence:** traces under `showcase/packages/langgraph-python/test-results/gen-ui-interrupt-*`. Mitigation in `tests/e2e/gen-ui-interrupt.spec.ts` — two interrupt flows marked `test.skip` with TODO. - **Suspected cause:** likely same cluster as B4 / B5. Either the `interrupt_agent` graph (shared with `interrupt-headless`) is not reaching its `interrupt()` on Railway, the `useInterrupt({ renderInChat: true })` primitive is not subscribing, or the `schedule_meeting` tool binding is stripped from the deployed graph. - **Suggested owner:** showcase-langgraph-python agent authoring / deploy. - **Next step:** hit `/api/copilotkit` with an `interrupt_agent` run while tailing Railway logs; confirm whether `schedule_meeting` is actually invoked and whether a LangGraph `interrupt()` is emitted on the SSE stream. - **Descoped cell(s):** `gen-ui-interrupt` E2E (interrupt flows skipped). ### B7 — `readonly-state-agent-context` LLM round-trip stalls past 60s on Railway (W8-READONLY-1) - **Symptom:** `/demos/readonly-state-agent-context` on Railway loads, but LLM round-trip for the "Who am I?" suggestion and the equivalent typed prompt stalls past 60s. There is no deterministic frontend tool side-effect to race against (the page simply expects an assistant bubble). - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ readonly-state-agent-context.spec.ts` marks both the suggestion flow and the typed-prompt flow `test.skip` with an inline "See W8-READONLY-1" pointer at `readonly-state-agent-context.spec.ts:76,96`. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** Railway round-trip flakiness; no frontend tool side-effect in the demo makes it impossible to distinguish slow-LLM from graph-dead. - **Suggested owner:** showcase-langgraph-python agent authoring / deploy. Parallel: demo authoring could add an `data-testid="assistant-message"` marker on the assistant bubble to give the spec a deterministic structural signal. - **Next step:** either fix the deployed agent's response latency or add the assistant-message testid so the spec can assert structural signal without waiting on LLM text. - **Descoped cell(s):** `readonly-state-agent-context` E2E (LLM round-trip assertions skipped). ### B8 — `open-gen-ui` iframe mount regularly exceeds 120s (W8-OGUI-1) - **Symptom:** `/demos/open-gen-ui` iframe mount exceeds the 120s per-test budget because the LLM has to author full HTML/CSS/JS before the iframe can paint. No reliable post-mount signal. - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ open-gen-ui.spec.ts` marks both the Quicksort suggestion path and the neural-network path `test.skip` with "See W8-OGUI-1" at `open-gen-ui.spec.ts:64,90`. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** demo is inherently LLM-authoring-bound. The iframe content is fully generated per request; there is no short-circuit signal (no testid on mount, iframe is srcdoc-loaded and opaque to the host). - **Suggested owner:** showcase-langgraph-python demo authoring. - **Next step:** emit a `data-testid="ogui-iframe"` on mount (short- circuits the LLM wait), or narrow the prompt to reduce authoring latency on Railway. - **Descoped cell(s):** `open-gen-ui` E2E (iframe-mount assertions skipped). ### B9 — `open-gen-ui-advanced` sandbox iframe round-trip unverifiable (W8-OGUI-2) - **Symptom:** `/demos/open-gen-ui-advanced` mounts an `sandbox="allow-scripts"`-only iframe; the round-trip to the host (e.g. the `notifyHost` console log) cannot be asserted via Playwright's `contentFrame()` because `allow-scripts`-only iframes restrict cross-frame interaction. - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ open-gen-ui-advanced.spec.ts` marks the Ping mount and the `notifyHost` round-trip `test.skip` with "See W8-OGUI-2" at `open-gen-ui-advanced.spec.ts:63,92`. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** shares B8's LLM-authoring latency; additionally the `allow-scripts` sandbox attribute by design prevents host-side introspection. - **Suggested owner:** showcase-langgraph-python demo authoring. - **Next step:** emit a post-mount testid or a host-visible console-log fixture the spec can assert against without crossing the sandbox boundary. - **Descoped cell(s):** `open-gen-ui-advanced` E2E (sandbox-attribute and round-trip assertions skipped). ### B10 — `declarative-gen-ui` `generate_a2ui` secondary LLM stalls for KPI/StatusReport prompts (W8-7) - **Symptom:** `/demos/declarative-gen-ui` KPI-dashboard and StatusReport pill flows regularly exceed 60s on Railway when the secondary LLM stage (which authors the a2ui JSON) stalls. - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ declarative-gen-ui.spec.ts` marks the KPI test and the StatusReport test `test.skip` with "See W8-7" at `declarative-gen-ui.spec.ts:118,140`. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** secondary LLM call in the `a2ui_dynamic` agent graph is slow/flaky on Railway. KPI is the slowest of the 4 pills. - **Suggested owner:** showcase-langgraph-python agent authoring. - **Next step:** measure secondary-LLM latency distribution on Railway; consider prompt shrinking or model swap for the secondary stage. - **Descoped cell(s):** `declarative-gen-ui` E2E (KPI + StatusReport flows skipped; ProductCard and VideoCard pills remain live). ### B11 — `a2ui-fixed-schema` `display_flight` secondary LLM occasionally stalls (W8-8) - **Symptom:** `/demos/a2ui-fixed-schema` `display_flight` flow occasionally stalls the secondary LLM stage past its 60s render budget. - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ a2ui-fixed-schema.spec.ts:31` — inline comment "W8-8: on Railway, `display_flight` occasionally stalls the secondary LLM stage; render budget is 60s." Spec still runs against the 60s budget — not skipped, but flaky. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** same secondary-LLM latency cluster as B10. - **Suggested owner:** showcase-langgraph-python agent authoring. - **Next step:** bundle with B10 investigation; possibly raise the render budget to 90s or switch the secondary stage model. - **Descoped cell(s):** none — test still runs; flake is documented, not skipped. ### B12 — `mcp-apps` Excalidraw MCP iframe fails to paint within 90s (W8-9) - **Symptom:** The end-to-end MCP round-trip (agent → `create_view` → server-side resource fetch → activity event → iframe render) on `/demos/mcp-apps` regularly sits above 90s and intermittently fails to paint an iframe at all when the Excalidraw MCP server is slow. - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ mcp-apps.spec.ts` marks the flowchart flow and the explicit `create_view`-prompt flow `test.skip` with "See W8-9" at `mcp-apps.spec.ts:60,80`. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** MCP Apps middleware latency or Excalidraw MCP upstream slowness. - **Suggested owner:** showcase-langgraph-python deploy + MCP infrastructure. - **Next step:** confirm whether the Excalidraw MCP server latency is the dominant factor; consider pre-warming or a cached-resource fallback. - **Descoped cell(s):** `mcp-apps` E2E (round-trip flows skipped; presence + sandbox-contract assertions live). ### B13 — `query_notes` occasionally does not fire without explicit keyword verb (W8-4) - **Symptom:** `/demos/frontend-tools-async` `query_notes` tool fires reliably when the user prompt contains an explicit "search my notes" verb phrase, but the "Find project-planning notes" suggestion pill and the typed variant "Find my notes about project planning." occasionally do not trigger the tool within 45s — the agent answers in-context without firing. - **Evidence:** during e2e authoring, the pill-click variant and the typed-prompt variant both timed out waiting on `[data-testid="notes-card"]` at 45s. The "Search my notes for 'auth'." typed variant and the zero-match "xyzzy-nonsense-keyword" variant succeeded reliably. Mitigation already landed in `showcase/packages/langgraph-python/tests/e2e/ frontend-tools-async.spec.ts` — pill test substitutes an explicit typed "Search my notes for 'auth'." prompt; terminal assertion accepts either `notes-list` or the empty-state copy. - **Suspected cause:** `frontend_tools_async` graph's system prompt does not consistently bias the model towards `query_notes` for "find … notes" phrasing. - **Suggested owner:** showcase-langgraph-python agent authoring. - **Next step:** harden the system prompt to always prefer `query_notes` when the prompt contains "notes", or update the suggestion pill copy to begin with "Search my notes for …" verbatim. - **Descoped cell(s):** none — test still runs after the pill→typed substitution; flake is documented, not skipped. ### B14 — `chat-customization-css` theme.css loses cascade on Railway - **Symptom:** On Railway the `chat-customization-css` demo intermittently loses the custom dashed-border and theme cascade — the `theme.css` overrides for `--copilot-kit-*` variables don't win over the default stylesheet load order. - **Evidence:** Memory-only from this session's dashboard walk (user note). Not captured in `tests/e2e/chat-customization-css.spec.ts` comments; the spec asserts `theme.css` CSS variables on the `.chat-css-demo-scope` wrapper but the reported Railway flake is about the dashed-border visual, not the computed variables. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** stylesheet load order on Railway's Next.js production build differs from local — `theme.css` is imported but not guaranteed to load after the default CopilotKit stylesheet under certain chunk-splitting conditions. - **Suggested owner:** showcase-langgraph-python demo authoring. - **Next step:** reproduce on Railway with a deterministic trigger; confirm import order in the production bundle; if needed, hoist `theme.css` import or add a `@layer` wrapper to force cascade. - **Descoped cell(s):** potentially `chat-customization-css` if the flake repros during Wave 1's final dashboard walk. Track but not pre-descoped. ### B15 — v2 `CopilotChatInput` Enter-key submit is flaky on slow networks - **Symptom:** On slow networks the Enter-key submit path in v2 `CopilotChatInput` intermittently drops the keystroke; tests using `page.keyboard.press("Enter")` after `fill()` flake. Workaround used across Wave 1 specs: click `[data-testid="copilot-send-button"]` instead. - **Evidence:** every Wave 1 spec (`showcase/packages/langgraph-python/tests/e2e/*.spec.ts`) uses the `[data-testid="copilot-send-button"]` locator rather than Enter. No dedicated comment in-spec explains why, but the workaround is uniform. Memory-only from this session. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** race between the controlled-input state update and the submit handler in v2 `CopilotChatInput` when Enter fires during an in-flight network tick. - **Suggested owner:** v2 chat-input component (packages/). - **Next step:** file an issue against the v2 chat-input package with a minimal repro; confirm whether the Enter handler awaits the latest controlled value. - **Descoped cell(s):** none — workaround is trivial. ### B16 — `agentic-chat` suite fails against Railway: `background-container` testid absent - **Symptom:** The `agentic-chat.spec.ts` suite asserts `[data-testid="background-container"]`, but on the deployed Railway demo that testid is not emitted — the deployed demo has drifted from source. - **Evidence:** `showcase/packages/langgraph-python/tests/e2e/ agentic-chat.spec.ts:13,20,89` all use `page.locator('[data-testid="background-container"]')`. The source under `src/app/demos/agentic-chat/page.tsx` does render the testid, but the Railway image appears to be from before a recent edit. Memory- only from this session. Scratch file does not mention this entry — **scratch not updated**. - **Suspected cause:** Railway build is stale relative to the source tree; redeploy needed, or the deployed branch diverges from the worktree. - **Suggested owner:** showcase-langgraph-python deploy. - **Next step:** redeploy Railway from current HEAD; re-run the `agentic-chat.spec.ts` suite and confirm all assertions pass. - **Descoped cell(s):** `agentic-chat` E2E remains pending a redeploy — track but not pre-descoped pending the Wave 1 post-merge dashboard walk. ### B17 — `chat-slots` manifest `highlight` list omits two components - **Symptom:** `packages/langgraph-python/manifest.yaml` `chat-slots` entry lists only `custom-welcome-screen.tsx` under `highlight:`. The demo actually uses three custom slot components: `custom-assistant-message.tsx` and `custom-disclaimer.tsx` are missing from the highlight list. - **Evidence:** - `showcase/packages/langgraph-python/manifest.yaml:268-276` (`chat-slots` entry highlight list). - `showcase/packages/langgraph-python/src/app/demos/chat-slots/` contains `custom-assistant-message.tsx`, `custom-disclaimer.tsx`, `custom-welcome-screen.tsx`, and `page.tsx`. - Does not affect the dashboard (highlight list is not dashboard- consumed for this column). Minor hygiene only. - **Suspected cause:** original manifest author added the first slot component and later additions were not back-filled. - **Suggested owner:** showcase-langgraph-python demo authoring. - **Next step:** add the two missing files to the `highlight:` array. - **Descoped cell(s):** none. ## Summary - **Total W8 / Wave 1 bug entries:** 17 (B1–B17). - **Descoped cells from Wave 1 completeness:** 7 — `chat-customization-css` (docs-og, via B3), `agentic-chat-reasoning` (E2E, via B4), `hitl-in-app` (E2E, via B5), `gen-ui-interrupt` (E2E, via B6), `readonly-state-agent-context` (E2E, via B7), `open-gen-ui` (E2E, via B8), `open-gen-ui-advanced` (E2E, via B9), plus partial descoping of `declarative-gen-ui` E2E (2 of 4 pills, via B10) and `mcp-apps` E2E (round-trip flows only, via B12). - **Follow-up-only (no cell impact):** 8 — B1, B2, B11, B13, B14, B15, B16, B17. Entries B7–B12 and B14–B17 were captured in-code (Playwright spec comments, manifest, and session memory) but were not synced back to `docs/superpowers/plans/langgraph-python-column-wave1-bugs-scratch.md` during Wave 1. The scratch file currently covers only W8-1, W8-2, W8-3 (docs), W8-3 (E2E), W8-4, W8-5, and W8-6.