Wave 1 of the langgraph-python column completeness effort surfaced the following issues while authoring QA checklists, E2E specs, and ops probes. Each is tracked for follow-up separately from Wave 1's merge.
- Descoped cell: the Wave 1 "green the column" declaration explicitly excludes this cell. The dashboard will show amber/red for it until the underlying cause is addressed.
- Follow-up: the issue doesn't block Wave 1 completion; filed here for later.
Entries are grouped by area (docs, backend-agent, probe plumbing, frontend
/ CSS, test infra). Cross-references use the W8-* tag as it appears in
docs/superpowers/plans/langgraph-python-column-wave1-bugs-scratch.md
and in inline // See W8-* comments inside Playwright specs under
showcase/packages/langgraph-python/tests/e2e/.
- Symptom:
scripts/probe-docs.tsonly validates URLs inshared/feature-registry.json. Per-integration overrides inpackages/<slug>/docs-links.jsonare invisible to the probe, soshowcase/shell/src/data/docs-status.jsoncan reportnotfoundfor a URL that actually resolves 200 — and conversely a broken override would not show red. - Evidence: Post-Task-1.4 probe aggregate is
ok=0 notfound=60 error=0 missing=16even though every langgraph-python cell exceptchat-customization-cssrenders ✓/✓ on the dashboard. The dashboard flips to green viashowcase/shell-dashboard/src/components/cell-pieces.tsx:36-57which trusts the override. Example:og_docs_urlhttps://docs.copilotkit.ai/langgraph/prebuilt-componentsinpackages/langgraph-python/docs-links.jsonis 200-verified but showsnotfoundin the probe output. - Suspected cause:
probe-docs.tsscope predates thedocs-links.jsonoverride pattern; it reads onlyREGISTRY_PATHand never walkspackages/*/docs-links.json. - Suggested owner: showcase ops.
- Next step: either (a) extend
probe-docs.tsto walkpackages/*/docs-links.jsonand emit per-integration docs-status rows, or (b) teachcell-pieces.tsxto defer to probe state whenever a URL exists. - Descoped cell(s): none — dashboard is already green via the override. Affects probe accuracy column-wide but not visible cell state.
- Symptom: Every
https://docs.copilotkit.ai/features/<id>entry inshared/feature-registry.jsonreturns the Next.js catch-all[[...slug]]page. This affects integrations that don't ship adocs-links.jsonoverride. - Evidence: Curl of any
/features/<id>URL returns 200 withx-matched-path: /[[...slug]]or/integrations/[[...slug]]. Probe output'snotfound=60aggregate is almost entirely these fallback URLs. Seedocs/superpowers/plans/langgraph-python-docs-audit.mdsurprise #3. - Suspected cause: registry URLs were written against an older docs
IA (
/features/<id>) that no longer exists. - Suggested owner: docs IA.
- Next step: short-term, ensure every integration has a
docs-links.jsonoverride. Long-term, update feature-registry URLs to point at integration-specific pages or drop the feature-level fallbacks. - Descoped cell(s): none for langgraph-python (overrides cover every cell). Other integration columns may still render red until each ships its own override.
- Symptom: langgraph-python ships a
chat-customization-cssdemo but no dedicated CSS-customization page exists under docs.copilotkit.ai or shell-docs. The cell renders the "missing" state for og. - Evidence:
packages/langgraph-python/docs-links.jsonentry forchat-customization-csshasog_docs_url: nullandshell_docs_path: "/custom-look-and-feel/css".https://docs.copilotkit.ai/langgraph/custom-look-and-feel/csssoft-404s (catch-all[[...slug]]).https://docs.copilotkit.ai/custom-look-and-feel/cssalso soft-404s.- No
integrations/langgraph/custom-look-and-feel/css.mdxexists undershowcase/shell-docs/src/content/docs/(a non-scopedcustom-look-and-feel/css.mdxdoes exist, which shell resolution matches).
- Suspected cause: docs page was never authored.
- Suggested owner: docs.
- Next step: author
langgraph/custom-look-and-feel/css(matching the/slotssibling) and the corresponding shell-docs mdx underintegrations/langgraph/custom-look-and-feel/css.mdx. Then un-nullog_docs_urlinpackages/langgraph-python/docs-links.json. - Descoped cell(s):
chat-customization-cssdocs-og.
- Symptom:
/demos/agentic-chat-reasoningonshowcase-langgraph-python-production.up.railway.apploads fine, but any typed prompt produces no[data-testid="reasoning-block"]and no[data-role="assistant"]bubble within 60s. - Evidence:
- Three consecutive E2E runs all time out at 60s on the reasoning-block locator.
- Traces under
showcase/packages/langgraph-python/test-results/agentic-chat-reasoning-*. - Same Railway host handles
frontend-tools(5/5) andfrontend-tools-async(2/3 LLM-dependent) — deployment is up; thereasoning_agentgraph specifically is non-responsive. - Mitigation already landed in
showcase/packages/langgraph-python/tests/e2e/agentic-chat-reasoning.spec.ts(threetest.skips with TODO).
- Suspected cause:
deepagents.create_deep_agent/init_chat_modelpath insrc/agents/reasoning_agent.pymay be missing a Python dep or an OpenAI Responses-API permission on Railway, or the agent name mapping insrc/app/api/copilotkit/route.ts:76-77(agentic-chat-reasoning→reasoning_agent) fails at the runtime layer. - Suggested owner: showcase-langgraph-python deploy.
- Next step: tail Railway logs while hitting
/api/copilotkitPOST with anagentic-chat-reasoningagent run; confirm whetherreasoning_agent.graphactually imports. - Descoped cell(s):
agentic-chat-reasoningE2E (reasoning-stream assertions skipped; page-load/submit-pipeline still live).
- Symptom:
/demos/hitl-in-appon Railway loads fine; suggestion pills and the 3 ticket cards render. A typed prompt explicitly naming the tool and a ticket (e.g. "Use request_user_approval to ask me to approve a $50 refund on ticket #12345.") does not cause the agent to invoke theuseFrontendToolhandler. No[data-testid="approval-dialog-overlay"]portal appears; all three flows time out at 60s with two Playwright retries each. - Evidence: traces under
showcase/packages/langgraph-python/test-results/hitl-in-app-*. Mitigation intests/e2e/hitl-in-app.spec.ts— three approval flows markedtest.skipwith TODO; page-load / ticket-card / suggestion-pill assertions remain live. - Suspected cause: deployed
hitl_in_app_agentgraph may be missing therequest_user_approvaltool binding; or the agent-name mapping insrc/app/api/copilotkit/route.tsdoes not route to a graph that receives frontend-tool registration; or the system prompt does not prime the model to call the tool for the typed prompt. - Suggested owner: showcase-langgraph-python agent authoring / deploy.
- Next step: verify the HITL-in-app agent graph definition against
the deployed image and confirm
useFrontendTool(request_user_approval)is registered on the session by the time the user prompt is sent. - Descoped cell(s):
hitl-in-appE2E (approval flows skipped).
- Symptom:
/demos/gen-ui-interrupton Railway loads fine; suggestion pills render. Typed prompts naming the backend tool (e.g. "Use schedule_meeting to book an intro call …") do not trigger theinterrupt_agentgraph'sinterrupt()within 60s; no inline[data-testid="time-picker-card"]renders; both pick-a-slot and cancel flows time out. - Evidence: traces under
showcase/packages/langgraph-python/test-results/gen-ui-interrupt-*. Mitigation intests/e2e/gen-ui-interrupt.spec.ts— two interrupt flows markedtest.skipwith TODO. - Suspected cause: likely same cluster as B4 / B5. Either the
interrupt_agentgraph (shared withinterrupt-headless) is not reaching itsinterrupt()on Railway, theuseInterrupt({ renderInChat: true })primitive is not subscribing, or theschedule_meetingtool binding is stripped from the deployed graph. - Suggested owner: showcase-langgraph-python agent authoring / deploy.
- Next step: hit
/api/copilotkitwith aninterrupt_agentrun while tailing Railway logs; confirm whetherschedule_meetingis actually invoked and whether a LangGraphinterrupt()is emitted on the SSE stream. - Descoped cell(s):
gen-ui-interruptE2E (interrupt flows skipped).
- Symptom:
/demos/readonly-state-agent-contexton Railway loads, but LLM round-trip for the "Who am I?" suggestion and the equivalent typed prompt stalls past 60s. There is no deterministic frontend tool side-effect to race against (the page simply expects an assistant bubble). - Evidence:
showcase/packages/langgraph-python/tests/e2e/ readonly-state-agent-context.spec.tsmarks both the suggestion flow and the typed-prompt flowtest.skipwith an inline "See W8-READONLY-1" pointer atreadonly-state-agent-context.spec.ts:76,96. Scratch file does not mention this entry — scratch not updated. - Suspected cause: Railway round-trip flakiness; no frontend tool side-effect in the demo makes it impossible to distinguish slow-LLM from graph-dead.
- Suggested owner: showcase-langgraph-python agent authoring /
deploy. Parallel: demo authoring could add an
data-testid="assistant-message"marker on the assistant bubble to give the spec a deterministic structural signal. - Next step: either fix the deployed agent's response latency or add the assistant-message testid so the spec can assert structural signal without waiting on LLM text.
- Descoped cell(s):
readonly-state-agent-contextE2E (LLM round-trip assertions skipped).
- Symptom:
/demos/open-gen-uiiframe mount exceeds the 120s per-test budget because the LLM has to author full HTML/CSS/JS before the iframe can paint. No reliable post-mount signal. - Evidence:
showcase/packages/langgraph-python/tests/e2e/ open-gen-ui.spec.tsmarks both the Quicksort suggestion path and the neural-network pathtest.skipwith "See W8-OGUI-1" atopen-gen-ui.spec.ts:64,90. Scratch file does not mention this entry — scratch not updated. - Suspected cause: demo is inherently LLM-authoring-bound. The iframe content is fully generated per request; there is no short-circuit signal (no testid on mount, iframe is srcdoc-loaded and opaque to the host).
- Suggested owner: showcase-langgraph-python demo authoring.
- Next step: emit a
data-testid="ogui-iframe"on mount (short- circuits the LLM wait), or narrow the prompt to reduce authoring latency on Railway. - Descoped cell(s):
open-gen-uiE2E (iframe-mount assertions skipped).
- Symptom:
/demos/open-gen-ui-advancedmounts ansandbox="allow-scripts"-only iframe; the round-trip to the host (e.g. thenotifyHostconsole log) cannot be asserted via Playwright'scontentFrame()becauseallow-scripts-only iframes restrict cross-frame interaction. - Evidence:
showcase/packages/langgraph-python/tests/e2e/ open-gen-ui-advanced.spec.tsmarks the Ping mount and thenotifyHostround-triptest.skipwith "See W8-OGUI-2" atopen-gen-ui-advanced.spec.ts:63,92. Scratch file does not mention this entry — scratch not updated. - Suspected cause: shares B8's LLM-authoring latency; additionally
the
allow-scriptssandbox attribute by design prevents host-side introspection. - Suggested owner: showcase-langgraph-python demo authoring.
- Next step: emit a post-mount testid or a host-visible console-log fixture the spec can assert against without crossing the sandbox boundary.
- Descoped cell(s):
open-gen-ui-advancedE2E (sandbox-attribute and round-trip assertions skipped).
- Symptom:
/demos/declarative-gen-uiKPI-dashboard and StatusReport pill flows regularly exceed 60s on Railway when the secondary LLM stage (which authors the a2ui JSON) stalls. - Evidence:
showcase/packages/langgraph-python/tests/e2e/ declarative-gen-ui.spec.tsmarks the KPI test and the StatusReport testtest.skipwith "See W8-7" atdeclarative-gen-ui.spec.ts:118,140. Scratch file does not mention this entry — scratch not updated. - Suspected cause: secondary LLM call in the
a2ui_dynamicagent graph is slow/flaky on Railway. KPI is the slowest of the 4 pills. - Suggested owner: showcase-langgraph-python agent authoring.
- Next step: measure secondary-LLM latency distribution on Railway; consider prompt shrinking or model swap for the secondary stage.
- Descoped cell(s):
declarative-gen-uiE2E (KPI + StatusReport flows skipped; ProductCard and VideoCard pills remain live).
- Symptom:
/demos/a2ui-fixed-schemadisplay_flightflow occasionally stalls the secondary LLM stage past its 60s render budget. - Evidence:
showcase/packages/langgraph-python/tests/e2e/ a2ui-fixed-schema.spec.ts:31— inline comment "W8-8: on Railway,display_flightoccasionally stalls the secondary LLM stage; render budget is 60s." Spec still runs against the 60s budget — not skipped, but flaky. Scratch file does not mention this entry — scratch not updated. - Suspected cause: same secondary-LLM latency cluster as B10.
- Suggested owner: showcase-langgraph-python agent authoring.
- Next step: bundle with B10 investigation; possibly raise the render budget to 90s or switch the secondary stage model.
- Descoped cell(s): none — test still runs; flake is documented, not skipped.
- Symptom: The end-to-end MCP round-trip (agent →
create_view→ server-side resource fetch → activity event → iframe render) on/demos/mcp-appsregularly sits above 90s and intermittently fails to paint an iframe at all when the Excalidraw MCP server is slow. - Evidence:
showcase/packages/langgraph-python/tests/e2e/ mcp-apps.spec.tsmarks the flowchart flow and the explicitcreate_view-prompt flowtest.skipwith "See W8-9" atmcp-apps.spec.ts:60,80. Scratch file does not mention this entry — scratch not updated. - Suspected cause: MCP Apps middleware latency or Excalidraw MCP upstream slowness.
- Suggested owner: showcase-langgraph-python deploy + MCP infrastructure.
- Next step: confirm whether the Excalidraw MCP server latency is the dominant factor; consider pre-warming or a cached-resource fallback.
- Descoped cell(s):
mcp-appsE2E (round-trip flows skipped; presence + sandbox-contract assertions live).
- Symptom:
/demos/frontend-tools-asyncquery_notestool fires reliably when the user prompt contains an explicit "search my notes" verb phrase, but the "Find project-planning notes" suggestion pill and the typed variant "Find my notes about project planning." occasionally do not trigger the tool within 45s — the agent answers in-context without firing. - Evidence: during e2e authoring, the pill-click variant and the
typed-prompt variant both timed out waiting on
[data-testid="notes-card"]at 45s. The "Search my notes for 'auth'." typed variant and the zero-match "xyzzy-nonsense-keyword" variant succeeded reliably. Mitigation already landed inshowcase/packages/langgraph-python/tests/e2e/ frontend-tools-async.spec.ts— pill test substitutes an explicit typed "Search my notes for 'auth'." prompt; terminal assertion accepts eithernotes-listor the empty-state copy. - Suspected cause:
frontend_tools_asyncgraph's system prompt does not consistently bias the model towardsquery_notesfor "find … notes" phrasing. - Suggested owner: showcase-langgraph-python agent authoring.
- Next step: harden the system prompt to always prefer
query_noteswhen the prompt contains "notes", or update the suggestion pill copy to begin with "Search my notes for …" verbatim. - Descoped cell(s): none — test still runs after the pill→typed substitution; flake is documented, not skipped.
- Symptom: On Railway the
chat-customization-cssdemo intermittently loses the custom dashed-border and theme cascade — thetheme.cssoverrides for--copilot-kit-*variables don't win over the default stylesheet load order. - Evidence: Memory-only from this session's dashboard walk (user
note). Not captured in
tests/e2e/chat-customization-css.spec.tscomments; the spec assertstheme.cssCSS variables on the.chat-css-demo-scopewrapper but the reported Railway flake is about the dashed-border visual, not the computed variables. Scratch file does not mention this entry — scratch not updated. - Suspected cause: stylesheet load order on Railway's Next.js
production build differs from local —
theme.cssis imported but not guaranteed to load after the default CopilotKit stylesheet under certain chunk-splitting conditions. - Suggested owner: showcase-langgraph-python demo authoring.
- Next step: reproduce on Railway with a deterministic trigger;
confirm import order in the production bundle; if needed, hoist
theme.cssimport or add a@layerwrapper to force cascade. - Descoped cell(s): potentially
chat-customization-cssif the flake repros during Wave 1's final dashboard walk. Track but not pre-descoped.
- Symptom: On slow networks the Enter-key submit path in v2
CopilotChatInputintermittently drops the keystroke; tests usingpage.keyboard.press("Enter")afterfill()flake. Workaround used across Wave 1 specs: click[data-testid="copilot-send-button"]instead. - Evidence: every Wave 1 spec
(
showcase/packages/langgraph-python/tests/e2e/*.spec.ts) uses the[data-testid="copilot-send-button"]locator rather than Enter. No dedicated comment in-spec explains why, but the workaround is uniform. Memory-only from this session. Scratch file does not mention this entry — scratch not updated. - Suspected cause: race between the controlled-input state update
and the submit handler in v2
CopilotChatInputwhen Enter fires during an in-flight network tick. - Suggested owner: v2 chat-input component (packages/).
- Next step: file an issue against the v2 chat-input package with a minimal repro; confirm whether the Enter handler awaits the latest controlled value.
- Descoped cell(s): none — workaround is trivial.
- Symptom: The
agentic-chat.spec.tssuite asserts[data-testid="background-container"], but on the deployed Railway demo that testid is not emitted — the deployed demo has drifted from source. - Evidence:
showcase/packages/langgraph-python/tests/e2e/ agentic-chat.spec.ts:13,20,89all usepage.locator('[data-testid="background-container"]'). The source undersrc/app/demos/agentic-chat/page.tsxdoes render the testid, but the Railway image appears to be from before a recent edit. Memory- only from this session. Scratch file does not mention this entry — scratch not updated. - Suspected cause: Railway build is stale relative to the source tree; redeploy needed, or the deployed branch diverges from the worktree.
- Suggested owner: showcase-langgraph-python deploy.
- Next step: redeploy Railway from current HEAD; re-run the
agentic-chat.spec.tssuite and confirm all assertions pass. - Descoped cell(s):
agentic-chatE2E remains pending a redeploy — track but not pre-descoped pending the Wave 1 post-merge dashboard walk.
- Symptom:
packages/langgraph-python/manifest.yamlchat-slotsentry lists onlycustom-welcome-screen.tsxunderhighlight:. The demo actually uses three custom slot components:custom-assistant-message.tsxandcustom-disclaimer.tsxare missing from the highlight list. - Evidence:
showcase/packages/langgraph-python/manifest.yaml:268-276(chat-slotsentry highlight list).showcase/packages/langgraph-python/src/app/demos/chat-slots/containscustom-assistant-message.tsx,custom-disclaimer.tsx,custom-welcome-screen.tsx, andpage.tsx.- Does not affect the dashboard (highlight list is not dashboard- consumed for this column). Minor hygiene only.
- Suspected cause: original manifest author added the first slot component and later additions were not back-filled.
- Suggested owner: showcase-langgraph-python demo authoring.
- Next step: add the two missing files to the
highlight:array. - Descoped cell(s): none.
- Total W8 / Wave 1 bug entries: 17 (B1–B17).
- Descoped cells from Wave 1 completeness: 7 —
chat-customization-css(docs-og, via B3),agentic-chat-reasoning(E2E, via B4),hitl-in-app(E2E, via B5),gen-ui-interrupt(E2E, via B6),readonly-state-agent-context(E2E, via B7),open-gen-ui(E2E, via B8),open-gen-ui-advanced(E2E, via B9), plus partial descoping ofdeclarative-gen-uiE2E (2 of 4 pills, via B10) andmcp-appsE2E (round-trip flows only, via B12). - Follow-up-only (no cell impact): 8 — B1, B2, B11, B13, B14, B15, B16, B17.
Entries B7–B12 and B14–B17 were captured in-code (Playwright spec
comments, manifest, and session memory) but were not synced back to
docs/superpowers/plans/langgraph-python-column-wave1-bugs-scratch.md
during Wave 1. The scratch file currently covers only W8-1, W8-2, W8-3
(docs), W8-3 (E2E), W8-4, W8-5, and W8-6.