You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/clarify — up to 5 questions, | Option | Description | table, with a "Recommended" row prefixed by reasoning.
/checklist — its own independent 3-question loop (up to 5) using | Option | Candidate | Why It Matters |.
The same templates ship to every supported agent. Inside Claude Code specifically, this causes friction:
UX inconsistency. Claude Code exposes a native AskUserQuestion tool that renders structured pickers with labels, descriptions, keyboard navigation, and multi-select support. Every other interactive flow in that host uses it; spec-kit is the one that asks users to type A/B/C.
Ambiguous parsing. Free-text replies like 2, "the second one", or "recommended" have to be interpreted by the model. Most of the time it works, but when the "Recommended" row reorders options or a user paraphrases, it occasionally misroutes.
Lost structure. Option labels, descriptions, the Recommended — <reasoning> callout, and the optional free-form short-answer escape hatch all have to be re-encoded in prose each question. A structured tool call would carry them natively.
Other agents without a native structured-question tool should keep the current Markdown-table behavior — no regression.
A note on scope. No command internally invokes /clarify — /specify, /plan, and /tasks emit [NEEDS CLARIFICATION] markers and delegate to /speckit.clarify via handoffs, which only suggest the next command to the user. So fixing /clarify in isolation would leave /checklist's parallel questioning loop untouched, which is why both belong in the same fix.
Mark the questioning sections as transformable in templates/commands/clarify.md and templates/commands/checklist.md. Keep the current numbered-list instructions as the default body; add explicit begin/end markers (e.g., an HTML-comment fence like <!-- speckit:question-render:begin --> ... <!-- speckit:question-render:end -->) around the question-rendering block in each template so a post-processor can target it surgically without touching the surrounding logic (question selection, 5-question cap, spec-file writes, validation pass, etc.).
Extend ClaudeIntegration.setup() (or a helper it calls) so that when generating speckit-clarify/SKILL.md and speckit-checklist/SKILL.md, the fenced block is replaced with instructions that tell Claude Code to invoke AskUserQuestion with a structured payload:
question: the clarification/checklist question text
options[]: {label, description}, with the recommended option placed first and its description prefixed Recommended — <reasoning>.
multiSelect: false
A final synthetic "Provide my own short answer (≤10 words)" option preserves the current free-form escape hatch.
Preserve all downstream behavior. For /clarify: the 5-question cap, category coverage map, incremental spec-file writes after each accepted answer, and the post-questioning validation pass. For /checklist: the 3-initial / up-to-5-total question cap and the Q1–Q5 labeling. Only the rendering of each question changes in either command.
Normalize the two table schemas./clarify uses Option | Description and /checklist uses Option | Candidate | Why It Matters. The post-processor should map both to AskUserQuestion's {label, description} shape — Candidate becomes the option label, and Why It Matters goes into description. Document this mapping inline in the integration code so the two templates can diverge further in the future without silently breaking the transform.
No changes for other integrations.MarkdownIntegration, TomlIntegration, and other SkillsIntegration subclasses continue shipping the templates unchanged — the fence is plain Markdown that renders correctly as-is. Snapshot tests should assert that every non-Claude integration's generated output is byte-identical to main.
Out of scope for this issue
Changing question-selection logic, caps, or taxonomy in either command.
/implement's "proceed anyway? (yes/no)" and /analyze's remediation-approval prompt. These are binary confirms — the same fencing pattern would cover them trivially once it exists, but the UX win is small and they can land in a follow-up PR to keep this one focused.
/specify, /plan, /tasks, /constitution, /taskstoissues. The first three emit [NEEDS CLARIFICATION] markers rather than asking interactively; /constitution is mostly free-form; /taskstoissues doesn't prompt.
Any changes to the extension-hooks system (.specify/extensions.yml).
Why this shape
Matches spec-kit's existing architecture (see AGENTS.md): per-agent behavior lives in src/specify_cli/integrations/<key>/, templates stay
agent-agnostic, and post-processing on skill generation is already an established pattern.
Zero risk to non-Claude integrations — the fence is invisible to them.
Preserves the "Recommended option + reasoning" UX in /clarify and the Candidate / Why It Matters semantics in /checklist; both move from table rows into structured AskUserQuestion option fields.
No new runtime abstraction, no new CLI flag, no schema migration
Alternatives Considered
Do nothing — keep the Markdown table rendering. The current flow works on every agent and is low-maintenance. Rejected because inside Claude Code, it's a visibly worse UX than every other interactive flow in the host (structured pickers via AskUserQuestion), and the ambiguous-parsing failure mode is already observable when users paraphrase answers or the "Recommended" row shifts option ordering.
Fix /clarify only, leave /checklist for later. Smallest diff, fastest to land. Rejected because /checklist has a parallel, independent questioning loop with essentially the same UX problem, and the fencing + post-processor infrastructure is the same shape for both — splitting them means building the mechanism twice or leaving an obvious gap between the two PRs.
Runtime abstraction — add an ask_user() helper in the CLI layer that integrations plug into. More flexible long-term. Rejected because spec-kit commands are executed by the agent, not by the Python CLI , there's no runtime point at which the CLI could intercept a question to re-render it. The question-rendering instructions live in the template text the agent reads, so the only place to transform them is at template/skill generation time, which is exactly where ClaudeIntegration already operates.
Fork the templates per-agent (clarify.claude.md, clarify.default.md, etc.). Rejected because it duplicates a ~200-line template body to change one rendering block, and it fights spec-kit's "one template per command, per-agent behavior in integration subpackages" architecture described in AGENTS.md.
Make the template itself conditional — something like "if your host supports a structured-question tool, use it; otherwise render a table." Rejected because it pushes host-capability detection onto every agent, most of which would interpret it inconsistently, and it makes the template body harder to read for the common case.
Add a user-facing CLI flag (e.g. specify init --claude-native-questions). Rejected as overkill — this is a rendering choice with no semantic impact on the generated spec, so it shouldn't need user configuration. The right default for Claude Code is "use Claude Code's native tool"; anything else is a regression.
The proposed solution is the only approach that (a) fits spec-kit's existing integration-layer pattern, (b) keeps the template single-sourced, (c) has zero risk to other agents, and (d) doesn't introduce new user-visible configuration.
Component
Agent integrations (command files, workflows)
AI Agent (if applicable)
Claude Code
Use Cases
User running /speckit.clarify on a freshly-specified feature inside Claude Code.
Today: sees an ASCII table with a "Recommended" row, reads A–E descriptions, types A (or "yes"), repeats up to 5 times.
After: sees a native Claude Code picker with the recommended option pre-highlighted, selects with arrow keys / enter, no free-text disambiguation.
User running /speckit.checklist to generate a UI-consistency or test-coverage checklist.
Today: answers 3 (up to 5) clarifying questions rendered as a Option | Candidate | Why It Matters table, each requiring a letter reply.
After: the same 3–5 questions rendered as native pickers, with Why It Matters shown as the option description on hover/selection.
User pastes a one-word free-form answer (e.g., "OAuth2") instead of picking a listed option.
Today: works, but relies on the model parsing "OAuth2" against the (<=5 words) constraint described in prose.
After: the AskUserQuestion payload includes a dedicated "Provide my own short answer (≤10 words)" option that, when selected, prompts for a free-text response — the escape hatch is preserved but is now a
first-class, unambiguous path.
User running the same command on a non-Claude agent (Copilot, Gemini, Windsurf, Cursor, etc.).
Today, see the Markdown table.
After: still sees the Markdown table, byte-identical. No regression, no divergence in behavior across agents beyond the rendering layer.
Maintainer reviewing a PR that touches /clarify or /checklist logic.
Today: edits one template, same behavior everywhere.
After: edits one template; the fenced rendering block is transformed at skill-generation time for Claude only, so question-selection logic, caps, taxonomy, and spec-file writes remain single-sourced and
agent-agnostic.
Acceptance Criteria
templates/commands/clarify.md contains a fenced rendering block (begin/end markers) around the question-presentation instructions in the sequential questioning loop. The block's default contents are the current Markdown-table instructions, unchanged in wording.
templates/commands/checklist.md contains an equivalent fenced rendering block around its step-2 clarification-question rendering. Default contents unchanged.
ClaudeIntegration.setup() in src/specify_cli/integrations/claude/__init__.py post-processes the generated speckit-clarify/SKILL.md and speckit-checklist/SKILL.md files, replacing the fenced block with instructions that direct Claude Code to invoke AskUserQuestion with a structured payload (question, options[], multiSelect: false).
The recommended option (from /clarify's "Recommended" row) is placed first in the options[] array and its description is prefixed Recommended — <reasoning>.
A final synthetic "Provide my own short answer (≤5 words)" option is appended to preserve the free-form escape hatch.
All downstream behavior is preserved: the 5-question cap (/clarify), 3-initial/5-total cap (/checklist), category coverage map, incremental spec-file writes after each accepted answer, the post-questioning validation pass, and the Q1–Q5 labeling in /checklist.
Snapshot tests under tests/ assert that every non-Claude integration's generated output for clarify and checklist is byte-identical to main. No regression anywhere except the Claude-specific skill files. Relevant integration subpackages: src/specify_cli/integrations/.
Snapshot tests assert that the Claude-specific output contains AskUserQuestion invocation instructions and no longer contains the | Option | Description | table for /clarify (and the corresponding | Option | Candidate | Why It Matters | table for /checklist).
Manual end-to-end verification through /speckit.clarify and /speckit.checklist inside Claude Code confirms: questions render as native pickers, the recommended option is visibly pre-highlighted, free-form answers still work, and the resulting spec.md / checklist file is equivalent to what the current flow produces.
No new user-facing CLI flags, no schema migration, no changes to .specify/extensions.yml, and no changes to any other command template under templates/commands/.
Additional Context
Claude Code's AskUserQuestion tool is the native structured-question mechanism used across Claude Code's own features and by user-authored skills. It accepts a question string, an options[] array of {label, description}, and a multiSelect boolean. See the Claude Code documentation for host context.
spec-kit's integration architecture is documented in AGENTS.md. Each agent has a subpackage under src/specify_cli/integrations/<key>/ with setup-time hooks; ClaudeIntegration already post-processes generated skill files to inject user-invocable, disable-model-invocation, and argument-hint — the proposed change extends the same pattern to the
question-rendering block.
templates/commands/clarify.md step 4 is the exact block that would be fenced (Sequential questioning loop → "For multiple-choice questions" → table render).
templates/commands/checklist.md step 2 ("Clarify intent (dynamic)") contains the parallel questioning loop and the Option | Candidate | Why It Matters table.
No other command template under templates/commands/ runs a structured multi-option picker loop; the remaining interactive prompts are binary yes/no confirms in /implement and /analyze, which are out of scope for this issue.
No command internally invokes /clarify — /specify, /plan, and /tasks use [NEEDS CLARIFICATION] markers and handoffs that only surface a suggested next command.
Environment where the issue is visible: Claude Code (CLI, desktop app, VS Code / JetBrains extensions) running spec-kit skills installed via specify init with the claude integration selected. Not
reproducible on other agents — by design.
No screenshots attached — the rendering difference is textual (Markdown table vs. native picker) and I'd rather not embed images of an unmerged design. Happy to add side-by-side captures before/after if a maintainer wants them for the PR.
Problem Statement
templates/commands/clarify.mdandtemplates/commands/checklist.mdeach run their own interactive questioning loop and instruct the agent to render every question as a Markdown table with A–E options, asking the user to reply with a letter:/clarify— up to 5 questions,| Option | Description |table, with a "Recommended" row prefixed by reasoning./checklist— its own independent 3-question loop (up to 5) using| Option | Candidate | Why It Matters |.The same templates ship to every supported agent. Inside Claude Code specifically, this causes friction:
AskUserQuestiontool that renders structured pickers with labels, descriptions, keyboard navigation, and multi-select support. Every other interactive flow in that host uses it; spec-kit is the one that asks users to typeA/B/C.2, "the second one", or "recommended" have to be interpreted by the model. Most of the time it works, but when the "Recommended" row reorders options or a user paraphrases, it occasionally misroutes.Recommended — <reasoning>callout, and the optional free-form short-answer escape hatch all have to be re-encoded in prose each question. A structured tool call would carry them natively.Other agents without a native structured-question tool should keep the current Markdown-table behavior — no regression.
Proposed Solution
Apply the transform at skill-generation time in
ClaudeIntegration, matching the pattern already used forargument-hint,user-invocable, anddisable-model-invocationinjection insrc/specify_cli/integrations/claude/__init__.py:Mark the questioning sections as transformable in
templates/commands/clarify.mdandtemplates/commands/checklist.md. Keep the current numbered-list instructions as the default body; add explicit begin/end markers (e.g., an HTML-comment fence like<!-- speckit:question-render:begin --> ... <!-- speckit:question-render:end -->) around the question-rendering block in each template so a post-processor can target it surgically without touching the surrounding logic (question selection, 5-question cap, spec-file writes, validation pass, etc.).Extend
ClaudeIntegration.setup()(or a helper it calls) so that when generatingspeckit-clarify/SKILL.mdandspeckit-checklist/SKILL.md, the fenced block is replaced with instructions that tell Claude Code to invokeAskUserQuestionwith a structured payload:question: the clarification/checklist question textoptions[]:{label, description}, with the recommended option placed first and its description prefixedRecommended — <reasoning>.multiSelect: false"Provide my own short answer (≤10 words)"option preserves the current free-form escape hatch.Preserve all downstream behavior. For
/clarify: the 5-question cap, category coverage map, incremental spec-file writes after each accepted answer, and the post-questioning validation pass. For/checklist: the 3-initial / up-to-5-total question cap and the Q1–Q5 labeling. Only the rendering of each question changes in either command.Normalize the two table schemas.
/clarifyusesOption | Descriptionand/checklistusesOption | Candidate | Why It Matters. The post-processor should map both toAskUserQuestion's{label, description}shape —Candidatebecomes the optionlabel, andWhy It Mattersgoes intodescription. Document this mapping inline in the integration code so the two templates can diverge further in the future without silently breaking the transform.No changes for other integrations.
MarkdownIntegration,TomlIntegration, and otherSkillsIntegrationsubclasses continue shipping the templates unchanged — the fence is plain Markdown that renders correctly as-is. Snapshot tests should assert that every non-Claude integration's generated output is byte-identical tomain.Out of scope for this issue
/implement's"proceed anyway? (yes/no)"and/analyze's remediation-approval prompt. These are binary confirms — the same fencing pattern would cover them trivially once it exists, but the UX win is small and they can land in a follow-up PR to keep this one focused./specify,/plan,/tasks,/constitution,/taskstoissues. The first three emit[NEEDS CLARIFICATION]markers rather than asking interactively;/constitutionis mostly free-form;/taskstoissuesdoesn't prompt..specify/extensions.yml).Why this shape
AGENTS.md): per-agent behavior lives insrc/specify_cli/integrations/<key>/, templates stayagent-agnostic, and post-processing on skill generation is already an established pattern.
/clarifyand theCandidate / Why It Matterssemantics in/checklist; both move from table rows into structuredAskUserQuestionoption fields.Alternatives Considered
Do nothing — keep the Markdown table rendering. The current flow works on every agent and is low-maintenance. Rejected because inside Claude Code, it's a visibly worse UX than every other interactive flow in the host (structured pickers via
AskUserQuestion), and the ambiguous-parsing failure mode is already observable when users paraphrase answers or the "Recommended" row shifts option ordering.Fix
/clarifyonly, leave/checklistfor later. Smallest diff, fastest to land. Rejected because/checklisthas a parallel, independent questioning loop with essentially the same UX problem, and the fencing + post-processor infrastructure is the same shape for both — splitting them means building the mechanism twice or leaving an obvious gap between the two PRs.Runtime abstraction — add an
ask_user()helper in the CLI layer that integrations plug into. More flexible long-term. Rejected because spec-kit commands are executed by the agent, not by the Python CLI , there's no runtime point at which the CLI could intercept a question to re-render it. The question-rendering instructions live in the template text the agent reads, so the only place to transform them is at template/skill generation time, which is exactly whereClaudeIntegrationalready operates.Fork the templates per-agent (
clarify.claude.md,clarify.default.md, etc.). Rejected because it duplicates a ~200-line template body to change one rendering block, and it fights spec-kit's "one template per command, per-agent behavior in integration subpackages" architecture described inAGENTS.md.Make the template itself conditional — something like "if your host supports a structured-question tool, use it; otherwise render a table." Rejected because it pushes host-capability detection onto every agent, most of which would interpret it inconsistently, and it makes the template body harder to read for the common case.
Add a user-facing CLI flag (e.g.
specify init --claude-native-questions). Rejected as overkill — this is a rendering choice with no semantic impact on the generated spec, so it shouldn't need user configuration. The right default for Claude Code is "use Claude Code's native tool"; anything else is a regression.The proposed solution is the only approach that (a) fits spec-kit's existing integration-layer pattern, (b) keeps the template single-sourced, (c) has zero risk to other agents, and (d) doesn't introduce new user-visible configuration.
Component
Agent integrations (command files, workflows)
AI Agent (if applicable)
Claude Code
Use Cases
User running
/speckit.clarifyon a freshly-specified feature inside Claude Code.Today: sees an ASCII table with a "Recommended" row, reads A–E descriptions, types
A(or "yes"), repeats up to 5 times.After: sees a native Claude Code picker with the recommended option pre-highlighted, selects with arrow keys / enter, no free-text disambiguation.
User running
/speckit.checklistto generate a UI-consistency or test-coverage checklist.Today: answers 3 (up to 5) clarifying questions rendered as a
Option | Candidate | Why It Matterstable, each requiring a letter reply.After: the same 3–5 questions rendered as native pickers, with
Why It Mattersshown as the option description on hover/selection.User pastes a one-word free-form answer (e.g., "OAuth2") instead of picking a listed option.
Today: works, but relies on the model parsing "OAuth2" against the
(<=5 words)constraint described in prose.After: the
AskUserQuestionpayload includes a dedicated "Provide my own short answer (≤10 words)" option that, when selected, prompts for a free-text response — the escape hatch is preserved but is now afirst-class, unambiguous path.
User running the same command on a non-Claude agent (Copilot, Gemini, Windsurf, Cursor, etc.).
Today, see the Markdown table.
After: still sees the Markdown table, byte-identical. No regression, no divergence in behavior across agents beyond the rendering layer.
Maintainer reviewing a PR that touches
/clarifyor/checklistlogic.Today: edits one template, same behavior everywhere.
After: edits one template; the fenced rendering block is transformed at skill-generation time for Claude only, so question-selection logic, caps, taxonomy, and spec-file writes remain single-sourced and
agent-agnostic.
Acceptance Criteria
templates/commands/clarify.mdcontains a fenced rendering block (begin/end markers) around the question-presentation instructions in the sequential questioning loop. The block's default contents are the current Markdown-table instructions, unchanged in wording.templates/commands/checklist.mdcontains an equivalent fenced rendering block around its step-2 clarification-question rendering. Default contents unchanged.ClaudeIntegration.setup()insrc/specify_cli/integrations/claude/__init__.pypost-processes the generatedspeckit-clarify/SKILL.mdandspeckit-checklist/SKILL.mdfiles, replacing the fenced block with instructions that direct Claude Code to invokeAskUserQuestionwith a structured payload (question,options[],multiSelect: false)./clarify'sOption | Descriptionrows and/checklist'sOption | Candidate | Why It Mattersrows toAskUserQuestion's{label, description}shape. The mapping is documented inline insrc/specify_cli/integrations/claude/__init__.py./clarify's "Recommended" row) is placed first in theoptions[]array and its description is prefixedRecommended — <reasoning>./clarify), 3-initial/5-total cap (/checklist), category coverage map, incremental spec-file writes after each accepted answer, the post-questioning validation pass, and the Q1–Q5 labeling in/checklist.tests/assert that every non-Claude integration's generated output forclarifyandchecklistis byte-identical tomain. No regression anywhere except the Claude-specific skill files. Relevant integration subpackages:src/specify_cli/integrations/.AskUserQuestioninvocation instructions and no longer contains the| Option | Description |table for/clarify(and the corresponding| Option | Candidate | Why It Matters |table for/checklist)./speckit.clarifyand/speckit.checklistinside Claude Code confirms: questions render as native pickers, the recommended option is visibly pre-highlighted, free-form answers still work, and the resultingspec.md/ checklist file is equivalent to what the current flow produces..specify/extensions.yml, and no changes to any other command template undertemplates/commands/.Additional Context
AskUserQuestiontool is the native structured-question mechanism used across Claude Code's own features and by user-authored skills. It accepts aquestionstring, anoptions[]array of{label, description}, and amultiSelectboolean. See the Claude Code documentation for host context.AGENTS.md. Each agent has a subpackage undersrc/specify_cli/integrations/<key>/with setup-time hooks;ClaudeIntegrationalready post-processes generated skill files to injectuser-invocable,disable-model-invocation, andargument-hint— the proposed change extends the same pattern to thequestion-rendering block.
templates/commands/clarify.mdstep 4 is the exact block that would be fenced (Sequential questioning loop → "For multiple-choice questions" → table render).templates/commands/checklist.mdstep 2 ("Clarify intent (dynamic)") contains the parallel questioning loop and theOption | Candidate | Why It Matterstable.templates/commands/runs a structured multi-option picker loop; the remaining interactive prompts are binary yes/no confirms in/implementand/analyze, which are out of scope for this issue./clarify—/specify,/plan, and/tasksuse[NEEDS CLARIFICATION]markers and handoffs that only surface a suggested next command.specify initwith theclaudeintegration selected. Notreproducible on other agents — by design.