Matrix Coder workflows (ralph/autopilot/ultrawork/ultraqa) are persona text only — no programmatic loop, iteration cap, or cross-turn state

**Epic:** #76  
**Affected phases:** #115 (Phase 3 — Workflow/loop skills)

## What was claimed

Phase 3 (#115) stated:

> A user can run an end-to-end autonomous coding loop (e.g. autopilot or Ralph) that self-corrects via verify and is observable on Switch UI.

The Phase 3 scope listed concrete loop semantics:
- **Ralph** — repeat \`executor → verify\` until verification passes, bounded, with stop criteria
- **autopilot** — full chain \`plan → executor → test → review → verify\` end-to-end
- **ultrawork** — parallel high-throughput fan-out of roles across files/topics via \`delegate_task\`
- **ultraqa** — \`test → verify → fix\` cycle until suite is green or 5-cycle cap

## What actually exists

All four workflow \`.md\` persona files are present and well-written:
- \`plugins/matrix_coder/personas/workflows/ralph.md\`
- \`plugins/matrix_coder/personas/workflows/autopilot.md\`
- \`plugins/matrix_coder/personas/workflows/ultrawork.md\`
- \`plugins/matrix_coder/personas/workflows/ultraqa.md\`

When a user sends \`matrix ralph: fix auth tests\`, \`core/harness.py\` loads \`ralph.md\` via \`registry.load_persona()\`, composes it with the base contracts, and injects the resulting text into the current turn via the \`pre_llm_call\` hook. The agent then reads procedural instructions and may attempt to self-direct.

**Nothing more happens programmatically.** Specifically:

- No Python code in \`core/\` implements any loop, gate-progression, or iteration
- \`delegate_task\` appears only inside persona text as an instruction to the agent — no code calls it
- There is no iteration counter, no 5-cycle cap, no stop-criteria check
- There is no cross-turn state: if the agent produces a response and the next user message is unrelated, the workflow context is gone
- The \`_inject_persona\` hook clears after every non-trigger turn — a multi-turn loop would need to re-trigger or persist state
- The only test for workflows is \`test_real_loader_workflow_composes\`, which asserts \`"Ralph" in out\` (persona text was injected), not that any loop structure ran

The existing code is correct for single-turn persona injection. It is not an implementation of the loop/orchestration semantics Phase 3 described.

## What "done" looks like

A complete Phase 3 implementation requires at minimum:

1. **Loop driver in \`harness.py\`** — a \`run_workflow(workflow, goal, session_id)\` function that:
   - Dispatches the appropriate sequence of specialist roles in order (or in parallel for ultrawork)
   - Tracks iteration count and enforces the configured cap (default 5)
   - Checks a stop condition (e.g. \`verify\` returns pass) between iterations
   - Returns a \`SpecialistResult\` summarising the final outcome

2. **Cross-turn state** — workflow sessions need to survive across turns. Options: store active workflow state in \`HermesBridge\` (keyed by session_id), or require the full loop to complete within a single invocation via \`delegate_task\`.

3. **\`delegate_task\` integration** — \`ultrawork\` fan-out requires actual calls to \`delegate_task\` from Python, not just instructing the agent to do so in persona text.

4. **Kanban child cards per iteration** — Phase 2 (#114) specified child cards per specialist dispatch; currently only a single parent card is created per invocation. Loop iterations should open child cards.

5. **Tests** — test that the loop iterates, that the cap fires, that a passing \`verify\` step halts the loop, and that a blocked gate surfaces correctly.

## Impact

Users invoking \`matrix ralph\`, \`matrix autopilot\`, \`matrix ultrawork\`, or \`matrix ultraqa\` get the workflow persona injected for that turn and the agent may attempt to self-direct — but there is no guarantee of gate-by-gate progression, no iteration cap, no automatic stop on success, and no multi-turn coordination. The behaviour depends entirely on the model following the persona instructions in a single context window.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matrix Coder workflows (ralph/autopilot/ultrawork/ultraqa) are persona text only — no programmatic loop, iteration cap, or cross-turn state #129

What was claimed

What actually exists

What "done" looks like

Impact

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Matrix Coder workflows (ralph/autopilot/ultrawork/ultraqa) are persona text only — no programmatic loop, iteration cap, or cross-turn state #129

Description

What was claimed

What actually exists

What "done" looks like

Impact

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions