You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Proposal: fix agy receiver prefix-drift with atomic persistence + drift-flag (near-term); adopt A2A Handshake v2 (#71) as long-term solution
Summary
Issue #108 documents a concrete bug in the agy receiver's last_stdout prefix-strip extractor: after a receiver restart, the persisted last_stdout drifts from what the agy CLI's server-side conversation actually contains. This produces [no reply produced by agy] on individual turns and, on subsequent turns, replies that reference context the receiver never displayed.
This issue proposes a two-horizon fix:
Near-term: Atomic last_stdout persistence + prefix_drifted flag + warning transcript record → makes the receiver honest about when it lost track.
Long-term: Adopt the A2A Bidirectional Session Handshake v2 protocol (#71) so Hermes can detect session freshness at dispatch time and reconcile state before a turn lands in empty air.
The two are complementary: the near-term fix is a tactical patch that works within the current protocol; the long-term fix eliminates the entire class of bugs by making the orchestrator and executor share a common session model.
14:09:24 — receiver SIGTERM-restarted (idle teardown). On-disk last_stdout is frozen at Turn 1.
14:09:52 — Turn 2 arrives at the fresh receiver. It resumes agy conversation via --conversation <uuid>. agy produces cumulative stdout: PONG handshake + <reply-to-sky>. The new receiver tries to prefix-strip using the persisted last_stdout (Turn 1 only), which is a prefix of the new output — but the extractor uses new_stdout[len(prior_stdout):], and if the precision is off by even a newline, the "last non-empty line" fallback kicks in. In this case the fallback returned empty "", producing [no reply produced by agy].
15:32:39 — Turn 3 arrives. Now last_stdout has been updated (by Turn 2's empty result) to... nothing useful. The prefix-strip succeeds but only because both prior and new last_stdout happen to overlap. The extractor outputs "42". The "sky" reply is permanently invisible to Hermes — it lives only in agy's server-side conversation.
15:33:44 — Turn 4. agy's server-side conversation has the full history (PONG, sky, 42, BLUE question). It replies referencing the sky question. Hermes sees this reply but never saw the sky answer — the user reads it as a cross-session bleed, even though it's a correct within-session recall whose display gap is the extractor's fault.
Proposed changes to templates/agy_receiver.py
1. Atomic last_stdout persistence
Problem:last_stdout is updated afterrun_agy_turn returns the extracted reply. If the receiver crashes between updating the in-memory store and flushing to disk (a2a-agy-sessions.json), the persisted last_stdout is stale and the next receiver starts behind.
Fix: Flush last_stdout to disk before agy runs, using the expected cumulative output (i.e., take the lock, read current last_stdout, flush the predicted-resume state). Alternatively, flush atomically after every successful agy return — non-negotiable, inside the per-contextId lock, using atomic_write (write to .tmp then os.rename).
Problem: When the persisted last_stdout does not match the output agy produces on a resume turn (first turn after restart, or any turn after a crash), the extractor enters the "prefix does not match" branch (docstring line 62) which silently falls back to "last non-empty line". This fallback is fragile (multi-line markdown can be misread, and empty results mask the failure).
Fix: Add a prefix_drifted: bool field to the session record in a2a-agy-sessions.json:
When the prefix-strip detects a mismatch (not new_stdout.startswith(prior_stdout)):
Set prefix_drifted: true with a timestamp.
Emit a warning transcript record as a synthetic sys_warn message in the reply — a hermes.drifted_state entry in the JSON-RPC response, not part of the conversation text. Hermes (or the dashboard) can render this as an orange warning badge.
Attempt the "last non-empty line" fallback as before — but now Hermes knows the result is unreliable.
When the prefix-strip succeeds on a subsequent turn (no mismatch → the drift self-healed or the extractor caught up):
Set prefix_drifted: false.
Emit a sys_info as a hermes.drift_recovered entry.
This visibility is the single most important improvement: it converts a silent corruption into a first-class observable event.
3. Return [incomplete reply — drift detected] instead of [no reply produced by agy]
When prefix_drifted is true AND the fallback-produced-reply is empty:
Return [drift detected — persisted last_stdout does not match agy's cumulative output] instead of [no reply produced by agy].
This tells the user/reviewer immediately that the receiver lost track, rather than suggesting agy failed to reply.
The prefix-drift bug is fundamentally a session state reconciliation failure. The receiver thinks it knows what agy said last; agy knows what it said last; there is no protocol path to reconcile the two. #71's protocol closes this gap completely:
A freshly-started receiver announces session_fresh: true. Hermes knows: "this receiver just booted — it may not have the prior turn's state." Hermes can re-send the last turn as context, or route the task to a receiver with a warm session.
SESSION ANNOUNCE mcp_health
Would detect whether the agy CLI's conversation store is accessible (via Keychain, last_conversations.json). If not, Hermes knows the receiver can't do multi-turn.
TASK DISPATCH reply_schema: structured
Hermes instructs agy to emit JSON-formatted replies with explicit context_fresh: true/false — structured enough that Hermes can detect whether the reply references prior context or is a fresh answer.
**TASK RESULT `status: error
partial
TASK DISPATCH continuation_from
Hermes explicitly tells agy which prior task this continues from. If the receiver was restarted and lost context, agy's server-side conversation can still resolve it — the dispatch tells the receiver which context ID to resume, and the receiver can confirm it has the right conversation UUID.
Adoption path
Phase 1 (near-term patch) — Implement the atomic persistence + drift-flag fix above. This works today with no protocol changes.
Phase 2 (inline hints, per [RFC] A2A Bidirectional Session Handshake: Orchestrator-Worker Protocol v2 #71 Phase 1) — Embed session_fresh awareness in Hermes gateway's existing fleet dispatch logic: check the receiver's drift-flag record before dispatching high-stakes tasks; flag prefix_drifted sessions for human review.
Should prefix_drifted: true block further dispatches to that contextId until a reconciliation turn (re-send the last expected prompt) succeeds? Or is the warning sufficient?
The atomic-write flush adds a disk write per agy turn (agy can take 10-30s per turn — one write is negligible). Confirm the overhead is acceptable.
For the prefix_drifted warning record: should it be emitted as a separate JSON-RPC message to Hermes (so the dashboard can render an inline badge), or just logged in the receiver's own log? I lean toward separate message — visibility is the whole point.
Proposal: fix agy receiver prefix-drift with atomic persistence + drift-flag (near-term); adopt A2A Handshake v2 (#71) as long-term solution
Summary
Issue #108 documents a concrete bug in the agy receiver's
last_stdoutprefix-strip extractor: after a receiver restart, the persistedlast_stdoutdrifts from what the agy CLI's server-side conversation actually contains. This produces[no reply produced by agy]on individual turns and, on subsequent turns, replies that reference context the receiver never displayed.This issue proposes a two-horizon fix:
last_stdoutpersistence +prefix_driftedflag + warning transcript record → makes the receiver honest about when it lost track.The two are complementary: the near-term fix is a tactical patch that works within the current protocol; the long-term fix eliminates the entire class of bugs by making the orchestrator and executor share a common session model.
Near-term fix: atomic persistence + drift-flag
Root cause (from #108's transcript)
For
v2611-agy-session:last_stdout="PONG — Handshake Acknowledged".last_stdoutis frozen at Turn 1.--conversation <uuid>. agy produces cumulative stdout:PONG handshake + <reply-to-sky>. The new receiver tries to prefix-strip using the persistedlast_stdout(Turn 1 only), which is a prefix of the new output — but the extractor usesnew_stdout[len(prior_stdout):], and if the precision is off by even a newline, the "last non-empty line" fallback kicks in. In this case the fallback returned empty"", producing[no reply produced by agy].last_stdouthas been updated (by Turn 2's empty result) to... nothing useful. The prefix-strip succeeds but only because both prior and newlast_stdouthappen to overlap. The extractor outputs "42". The "sky" reply is permanently invisible to Hermes — it lives only in agy's server-side conversation.Proposed changes to
templates/agy_receiver.py1. Atomic
last_stdoutpersistenceProblem:
last_stdoutis updated afterrun_agy_turnreturns the extracted reply. If the receiver crashes between updating the in-memory store and flushing to disk (a2a-agy-sessions.json), the persistedlast_stdoutis stale and the next receiver starts behind.The key window is in
poll_inbox(pseudocode):Fix: Flush
last_stdoutto disk before agy runs, using the expected cumulative output (i.e., take the lock, read currentlast_stdout, flush the predicted-resume state). Alternatively, flush atomically after every successful agy return — non-negotiable, inside the per-contextId lock, usingatomic_write(write to.tmpthenos.rename).2.
prefix_driftedflag ina2a-agy-sessions.jsonProblem: When the persisted
last_stdoutdoes not match the output agy produces on a resume turn (first turn after restart, or any turn after a crash), the extractor enters the "prefix does not match" branch (docstring line 62) which silently falls back to "last non-empty line". This fallback is fragile (multi-line markdown can be misread, and empty results mask the failure).Fix: Add a
prefix_drifted: boolfield to the session record ina2a-agy-sessions.json:{ "v2611-agy-session": { "conversation_id": "ae6ce7ce-...", "last_stdout": "...", "prefix_drifted": true, "drifted_at": 1780493888, "updated_at": 1780493888 } }When the prefix-strip detects a mismatch (
not new_stdout.startswith(prior_stdout)):prefix_drifted: truewith a timestamp.sys_warnmessage in the reply — ahermes.drifted_stateentry in the JSON-RPC response, not part of the conversation text. Hermes (or the dashboard) can render this as an orange warning badge.When the prefix-strip succeeds on a subsequent turn (no mismatch → the drift self-healed or the extractor caught up):
prefix_drifted: false.sys_infoas ahermes.drift_recoveredentry.This visibility is the single most important improvement: it converts a silent corruption into a first-class observable event.
3. Return
[incomplete reply — drift detected]instead of[no reply produced by agy]When
prefix_driftedis true AND the fallback-produced-reply is empty:[drift detected — persisted last_stdout does not match agy's cumulative output]instead of[no reply produced by agy].Long-term fix: adopt #71's A2A Handshake v2 protocol
The prefix-drift bug is fundamentally a session state reconciliation failure. The receiver thinks it knows what agy said last; agy knows what it said last; there is no protocol path to reconcile the two. #71's protocol closes this gap completely:
What #71 offers that prevents this bug class
session_freshsession_fresh: true. Hermes knows: "this receiver just booted — it may not have the prior turn's state." Hermes can re-send the last turn as context, or route the task to a receiver with a warm session.mcp_healthlast_conversations.json). If not, Hermes knows the receiver can't do multi-turn.reply_schema: structuredcontext_fresh: true/false— structured enough that Hermes can detect whether the reply references prior context or is a fresh answer.continuation_fromAdoption path
session_freshawareness in Hermes gateway's existing fleet dispatch logic: check the receiver's drift-flag record before dispatching high-stakes tasks; flagprefix_driftedsessions for human review.SESSION ANNOUNCEon startup, includingsession_fresh. Hermes parses it, updates its peer profile cache, and uses it for routing decisions.TASK RESULTenvelopes. The extractor's warning flags become first-class fields in the structured result.Open questions
prefix_drifted: trueblock further dispatches to that contextId until a reconciliation turn (re-send the last expected prompt) succeeds? Or is the warning sufficient?prefix_driftedwarning record: should it be emitted as a separate JSON-RPC message to Hermes (so the dashboard can render an inline badge), or just logged in the receiver's own log? I lean toward separate message — visibility is the whole point.References
plugins/a2a_fleet/templates/agy_receiver.py(prefix-strip extractor in docstring lines 44-64;run_agy_turnand_extract_replydownstream)plugins/a2a_fleet/context_store.py(in-memory LRU store; disk persistence viaa2a-agy-sessions.json)plugins/a2a_fleet/dashboard/plugin_api.py(correctly scopes buckets by(repo_path, contextId)— UI is not the source of confusion)plugins/a2a_fleet/dashboard/manifest.jsonv2611-agy-sessionthread in sidebar with Turn 4 reply visible but Turn 2's sky-answer missing — visually confirms the drift gap