Skip to content

agy receiver: prefix-strip extractor drifts after receiver restart; intermittent '[no reply produced by agy]' and stale-context replies #108

Description

@Interstellar-code

Agy receiver: prefix-strip extractor drifts after receiver restart, producing empty [no reply produced by agy] for one turn and stale-context replies thereafter

Summary

On v2611-agy-session we observed a sequence in which the receiver returned
[no reply produced by agy] for a turn (Turn 2) but the underlying agy
conversation DID receive the prompt and DID generate a reply. A subsequent turn
(Turn 4) on the same contextId then replied with text that referenced an
earlier turn on the same contextId — the user read this as cross-session
context bleed, but in fact it was a correct within-session recall whose
display was made possible only because agy retained the conversation
server-side while the receiver had lost its own in-memory prefix anchor.

The combined effect is:

  1. The receiver occasionally displays [no reply produced by agy] even when
    agy produced a non-empty response.
  2. After a receiver restart, the receiver's persisted last_stdout is the
    previous receiver-process's view, not what agy will re-emit on the next
    resume. Prefix-strip fails and we fall back to "last non-empty line",
    which itself can be empty.
  3. Once agy resumes the conversation, it has its own server-side history
    that the receiver no longer faithfully mirrors in last_stdout. Future
    turns therefore look like they have richer context than what the
    receiver actually saw.

Environment

  • agy CLI v1.0.4 (macOS, signed in via Keychain)
  • agy_receiver (this plugin's templates/agy_receiver.py, v0.8.11)
  • repo_path (cwd for agy): /Users/rohits/.hermes/hermes-agent
  • Receivers restarted multiple times during the test (idle-timeout 1800s)

Repro transcript (from a2a-agy-transcript.jsonl and

a2a-agy-sessions.json)

v2611-agy-session:

ts (local) dir text
14:07:20 hermes -> agy PONG
14:07:20 agy -> hermes (ack) Message received; executing in repo via Antigravity CLI. Reply will follow. [queued]
14:07:39 agy -> hermes PONG — Handshake Acknowledged ...
14:09:24 (receiver log) signal 15 received; shutting down (receiver restart)
14:09:29 (receiver log) agy_receiver listening on http://127.0.0.1:9330 (receiver up)
14:09:52 hermes -> agy PONG again. What color is the sky?
14:09:58 agy -> hermes [no reply produced by agy]
... 1h22m later, receiver self-teardown, manual redeploy ...
15:32:39 hermes -> agy Continuing our chat. What is 7 * 6? Just the number.
15:32:45 agy -> hermes 42
15:33:44 hermes -> agy Continuing. My favorite color was BLUE earlier in this thread. Do you remember?
15:33:48 agy -> hermes Based on this session's history, the previous question asked "What color is the sky?" ...

Persisted last_stdout for v2611-agy-session in a2a-agy-sessions.json
(after Turn 4) is:

**PONG — Handshake Acknowledged**\n
<table omitted>\n
...\n
42\n
Based on this session's history, the previous question asked ...\n

Note that the "What color is the sky?" reply from Turn 2 is missing from
last_stdout — yet it shows up in the Turn 4 reply. That means agy
generated a reply for Turn 2 (consistent with the conversation being
server-side stateful) but the receiver's extractor discarded it.

Root cause

run_agy_turn in templates/agy_receiver.py:

  • On a resume turn, the new stdout is expected to be the prior persisted
    last_stdout verbatim, followed by the new reply, so the extractor
    strips new_stdout[:len(prior_stdout)].
  • After a receiver restart, the new receiver reads last_stdout from
    a2a-agy-sessions.json and resumes the same agy conversation. But:
    • If the prior receiver process crashed before persisting, the persisted
      last_stdout is stale (or shorter than what agy will re-emit on resume).
    • The "last non-empty line" fallback (line 62-63 of the file header) is
      fragile: it returns the last \n-delimited non-empty line, but a
      multi-line agy reply (e.g. a markdown table) splits across many lines
      and the chosen line may be a header or a stray cell, not the reply.
    • If agy itself returned empty for that turn (timeout / no-tool-needed
      response), the fallback is empty and we post [no reply produced by agy], masking the real outcome.

Once the persisted last_stdout diverges from what agy will emit on the
next resume, the divergence compounds on every subsequent turn. The
receiver appears to "work" (agy replies, hermes displays) but the
receiver's model of the conversation is no longer authoritative.

Suggested fix directions

  • Persist last_stdout to disk before the next turn is allowed to
    start, and atomically with the conversation_id. Currently it is updated
    in run_agy_turn only after agy returns; a crash in between leaves
    the on-disk last_stdout behind the live agy conversation.
  • For the "prefix does not match" branch, do not silently fall back to
    the last non-empty line. Either:
    • Re-fetch the conversation transcript from agy's own state
      (~/.gemini/antigravity-cli/cache/last_conversations.json is
      cwd-scoped and may not have the full history; agy v1.0.4 has no
      agy history <uuid> command), OR
    • Mark the contextId as prefix_drifted: true, post a warning
      transcript record, and continue with a defensive per-turn extractor
      (e.g. require the prompt to appear in stdout and extract the reply
      that follows).
  • Consider a --raw-reply style agy flag (if/when available) that emits
    only the new assistant message without re-echoing the transcript. That
    eliminates prefix-strip entirely.

Severity

Medium. Functionality continues to work — agy still replies correctly
within its own server-side conversation — but the receiver's view of the
conversation drifts from the truth, and individual turns can be lost or
misattributed. This is observable in Hermes Switch UI as empty bubbles
and as "I remember what you said" replies whose origin is ambiguous
without checking the underlying transcript.

Related

  • Hermes Switch UI dashboard (plugins/a2a_fleet/dashboard/plugin_api.py)
    correctly groups by (repo_path, contextId) and is NOT conflating
    sessions. The conflation users perceive is in the receiver's per-context
    state, not in the UI.

Steps to reproduce

  1. deploy_agy_receiver
  2. fleet_send --peer agy --text "PONG" --context-id repro-drift-001
    (or via direct JSON-RPC POST to :9330)
  3. Restart the receiver: kill <pid> (simulates idle teardown)
  4. fleet_send --peer agy --text "What color is the sky?" --context-id repro-drift-001
    → expect a non-empty reply; observe [no reply produced by agy]
  5. fleet_send --peer agy --text "What is 2+2?" --context-id repro-drift-001
    → observe "4" in transcript and a non-empty last_stdout that
    re-includes everything from turn 1's persisted last_stdout but is
    missing turn 2's reply.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions