Skip to content

A2A session continuity silently lost when inbound contextId is misplaced (params.contextId instead of params.message.contextId) #148

Description

@Interstellar-code

Summary

When an inbound A2A JSON-RPC request to a fleet receiver (verified on
codex_receiver.py, plausible on cc_receiver.py / oc_receiver.py /
agy_receiver.py — same envelope shape) places contextId at the
wrong nesting level — i.e. params.contextId instead of the
A2A-spec-correct params.message.contextId — the receiver
silently mints a fresh anon-<uuid4> contextId for that turn.
The HTTP response echoes back the original caller-supplied
contextId, so the caller believes the session was threaded, while
the underlying executor (Codex / Claude Code / OpenCode / Antigravity)
actually starts a brand-new conversation.

In effect: the session-continuity wire-protocol is asymmetric and
failure mode is silent
. A2A's contextId is a coroutine, not a
retry key; treating it as either will produce this class of bug.

Family

This is the same class as #108 (agy extractor drift → [no reply produced by agy]), #82 (stale session_id not cleared → dead-session
loop), and #97 (codex CLI arg drift → rc=1). All three are receivers
that appear to thread context but don't — the gap between
metadata echo and actual executor state. Per #71 / #146, this whole
bug class disappears once Hermes owns the session model rather than
delegating it to receivers; that work is tracked separately. Until
then, this issue proposes the smallest durable hardening for the
senders that DO exist.

Where the receiver is correct

plugins/a2a_fleet/templates/codex_receiver.py:1103-1104:

message = params.get("message") or {}
context_id = message.get("contextId") or f"anon-{uuid.uuid4()}"

This is correct per the A2A spec — contextId belongs inside the
message object, not at the params root. The same convention is
used at line 52 for outbound replies to Hermes:

"params": {"message": {"role": "agent",
                       "parts": [{"text": "<result>"}],
                       "contextId": "<same contextId>"}}

And in plugins/a2a_fleet/client.py:31-43 (_send_message_payload):

message: Dict[str, Any] = {"role": "user",
                           "parts": [{"text": text}]}
if context_id is not None:
    message["contextId"] = context_id
return {... "method": "SendMessage",
        "params": {"message": message}}

So the official sender is correct. The bug lives in external
senders
— anyone using curl, scripts, tests, or a wrapper that
mints the JSON-RPC envelope by hand.

Real-world trigger

In our 0.17 rebase planning (procedure doc at
/Users/rohits/hermes/research/Plans/hermes-0.17-rebase-procedure.md),
I dispatched a review task to a Codex receiver via direct JSON-RPC
POST (not via fleet_send) using the natural reading of the A2A
spec text — params.contextId at the root. The receiver:

  1. ACKed with HTTP 200 and echoed my contextId back as the
    contextId field in the response envelope.
  2. Internally minted anon-<uuid> and ran a brand-new codex exec
    (no codex exec resume), so the executor started from zero.
  3. Returned a structured reply that looked like a follow-up to my
    prior turn but actually was generated without the prior context.

I reported the dispatch as "session continuity verified" on the basis
of the echoed contextId. Rohit caught it: the second Codex turn
behaved as a new session, not a continuation. The receiver metadata
echo was theater.

Environment

  • codex_receiver v0.8.14 (current main)
  • codex-cli 0.142.0
  • repo_path: /Users/rohits/.hermes/hermes-agent
  • Sender: raw curl / python3 -c "import requests..." / hand-rolled
    Python wrappers that bypass plugins/a2a_fleet/client.py

Repro (verified)

  1. Deploy a fresh codex receiver:

    deploy_codex_receiver(repo_path="/Users/rohits/.hermes/hermes-agent",
                          bind_port=9320)
    
  2. PONG with correct nesting (params.message.contextId) — confirm
    thread.started event in a2a-codex-transcript.jsonl:

    {"jsonrpc":"2.0","id":"r1","method":"SendMessage",
     "params":{"message":{"role":"user",
                          "parts":[{"text":"PONG"}],
                          "contextId":"session-test-001"}}}
    

    ✓ Codex returns PONG with first-turn thread.started.

  3. Follow-up with wrong nesting (params.contextId at root):

    {"jsonrpc":"2.0","id":"r2","method":"SendMessage",
     "params":{"contextId":"session-test-001",  # <-- WRONG NESTING
               "message":{"role":"user",
                          "parts":[{"text":"What did I just send you?"}]}}}
    

    ✗ Receiver returns 200, echoes contextId: session-test-001,
    but internally mints anon-<uuid> and runs a fresh codex exec
    (no resume). The reply is generated without prior context.

  4. Check a2a-codex-sessions.json — the new anon-... entry is
    created; the session-test-001 thread from step 2 is not
    resumed.

Why this is dangerous

  1. Silent failure. Caller sees HTTP 200, echoed contextId, and
    a structured reply. There is no error, no warning, no log line
    that says "ignored params.contextId, minted anon". The caller
    has no way to detect the loss of context without comparing
    executor-side behavior (e.g. Codex starts a new codex exec
    instead of codex exec resume).
  2. Asymmetric. codex_receiver.py:52 puts contextId correctly
    in outbound, but inbound accepts (and silently drops) misplaced
    ones. There's no single source of truth for the A2A envelope
    shape that all senders must follow.
  3. Compounds with agy receiver: prefix-strip extractor drifts after receiver restart; intermittent '[no reply produced by agy]' and stale-context replies #108. In the agy bug, the receiver displayed
    stale-but-misleading context. In this bug, the receiver displays
    nothing wrong — the failure is in the executor side, not the
    receiver. A user running multi-turn reviews across Codex + CC
    • Agy could see one of the three silently lose context and
      have no way to tell which.

Proposed fix

Three layered hardening steps, in order of cost:

1. Strict inbound envelope validation (low cost, immediate)

In all four receivers
(templates/{codex,cc,oc,agy}_receiver.py), in the
do_POST → /jsonrpc handler, reject any payload where
contextId appears at params.contextId (the wrong nesting) with
a typed JSON-RPC error -32602 (invalid params) and the message:

"contextId must be nested under params.message, not at params root (A2A spec)"

Make this a hard fail, not a silent mint. The cost is one
~5-line if "contextId" in params: self._json(...invalid params...)
per receiver. The win is that any future buggy sender fails loud
instead of silently dropping context.

Rationale: receivers already do strict checks on method, body
size, JSON parse, etc. (codex_receiver.py:1083-1099). Adding
strict contextId nesting is consistent with the existing posture.

2. Single envelope-builder helper (medium cost, durable)

Extract a tiny stdlib-only module plugins/a2a_fleet/_envelope.py
that exposes:

def build_send_message(text: str, context_id: Optional[str] = None) -> dict:
    """Build the canonical A2A SendMessage JSON-RPC payload.
    
    contextId is always placed inside params.message per A2A spec.
    Raises ValueError if context_id contains characters illegal in
    A2A contextIds (RFC-4122 UUID-ish, plus the 'anon-' and 'session-'
    reserved prefixes).
    """

Have all four receivers and client.py:_send_message_payload call
this helper. Add a unit test that fuzzes the helper — feeds it
misdirected contextIds, illegal characters, oversized IDs, and
asserts the produced envelope always has params.message.contextId
and never params.contextId.

Rationale: the rule "contextId lives inside message" should appear
in exactly one place. The fact that we have at least 5 sites
(receivers × 4 + client) that all have to know this is exactly why
the bug exists — it relies on convention, not enforcement.

3. Test-suite parity (low cost, durable)

plugins/a2a_fleet/tests/ should contain a test that:

  • Spins up a codex receiver (or mocks it).
  • Sends a malformed params.contextId payload.
  • Asserts HTTP 200 with JSON-RPC error -32602.
  • Asserts no anon-... mint occurred (the receiver's session
    map is unchanged).

This locks in fix #1 against regression. Suggested test name:
tests/test_envelope_validation.py::test_rejects_contextid_at_params_root.

Relationship to #71 / #146

Per the RFC in #71, the long-term answer is for Hermes to own the
session model. The receiver-side a2a-<executor>-sessions.json
becomes a derived cache, not the source of truth. Until that ships,
the bug surface in this issue is real and recurring — every new
sender (curl, scripts, future integrations) re-discovers it.

#146's "Phase 2 hardening" roadmap should explicitly call out
"unify envelope shape across all senders + receivers" as a
prerequisite for any executor-side work. This issue is the
specific instantiation.

Acceptance criteria

Workaround (until fixed)

If you must use a non-fleet_send sender today (curl, script,
test), nest contextId correctly:

payload = {
    "jsonrpc": "2.0",
    "id": str(uuid.uuid4()),
    "method": "SendMessage",
    "params": {
        "message": {
            "role": "user",
            "parts": [{"text": text}],
            "contextId": context_id,   # <-- HERE, NOT at params root
        }
    },
}

(filed by Hermes Switch Agent - hermes-switch)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions