You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an inbound A2A JSON-RPC request to a fleet receiver (verified on codex_receiver.py, plausible on cc_receiver.py / oc_receiver.py / agy_receiver.py — same envelope shape) places contextId at the wrong nesting level — i.e. params.contextId instead of the
A2A-spec-correct params.message.contextId — the receiver silently mints a fresh anon-<uuid4> contextId for that turn.
The HTTP response echoes back the original caller-supplied contextId, so the caller believes the session was threaded, while
the underlying executor (Codex / Claude Code / OpenCode / Antigravity)
actually starts a brand-new conversation.
In effect: the session-continuity wire-protocol is asymmetric and
failure mode is silent. A2A's contextId is a coroutine, not a retry key; treating it as either will produce this class of bug.
Family
This is the same class as #108 (agy extractor drift → [no reply produced by agy]), #82 (stale session_id not cleared → dead-session
loop), and #97 (codex CLI arg drift → rc=1). All three are receivers
that appear to thread context but don't — the gap between
metadata echo and actual executor state. Per #71 / #146, this whole
bug class disappears once Hermes owns the session model rather than
delegating it to receivers; that work is tracked separately. Until
then, this issue proposes the smallest durable hardening for the
senders that DO exist.
message=params.get("message") or {}
context_id=message.get("contextId") orf"anon-{uuid.uuid4()}"
This is correct per the A2A spec — contextId belongs inside the message object, not at the params root. The same convention is
used at line 52 for outbound replies to Hermes:
So the official sender is correct. The bug lives in external
senders — anyone using curl, scripts, tests, or a wrapper that
mints the JSON-RPC envelope by hand.
Real-world trigger
In our 0.17 rebase planning (procedure doc at /Users/rohits/hermes/research/Plans/hermes-0.17-rebase-procedure.md),
I dispatched a review task to a Codex receiver via direct JSON-RPC
POST (not via fleet_send) using the natural reading of the A2A
spec text — params.contextId at the root. The receiver:
ACKed with HTTP 200 and echoed my contextId back as the contextId field in the response envelope.
Internally minted anon-<uuid> and ran a brand-new codex exec
(no codex exec resume), so the executor started from zero.
Returned a structured reply that looked like a follow-up to my
prior turn but actually was generated without the prior context.
I reported the dispatch as "session continuity verified" on the basis
of the echoed contextId. Rohit caught it: the second Codex turn
behaved as a new session, not a continuation. The receiver metadata
echo was theater.
Environment
codex_receiver v0.8.14 (current main)
codex-cli 0.142.0
repo_path: /Users/rohits/.hermes/hermes-agent
Sender: raw curl / python3 -c "import requests..." / hand-rolled
Python wrappers that bypass plugins/a2a_fleet/client.py
✓ Codex returns PONG with first-turn thread.started.
Follow-up with wrong nesting (params.contextId at root):
{"jsonrpc":"2.0","id":"r2","method":"SendMessage",
"params":{"contextId":"session-test-001", # <-- WRONG NESTING
"message":{"role":"user",
"parts":[{"text":"What did I just send you?"}]}}}
✗ Receiver returns 200, echoes contextId: session-test-001,
but internally mints anon-<uuid> and runs a freshcodex exec
(no resume). The reply is generated without prior context.
Check a2a-codex-sessions.json — the new anon-... entry is
created; the session-test-001 thread from step 2 is not
resumed.
Why this is dangerous
Silent failure. Caller sees HTTP 200, echoed contextId, and
a structured reply. There is no error, no warning, no log line
that says "ignored params.contextId, minted anon". The caller
has no way to detect the loss of context without comparing
executor-side behavior (e.g. Codex starts a new codex exec
instead of codex exec resume).
Asymmetric.codex_receiver.py:52 puts contextId correctly
in outbound, but inbound accepts (and silently drops) misplaced
ones. There's no single source of truth for the A2A envelope
shape that all senders must follow.
In all four receivers
(templates/{codex,cc,oc,agy}_receiver.py), in the do_POST → /jsonrpc handler, reject any payload where contextId appears at params.contextId (the wrong nesting) with
a typed JSON-RPC error -32602 (invalid params) and the message:
"contextId must be nested under params.message, not at params root (A2A spec)"
Make this a hard fail, not a silent mint. The cost is one
~5-line if "contextId" in params: self._json(...invalid params...)
per receiver. The win is that any future buggy sender fails loud
instead of silently dropping context.
Rationale: receivers already do strict checks on method, body
size, JSON parse, etc. (codex_receiver.py:1083-1099). Adding
strict contextId nesting is consistent with the existing posture.
2. Single envelope-builder helper (medium cost, durable)
Extract a tiny stdlib-only module plugins/a2a_fleet/_envelope.py
that exposes:
defbuild_send_message(text: str, context_id: Optional[str] =None) ->dict:
"""Build the canonical A2A SendMessage JSON-RPC payload. contextId is always placed inside params.message per A2A spec. Raises ValueError if context_id contains characters illegal in A2A contextIds (RFC-4122 UUID-ish, plus the 'anon-' and 'session-' reserved prefixes). """
Have all four receivers and client.py:_send_message_payload call
this helper. Add a unit test that fuzzes the helper — feeds it
misdirected contextIds, illegal characters, oversized IDs, and
asserts the produced envelope always has params.message.contextId
and never params.contextId.
Rationale: the rule "contextId lives inside message" should appear
in exactly one place. The fact that we have at least 5 sites
(receivers × 4 + client) that all have to know this is exactly why
the bug exists — it relies on convention, not enforcement.
3. Test-suite parity (low cost, durable)
plugins/a2a_fleet/tests/ should contain a test that:
Spins up a codex receiver (or mocks it).
Sends a malformed params.contextId payload.
Asserts HTTP 200 with JSON-RPC error -32602.
Asserts noanon-... mint occurred (the receiver's session
map is unchanged).
This locks in fix #1 against regression. Suggested test name: tests/test_envelope_validation.py::test_rejects_contextid_at_params_root.
Per the RFC in #71, the long-term answer is for Hermes to own the
session model. The receiver-side a2a-<executor>-sessions.json
becomes a derived cache, not the source of truth. Until that ships,
the bug surface in this issue is real and recurring — every new
sender (curl, scripts, future integrations) re-discovers it.
#146's "Phase 2 hardening" roadmap should explicitly call out
"unify envelope shape across all senders + receivers" as a
prerequisite for any executor-side work. This issue is the
specific instantiation.
Acceptance criteria
All 4 receivers reject params.contextId (root nesting) with
JSON-RPC error -32602 and a clear message.
_envelope.py (or equivalent) exists and is the single source
of truth for the SendMessage payload shape.
Fuzz test in tests/test_envelope.py passes against all 4
executor flavors.
Documentation note added to plugins/a2a_fleet/SKILL.md (or
equivalent) explicitly stating: "contextId is a coroutine
identity, NOT a retry key. It lives inside params.message,
not at params root. Misplaced contextIds will be rejected
with -32602."
Summary
When an inbound A2A JSON-RPC request to a fleet receiver (verified on
codex_receiver.py, plausible oncc_receiver.py/oc_receiver.py/agy_receiver.py— same envelope shape) placescontextIdat thewrong nesting level — i.e.
params.contextIdinstead of theA2A-spec-correct
params.message.contextId— the receiversilently mints a fresh
anon-<uuid4>contextId for that turn.The HTTP response echoes back the original caller-supplied
contextId, so the caller believes the session was threaded, whilethe underlying executor (Codex / Claude Code / OpenCode / Antigravity)
actually starts a brand-new conversation.
In effect: the session-continuity wire-protocol is asymmetric and
failure mode is silent. A2A's
contextIdis a coroutine, not aretry key; treating it as either will produce this class of bug.
Family
This is the same class as #108 (agy extractor drift →
[no reply produced by agy]), #82 (stalesession_idnot cleared → dead-sessionloop), and #97 (codex CLI arg drift → rc=1). All three are receivers
that appear to thread context but don't — the gap between
metadata echo and actual executor state. Per #71 / #146, this whole
bug class disappears once Hermes owns the session model rather than
delegating it to receivers; that work is tracked separately. Until
then, this issue proposes the smallest durable hardening for the
senders that DO exist.
Where the receiver is correct
plugins/a2a_fleet/templates/codex_receiver.py:1103-1104:This is correct per the A2A spec —
contextIdbelongs inside themessageobject, not at theparamsroot. The same convention isused at line 52 for outbound replies to Hermes:
And in
plugins/a2a_fleet/client.py:31-43(_send_message_payload):So the official sender is correct. The bug lives in external
senders — anyone using
curl, scripts, tests, or a wrapper thatmints the JSON-RPC envelope by hand.
Real-world trigger
In our 0.17 rebase planning (procedure doc at
/Users/rohits/hermes/research/Plans/hermes-0.17-rebase-procedure.md),I dispatched a review task to a Codex receiver via direct JSON-RPC
POST (not via
fleet_send) using the natural reading of the A2Aspec text —
params.contextIdat the root. The receiver:contextIdback as thecontextIdfield in the response envelope.anon-<uuid>and ran a brand-newcodex exec(no
codex exec resume), so the executor started from zero.prior turn but actually was generated without the prior context.
I reported the dispatch as "session continuity verified" on the basis
of the echoed
contextId. Rohit caught it: the second Codex turnbehaved as a new session, not a continuation. The receiver metadata
echo was theater.
Environment
codex_receiverv0.8.14 (currentmain)codex-cli0.142.0repo_path:/Users/rohits/.hermes/hermes-agentcurl/python3 -c "import requests..."/ hand-rolledPython wrappers that bypass
plugins/a2a_fleet/client.pyRepro (verified)
Deploy a fresh codex receiver:
PONG with correct nesting (
params.message.contextId) — confirmthread.startedevent ina2a-codex-transcript.jsonl:✓ Codex returns PONG with first-turn
thread.started.Follow-up with wrong nesting (
params.contextIdat root):✗ Receiver returns 200, echoes
contextId: session-test-001,but internally mints
anon-<uuid>and runs a freshcodex exec(no resume). The reply is generated without prior context.
Check
a2a-codex-sessions.json— the newanon-...entry iscreated; the
session-test-001thread from step 2 is notresumed.
Why this is dangerous
contextId, anda structured reply. There is no error, no warning, no log line
that says "ignored
params.contextId, minted anon". The callerhas no way to detect the loss of context without comparing
executor-side behavior (e.g. Codex starts a new
codex execinstead of
codex exec resume).codex_receiver.py:52putscontextIdcorrectlyin outbound, but inbound accepts (and silently drops) misplaced
ones. There's no single source of truth for the A2A envelope
shape that all senders must follow.
stale-but-misleading context. In this bug, the receiver displays
nothing wrong — the failure is in the executor side, not the
receiver. A user running multi-turn reviews across Codex + CC
have no way to tell which.
Proposed fix
Three layered hardening steps, in order of cost:
1. Strict inbound envelope validation (low cost, immediate)
In all four receivers
(
templates/{codex,cc,oc,agy}_receiver.py), in thedo_POST → /jsonrpchandler, reject any payload wherecontextIdappears atparams.contextId(the wrong nesting) witha typed JSON-RPC error
-32602 (invalid params)and the message:Make this a hard fail, not a silent mint. The cost is one
~5-line
if "contextId" in params: self._json(...invalid params...)per receiver. The win is that any future buggy sender fails loud
instead of silently dropping context.
Rationale: receivers already do strict checks on
method, bodysize, JSON parse, etc. (
codex_receiver.py:1083-1099). Addingstrict contextId nesting is consistent with the existing posture.
2. Single envelope-builder helper (medium cost, durable)
Extract a tiny stdlib-only module
plugins/a2a_fleet/_envelope.pythat exposes:
Have all four receivers and
client.py:_send_message_payloadcallthis helper. Add a unit test that fuzzes the helper — feeds it
misdirected contextIds, illegal characters, oversized IDs, and
asserts the produced envelope always has
params.message.contextIdand never
params.contextId.Rationale: the rule "contextId lives inside message" should appear
in exactly one place. The fact that we have at least 5 sites
(receivers × 4 + client) that all have to know this is exactly why
the bug exists — it relies on convention, not enforcement.
3. Test-suite parity (low cost, durable)
plugins/a2a_fleet/tests/should contain a test that:params.contextIdpayload.-32602.anon-...mint occurred (the receiver's sessionmap is unchanged).
This locks in fix #1 against regression. Suggested test name:
tests/test_envelope_validation.py::test_rejects_contextid_at_params_root.Relationship to #71 / #146
Per the RFC in #71, the long-term answer is for Hermes to own the
session model. The receiver-side
a2a-<executor>-sessions.jsonbecomes a derived cache, not the source of truth. Until that ships,
the bug surface in this issue is real and recurring — every new
sender (curl, scripts, future integrations) re-discovers it.
#146's "Phase 2 hardening" roadmap should explicitly call out
"unify envelope shape across all senders + receivers" as a
prerequisite for any executor-side work. This issue is the
specific instantiation.
Acceptance criteria
params.contextId(root nesting) withJSON-RPC error
-32602and a clear message._envelope.py(or equivalent) exists and is the single sourceof truth for the SendMessage payload shape.
tests/test_envelope.pypasses against all 4executor flavors.
plugins/a2a_fleet/SKILL.md(orequivalent) explicitly stating: "contextId is a coroutine
identity, NOT a retry key. It lives inside
params.message,not at
paramsroot. Misplaced contextIds will be rejectedwith -32602."
Workaround (until fixed)
If you must use a non-
fleet_sendsender today (curl, script,test), nest
contextIdcorrectly:(filed by Hermes Switch Agent - hermes-switch)