Summary
The current A2A fleet protocol (v0.5.x) operates as a fire-and-forget message bus: Hermes dispatches tasks to Claude Code workers with no upfront capability exchange, no structured output contract, and no budget signalling. This RFC proposes a three-message bidirectional handshake that closes those gaps.
Status: Draft — awaiting implementer review.
Motivation
Problems with the current protocol
-
MCP health blindness. Hermes dispatches tasks requiring specific MCP servers (e.g. context-mode) without knowing if those servers are alive on the executor. Claude Code discovers the dependency failure mid-task — after spending tokens.
-
Model-tier blindness. Hermes has no visibility into whether the executor is running Haiku, Sonnet, or Opus. Task routing is blind — complex work may be sent to a tier that degrades quality, or trivial work to a tier that wastes budget.
-
No output contract. Claude Code replies in free-form prose. Hermes must parse unstructured text to extract PR numbers, file paths, or status. A single schema change eliminates downstream parsing fragility.
-
No token budget signalling. Executors cannot size subagents or choose depth-vs-speed strategies without a budget ceiling. Tokens are spent before Hermes knows if the work was worth it.
-
No priority signalling. Tasks marked "urgent" by the user are treated identically to background research. Executors cannot adapt their verification depth accordingly.
Proposal: Three-Message Protocol
Message 1 — SESSION ANNOUNCE (Worker → Orchestrator)
Emitted by the Claude Code receiver shim at startup and on first task receipt. Allows Hermes to build a live peer capability profile before dispatching work.
{
"type": "SESSION_ANNOUNCE",
"protocol_version": "2.0",
"worker": {
"agent_id": "claude-code",
"model": "claude-sonnet-4-20250514",
"model_tier": "sonnet",
"context_window": 200000
},
"capabilities": {
"tools": ["browser", "terminal", "file", "patch"],
"mcp_servers": ["context_mode", "claude_mem", "trek"]
},
"mcp_health": {
"context_mode": "ok",
"claude_mem": "ok",
"trek": "timeout"
},
"session": {
"context_id": "handshake:hermes-switch",
"active_tasks": 2,
"repo_state": {
"branch": "main",
"clean": true
},
"session_fresh": true
}
}
Key fields for routing decisions:
model_tier — Hermes routes Haiku/Sonnet/Opus appropriately
mcp_health — Skip tasks requiring a dead server; reply blocked immediately if required_mcp is unmet
active_tasks — Concurrency awareness to avoid overloading
repo_state.clean — Reject write-task dispatches to a dirty working tree
Message 2 — TASK DISPATCH (Orchestrator → Worker)
Sent by Hermes at the start of every task. Establishes the execution contract.
{
"type": "TASK_DISPATCH",
"protocol_version": "2.0",
"task": {
"task_id": "abc-123",
"priority": "high",
"reply_schema": "structured",
"token_budget": 4000
},
"requirements": {
"required_mcp": ["context_mode"],
"required_capabilities": ["terminal", "file"]
},
"deadline": "2026-05-31T21:00:00Z",
"brief": "Refactor the auth module. Focus on token refresh logic..."
}
Key fields for execution strategy:
reply_schema — prose | structured | json — eliminates format guessing
token_budget — Executor sizes subagents and chooses depth-vs-speed
required_mcp — Executor validates capability before accepting; returns blocked immediately if unmet
priority — Affects whether executor deep-verifies or fires-and-summarizes
Message 3 — TASK RESULT (Worker → Orchestrator)
Returned by Claude Code on completion. Structured for reliable Hermes parsing.
{
"type": "TASK_RESULT",
"protocol_version": "2.0",
"task_id": "abc-123",
"status": "done",
"tokens_spent": 3240,
"next_steps": [
"Run the full test suite to verify refactor",
"Open PR for review"
],
"blocked_reason": null,
"artifacts": {
"pr": "https://github.com/Interstellar-code/hermes-agent/pull/72",
"files_changed": [
"src/auth/token_refresh.py",
"tests/auth/test_token_refresh.py"
]
}
}
Status values:
done — Task completed successfully
blocked — Task could not start; blocked_reason is required
error — Task failed mid-execution; include error summary
partial — Task partially completed; include what remains
Implementation Plan
Phase 1 — Minimal viable (no executor code change)
Hermes begins sending reply_schema, token_budget, priority, and required_mcp as structured YAML hints embedded in the task message body. Claude Code reads them from the brief. Executor must parse them from prose — fragile but zero code change required on the worker side.
TASK DISPATCH (inline)
priority: high
reply_schema: structured
token_budget: 4000
required_mcp: [context_mode]
Brief: Refactor the auth module...
Phase 2 — Structured SESSION ANNOUNCE (receiver shim change)
The a2a_receiver.py shim emits a structured JSON SESSION ANNOUNCE block in its first reply. Hermes parses it into a peer profile cache. Requires change only to the receiver shim (plugins/a2a_fleet/a2a_receiver.py).
Phase 3 — Structured TASK RESULT (receiver shim change)
Claude Code is instructed (via the task brief) to emit structured TASK RESULT JSON at completion. The receiver shim wraps it in the A2A reply. Hermes parses status, tokens_spent, artifacts, and next_steps directly from JSON.
Phase 4 — Executor-side validation
The receiver shim validates required_mcp and required_capabilities at task receipt. If unmet, replies with a structured TASK RESULT of status: blocked immediately — before spending any tokens.
Priority
| Priority |
Change |
Impact |
| 🔴 High |
required_mcp in dispatch + mcp_health in announce |
Prevents dead-on-arrival task dispatches; biggest token savings |
| 🔴 High |
reply_schema in dispatch |
Eliminates format guessing; cleaner Hermes parsing |
| 🟡 Medium |
model_tier + token_budget |
Token budget management; executor can size subagents |
| 🟡 Medium |
priority in dispatch |
Executor chooses depth vs speed per task |
| 🟢 Low |
repo_state, session_fresh |
Nice-to-have; rarely blocking |
Open Questions
- Should
SESSION ANNOUNCE be emitted on every receiver restart, or only on first task of a fresh context_id?
- How should Hermes handle a
blocked reply — queue the task for retry, alert the user, or route to a different executor?
- Should
tokens_spent be self-reported by Claude Code (estimated) or computed by Hermes from the input/output token delta?
- Is there appetite to version this protocol (
protocol_version field) so mixed-version fleets can negotiate?
References
- A2A Fleet plugin:
plugins/a2a_fleet/
- Fleet send tool:
plugins/a2a_fleet/fleet_tools.py
- Receiver shim:
plugins/a2a_fleet/a2a_receiver.py
- Existing fleet.yaml managed-peer config at
~/.hermes/profiles/hermes-switch/fleet.yaml
Filed by: Hermes Agent (hermes-switch profile) via gh CLI
Session: protocol-design:hermes-switch, 2026-05-31
Summary
The current A2A fleet protocol (v0.5.x) operates as a fire-and-forget message bus: Hermes dispatches tasks to Claude Code workers with no upfront capability exchange, no structured output contract, and no budget signalling. This RFC proposes a three-message bidirectional handshake that closes those gaps.
Status: Draft — awaiting implementer review.
Motivation
Problems with the current protocol
MCP health blindness. Hermes dispatches tasks requiring specific MCP servers (e.g.
context-mode) without knowing if those servers are alive on the executor. Claude Code discovers the dependency failure mid-task — after spending tokens.Model-tier blindness. Hermes has no visibility into whether the executor is running Haiku, Sonnet, or Opus. Task routing is blind — complex work may be sent to a tier that degrades quality, or trivial work to a tier that wastes budget.
No output contract. Claude Code replies in free-form prose. Hermes must parse unstructured text to extract PR numbers, file paths, or status. A single schema change eliminates downstream parsing fragility.
No token budget signalling. Executors cannot size subagents or choose depth-vs-speed strategies without a budget ceiling. Tokens are spent before Hermes knows if the work was worth it.
No priority signalling. Tasks marked "urgent" by the user are treated identically to background research. Executors cannot adapt their verification depth accordingly.
Proposal: Three-Message Protocol
Message 1 — SESSION ANNOUNCE (Worker → Orchestrator)
Emitted by the Claude Code receiver shim at startup and on first task receipt. Allows Hermes to build a live peer capability profile before dispatching work.
{ "type": "SESSION_ANNOUNCE", "protocol_version": "2.0", "worker": { "agent_id": "claude-code", "model": "claude-sonnet-4-20250514", "model_tier": "sonnet", "context_window": 200000 }, "capabilities": { "tools": ["browser", "terminal", "file", "patch"], "mcp_servers": ["context_mode", "claude_mem", "trek"] }, "mcp_health": { "context_mode": "ok", "claude_mem": "ok", "trek": "timeout" }, "session": { "context_id": "handshake:hermes-switch", "active_tasks": 2, "repo_state": { "branch": "main", "clean": true }, "session_fresh": true } }Key fields for routing decisions:
model_tier— Hermes routes Haiku/Sonnet/Opus appropriatelymcp_health— Skip tasks requiring a dead server; replyblockedimmediately ifrequired_mcpis unmetactive_tasks— Concurrency awareness to avoid overloadingrepo_state.clean— Reject write-task dispatches to a dirty working treeMessage 2 — TASK DISPATCH (Orchestrator → Worker)
Sent by Hermes at the start of every task. Establishes the execution contract.
{ "type": "TASK_DISPATCH", "protocol_version": "2.0", "task": { "task_id": "abc-123", "priority": "high", "reply_schema": "structured", "token_budget": 4000 }, "requirements": { "required_mcp": ["context_mode"], "required_capabilities": ["terminal", "file"] }, "deadline": "2026-05-31T21:00:00Z", "brief": "Refactor the auth module. Focus on token refresh logic..." }Key fields for execution strategy:
reply_schema—prose | structured | json— eliminates format guessingtoken_budget— Executor sizes subagents and chooses depth-vs-speedrequired_mcp— Executor validates capability before accepting; returnsblockedimmediately if unmetpriority— Affects whether executor deep-verifies or fires-and-summarizesMessage 3 — TASK RESULT (Worker → Orchestrator)
Returned by Claude Code on completion. Structured for reliable Hermes parsing.
{ "type": "TASK_RESULT", "protocol_version": "2.0", "task_id": "abc-123", "status": "done", "tokens_spent": 3240, "next_steps": [ "Run the full test suite to verify refactor", "Open PR for review" ], "blocked_reason": null, "artifacts": { "pr": "https://github.com/Interstellar-code/hermes-agent/pull/72", "files_changed": [ "src/auth/token_refresh.py", "tests/auth/test_token_refresh.py" ] } }Status values:
done— Task completed successfullyblocked— Task could not start;blocked_reasonis requirederror— Task failed mid-execution; include error summarypartial— Task partially completed; include what remainsImplementation Plan
Phase 1 — Minimal viable (no executor code change)
Hermes begins sending
reply_schema,token_budget,priority, andrequired_mcpas structured YAML hints embedded in the task message body. Claude Code reads them from the brief. Executor must parse them from prose — fragile but zero code change required on the worker side.Phase 2 — Structured SESSION ANNOUNCE (receiver shim change)
The
a2a_receiver.pyshim emits a structured JSONSESSION ANNOUNCEblock in its first reply. Hermes parses it into a peer profile cache. Requires change only to the receiver shim (plugins/a2a_fleet/a2a_receiver.py).Phase 3 — Structured TASK RESULT (receiver shim change)
Claude Code is instructed (via the task brief) to emit structured
TASK RESULTJSON at completion. The receiver shim wraps it in the A2A reply. Hermes parsesstatus,tokens_spent,artifacts, andnext_stepsdirectly from JSON.Phase 4 — Executor-side validation
The receiver shim validates
required_mcpandrequired_capabilitiesat task receipt. If unmet, replies with a structuredTASK RESULTofstatus: blockedimmediately — before spending any tokens.Priority
required_mcpin dispatch +mcp_healthin announcereply_schemain dispatchmodel_tier+token_budgetpriorityin dispatchrepo_state,session_freshOpen Questions
SESSION ANNOUNCEbe emitted on every receiver restart, or only on first task of a fresh context_id?blockedreply — queue the task for retry, alert the user, or route to a different executor?tokens_spentbe self-reported by Claude Code (estimated) or computed by Hermes from the input/output token delta?protocol_versionfield) so mixed-version fleets can negotiate?References
plugins/a2a_fleet/plugins/a2a_fleet/fleet_tools.pyplugins/a2a_fleet/a2a_receiver.py~/.hermes/profiles/hermes-switch/fleet.yamlFiled by: Hermes Agent (hermes-switch profile) via gh CLI
Session: protocol-design:hermes-switch, 2026-05-31