Enable Hermes↔Hermes A2A peering (response_handler: agent across profiles) + fix A2A bind-race

# Enable Hermes↔Hermes A2A peering (response_handler: agent across profiles) + fix the A2A server bind-race

> Architecture corrected after a code-level review (Codex). An earlier draft of this issue wrongly described A2A ingress as the dashboard/web tier reached over a `:8642` IPC channel — that is FALSE. `:8642` is the unrelated OpenAI-compatible API server (`gateway/platforms/api_server.py`, `/v1`). The real model is below. The stale `plugins/a2a_fleet/references/hermes-gateway-plugin-guide.md` (claims `/api/plugins/a2a_fleet/jsonrpc`, `/sse/{task_id}`, `/tasks` routes that do not exist) caused the error and must be fixed/removed.

## Goal
One Hermes *profile* (Switch) dispatches a task to **another profile's agent** (Neo/Morpheus/Trinity) over A2A and gets a reply — `fleet_send("neo", "...")` → Neo's agent loop answers. No new server software: reuse the `agent` protocol that already ships, exposed per profile.

## Actual architecture (verified in code — build on this)
- **A2A ingress = a standalone `uvicorn` listener** in `plugins/a2a_fleet/server.py`, bound to `fleet.server.bind_host:bind_port` (`server.py:314`). Serves `/jsonrpc` (SendMessage, `server.py:181`), `/.well-known/agent-card.json`, `/health`. It is NOT the dashboard server and NOT `/api/plugins/...` (those are read-only conversation/peer routes, `dashboard/plugin_api.py:14`).
- **Route B `agent` = in-process bridge.** Inbound `SendMessage` with `response_handler: agent` calls `get_agent_bridge()` (`server.py:206`) and runs `bridge.bridge_sync` in an executor (`server.py:224`); the bridge does `asyncio.run_coroutine_threadsafe(self._message_handler(event), self._gateway_loop)` (`adapter.py:190`) — i.e. the uvicorn daemon thread hands the message to the **gateway agent loop in the SAME process**. The bridge only exists once the platform adapter connects (`adapter.py:104`, wired at `gateway/run.py:4157`).
- **Implication:** the A2A listener MUST run in the process that hosts the gateway agent loop. If any other process wins the port, its listener has no bridge → inbound `agent` requests return "bridge not ready" (`server.py:210`).
- **Outbound `fleet_send`** (`fleet_tools.py:47` → `client.py:85`) is independent of all the above — it just POSTs `SendMessage` to a peer `url` with the peer bearer. A profile can send even if it runs no listener.

## CRITICAL defect — fix before anything else (blocker)
`register()` calls **`_start_server_in_thread()` unconditionally** (`__init__.py:546`), guarded only by a module-local `_server_thread` (`__init__.py:95`). But `register(ctx)` runs in **every** process that loads the plugin — generic tool startup (`model_tools.py:197`), gateway startup (`gateway/run.py:4056`), CLI deferred startup (`cli.py:880`). So multiple processes on one profile **race to bind the same `bind_port`**; bind failure only surfaces after uvicorn exits (`server.py:327`/`:356`) and is swallowed. If a non-gateway process wins, Route B is dead (no bridge) — nondeterministic.

**Fix:** add an explicit process-role gate so **only the gateway/agent process** (the one where `register_platform`/the bridge is available) starts the A2A listener. Other plugin-load contexts must skip `_start_server_in_thread()`. Add a test asserting the server start path fires only in the gateway context.

## Implementation (one pass — follows existing patterns, no phases)
1. **Process-role gate** for `_start_server_in_thread()` (the blocker above). Co-locate listener + bridge in the gateway process only.
2. **Per-profile enablement** — each profile that should RECEIVE: `$HERMES_HOME/profiles/<p>/fleet.yaml` → `fleet.enabled: true`, `fleet.response_handler: agent`, `fleet.server.bind_port: <unique>` (mandatory, `fleet_config.py:170`). Map: `switch 9219, neo 9220, morpheus 9221, trinity 9222`. The agent-card already advertises the profile name.
   - **Prereq:** that profile must run `hermes_cli.main --profile <p> gateway run` with `platforms.a2a_fleet` connected (the bridge readies on connect, `gateway/run.py:4157`). A CLI-only process is NOT enough for inbound `agent` mode.
3. **Peer wiring** — each SENDER lists the others as **plain agent peers** (NO `managed`/`mode`/`repo_path` — those are for deployed CLI receivers; plain peers validate fine, `fleet_config.py:237` + test `test_fleet_config.py:67`):
   ```yaml
   agents:
     neo:
       url: http://127.0.0.1:9220/jsonrpc
       agent_card_url: http://127.0.0.1:9220/.well-known/agent-card.json
       token_env: A2A_HERMES_TOKEN_NEO   # profile-scoped name (see #5)
   ```
   Bidirectional = both list each other + both run a listener.
4. **Handshake convention** — reuse the executor handshake pattern: first message on reserved contextId `handshake:hermes-<peer>` where initiator declares role/purpose and receiver confirms role=agent + profile name + ready. The `agent` handler already processes it; this is a documented convention, not new code.
5. **Profile-scoped token env names** — managed-peer token envs are mode+repo-derived and collision-safe (`managed_peers.py:32/176`), but `fleet.server.token_env` and plain-peer `token_env` are raw `os.environ.get` (`fleet_config.py:104/166/242`). With multiple profiles in ONE host env, names MUST be profile-scoped (e.g. `A2A_HERMES_TOKEN_NEO`, not a generic `SWITCH_A2A_TOKEN`). Loopback dev may use `auth_required: false`.
6. **Docs cleanup** — fix/remove the stale `references/hermes-gateway-plugin-guide.md` (it documents non-existent `/api/plugins/a2a_fleet/jsonrpc|sse|tasks` routes). Add a "Hermes↔Hermes peering" section to the `deploy-fleet` skill (port map, plain-agent-peer shape, handshake, the gateway-run prereq).

## Required before an implementer starts
- A valid `fleet.yaml` with `fleet.server.bind_port` per receiving profile.
- Each receiving profile running `gateway run` with `platforms.a2a_fleet` connected.
- Profile-scoped token envs when `auth_required: true`.

## Out of scope (separate issues — NOT needed for a minimal round-trip)
Async Task lifecycle (`tasks/*`), streaming (`message/stream`) — currently stubbed `-32601`; structured `TASK_RESULT` + `SESSION_ANNOUNCE` (#71); P0-2 deploy-tool runtime `repo_path`-empty.

## Why this is small
The listener, in-process bridge, outbound client, agent-card, token resolution, and session model already ship (the `agent` protocol is one of 5 live protocols). Net new work = the process-role gate (blocker) + per-profile config + a documented handshake + doc cleanup. Do NOT build a new HTTP surface or a new IPC bridge — none is needed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Hermes↔Hermes A2A peering (response_handler: agent across profiles) + fix A2A bind-race #120

Enable Hermes↔Hermes A2A peering (response_handler: agent across profiles) + fix the A2A server bind-race

Goal

Actual architecture (verified in code — build on this)

CRITICAL defect — fix before anything else (blocker)

Implementation (one pass — follows existing patterns, no phases)

Required before an implementer starts

Out of scope (separate issues — NOT needed for a minimal round-trip)

Why this is small

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Enable Hermes↔Hermes A2A peering (response_handler: agent across profiles) + fix A2A bind-race #120

Description

Enable Hermes↔Hermes A2A peering (response_handler: agent across profiles) + fix the A2A server bind-race

Goal

Actual architecture (verified in code — build on this)

CRITICAL defect — fix before anything else (blocker)

Implementation (one pass — follows existing patterns, no phases)

Required before an implementer starts

Out of scope (separate issues — NOT needed for a minimal round-trip)

Why this is small

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions