Skip to content

[backend] refactor: route web chat through GatewayRunner pipeline for parity with Telegram/Slack/Discord #188

@Interstellar-code

Description

@Interstellar-code

Problem

The Web UI chat currently uses a separate, thin code path (ApiServerAdapter._handle_chat_completions()_create_agent()agent.run_conversation()) that bypasses the full gateway pipeline used by all messaging platforms (Telegram, Slack, Discord).

This means the Web UI is missing critical infrastructure that messaging platforms get for free:

Capability Telegram Slack/Discord Web UI
Persistent sessions (auto-reset, expiry) ⚠️ opt-in only
Session context injection ✅ rich ✅ rich ❌ none
Session hygiene (auto-compress oversized transcripts)
Plugin hooks (pre_gateway_dispatch)
Auto-skills (topic/channel bindings)
Command handling (/stop, /new, /steer, /queue)
Interrupt support
Agent caching (LRU + idle TTL warm starts) ❌ fresh per request
Provider routing (allow/ignore/order/sort) ⚠️ basic
Typing indicators

Architecture

Current: Two Separate Paths

Web UI:
  POST /v1/chat/completions
    → ApiServerAdapter._handle_chat_completions()
      → _create_agent() [separate, thin]
      → _run_agent()
        → agent.run_conversation()
      ← JSON/SSE response

Telegram/Slack/Discord:
  Platform SDK event
    → PlatformAdapter._handle_*_message()
      → _build_message_event()
      → self.handle_message(event)
        → BasePlatformAdapter.handle_message()
          → _process_message_background()
            → GatewayRunner._handle_message()
              ┌─────────────────────────────────────────┐
              │ FULL GATEWAY PIPELINE                   │
              │ • Auth / pairing                        │
              │ • Session persistence                   │
              │ • build_session_context()               │
              │ • Plugin hooks                          │
              │ • Auto-skills                           │
              │ • Session hygiene (auto-compression)    │
              │ • Vision enrichment                     │
              │ • _run_agent() with agent caching       │
              │ • run_conversation()                    │
              └─────────────────────────────────────────┘
            ← agent_result
            → Response delivery (platform-specific)

Proposed: Shared Pipeline, Forked Delivery

All paths:
  → GatewayRunner._handle_message()
    → [full gateway pipeline]
    ← agent_result

Delivery (platform-specific):
  Telegram → MarkdownV2, chunking, typing, media extraction
  Slack    → Block Kit, threads, reactions
  Web UI   → SSE stream, JSON, no formatting/chunking

Implementation Plan

Phase 1: Route Web Chat Through Gateway Pipeline

File: hermes-agent/gateway/platforms/api_server.py

The /api/sessions/{id}/chat/stream endpoint should:

  1. Build a MessageEvent from the web chat request (mapping session headers to SessionSource)
  2. Route it through self.handle_message(event) like other platform adapters
  3. Let the gateway pipeline handle session management, context injection, hygiene, etc.
  4. Capture the agent result and stream it back as SSE
# Current (thin path):
agent = self._create_agent(ephemeral_system_prompt=system_prompt, session_id=session_id)
result = await self._run_agent(user_message=user_message, ...)

# Proposed (gateway path):
event = self._build_message_event(session_id, user_message, system_prompt, ...)
await self.handle_message(event)  # → full gateway pipeline
# → SSE events from streaming callbacks

Phase 2: SessionSource Mapping

Map web UI headers to SessionSource:

  • X-Hermes-Session-Idsession_id
  • X-Hermes-Session-Key → stable memory scope key
  • Platform = "switchui" (new value) or reuse "api_server"

Phase 3: Platform Toolsets Config

Add platform_toolsets.switchui to config.yaml so the web chat has its own toolset configuration, matching the pattern used by platform_toolsets.telegram.

Phase 4: Streaming Callbacks Through Gateway Path

Ensure stream_delta_callback, tool_start_callback, and tool_complete_callback are wired into the agent when dispatched through handle_message(), so SSE events still flow to the Web UI.

Phase 5: Override Delivery Tail

The Web UI delivery differs from messaging platforms:

  • No MarkdownV2 conversion (Web UI renders its own markdown)
  • No message chunking
  • No MEDIA:<path> extraction (handled differently)
  • No typing indicators (Web UI has its own loading state)
  • JSON/SSE response format

This can be achieved by:

  • The web endpoint providing its own _message_handler callback
  • Or by detecting platform="switchui" and skipping platform-specific formatting

Expected Benefits

After implementation, Web UI chat will immediately gain:

  1. Warm agent starts — LRU cache with idle TTL (no cold start per request)
  2. Session context — "You are on Switch UI. User: Rohit..." injected into system prompt
  3. Auto-compression — sessions that grow too large get compressed at 85% threshold
  4. Plugin hooks — full pre_gateway_dispatch plugin chain
  5. Interrupt support/stop and agent interruption work
  6. Provider routing — per-session model/provider routing applies
  7. Auto-skills — topic/channel skill bindings work if configured
  8. Session hygiene — consistent with messaging platforms

Acceptance Criteria

  • Web chat routes through GatewayRunner._handle_message()
  • Session context is injected (platform, user, workspace)
  • Agent caching works (warm starts, not fresh per request)
  • Session hygiene auto-compresses oversized transcripts
  • SSE streaming still works for tool events and token deltas
  • /stop and interrupt work from the Web UI
  • No regression in existing Web UI functionality
  • /v1/chat/completions (OpenAI-compatible) continues to work as-is for external clients

Notes

  • The /v1/chat/completions endpoint (OpenAI-compatible API) should remain unchanged — it serves external clients that expect the OpenAI protocol
  • The /api/sessions/{id}/chat/stream endpoint (used by Switch UI) is the one to refactor
  • The gateway pipeline was designed to be platform-agnostic above the delivery layer; the API server just never connected to it
  • This is primarily a gateway-side change; minimal changes needed in the Switch UI frontend

Related Code

  • Thin path (to be replaced): hermes-agent/gateway/platforms/api_server.py_handle_chat_completions(), _create_agent(), _run_agent()
  • Full pipeline (to be reused): hermes-agent/gateway/run.py_handle_message(), _handle_message_with_agent(), _run_agent()
  • Base adapter (routing): hermes-agent/gateway/platforms/base.pyhandle_message(), _process_message_background()
  • Telegram adapter (reference): hermes-agent/gateway/platforms/telegram.py_handle_text_message(), _build_message_event()

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priorityenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions