You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The history pipeline is "fetch everything, trim on the consumer" at every layer:
src/routes/api/history.ts:95 — getMessages(sessionKey) pulls the ENTIRE message list from the gateway; :119 applies messages.slice(-limit) in memory. limit is cosmetic — the full transcript (tool payloads + thinking blocks) always crosses the wire gateway → workspace.
src/server/hermes-api.ts:264-275 and claude-dashboard-api.ts:133-138 — message endpoints accept no limit/offset.
src/screens/chat/chat-queries.ts:69 — frontend hardcodes limit: '1000', making the trim a no-op for most sessions.
use-realtime-chat-history.ts:492 — full re-fetch every 30s plus on every stream-clear.
history.ts:57-92 — resolving "main" costs listSessions(30,0) + getMessages = two sequential gateway round-trips per default view; the main-resolution heuristic is duplicated in send-stream.ts.
Symptom: slow session open and heavy 30s background sync on large sessions (e.g. the 505-message session); SSR CPU/memory per refetch.
Fix
Push limit/offset (or ?since=<id> delta) into the gateway messages call so trimming happens at source — needs a small hermes-agent change; workspace-side mitigation until then: request tail only, disable focus-refetch while streaming.
Extract the "main" resolution into one shared helper; briefly cache the resolved id.
Found in chat-area audit 2026-06-11. Related: #208.
Problem
The history pipeline is "fetch everything, trim on the consumer" at every layer:
src/routes/api/history.ts:95—getMessages(sessionKey)pulls the ENTIRE message list from the gateway;:119appliesmessages.slice(-limit)in memory.limitis cosmetic — the full transcript (tool payloads + thinking blocks) always crosses the wire gateway → workspace.src/server/hermes-api.ts:264-275andclaude-dashboard-api.ts:133-138— message endpoints accept no limit/offset.src/screens/chat/chat-queries.ts:69— frontend hardcodeslimit: '1000', making the trim a no-op for most sessions.use-realtime-chat-history.ts:492— full re-fetch every 30s plus on every stream-clear.use-chat-history.ts:394-397—staleTime: 0+refetchOnWindowFocus: truewith no streaming guard: re-focusing the tab mid-stream refires a full history fetch that races/overwrites the live SSE buffer. Likely contributor to when I have given the task on the chat and leave the chat session the response is getting some what scrappy.., I have to stay in the chat window to ge... #208 (responses "scrappy" when leaving and returning to the chat window).history.ts:57-92— resolving "main" costslistSessions(30,0)+getMessages= two sequential gateway round-trips per default view; the main-resolution heuristic is duplicated insend-stream.ts.Symptom: slow session open and heavy 30s background sync on large sessions (e.g. the 505-message session); SSR CPU/memory per refetch.
Fix
limit/offset(or?since=<id>delta) into the gateway messages call so trimming happens at source — needs a small hermes-agent change; workspace-side mitigation until then: request tail only, disable focus-refetch while streaming.Found in chat-area audit 2026-06-11. Related: #208.