[CAPABILITY] Cron agent sessions do not survive gateway restarts — interrupted jobs vanish without record or retry

## Problem (observed 2026-06-10, cost a full daily cycle)
`evolution-implementation` fired on schedule at 22:00:33 and worked for ~30 minutes (46 API calls). At 22:30:47 the gateway was restarted (operator was applying provider-failover config via Telegram; the agent itself initiated the planned restart). The cron session was killed mid-flight:
- no result was recorded — `hermes cron list` shows NO "Last run" for the job at all (as if it never ran);
- no retry / no re-fire — the daily slot was simply lost;
- downstream `evolution-integration` ran at 23:01, found nothing, reported "ok" — the whole night looked healthy while producing nothing.

Session log ends abruptly: `agent.log` last entry for `cron_ac158635a029_20260610_220033` at 22:30:32, then the new gateway process initializes MCP at 22:30:51.

## Why it matters
Gateway restarts are ROUTINE (config changes, `hermes update` auto-update at 04:17 nightly, /restart). Any cron job unlucky enough to overlap one silently dies. Combined with the lack of failure records this is invisible — #83 (watchdog) would detect the gap next day, but the work is still lost.

## Proposed direction
Any of (in increasing ambition):
1. On gateway startup, detect cron sessions that were in-flight at shutdown (marker file written at job start, cleared at completion) and record them as `interrupted` — making the failure visible to `cron list` and the future watchdog (#83).
2. Re-fire interrupted jobs once on startup if still within N hours of their scheduled slot.
3. Graceful drain: planned restarts already drain user sessions — extend the drain to wait (bounded) for running cron sessions, or delay the restart until the job completes.

## Value
- Impact: 0.8 (a routine operation silently destroys daily cycles; happened on day one of observation)
- Effort: 0.4 (option 1 is a marker file + startup scan; 2-3 incremental)
- Priority Score: 4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CAPABILITY] Cron agent sessions do not survive gateway restarts — interrupted jobs vanish without record or retry #105

Problem (observed 2026-06-10, cost a full daily cycle)

Why it matters

Proposed direction

Value

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[CAPABILITY] Cron agent sessions do not survive gateway restarts — interrupted jobs vanish without record or retry #105

Description

Problem (observed 2026-06-10, cost a full daily cycle)

Why it matters

Proposed direction

Value

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions