Skip to content

/cleanup slash command — detect and resolve orphaned, stale, and hung processes from previous sessions #1790

@berrat

Description

@berrat

Describe the feature or problem you'd like to solve

When a Copilot CLI session crashes, is force-killed, or exits uncleanly, it can leave behind a range of residual state: orphaned child processes, hanging MCP server processes, stale lock files, lingering worker or sub-agent processes, and session database files in an inconsistent state. Currently there is no built-in way to detect or resolve any of this. Users who experience repeated crashes or unexpected behaviour across sessions have no tool to audit or clean up leftover state — they must either reboot, hunt for processes manually, or live with the side effects.

The fix must be lightweight and token-free — a pure local operation that does not invoke the LLM, consume API quota, or count as a premium request.

Proposed solution

Introduce a /cleanup slash command that performs a fast, local-only scan of all residual state from previous sessions and presents a concise, actionable report — without invoking the model or consuming any API quota.

What it scans:

Category What it looks for
Orphaned processes Child processes (MCP servers, sub-agents, worker tasks) whose parent session PID no longer exists
Hung / unresponsive processes Copilot CLI processes that are running but not responding (detectable via PID status + lock file age)
Stale lock files .lock or session mutex files left behind by crashed instances
Inconsistent session state session.db files from sessions that never cleanly closed (no clean-exit marker)
Abandoned log files Oversized or orphaned log files from sessions that did not rotate cleanly
Zombie sub-agents Background agent processes that are still alive but whose parent session is gone

Output format — compact, structured, actionable:

/cleanup — Session Health Check

  ✖  2 orphaned MCP server processes (playwright PID 18842, git PID 19001)
  ✖  1 stale lock file  (~/.copilot/session-state/813d.../session.lock  age: 4h 12m)
  ⚠  1 inconsistent session DB  (~/.copilot/session-state/813d.../session.db)
  ✔  No hung Copilot CLI processes found
  ✔  No zombie sub-agent processes found

  Found 3 issue(s).  Resolve all? (Y/n)  — or press Enter to review individually

Resolution options (per item or all-at-once):

No LLM involvement — this is entirely a local process/filesystem operation. It must not trigger any API call, token usage, or premium request.

Additional flags for automation:

/cleanup --dry-run    # Report issues without taking any action
/cleanup --force      # Resolve everything without confirmation prompts
/cleanup --stale-only # Only target lock files and session state, skip processes

Example prompts or workflows

  1. User returns after a machine sleep that killed their session. Runs /cleanup — two orphaned MCP servers and a stale lock file are found and resolved in seconds, zero tokens spent.
  2. User notices a new session starts slowly. /cleanup --dry-run reveals a hung Copilot CLI process from a previous crash is still holding resources.
  3. CI/CD script runs /cleanup --force before starting a fresh session to guarantee a clean environment with no residual state from a previous run.
  4. User runs /cleanup and finds an inconsistent session DB. Chooses to attempt recovery (hands off to Recover queued and pending prompts after agent crash or unexpected exit #1780 queue recovery flow) rather than delete it.
  5. /cleanup finds nothing — outputs ✔ All clear. No orphaned or stale state found. and exits immediately, consuming no time or tokens.

Additional context

  • OS: Windows 11 Pro (Build 26200 / 24H2)
  • Shell: PowerShell 7.5.4
  • Copilot CLI: 0.0.420 (win32-x64)
  • This command must never invoke the LLM — it is a local diagnostic and remediation tool only. No API calls, no token usage, no premium requests consumed under any circumstances.
  • /diagnose already handles connectivity checks — /cleanup is the complementary local health check for process and session state.
  • On Windows, process termination should use Stop-Process / TerminateProcess; on Unix, SIGTERM followed by SIGKILL after a grace period.
  • The session state directory (~/.copilot/session-state/) and log directory (~/.copilot/logs/) are the primary scan targets — both are already well-defined locations.
  • Related: Recover queued and pending prompts after agent crash or unexpected exit #1780 (queue recovery from crashed sessions) — /cleanup and queue recovery are complementary; /cleanup handles processes and locks, Recover queued and pending prompts after agent crash or unexpected exit #1780 handles prompt data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions