Skip to content

F4.A — Debug capture: JSONL writer, redaction, retention, SSE live tail #36

@HXYerror

Description

@HXYerror

Part of #23. Depends on F1.D, F2.A, F2.B, F2.C.

Background

Today only --verbose console logs exist. Operators debugging Anthropic↔OpenAI translation, model-specific quirks, or upstream errors need full request/response visibility — but storing prompts/responses on disk is the kind of feature that creates GDPR/SOC2 exposure.

⚠️ Reviewer caveats — must be addressed

Privacy posture (devil-advocate, security S7): Storing other users' prompts on the operator's disk creates a data controller relationship. Default retention should arguably be 0 (in-memory only); on-disk persistence should require explicit opt-in beyond just enabling the flag.

Redaction completeness (security S1, backend B12): The original draft regex /gh[opusr]_/ misses:

  • github_pat_… (fine-grained PATs, contain underscores)
  • The upstream Copilot bearer (JWT shape eyJ…\.eyJ…\.…)
  • OAuth client id Iv1\.b507a08c87ecfe98 (in src/lib/api-config.ts)

Without complete redaction, every captured upstream request leaks a working Copilot JWT in plaintext on disk.

Path traversal (security S5): /admin/traces/:date.jsonl MUST validate input strictly.

Goal

Per-key (or admin-header) capture of full request/response to JSONL with mandatory redaction, bounded retention, and live tail.

Tasks

  • src/middleware/trace.ts: enabled when key.debug_enabled === true OR X-Capi-Debug: 1 from admin tier (per F2.C)
  • Capture both legs: client→proxy AND proxy→upstream (the latter contains the Copilot JWT — redaction MUST cover it)
  • Capture shape per request: {trace_id, ts, key_id, route, req:{method,url,headers,body}, upstream_req:{...}, upstream_res:{status,headers,body|stream_chunks}, res:{...}, latency_ms}
  • Redaction (mandatory, defense-in-depth):
    • Strip headers by name (case-insensitive): authorization, x-api-key, cookie, set-cookie, proxy-authorization
    • Body patterns to mask: gh[oprsu]_[A-Za-z0-9]{20,}, github_pat_[A-Za-z0-9_]{20,}, JWT shape eyJ[A-Za-z0-9_-]+?\.eyJ[A-Za-z0-9_-]+?\.[A-Za-z0-9_-]+, Iv1\.[a-f0-9]{16}
    • Block trace write if any unmasked match is detected after redaction (defense-in-depth: redaction sanity check before persistence)
    • Add fuzz tests
  • JSONL writer: append to ~/.local/share/copilot-api/traces/YYYY-MM-DD.jsonl, mode 0600, line-buffered; rotation by day in filename
  • Retention:
    • Default 7 days, configurable in config.json
    • 1 GB total cap, drop oldest day on overflow; alarm/log when eviction triggers within retention window (per backend review Add Copilot-Vision-Request header for image inputs in Responses path #20)
    • Default traces_days = 0 (in-memory only) when first introducing the feature; admin must explicitly set a non-zero value (per devil-advocate privacy posture)
  • SSE live tail at /admin/traces/stream (admin tier):
    • Single in-process broadcaster (per backend review Anthropic /v1/messages → Responses API adapter #8) — Set<ReadableStreamController>. Writer pushes each post-redaction line to all subscribers. NOT N file watchers.
    • Cap concurrent subscribers at 4
    • Bounded per-subscriber queue (drop-oldest at 1 MB)
    • Heartbeat ping every 15 s
    • Last-Event-ID reconnect support
  • Download endpoint /admin/traces/:date.jsonl (admin tier):
    • Validate ^\d{4}-\d{2}-\d{2}$ strictly (per security S5)
    • Resolve against tracesDir; assert resolved path startsWith tracesDir + sep
    • Reject anything else with 400
  • Trace viewer page /admin/traces showing live tail (with banner if any key has debug_enabled per F2.E)
  • Debug TTL auto-disable hook: works with F2.E debug-ttl-sweeper
  • Audit log the toggle event (F2.D)
  • README security notes: document the threat model (anyone with read access to the data dir can read traces); recommend chmod 0700 on the data dir; install script should set this
  • Tests:
    • Redaction completeness: fuzz test asserts NO unmasked gho_, ghp_, ghu_, ghs_, ghr_, github_pat_, eyJ…, Iv1\.… in any sample after redaction
    • File perms 0600 verified
    • Retention sweep evicts correctly under burst
    • SSE multiple subscribers receive same data
    • Path traversal attempts (../, URL-encoded variants) rejected
    • 1GB cap enforced

Acceptance criteria

  • A request with debug_enabled produces exactly one redacted JSON line; grep -E 'gho_|ghp_|ghu_|github_pat_|eyJ' file finds nothing
  • tail -f and /admin/traces/stream show identical content within 100 ms
  • curl /admin/traces/../../etc/passwd returns 400
  • Traces dir total size never exceeds 1 GB across long-running sessions
  • Default install with no admin action → no traces persisted (because traces_days=0 default)

File pointers

  • New: src/middleware/trace.ts, src/services/trace-writer.ts, src/services/trace-broadcaster.ts, src/services/trace-retention.ts, src/admin/traces/{stream,download,page}.tsx, tests/trace.test.ts, tests/trace-redaction-fuzz.test.ts
  • Touch: src/lib/paths.ts (tracesDir()), src/admin/layout.tsx, README.md

Dependencies

Depends on F1.D, F2.A, F2.B, F2.C.

Metadata

Metadata

Assignees

No one assigned

    Labels

    debugRequest/response debug capturesecuritySecurity-sensitivestoragePersistence layer

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions