[Epic] Admin Plane: model config, multi-key auth, usage dashboard, debug mode

## Background

copilot-api today is a strict single-tenant proxy. To run it as an internal/team gateway we need an admin plane with: configurable model exposure (with optional WebUI), multi-client API key auth, usage telemetry dashboard, and on-demand request/response debug capture.

## Approved architecture

Captured during the OPC build phase:

- **Three storage layers** by access pattern:
  - `~/.local/share/copilot-api/config.json` — model aliases, retention knobs, feature flags (rare writes, atomic rename + `fs.watch` hot reload)
  - `~/.local/share/copilot-api/copilot-api.db` — SQLite WAL — keys + usage events + sessions (transactional, indexed)
  - `~/.local/share/copilot-api/traces/YYYY-MM-DD.jsonl` — debug request/response (high-volume append-only)
  - `~/.local/share/copilot-api/audit.jsonl` — admin audit log (append-only)
- **Auth**: opaque `sk-cap-<32B base32>` keys, sha256 at rest, two tiers (admin / client). Bearer header carries our key; proxy strips before forwarding upstream.
- **Migration**: see Q1 below — default decision pending operator review.
- **WebUI**: `hono/jsx` server-rendered + uPlot vendored, mounted at `/admin/*` same port. Zero new build pipeline. CSRF + CSP + HttpOnly Secure SameSite=Strict session cookie.
- **Telemetry**: events table records token counts only — never request/response bodies (that's debug mode's job).
- **Debug**: opt-in per-key OR per-request `X-Capi-Debug` (admin only). Default retention 0 days (in-memory only); 7-day cap when enabled; 1 GB total hard cap.

## ⚠️ Open strategic questions (must be decided before implementation)

These were raised during multi-role review and are **not** purely engineering decisions. Resolving them affects scope and even whether sub-issues should be implemented.

### 🔴 Q1 — GitHub abuse-detection risk (devil-advocate, security S2)
The README explicitly warns that fan-out from a single OAuth identity to many concurrent users triggers GitHub Security flags and account suspension. This admin plane *productizes* fan-out. Pick one:
1. Ship as designed with a startup banner
2. **Recommended**: `auth=on` default in v0.8 with auto-bootstrap; `--no-auth` requires `--i-accept-account-suspension-risk` flag and refuses non-loopback bind without it (#33 default)
3. Cap concurrent in-flight upstream requests per OAuth identity by default
4. Recommend litellm for multi-tenant; keep copilot-api single-tenant

### 🔴 Q2 — Are we duplicating litellm?
litellm already provides multi-key auth, virtual keys, dashboards, spend tracking, multi-provider, retention. copilot-api's moat is being a small, single-process, protocol-faithful Copilot adapter. Pick one:
1. Ship full plan as designed
2. **Recommended by devil-advocate**: ship MVP only — `config.json` (#24) with model exposure + per-key allow-lists, `/usage` JSON endpoint, no SQLite/WebUI/debug-capture. Defer the rest until concrete demand emerges.
3. Sequence: ship Responses API epic (#1) FIRST; revisit admin plane after.

### 🟡 Q3 — Privacy posture flip
Today the operator processes only their own prompts (no data-controller relationship). Multi-client + debug capture turns the operator into a data controller in EU/UK and a "service provider" under CCPA. Recommended in #36: default `traces_days = 0` (in-memory only); on-disk retention requires explicit operator opt-in.

### 🟡 Q4 — Sequencing vs Responses API epic (#1)
Epic #1 (Responses API + Anthropic fidelity) covers protocol work for the existing solo-Claude-Code user base. Doing this epic first delays #1. Recommended: ship #1 first, admin plane second.

## 🚨 Hard PR-blockers identified by review

These must land in their respective sub-issues; failing any one is a non-merge-able defect:

| Tag | Issue | Blocker |
|---|---|---|
| **S1** | #36 | Trace redaction must cover `github_pat_…`, Copilot JWT `eyJ…\.eyJ…\.…`, `Iv1\.…` OAuth client id; redact by header name unconditionally on **both ingress and egress** |
| **S2** | #33 | Reconsider `--no-auth` default — public-facing port + no auth = account suspension risk |
| **S3** | #28 | Admin key printed to stdout leaks to Docker `json-file` / journald / k8s log shippers permanently — write to mode-0600 file; only TTY shows literal key |
| **S4** | #28 | Bootstrap condition is "zero **admin-tier** keys", not "zero keys"; CLI escape hatch `copilot-api admin recover` for lockout |
| **S5** | #36 | `/admin/traces/:date.jsonl` must reject anything not matching `^\d{4}-\d{2}-\d{2}$` and assert path stays inside traces dir |
| **B1** | #34 | Streaming token counts require injecting `stream_options.include_usage=true` outbound; without it telemetry rows record 0 for every streaming request |
| **B2** | #25 | SSE alias rewrite must operate at the parsed-event layer of `fetch-event-stream`'s `events()`, not regex on raw bytes |
| **B3** | #27 | `PRAGMA journal_mode=WAL` must run **before** any transaction (silently no-ops inside one) |
| **B4** | #29 | Existing `src/lib/rate-limit.ts` mutates global `state.lastRequestTimestamp` — extend for per-key buckets via `Map<keyId, …>`, do NOT mutate globals from middleware |

## Sub-issues

### F1 — Model exposure config + WebUI
- [ ] #24 — config.json schema, atomic write, fs.watch hot reload
- [ ] #25 — Bidirectional model alias rewriting (request + response + SSE) — **scope down to hide-only?** see B2/devil-advocate
- [ ] #26 — Filter /v1/models by config + per-key scope
- [ ] #31 — Admin WebUI shell at /admin/* with login + CSRF + CSP

### F2 — Multi-client auth
- [ ] #27 — bun:sqlite + WAL + migration runner
- [ ] #28 — keys table + sk-cap key generator + first-run admin bootstrap
- [ ] #29 — Auth middleware (Bearer extraction, tier + scope check, header sanitization)
- [ ] #30 — Audit log for admin actions
- [ ] #32 — Admin WebUI: keys management page
- [ ] #33 — --no-auth legacy mode + v0.8 / v0.9 deprecation

### F3 — Usage dashboard
- [ ] #34 — events table + telemetry middleware + retention sweep
- [ ] #35 — Admin WebUI: usage dashboard with uPlot + CSV export

### F4 — Debug mode
- [ ] #36 — Debug capture: JSONL writer, redaction, retention, SSE live tail

## Dependency graph

```
config-store (#24) ──► alias rewriter (#25) ──► /v1/models filter (#26)
                                                     ▲
sqlite (#27) ──► keys (#28) ──► auth middleware (#29) ─┘
                       │
                       └─► audit log (#30)
                       
auth (#29) ──► WebUI shell (#31) ──► keys page (#32)
                                  └─► usage dashboard (#35)
                                  └─► trace viewer (in #36)

events (#34) ──► usage dashboard (#35)
trace capture (#36) depends on (#27, #29, #31)
--no-auth deprecation (#33) depends on (#28, #29)
```

## Recommended landing order

If proceeding with the full plan after deciding Q1–Q4:

1. **Foundations**: #24 → #27 → #28 → #29 → #30
2. **Compat shipping vehicle**: #31 → #33 (so the --no-auth decision is tested)
3. **Per-feature**: #25 → #26 (F1 user-visible), #32 (key management), #34 → #35 (telemetry), #36 (debug)
4. **Update README + CHANGELOG**

## References

- OPC review artifacts: `.harness/nodes/code-review/run_1/eval-{backend,security,devil-advocate}.md`
- Existing Responses API epic: #1 (recommended to land first)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Epic] Admin Plane: model config, multi-key auth, usage dashboard, debug mode #23

Background

Approved architecture

⚠️ Open strategic questions (must be decided before implementation)

🔴 Q1 — GitHub abuse-detection risk (devil-advocate, security S2)

🔴 Q2 — Are we duplicating litellm?

🟡 Q3 — Privacy posture flip

🟡 Q4 — Sequencing vs Responses API epic (#1)

🚨 Hard PR-blockers identified by review

Sub-issues

F1 — Model exposure config + WebUI

F2 — Multi-client auth

F3 — Usage dashboard

F4 — Debug mode

Dependency graph

Recommended landing order

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Tag	Issue	Blocker
S1	#36	Trace redaction must cover `github_pat_…`, Copilot JWT `eyJ…\.eyJ…\.…`, `Iv1\.…` OAuth client id; redact by header name unconditionally on both ingress and egress
S2	#33	Reconsider `--no-auth` default — public-facing port + no auth = account suspension risk
S3	#28	Admin key printed to stdout leaks to Docker `json-file` / journald / k8s log shippers permanently — write to mode-0600 file; only TTY shows literal key
S4	#28	Bootstrap condition is "zero admin-tier keys", not "zero keys"; CLI escape hatch `copilot-api admin recover` for lockout
S5	#36	`/admin/traces/:date.jsonl` must reject anything not matching `^\d{4}-\d{2}-\d{2}$` and assert path stays inside traces dir
B1	#34	Streaming token counts require injecting `stream_options.include_usage=true` outbound; without it telemetry rows record 0 for every streaming request
B2	#25	SSE alias rewrite must operate at the parsed-event layer of `fetch-event-stream`'s `events()`, not regex on raw bytes
B3	#27	`PRAGMA journal_mode=WAL` must run before any transaction (silently no-ops inside one)
B4	#29	Existing `src/lib/rate-limit.ts` mutates global `state.lastRequestTimestamp` — extend for per-key buckets via `Map<keyId, …>`, do NOT mutate globals from middleware

[Epic] Admin Plane: model config, multi-key auth, usage dashboard, debug mode #23

Description

Background

Approved architecture

⚠️ Open strategic questions (must be decided before implementation)

🔴 Q1 — GitHub abuse-detection risk (devil-advocate, security S2)

🔴 Q2 — Are we duplicating litellm?

🟡 Q3 — Privacy posture flip

🟡 Q4 — Sequencing vs Responses API epic (#1)

🚨 Hard PR-blockers identified by review

Sub-issues

F1 — Model exposure config + WebUI

F2 — Multi-client auth

F3 — Usage dashboard

F4 — Debug mode

Dependency graph

Recommended landing order

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions