You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
copilot-api today is a strict single-tenant proxy. To run it as an internal/team gateway we need an admin plane with: configurable model exposure (with optional WebUI), multi-client API key auth, usage telemetry dashboard, and on-demand request/response debug capture.
Approved architecture
Captured during the OPC build phase:
Three storage layers by access pattern:
~/.local/share/copilot-api/config.json — model aliases, retention knobs, feature flags (rare writes, atomic rename + fs.watch hot reload)
Auth: opaque sk-cap-<32B base32> keys, sha256 at rest, two tiers (admin / client). Bearer header carries our key; proxy strips before forwarding upstream.
Migration: see Q1 below — default decision pending operator review.
WebUI: hono/jsx server-rendered + uPlot vendored, mounted at /admin/* same port. Zero new build pipeline. CSRF + CSP + HttpOnly Secure SameSite=Strict session cookie.
Telemetry: events table records token counts only — never request/response bodies (that's debug mode's job).
Debug: opt-in per-key OR per-request X-Capi-Debug (admin only). Default retention 0 days (in-memory only); 7-day cap when enabled; 1 GB total hard cap.
⚠️ Open strategic questions (must be decided before implementation)
These were raised during multi-role review and are not purely engineering decisions. Resolving them affects scope and even whether sub-issues should be implemented.
The README explicitly warns that fan-out from a single OAuth identity to many concurrent users triggers GitHub Security flags and account suspension. This admin plane productizes fan-out. Pick one:
Cap concurrent in-flight upstream requests per OAuth identity by default
Recommend litellm for multi-tenant; keep copilot-api single-tenant
🔴 Q2 — Are we duplicating litellm?
litellm already provides multi-key auth, virtual keys, dashboards, spend tracking, multi-provider, retention. copilot-api's moat is being a small, single-process, protocol-faithful Copilot adapter. Pick one:
Ship full plan as designed
Recommended by devil-advocate: ship MVP only — config.json (F1.A — config.json: schema, atomic write, fs.watch hot reload #24) with model exposure + per-key allow-lists, /usage JSON endpoint, no SQLite/WebUI/debug-capture. Defer the rest until concrete demand emerges.
Today the operator processes only their own prompts (no data-controller relationship). Multi-client + debug capture turns the operator into a data controller in EU/UK and a "service provider" under CCPA. Recommended in #36: default traces_days = 0 (in-memory only); on-disk retention requires explicit operator opt-in.
Epic #1 (Responses API + Anthropic fidelity) covers protocol work for the existing solo-Claude-Code user base. Doing this epic first delays #1. Recommended: ship #1 first, admin plane second.
🚨 Hard PR-blockers identified by review
These must land in their respective sub-issues; failing any one is a non-merge-able defect:
Trace redaction must cover github_pat_…, Copilot JWT eyJ…\.eyJ…\.…, Iv1\.… OAuth client id; redact by header name unconditionally on both ingress and egress
Existing src/lib/rate-limit.ts mutates global state.lastRequestTimestamp — extend for per-key buckets via Map<keyId, …>, do NOT mutate globals from middleware
Background
copilot-api today is a strict single-tenant proxy. To run it as an internal/team gateway we need an admin plane with: configurable model exposure (with optional WebUI), multi-client API key auth, usage telemetry dashboard, and on-demand request/response debug capture.
Approved architecture
Captured during the OPC build phase:
~/.local/share/copilot-api/config.json— model aliases, retention knobs, feature flags (rare writes, atomic rename +fs.watchhot reload)~/.local/share/copilot-api/copilot-api.db— SQLite WAL — keys + usage events + sessions (transactional, indexed)~/.local/share/copilot-api/traces/YYYY-MM-DD.jsonl— debug request/response (high-volume append-only)~/.local/share/copilot-api/audit.jsonl— admin audit log (append-only)sk-cap-<32B base32>keys, sha256 at rest, two tiers (admin / client). Bearer header carries our key; proxy strips before forwarding upstream.hono/jsxserver-rendered + uPlot vendored, mounted at/admin/*same port. Zero new build pipeline. CSRF + CSP + HttpOnly Secure SameSite=Strict session cookie.X-Capi-Debug(admin only). Default retention 0 days (in-memory only); 7-day cap when enabled; 1 GB total hard cap.These were raised during multi-role review and are not purely engineering decisions. Resolving them affects scope and even whether sub-issues should be implemented.
🔴 Q1 — GitHub abuse-detection risk (devil-advocate, security S2)
The README explicitly warns that fan-out from a single OAuth identity to many concurrent users triggers GitHub Security flags and account suspension. This admin plane productizes fan-out. Pick one:
auth=ondefault in v0.8 with auto-bootstrap;--no-authrequires--i-accept-account-suspension-riskflag and refuses non-loopback bind without it (F2.F — --no-auth legacy mode + v0.8 / v0.9 deprecation plan #33 default)🔴 Q2 — Are we duplicating litellm?
litellm already provides multi-key auth, virtual keys, dashboards, spend tracking, multi-provider, retention. copilot-api's moat is being a small, single-process, protocol-faithful Copilot adapter. Pick one:
config.json(F1.A — config.json: schema, atomic write, fs.watch hot reload #24) with model exposure + per-key allow-lists,/usageJSON endpoint, no SQLite/WebUI/debug-capture. Defer the rest until concrete demand emerges.🟡 Q3 — Privacy posture flip
Today the operator processes only their own prompts (no data-controller relationship). Multi-client + debug capture turns the operator into a data controller in EU/UK and a "service provider" under CCPA. Recommended in #36: default
traces_days = 0(in-memory only); on-disk retention requires explicit operator opt-in.🟡 Q4 — Sequencing vs Responses API epic (#1)
Epic #1 (Responses API + Anthropic fidelity) covers protocol work for the existing solo-Claude-Code user base. Doing this epic first delays #1. Recommended: ship #1 first, admin plane second.
🚨 Hard PR-blockers identified by review
These must land in their respective sub-issues; failing any one is a non-merge-able defect:
github_pat_…, Copilot JWTeyJ…\.eyJ…\.…,Iv1\.…OAuth client id; redact by header name unconditionally on both ingress and egress--no-authdefault — public-facing port + no auth = account suspension riskjson-file/ journald / k8s log shippers permanently — write to mode-0600 file; only TTY shows literal keycopilot-api admin recoverfor lockout/admin/traces/:date.jsonlmust reject anything not matching^\d{4}-\d{2}-\d{2}$and assert path stays inside traces dirstream_options.include_usage=trueoutbound; without it telemetry rows record 0 for every streaming requestfetch-event-stream'sevents(), not regex on raw bytesPRAGMA journal_mode=WALmust run before any transaction (silently no-ops inside one)src/lib/rate-limit.tsmutates globalstate.lastRequestTimestamp— extend for per-key buckets viaMap<keyId, …>, do NOT mutate globals from middlewareSub-issues
F1 — Model exposure config + WebUI
F2 — Multi-client auth
F3 — Usage dashboard
F4 — Debug mode
Dependency graph
Recommended landing order
If proceeding with the full plan after deciding Q1–Q4:
References
.harness/nodes/code-review/run_1/eval-{backend,security,devil-advocate}.md