You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Callstack is intended to run unattended on Raspberry Pi + GSM/LTE modem deployments, but the HTTP server currently has SMS/USSD routes only. Operators have no low-friction way to answer basic production questions such as: is the process alive, is the modem connected, how many SMS sends failed, are delivery reports arriving, is signal quality degrading, and did auto-reconnect start flapping?
This decomposes the PII-safe metrics slice from #9 into one implementable PR: add a small health/readiness endpoint and a Prometheus-style metrics endpoint without exposing phone numbers, SMS bodies, SIM identifiers, API keys, or webhook URLs.
User journey
An operator starts Callstack on a Pi with the HTTP server enabled.
A local service manager or monitoring agent polls /healthz to decide whether the process is alive and whether the modem is ready.
Prometheus or a lightweight scraper polls /metrics to collect counters/gauges for SMS, delivery reports, calls, signal quality, reconnects, and uptime.
The operator can diagnose modem/network problems from aggregate metrics without leaking private SMS/call data into logs or monitoring labels.
200 {"status":"ready"} when the server is alive and the modem is connected/initialized.
503 {"status":"degraded"} when the server is alive but modem state is disconnected/reconnecting/unknown.
Metrics endpoint:
GET /metrics
Example text format:
# HELP callstack_uptime_seconds Seconds since HTTP app startup.
# TYPE callstack_uptime_seconds gauge
callstack_uptime_seconds 1234.5
# HELP callstack_sms_received_total Total completed inbound SMS messages.
# TYPE callstack_sms_received_total counter
callstack_sms_received_total 42
Initial metric names can stay dependency-free and manually rendered; adding prometheus-client is not required for this first slice.
Technical approach
Add a small in-process stats collector, either in server.py or a focused helper such as callstack/metrics.py.
Initialize it in create_app(modem, ...) and store it in app["callstack_metrics"] or an equivalent explicit location.
Subscribe to typed events on modem.bus and increment counters/gauges for:
inbound SMS completed events,
outbound SMS sent events,
SMS delivery report statuses,
call state transitions / active call gauge,
signal quality RSSI/BER last values,
modem disconnect/reconnect counts,
USSD response count,
HTTP request counts by route/status class if this stays small.
Add /healthz and /metrics routes in create_app.
Keep labels static and low-cardinality. Do not label by phone number, sender, recipient, SMS body, webhook URL, SIM identifier, API key, modem serial/IMEI, or raw error strings.
Modem tracks _connected internally today; this PR can expose a safe read-only property if needed rather than reaching into private state from the server.
Hardware / modem caveats
Metrics must be useful even when no modem hardware is present in tests. Use event injection and fake modem state.
Signal quality values can be unknown or stale; expose clear unknown/absence semantics rather than pretending a modem recently reported.
Monitoring output must avoid PII and secrets by design. Metrics systems are often broadly readable on a LAN.
Acceptance criteria
GET /healthz returns JSON with process liveness, modem readiness/degraded state, and uptime.
GET /metrics returns Prometheus-compatible text with # HELP/# TYPE lines and stable metric names.
Counters/gauges cover at least SMS received, SMS sent, delivery report statuses, active call state, last signal RSSI/BER, modem reconnect/disconnect counts, and uptime.
Metrics labels/values do not include phone numbers, SMS bodies, webhook URLs, SIM identifiers, API keys, modem IMEI/serial values, or unbounded raw error strings.
Tests prove metrics update after emitting representative typed events on the event bus.
Tests prove health returns degraded/non-200 when modem readiness is false/unknown.
Motivation
Callstack is intended to run unattended on Raspberry Pi + GSM/LTE modem deployments, but the HTTP server currently has SMS/USSD routes only. Operators have no low-friction way to answer basic production questions such as: is the process alive, is the modem connected, how many SMS sends failed, are delivery reports arriving, is signal quality degrading, and did auto-reconnect start flapping?
This decomposes the PII-safe metrics slice from #9 into one implementable PR: add a small health/readiness endpoint and a Prometheus-style metrics endpoint without exposing phone numbers, SMS bodies, SIM identifiers, API keys, or webhook URLs.
User journey
/healthzto decide whether the process is alive and whether the modem is ready./metricsto collect counters/gauges for SMS, delivery reports, calls, signal quality, reconnects, and uptime.API / UX sketch
Health endpoint:
Example response:
{ "status": "ready", "modem_connected": true, "uptime_seconds": 1234.5, "sms_store_ready": true }Suggested status semantics:
200 {"status":"ready"}when the server is alive and the modem is connected/initialized.503 {"status":"degraded"}when the server is alive but modem state is disconnected/reconnecting/unknown.Metrics endpoint:
Example text format:
Initial metric names can stay dependency-free and manually rendered; adding
prometheus-clientis not required for this first slice.Technical approach
server.pyor a focused helper such ascallstack/metrics.py.create_app(modem, ...)and store it inapp["callstack_metrics"]or an equivalent explicit location.modem.busand increment counters/gauges for:/healthzand/metricsroutes increate_app./metrics; do not add unauthenticated network observability by accident.pytest-aiohttp; no real serial ports, SIM, carrier network, or Prometheus server should be required.Affected modules and tests
Likely files:
server.py— add routes and wire stats collector.callstack/metrics.py— collector/rendering helpers if keepingserver.pysmall.tests/test_metrics.py— route behavior, event-driven counters, PII-safety assertions.tests/test_api_auth.pyonly if auth middleware behavior for/healthz//metricsneeds explicit coverage.Existing context:
server.pyalready imports aiohttp, ownscreate_app(modem, api_keys=None), and registers SMS/USSD routes.callstack.events.typesalready definesIncomingSMSEvent,SMSSentEvent,SMSDeliveryReportEvent,CallStateEvent,SignalQualityEvent,ModemDisconnectedEvent,ModemReconnectedEvent, andUSSDResponseEvent.Modemtracks_connectedinternally today; this PR can expose a safe read-only property if needed rather than reaching into private state from the server.Hardware / modem caveats
unknown/absence semantics rather than pretending a modem recently reported.Acceptance criteria
GET /healthzreturns JSON with process liveness, modem readiness/degraded state, and uptime.GET /metricsreturns Prometheus-compatible text with# HELP/# TYPElines and stable metric names.Exact verification gates
If adding any dependency, also run the packaging-oriented gate and record the exact result:
Non-goals