Skip to content

Add reliable signed SMS webhooks with retry/backoff #21

Description

@Justinabox

Motivation

Callstack already has POST /sms/subscribe and forwards incoming SMS to subscribed webhook URLs, but delivery is currently best-effort: one POST per subscriber, a single 5s timeout, warning-only failure handling, no retry visibility, and no authenticity signal for receivers. That is risky for unattended Raspberry Pi/MFA/automation deployments because transient network failures can silently drop SMS events and unauthenticated callbacks are hard for downstream apps to trust.

This is a small v0.3 hardening slice that improves the existing webhook adapter without changing modem behavior or requiring real hardware.

User journey

  1. An operator starts the HTTP server with an API key and a webhook signing secret.
  2. A local integration subscribes a webhook URL through the authenticated API.
  3. An SMS arrives while the receiver is briefly unavailable.
  4. Callstack retries delivery with bounded exponential backoff and records the final delivery state.
  5. The receiver verifies the callback signature before accepting the SMS event.
  6. The operator can inspect recent webhook delivery attempts without exposing SMS bodies or secrets in logs.

API / UX sketch

Configuration should be explicit and not log secrets:

app = create_app(
    modem,
    api_keys=["..."],
    webhook_signing_secret="loaded-from-env-or-config",
    webhook_max_attempts=3,
)

Callback request shape stays compatible enough for existing subscribers, with new headers:

POST /receiver
Content-Type: application/json
X-Callstack-Event: sms.received
X-Callstack-Delivery-Id: <opaque-id>
X-Callstack-Timestamp: <unix-seconds>
X-Callstack-Signature: sha256=<hmac>

Payload:

{
  "event": "sms.received",
  "sender": "+155****0100",
  "body": "example",
  "received_at": "2026-06-24T00:00:00"
}

Optional diagnostic endpoint, authenticated like the other HTTP routes:

GET /sms/webhook-deliveries?limit=50

Return only delivery metadata by default: delivery id, subscriber URL host/path or redacted URL, attempt count, status, last error class/message, timestamps. Do not include API keys, signing secrets, SIM identifiers, or full SMS bodies in logs/diagnostics.

Technical approach

  • Extract webhook delivery out of module globals in server.py into a small testable helper, for example WebhookDispatcher.
  • Keep the first implementation in-process and bounded; durable webhook queues can be a later issue if needed.
  • Generate one opaque delivery id per event/subscriber and compute HMAC over a canonical string such as timestamp + "." + raw_json_body.
  • Add bounded exponential backoff with jitter; avoid infinite loops and make retry timings injectable for fast tests.
  • Record recent delivery attempts in an in-memory ring buffer with redacted URL display.
  • Integrate the dispatcher from the existing on_sms handler; do not alter SMS parsing/reassembly semantics.
  • Ensure webhook callbacks never run before the HTTP auth/security posture from Secure HTTP server startup instead of disabling auth by default #4 is respected by server startup.

Affected modules and tests

Likely files:

  • Modify: server.py
  • Create (optional): callstack/webhooks.py
  • Create/modify tests: tests/test_webhooks.py and/or tests/test_api_auth.py
  • Docs follow-up (not in this lane): README HTTP/webhook section after implementation

Existing context:

  • server.py currently stores webhook_urls, received_messages, and delivery_reports as process globals.
  • notify_webhooks() currently uses one aiohttp.ClientSession and logs failures only.
  • tests/test_api_auth.py already uses pytest-aiohttp and can be mirrored for HTTP route tests.

Hardware / modem caveats

  • This feature must be fully testable with fake SMS events and aiohttp test servers; no modem, SIM, phone numbers, or carrier network access should be required.
  • Do not claim exactly-once delivery: retries can create duplicates, so downstream consumers should use X-Callstack-Delivery-Id idempotently.
  • Avoid logging full phone numbers, SMS bodies, webhook URLs containing tokens, or signing secrets.

Acceptance criteria

  • Webhook callbacks include X-Callstack-Event, X-Callstack-Delivery-Id, X-Callstack-Timestamp, and X-Callstack-Signature: sha256=... when a signing secret is configured.
  • HMAC verification can be reproduced in a unit test from the exact body bytes and timestamp header.
  • Failed webhook POSTs retry with bounded exponential backoff up to a configurable max attempt count.
  • Retry tests run quickly by injecting sleep/backoff behavior; no real waiting for seconds.
  • A permanently failing subscriber records a final failed status instead of looping forever.
  • One failing subscriber does not prevent other subscribers from receiving the same SMS event.
  • Logs and any diagnostics redact webhook secrets/tokens and avoid SMS body disclosure.
  • Existing /sms/send, /sms/subscribe, /sms/messages, /sms/delivery-reports, and /ussd/send behavior remains compatible except for documented new auth/signature behavior.

Exact verification gates

Run these before opening the PR:

git diff --check
PYTHONPATH=. uv run --no-project --with pytest --with pytest-asyncio --with pytest-aiohttp --with pyserial-asyncio --with aiosqlite pytest tests/test_webhooks.py tests/test_api_auth.py -q
PYTHONPATH=. uv run --no-project --with pytest --with pytest-asyncio --with pytest-aiohttp --with pyserial-asyncio --with aiosqlite pytest tests/ -q

If packaging/config files change, also run:

uv run --with pytest --with pytest-asyncio --with pytest-aiohttp --with pyserial-asyncio --with aiosqlite pytest tests/ -q

Non-goals

  • Durable on-disk webhook queue.
  • Webhook management CRUD beyond the existing subscribe flow plus optional diagnostics.
  • Multipart SMS reassembly changes; Reassemble inbound multipart SMS before public delivery #10 covers the logical-message boundary.
  • Public internet webhook ingress or tunneling setup.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions