Goal
Add internal Slack alerting for the relay Service Bus topic so we are notified when relay delivery is unhealthy.
Scope
- Monitor
bildrelaybus / relay-events subscriptions: bnj-dev, bnj-prod, op-dev, op-prod.
- Alert in
#eva-ops-alerts when a subscription has dead-letter messages.
- Alert when the oldest active message is older than 30 minutes.
- Keep the implementation simple and stateless: every hourly run may report the same still-open issue again.
- Avoid Blob-backed alert state, dedupe storage, and recovery notices.
- Do not include message bodies, SAS URLs, connection strings, or Slack secrets in alert payloads.
Verification
- Unit coverage for stale-active alerts, dead-letter alerts, stateless repeated reporting, missing Slack config, and monitor failure handling.
- Service Bus smoke test with a temporary
relay-alert-smoke subscription.
- Live settings verifier confirms exactly one monitor host is enabled.
Related
PR: https://github.com/bild-engineering/bild-ia/pull/299
Goal
Add internal Slack alerting for the relay Service Bus topic so we are notified when relay delivery is unhealthy.
Scope
bildrelaybus/relay-eventssubscriptions:bnj-dev,bnj-prod,op-dev,op-prod.#eva-ops-alertswhen a subscription has dead-letter messages.Verification
relay-alert-smokesubscription.Related
PR: https://github.com/bild-engineering/bild-ia/pull/299