Public-tab uptime is still wrong for most relays: only the 8 allowlist relays had their counters reset; the other ~718 verified relays still carry degraded-bomb-sweep ratios (e.g. 13/32 ≈ 40%), so their cumulative uptime is stuck low even as clean daily probes accrue.
Fix needs two parts:
- Decouple prober membership from the uptime stat — target relays by
lastChecked exists (we've probed it before) instead of checksOnline ≥ 1. This survives a counter reset and keeps offline-but-known relays in rotation (so they can recover). Brand-new harvested candidates (no lastChecked) still excluded.
- Reset corrupted counters (
checksOnline/checksTotal/uptime) for the tracked set (ops), then re-probe → uptime rebuilds clean from honest daily samples.
Change (this PR)
sweepTargets: query { lastChecked: { $exists: true } } instead of { checksOnline: { $gte: 1 } }.
Reset + reprobe done as an ops step after deploy.
Note: post-reset, day-1 uptime is ~100% online / 0% offline (one sample); it becomes meaningful over subsequent days.
Public-tab uptime is still wrong for most relays: only the 8 allowlist relays had their counters reset; the other ~718 verified relays still carry degraded-bomb-sweep ratios (e.g. 13/32 ≈ 40%), so their cumulative uptime is stuck low even as clean daily probes accrue.
Fix needs two parts:
lastCheckedexists (we've probed it before) instead ofchecksOnline ≥ 1. This survives a counter reset and keeps offline-but-known relays in rotation (so they can recover). Brand-new harvested candidates (no lastChecked) still excluded.checksOnline/checksTotal/uptime) for the tracked set (ops), then re-probe → uptime rebuilds clean from honest daily samples.Change (this PR)
sweepTargets: query{ lastChecked: { $exists: true } }instead of{ checksOnline: { $gte: 1 } }.Reset + reprobe done as an ops step after deploy.
Note: post-reset, day-1 uptime is ~100% online / 0% offline (one sample); it becomes meaningful over subsequent days.