Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

systemd Deployment Guide for scan-all

This directory provides systemd unit templates so an Ubuntu host can run security-scanner scan-all on a periodic schedule (weekly by default).

The templates are deliberately not auto-installed. Operators copy and edit them to fit their host layout.


1. Files

Path Purpose
security-scanner-scan-all.service System-level one-shot unit that invokes security-scanner scan-all.
security-scanner-scan-all.timer Schedules the system .service on a calendar interval.
user/security-scanner-scan-all.service No-sudo user-level variant (%h-based, runs as the invoking user).
user/security-scanner-scan-all.timer Schedules the user .service (default: every 2 hours).

Scale worker pool + periodic jobs (M3, see §9):

Path Purpose
security-scanner-scan-worker@.service Instanced daemon template; scan-worker@1..N are N independent processes, each a distinct fence-token holder (FR-4).
scan-worker.target Brings the whole worker pool up/down at once.
security-scanner-lease-reaper.{service,timer} Reclaims expired job + repo leases (FR-6) on a timer.
security-scanner-incr-poll.{service,timer} discover-updates --enqueue --from-catalog --ls-remote-skip (FR-2).
security-scanner-baseline.{service,timer} Per-repo baseline ScanJob enqueue (FR-3).
security-scanner-freshness-eval.{service,timer} Per-repo staleness detector + BREACH_COUNTER rollup (FR-8).
security-scanner-catalog-reconcile.{service,timer} Org catalog reconcile (FR-1). Governance-gated: keep DISABLED until GATE 2 (default provider refuses live fetch).

Two flavors:

  • System-level (§4) — requires root, runs as a dedicated scanner user, full hardening. Use for shared/production hosts.
  • User-level (§4b) — no sudo needed, installed under ~/.config/systemd/user/. Use for a single-user host (e.g. a personal Ubuntu box). Auth via gh auth login.

Both are templates with operator-specific placeholders. Review every line before copying.


2. Prerequisites

On the target Ubuntu host:

  • Python 3.10+ and uv installed. which uv resolves to /usr/bin/uv (adjust ExecStart= if installed elsewhere).
  • security-scanner source checked out at WorkingDirectory and a working uv sync run as the service user.
  • gh and glab CLIs installed and reachable on PATH. Required for GitHub/GitLab clone/fetch (spec §6).
  • A reachable DynamoDB-compatible backend on http://localhost:4567. For a single host, run the DynamoDB Local container shipped in the repo's docker-compose.yml (see "Start the local DB" below). This requires Docker with the Compose v2 plugin.
  • A non-root service user (scanner by default) owning:
    • WorkingDirectory (project tree).
    • ~/.cache/security-scanner/ (clone cache, lock file).
    • /var/log/security-scanner/ (notification log).

Create the user and directories:

sudo useradd --system --create-home --shell /usr/sbin/nologin scanner
sudo install -d -o scanner -g scanner /var/log/security-scanner
sudo install -d -o scanner -g scanner /var/cache/security-scanner

Start the local DB. From the project tree, bring up DynamoDB Local and create the table (and its query index) once:

docker compose up -d dynamodb-local

uv run security-scanner init-storage \
  --storage-backend dynamodb \
  --dynamodb-endpoint-url http://localhost:4567 \
  --dynamodb-table security_scanner_local_dev

If host port 4567 is already in use on a test box, set SECURITY_SCANNER_DYNAMO_HOST_PORT=<free-port> for the compose command. The worker service still talks to DynamoDB Local through the compose network.

DynamoDB Local here is for single-host, local-only use; its data is persisted in the named Compose volume. Register scan targets with add-target before the first scheduled run — see the getting-started guide. Managed DynamoDB is out of scope for now.

Incremental worker local proof. The repository's Docker Compose file also contains a worker service. It is for local verification only, not a production deployment target.

SECURITY_SCANNER_QUICKSTART_TARGET=https://github.com/<owner>/<repo> \
  docker compose up --abort-on-container-exit --exit-code-from worker worker

For custom GitLab domains, add the provider hint:

SECURITY_SCANNER_QUICKSTART_TARGET=https://source.example.test/<group>/<repo> \
SECURITY_SCANNER_SCM_PROVIDER=gitlab \
  docker compose up --abort-on-container-exit --exit-code-from worker worker

That command exercises security-scanner quickstart against the local DynamoDB Local service, creates a current-tip queue job, and runs the worker. Keep credentials out of compose files and inject them through the host environment or the normal service manager only when you intentionally test private repository access.


3. Authentication

For private repos, pass SCM tokens to the service. Two options.

Option A (recommended): EnvironmentFile.

sudo install -d -m 700 /etc/security-scanner
sudo tee /etc/security-scanner/scm.env >/dev/null <<'EOF'
GH_TOKEN=<GH_TOKEN>
GITLAB_TOKEN=<GITLAB_TOKEN>
EOF
sudo chmod 600 /etc/security-scanner/scm.env
sudo chown scanner:scanner /etc/security-scanner/scm.env

The service file already references this path with EnvironmentFile=-.

Option B (less secure): inline Environment= lines in the unit file.

Edit the .service and uncomment the inline Environment=GH_TOKEN= / Environment=GITLAB_TOKEN= lines. Tokens become world-readable in unit metadata — not recommended.


4. Install

# Copy templates (edit them first to suit your host).
sudo install -m 644 deploy/systemd/security-scanner-scan-all.service \
    /etc/systemd/system/security-scanner-scan-all.service
sudo install -m 644 deploy/systemd/security-scanner-scan-all.timer \
    /etc/systemd/system/security-scanner-scan-all.timer

# Reload systemd to pick them up.
sudo systemctl daemon-reload

# Enable + start the timer (this also enables the service to be triggered).
sudo systemctl enable --now security-scanner-scan-all.timer

# Verify.
systemctl list-timers security-scanner-scan-all.timer

OnCalendar=Sun *-*-* 03:00:00 runs every Sunday at 03:00 local time. Adjust in the .timer file before installing.


4b. Install (user-level, no sudo)

For a single-user host without root, install under the user systemd manager. Auth comes from gh auth login (token stored in ~/.config/gh/hosts.yml, which the user manager reads without a desktop keyring).

Prereqs (as the scanning user): the project checked out at ~/security-scanner with uv sync run; uv/git/gitleaks/gh/docker on PATH; the local DynamoDB Local DB reachable (the unit's ExecStartPre brings it up).

# 1. Authenticate gh (token never leaves the host).
gh auth login          # GitHub.com → HTTPS → paste a read-scoped token

# 2. Bootstrap the catalog table once (DynamoDB Local must be up).
docker compose up -d dynamodb-local
export SECURITY_SCANNER_STORAGE_BACKEND=dynamodb
uv run security-scanner init-storage

# 3. Install the user units (no sudo).
mkdir -p ~/.config/systemd/user
install -m 644 deploy/systemd/user/security-scanner-scan-all.service \
    ~/.config/systemd/user/security-scanner-scan-all.service
install -m 644 deploy/systemd/user/security-scanner-scan-all.timer \
    ~/.config/systemd/user/security-scanner-scan-all.timer
systemctl --user daemon-reload
systemctl --user enable --now security-scanner-scan-all.timer

# 4. Let the timer fire while logged out (one-time; the ONLY step needing sudo,
#    or ask an admin — skip if linger is already enabled for this user).
sudo loginctl enable-linger "$USER"

# 5. Verify next run.
systemctl --user list-timers security-scanner-scan-all.timer

Trigger one run immediately:

systemctl --user start security-scanner-scan-all.service
systemctl --user show security-scanner-scan-all.service -p Result -p ExecMainStatus
journalctl --user -u security-scanner-scan-all.service -e

OnCalendar=*-*-* 00/2:00:00 runs every 2 hours on the even hour. Adjust in the .timer file before installing.

Notes:

  • A oneshot service shows inactive (dead) after a successful run — that is normal. Check Result=success / ExecMainStatus=0, not is-active.
  • Linger is the only step that needs sudo (once). Without it the timer fires only while the user has an active login session.
  • Uninstall: systemctl --user disable --now security-scanner-scan-all.timer, then remove the two files from ~/.config/systemd/user/.

5. First-run verification

Trigger one run immediately without waiting for the timer:

sudo systemctl start security-scanner-scan-all.service
sudo systemctl status security-scanner-scan-all.service
journalctl -u security-scanner-scan-all.service -e

Inspect the structured log:

tail -n 20 /var/log/security-scanner/scan-all.log.jsonl | jq .

You should see one summary record per run plus one finding record per detected leak. See docs/workbench/specs/2026-05-31-scan-all-notification-log.md for the full schema.


5b. Optional: verifier auto-triage (Ollama)

scan-all can run the Ollama verifier after scanning and record terminal finding dispositions (false_positive / true_positive). This is DEFAULT-OFF and trigger-agnostic — it rides whatever runs scan-all (this timer, cron, or a manual run), so it needs no systemd-specific wiring. Enable it purely via environment variables:

Env var Meaning
SECURITY_SCANNER_VERIFY_ARTIFACTS 1/true/yes/on enables verification; unset/0/false keeps it off.
SECURITY_SCANNER_OLLAMA_HOST Ollama-compatible host, e.g. http://127.0.0.1:11434.
SECURITY_SCANNER_OLLAMA_MODEL Model name.
SECURITY_SCANNER_OLLAMA_TIMEOUT_SECONDS Optional HTTP timeout (default 30).
SECURITY_SCANNER_OLLAMA_MIN_CONFIDENCE Optional min confidence (default 0.60).
SECURITY_SCANNER_OLLAMA_API_KEY_ENV Optional: name of the env var holding the API key (the token itself stays in that separate env var, never inline).

Notes:

  • The verifier reads only redacted metadata; raw secrets never leave the host.
  • It fails closed: if Ollama is unreachable or low-confidence, the finding is left needs_review (no disposition written) and the scan still records all findings — verification never destructively fails a scan. A verifier failure surfaces only as exit code 2 (alertable; see §6).
  • The CLI flag --verify-artifacts / --no-verify-artifacts overrides the env default for one-off runs.
  • This is a full-sweep triage: newly detected findings are verified on the next scan-all run. Per-change verification in the incremental scan-worker path is out of scope here (separate follow-up).

6. Exit code semantics for alerting

The service uses SuccessExitStatus=0 3, so systemd treats exit codes 0 and 3 as non-failure (no failed state). External monitoring should still distinguish them:

Code Meaning Alert?
0 Success or empty catalog No
1 Fatal worker error (catalog lookup, environment) Yes
2 At least one repo failed fetch or scan; others completed Yes
3 Another scan-all held the lock; this run did nothing No

Recommended monitoring sources:

  • journalctl -u security-scanner-scan-all.service for stdout + exit code.
  • The JSONL notification log (/var/log/security-scanner/scan-all.log.jsonl) for machine-readable per-repo / per-finding events.

7. Log rotation

The scanner does not rotate the JSONL log. Use OS-standard tools.

Example logrotate config at /etc/logrotate.d/security-scanner:

/var/log/security-scanner/scan-all.log.jsonl {
    weekly
    rotate 12
    compress
    delaycompress
    missingok
    notifempty
    copytruncate
}

copytruncate is recommended because the scanner reopens the file per record (spec §4) but log shippers may keep their own file descriptors.


8. Uninstall

sudo systemctl disable --now security-scanner-scan-all.timer
sudo rm /etc/systemd/system/security-scanner-scan-all.timer
sudo rm /etc/systemd/system/security-scanner-scan-all.service
sudo systemctl daemon-reload

DB catalog, cache directory, and notification log are left intact. Remove manually if you also want to clean state.


9. Scale worker pool (M3) — N processes + periodic timers

The scale redesign (design.md v2, FR-4) replaces the single weekly scan-all oneshot with a queue + N-worker-pool model: a per-repo job queue, N independent worker processes draining it, and several periodic timers feeding and maintaining the queue. The scan-all units above still work; the units in this section are the scale path.

Box-gated. The deployment box is OFFLINE. These artifacts are what a future box deploy instantiates; DEPLOYED behavior (N live processes, Restart=on-failure recovery on a real crash, and the real cadence values) is NOT proven here. The OnCalendar= values in every timer are GATE-1 placeholders — the box load gate sets the real cadences (poll interval, baseline window, N). Do not treat them as load-validated.

9a. Worker pool

security-scanner-scan-worker@.service is an instanced (templated) unit. The systemd instance name %i is threaded into --worker-id scan-worker@%i, so scan-worker@1 .. scan-worker@N run as N independent OS processes, each a distinct RepoLease fence-token holder. The RepoLease CAS (M2) guarantees two instances never scan the same repo concurrently (FR-4).

Bring up N instances (pick N from the box load gate):

sudo systemctl enable --now scan-worker@{1..8}     # example: 8 workers
sudo systemctl enable --now scan-worker.target     # group start/stop
# stop the whole pool:
sudo systemctl stop scan-worker.target

Each instance is Type=simple (long-running daemon, polls until SIGTERM) with Restart=on-failure; a crashed instance is restarted by systemd and its stranded leases are reclaimed by the lease-reaper timer below.

9b. Periodic timers

sudo systemctl enable --now security-scanner-lease-reaper.timer
sudo systemctl enable --now security-scanner-incr-poll.timer
sudo systemctl enable --now security-scanner-baseline.timer
sudo systemctl enable --now security-scanner-freshness-eval.timer

9c. catalog-reconcile — DISABLED until GATE 2

Do not systemctl enable security-scanner-catalog-reconcile.timer yet. The reconcile command's default org-list provider is a governance-gated stub that REFUSES to fetch live GitHub (a live org GET is gated to a human PR + the autopilot ghas-live-fetch-or-mutation-required stop-condition, GATE 2). As shipped the unit fails closed; enabling the timer early only schedules failing runs. Enable it only after GATE 2 clears and a live provider is wired.


10. Related documents

  • docs/workbench/adrs/ADR-20260531-periodic-multi-repo-scan-catalog.md
  • docs/workbench/specs/2026-05-31-scan-all-and-target-catalog.md
  • docs/workbench/adrs/ADR-20260531-2-scan-all-notification-log.md
  • docs/workbench/specs/2026-05-31-scan-all-notification-log.md