Skip to content

Add S3 backup scaffold for private knowledge repo #81

Description

@alexeygrigorev

Add S3 backup scaffold for private knowledge repo

Status: in progress
Tags: docs, process-docs, infra, data, P1
Depends on: private DataTalksClub/dataops-knowledge repository and an AWS backup bucket/role
Blocks: future private knowledge repo bootstrap

Scope

Add a reusable scaffold for the future private DataTalksClub/dataops-knowledge repository so it can back itself up to S3 on a schedule.

The intended model is:

  • Private GitHub repo remains canonical for operational docs/templates/prompts.
  • Private S3 bucket stores backups.
  • Backup job zips the whole repo and uploads only when the current commit SHA differs from the last successful backup manifest in S3.
  • Backup includes a manifest and SHA-256 checksums for restore/audit.
  • The workflow uses GitHub Actions OIDC and minimal reasonable S3 permissions.

Acceptance Criteria

  • templates/dataops-knowledge includes a scheduled GitHub Actions workflow template for S3 backups.
  • templates/dataops-knowledge includes a repository-local backup script that creates a zip, manifest, checksums, and skip-if-unchanged behavior based on the last S3 manifest.
  • Docs explain required bucket, prefix, AWS role, and restore/audit expectations.
  • ADR/repository-structure docs mention daily S3 backups for the private knowledge repo while keeping Git as canonical.
  • Existing scaffold validation and planning-doc checks pass.

Verification

  • uv run --with pytest python -m pytest tests/planning_docs
  • uv run --project lambda-functions --extra search --with pytest python -m pytest tests/docs_app
  • uv run --project lambda-functions --extra search python -m lambda_functions.validate_knowledge_repo --repo-root . --scaffold-root templates/dataops-knowledge
  • git diff --check

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1ImportantdataData model, migration, storagedocsDocumentation or process docs workinfraDeployment and infrastructureprocess-docsSOPs, templates, references, playbooks

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions