Skip to content

Provision S3 backup infrastructure for private knowledge repo #82

Description

@alexeygrigorev

Provision S3 backup infrastructure for private knowledge repo

Status: in progress
Tags: infra, data, process-docs, P1
Depends on: #81
Blocks: enabling scheduled backups in the future private DataTalksClub/dataops-knowledge repository

Scope

Use ../aws-infra, not the public app repo, to provision the AWS resources needed by the private knowledge-repo S3 backup workflow.

The intended model remains:

  • DataTalksClub/dataops is the public app/runtime repo.
  • Private DataTalksClub/dataops-knowledge is canonical for operational docs/templates/prompts.
  • Private S3 stores daily backup artifacts generated from the private Git repo.

Implement a CloudFormation stack under ../aws-infra/sandbox/dataops that creates:

  • a private S3 bucket for dataops-knowledge backups;
  • server-side encryption;
  • bucket versioning;
  • public access blocking;
  • lifecycle rules for daily Git archive/bundle backups;
  • a GitHub Actions OIDC role assumable by DataTalksClub/dataops-knowledge on main;
  • scoped S3 permissions for reading latest/manifest.json and writing backup objects under the configured prefix.

Acceptance Criteria

  • Infra source lives in ../aws-infra/sandbox/dataops.
  • CloudFormation validates locally.
  • Stack is deployed/applied in AWS sandbox account.
  • Outputs provide bucket name, prefix, and backup role ARN for future GitHub repository variables.
  • Bucket is private, encrypted, versioned, and public access is blocked.
  • Backup role trust policy is restricted to DataTalksClub/dataops-knowledge on main.
  • S3 role permissions are reasonably scoped to the backup bucket/prefix.
  • Infra docs explain how to deploy/update and which variables to set in the private knowledge repo.

Verification

  • aws sts get-caller-identity
  • aws cloudformation validate-template --template-body file://sandbox/dataops/template.knowledge-backups.yaml --region eu-west-1
  • aws cloudformation deploy --stack-name dataops-knowledge-backups --template-file sandbox/dataops/template.knowledge-backups.yaml --region eu-west-1 --capabilities CAPABILITY_NAMED_IAM
  • aws cloudformation describe-stacks --stack-name dataops-knowledge-backups --region eu-west-1
  • aws s3api get-public-access-block, get-bucket-versioning, get-bucket-encryption, and get-bucket-lifecycle-configuration for the created bucket.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1ImportantdataData model, migration, storageinfraDeployment and infrastructureprocess-docsSOPs, templates, references, playbooks

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions