You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement a safe migration path for the Google Spreadsheet TODO data described in work-engine/docs/data.md so pending spreadsheet work becomes normal DataOps operations-manager work, not a separate spreadsheet clone. The operator should see imported work in the same Work/Home/task/bundle surfaces as native tasks, recurring work, workflow bundles, proof-gated tasks, waiting follow-ups, process-doc links, artifacts, notifications, and portable exports.
Use these current references:
.goal-v1.md
docs/operations-manager-platform-jtbd.md
docs/v1-workflow-data-model.md
docs/v1-execution-state-schema.md
docs/v1-execution-data-safety.md
docs/restore-drill.md
work-engine/docs/data.md
work-engine/docs/specs.md
work-engine/docs/templates.md
work-engine/scripts/migrate-data.ts
work-engine/tests/migrate-data.test.ts
work-engine/tests/export-portable.test.ts
work-engine/tests/dry-run-import.test.ts
work-engine/src/types.ts
content/tasks/templates/ and relevant process docs in content/**
source-system references in ../datatasks and ../dtc-operations for comparison only; do not modify source repositories
The current script maps CSV rows into standalone manual tasks. This issue should extend or replace that CSV path so spreadsheet rows can become the right DataOps entities:
Open pending rows from TODO list - todo.csv become ordinary active tasks, workflow tasks, or recurring configs according to their content.
Repetitive rows from TODO list - done.csv may be analyzed to infer recurring patterns, process-doc references, and proof semantics, but completed history must not flood the active task list by default.
Rows that describe one-off work become tasks with due date, assignee when safely known, source/provenance, comments, process-doc context, proof requirements, waiting/follow-up metadata, and export-safe audit/provenance.
Rows that match existing workflow patterns should attach to existing workflow bundles/tasks when a safe deterministic match exists, or be reported for human review instead of creating disconnected duplicate work.
Rows that are truly periodic should become recurring configs or automatic workflow trigger suggestions using the current recurring/template primitives, not many copied historical task rows.
Process-document creation/update rows should link to in-repo process docs by stable instructionDocId where resolvable and preserve external instructionsUrl/spreadsheet notes as fallback provenance.
Notes, process document links, comments, date-finished values, and status values should feed proof/comment/completion metadata only when the data is safe and unambiguous.
The migration must be file-based and local/test safe. It may consume local CSV fixtures or a human-provided export file, but it must not connect to live Google Sheets, mutate Google Sheets, write external systems, or access production spreadsheets during agent verification.
Acceptance Criteria
The CSV migration supports explicit local input files, for example --source-todo <csv> and optional --source-done <csv>, so tests and operators do not rely on hard-coded production spreadsheet paths.
Dry-run/preview mode is the default or clearly available and writes no records. It reports counts for imported tasks, recurring configs/suggestions, workflow attachment candidates, completed rows skipped, blank rows skipped, unsafe rows, unresolved process docs, unresolved workflow matches, proof requirements, waiting/follow-up tasks, and validation errors.
By default, only open/pending TODO rows become active DataOps work. Completed historical rows are skipped unless an explicit history/analysis flag is used, and recurring pattern inference from history does not create duplicate historical tasks.
Imported spreadsheet work appears in the normal Operations Home, task list, waiting/follow-up, overdue, recurring, and workflow-context surfaces. No separate spreadsheet-import queue is required for daily operation.
Each imported task preserves due date from the row Date field after robust parsing of known formats; invalid or missing dates are reported and either assigned a safe review date or skipped according to documented rules.
Status normalization handles NEW, DONE, DONEDONE, mixed case, blank separator rows, multiline task text, extra empty columns, and done.csv rows whose status does not match the file name.
Imported tasks use source: import or an equivalently documented source value, plus source provenance that includes source file, row number, source date/status, and source task text without exporting spreadsheet credentials or private links that are unsafe to store.
Rows matching known recurring duties, such as Slack invites, Trello/card review, newsletter preparation, Mailchimp backup, Slack dump, sponsor performance follow-up, invoice/receipt checks, or bookkeeping TODO checks, become recurring configs, automatic workflow trigger candidates, or explicit migration suggestions rather than repeated standalone tasks.
A row can attach to an existing workflow bundle/task only when a deterministic match is available, such as a stable source ID, normalized title/date/template match, or explicit operator-approved mapping; ambiguous matches remain standalone tasks or migration warnings.
Process-doc links in notes or done-history metadata resolve to instructionDocId when an in-repo document is known; unresolved links preserve instructionsUrl and appear in the unresolved-doc report for future Extend process docs with stable IDs #33/workflow mapping work.
Imported process-document work records process-doc title/link/comment context as task comments, proof requirements, or artifact metadata when appropriate; it must not copy whole external docs into DynamoDB.
Completion proof is explicit. Rows that require a Google Doc, spreadsheet update, report, invoice, backup file, public link, comment, or external status set requiredLinkName, requiresFile, proofRequirement, externalStatus, or artifact metadata as appropriate.
Tasks with required proof cannot be imported or updated into done unless the required proof is present and export validation accepts it; ambiguous completed rows are reported instead of silently marked done.
Waiting/follow-up semantics are preserved or inferred conservatively for tasks blocked on a guest, sponsor, author, speaker, publisher, freelancer, accountant, Alexey, Valeria/Valeriia, Grace, or another external/internal reviewer. Waiting tasks must have waitingFor, followUpAt, and a short note/comment.
Ambiguous waiting cases remain todo with a migration note and unresolved waiting warning instead of becoming waiting without a safe follow-up date.
Import write mode is idempotent. Re-running the same CSV inputs updates or skips previously imported records using stable source keys such as source file + row hash/source row ID, and does not create duplicate tasks, recurring configs, artifacts, notifications, or audit/provenance records.
Non-sensitive useful URLs in notes become task links, bundle links, or artifact metadata when they are proof/output links. Temporary signed URLs, OAuth URLs, credentials, cookies, API keys, session values, and binary payloads are rejected or redacted.
Imported tasks that are due, overdue, waiting, missing proof, or ready for follow-up drive existing notification/dashboard behavior, including follow-up-due and missing-evidence context where implemented.
Import/audit provenance is durable and export-safe. Use audit events where supported; otherwise use documented bounded provenance metadata/comment fields that are included in portable export without DynamoDB PK/SK internals.
Portable export after a fixture import includes imported tasks, recurring configs, artifacts/files/notifications/audit/provenance as applicable; validate:export passes; dry-run:import reports valid counts without writing data.
UI/API behavior remains unified: imported spreadsheet work is visible and actionable through existing task, workflow, proof, waiting/follow-up, recurring, and Home surfaces.
[HUMAN] Before any real production Google Spreadsheet export is used, Alexey or Valeria confirms the export source, export date, included tabs/columns, row count, and whether any sensitive rows/links/comments must be redacted.
[HUMAN] Before any production DynamoDB import write, a human confirms target environment, on-demand backup/export location, dry-run summary, skipped/error report, unresolved mappings, and rollback/restore plan.
Test Scenarios
Scenario: Dry-run previews spreadsheet rows without writes
Given: local CSV fixtures for todo.csv and done.csv with pending rows, done rows, blank separators, multiline task text, mixed statuses, and extra columns
When: the migration runs in dry-run mode
Then: it reports planned active tasks, recurring suggestions/configs, skipped historical rows, skipped blanks, unresolved mappings, proof requirements, waiting/follow-up candidates, and writes no DynamoDB records.
Scenario: Pending rows become normal tasks
Given: a local todo.csv fixture with one ad-hoc pending task, one task with notes, and one task with a process-doc link
When: the migration runs in local write mode
Then: the rows create normal task records with due dates, comments, source provenance, process-doc context where resolvable, and visibility through existing task APIs and Operations Home.
Scenario: Repetitive history becomes recurring work
Given: done.csv contains repeated historical rows for Slack invites, Mailchimp backup, Slack dump, and newsletter preparation
When: the migration analyzes history
Then: it creates or suggests recurring configs/template triggers instead of importing every historical occurrence as an active task, and the summary explains created/skipped/suggested counts.
Scenario: Process-document TODO keeps doc context
Given: a row asks to create or update a process document and includes a process document title/link/comment
When: the row is imported
Then: the task stores bounded context, resolves instructionDocId when possible, preserves unresolved external instructionsUrl as fallback, and requires URL/comment/artifact proof before completion when the row outcome is a document.
Scenario: Waiting work drives follow-ups
Given: a row clearly represents waiting for a guest/sponsor/reviewer and includes a safe follow-up date or notes from which one can be deterministically parsed
When: the row is imported
Then: the task has status=waiting, waitingFor, followUpAt, a short comment, and appears in follow-up views when the date is due.
Scenario: Ambiguous waiting is not guessed
Given: a row mentions a person but does not clearly indicate blocked work or a safe follow-up date
When: the row is imported
Then: it remains todo, keeps source context in comments/provenance, and the migration report records an unresolved waiting inference.
Scenario: Proof-gated completion remains safe
Given: a row marked completed requires a Google Doc link, spreadsheet update, report file, invoice, backup, or public URL as proof
When: the import cannot find the required proof in row data
Then: the task is not silently marked done; it is reported as missing proof or imported as active review work according to documented rules.
Scenario: Workflow attachment avoids duplicates
Given: a spreadsheet row appears related to an existing newsletter/podcast/tax-report workflow but the match is ambiguous
When: the import runs
Then: the row is not attached to an arbitrary bundle; it is imported as standalone review work or reported as an unresolved workflow match.
Scenario: Import is idempotent
Given: the same CSV fixtures were already imported once
When: the migration runs again in local write mode
Then: no duplicate tasks, recurring configs, artifacts, notifications, or provenance records are created, and the summary reports created/updated/skipped counts.
Scenario: Export and restore safety holds after migration
Given: fixture spreadsheet rows have been imported into local work-engine data
When: portable export, export validation, and dry-run import are run
Then: relationships validate, waiting tasks have required metadata, proof-gated tasks are valid, redactions are enforced, no secrets or binaries are exported, and dry-run import reports insert/update counts without writing production data.
Scenario: Imported work is visible in the operator flow
Given: imported rows include overdue work, waiting follow-ups, recurring duties, missing proof, and process-doc links
When: the operator opens Operations Home, the task list, and relevant bundle/task detail views
Then: imported work appears in normal due/overdue/waiting/follow-up/recurring/workflow sections, proof blockers are visible, and process docs open from the task context.
Out of Scope
Connecting to live Google Sheets APIs, mutating production spreadsheets, deleting rows, changing spreadsheet statuses, or syncing bidirectionally with Google Sheets.
Importing all historical completed spreadsheet rows as active runtime tasks by default.
Rebuilding the app as a spreadsheet clone, grid editor, or separate spreadsheet-import dashboard.
Creating new integrations with Slack, Airtable, Mailchimp, Dropbox, Google Drive, Google Calendar, Luma, Meetup, YouTube, Spotify, Apple Podcasts, Finom, Wise, Revolut, LinkedIn, X, email, or Telegram.
Performing production DynamoDB writes, destructive restore drills, production backup creation, or external account checks during agent verification.
Modifying ../dtc-operations, ../datatasks, ../podcast-assistant, or any other source repository.
Implement raw intake inbox for operational inputs #31 can later represent spreadsheet imports as raw intake items, but this issue should not require a separate inbox to make imported pending TODOs actionable in Work/Home.
Extend process docs with stable IDs #33 and workflow mapping issues are useful for stable process-doc IDs. If a process doc cannot be resolved to a stable ID, preserve instructionsUrl and report the unresolved mapping.
Any real production spreadsheet export/import requires human confirmation and must follow docs/v1-execution-data-safety.md and docs/restore-drill.md backup/export/restore guidance.
Affected Areas
work-engine/scripts/migrate-data.ts and any helper modules created for CSV/TODO migration.
work-engine/tests/migrate-data.test.ts and new fixture-based migration tests.
work-engine/src/db/tasks.ts, work-engine/src/db/recurring.ts, work-engine/src/db/artifacts.ts, work-engine/src/db/notifications.ts, and route validation only if existing fields cannot store migration-safe metadata.
work-engine/src/export/portable.ts, work-engine/scripts/export-execution-data.ts, work-engine/scripts/validate-execution-export.ts, and work-engine/scripts/dry-run-import.ts if exported fields or validation rules need updates.
work-engine/src/public/app.js, work-engine/src/pages/index.html, and Playwright specs only if imported work is not visible/actionable through existing operator surfaces.
Process-doc resolution/search code only if implementation adds a resolver from spreadsheet/process-doc URLs to instructionDocId.
Data safety/export docs only if implementation discovers a new durable entity, source-provenance field, or export rule not already covered by the V1 data-safety docs.
Data Safety, Export, And Restore Implications
Use dry-run first for every real export. Production write mode must not run until a human reviews the dry-run summary, unresolved mappings, unsafe URLs, skipped rows, and planned entity counts.
Create an on-demand DynamoDB backup and portable export before any production import write. Local agent scratch exports must use project-local .tmp/exports/.
Store metadata and stable external/storage references only. Do not store Google Sheets credentials, OAuth tokens, cookies, API keys, signed URLs, session values, private credentials, spreadsheet binary exports, or large raw external document bodies in DynamoDB.
Exported records must use stable application IDs and explicit relationships, not DynamoDB PK/SK internals.
Source provenance must be bounded and redacted. It may include source filename/tab, row number, normalized source date/status, row hash/source ID, and short source text, but not raw secrets or entire private spreadsheet dumps.
After local fixture import, run portable export validation and dry-run import validation to prove relationship integrity, redaction, date parseability, waiting-task requirements, proof requirements, artifacts/files, notifications, recurring configs, and audit/omitted entities.
Production restore drills, destructive restore/import checks, and live spreadsheet access are [HUMAN] and must not be performed by agents unless explicitly authorized in a later issue.
Blockers
Real production spreadsheet migration is blocked on [HUMAN] confirmation of the spreadsheet export source, tabs, export date, row count, redaction requirements, and target environment.
Production DynamoDB writes are blocked on [HUMAN] approval of backup/export evidence, dry-run summary, unresolved mappings, skipped/error report, and rollback/restore plan.
Full instructionDocId coverage may be blocked by unresolved stable process-doc IDs until Extend process docs with stable IDs #33 or workflow-specific mapping issues cover the remaining docs; this should not block a safe import when fallback instructionsUrl and unresolved mapping reports are present.
If existing work-engine storage lacks a durable place for source provenance/audit events needed for idempotency and export safety, the Architect should review the minimal schema extension before SWE implementation.
If Implement V1 recurring work strategy #40 is still open and recurring behavior conflicts with the import plan, import should create a reviewed recurring suggestion/report or minimal current-model config rather than implementing a competing recurring-work product.
Required Verification Commands
Run the work-engine checks because this issue changes migration logic, runtime entities, export safety, and possibly operator behavior:
npm --prefix work-engine test
npm --prefix work-engine run typecheck
npm --prefix work-engine run build
Run focused migration/export commands with local fixtures and project-local export directories. The exact fixture paths may differ after implementation, but verification must include dry-run, local write/import, export, export validation, and dry-run restore/import validation:
Migrate spreadsheet TODOs into integrated operations-manager work
Status: pending
Tags:
enhancement,migration,portal,process-docs,work-engine,frontend,backend,testing,data,P1Depends on: #15, #29, #48, #50
Blocks: None
Scope
Implement a safe migration path for the Google Spreadsheet TODO data described in
work-engine/docs/data.mdso pending spreadsheet work becomes normal DataOps operations-manager work, not a separate spreadsheet clone. The operator should see imported work in the same Work/Home/task/bundle surfaces as native tasks, recurring work, workflow bundles, proof-gated tasks, waiting follow-ups, process-doc links, artifacts, notifications, and portable exports.Use these current references:
.goal-v1.mddocs/operations-manager-platform-jtbd.mddocs/v1-workflow-data-model.mddocs/v1-execution-state-schema.mddocs/v1-execution-data-safety.mddocs/restore-drill.mdwork-engine/docs/data.mdwork-engine/docs/specs.mdwork-engine/docs/templates.mdwork-engine/scripts/migrate-data.tswork-engine/tests/migrate-data.test.tswork-engine/tests/export-portable.test.tswork-engine/tests/dry-run-import.test.tswork-engine/src/types.tscontent/tasks/templates/and relevant process docs incontent/**../datatasksand../dtc-operationsfor comparison only; do not modify source repositoriesThe current script maps CSV rows into standalone manual tasks. This issue should extend or replace that CSV path so spreadsheet rows can become the right DataOps entities:
TODO list - todo.csvbecome ordinary active tasks, workflow tasks, or recurring configs according to their content.TODO list - done.csvmay be analyzed to infer recurring patterns, process-doc references, and proof semantics, but completed history must not flood the active task list by default.instructionDocIdwhere resolvable and preserve externalinstructionsUrl/spreadsheet notes as fallback provenance.The migration must be file-based and local/test safe. It may consume local CSV fixtures or a human-provided export file, but it must not connect to live Google Sheets, mutate Google Sheets, write external systems, or access production spreadsheets during agent verification.
Acceptance Criteria
--source-todo <csv>and optional--source-done <csv>, so tests and operators do not rely on hard-coded production spreadsheet paths.Datefield after robust parsing of known formats; invalid or missing dates are reported and either assigned a safe review date or skipped according to documented rules.NEW,DONE,DONEDONE, mixed case, blank separator rows, multiline task text, extra empty columns, anddone.csvrows whose status does not match the file name.source: importor an equivalently documented source value, plus source provenance that includes source file, row number, source date/status, and source task text without exporting spreadsheet credentials or private links that are unsafe to store.instructionDocIdwhen an in-repo document is known; unresolved links preserveinstructionsUrland appear in the unresolved-doc report for future Extend process docs with stable IDs #33/workflow mapping work.requiredLinkName,requiresFile,proofRequirement,externalStatus, or artifact metadata as appropriate.doneunless the required proof is present and export validation accepts it; ambiguous completed rows are reported instead of silently marked done.waitingFor,followUpAt, and a short note/comment.todowith a migration note and unresolved waiting warning instead of becomingwaitingwithout a safe follow-up date.follow-up-dueand missing-evidence context where implemented.PK/SKinternals.validate:exportpasses;dry-run:importreports valid counts without writing data.Test Scenarios
Scenario: Dry-run previews spreadsheet rows without writes
Given: local CSV fixtures for
todo.csvanddone.csvwith pending rows, done rows, blank separators, multiline task text, mixed statuses, and extra columnsWhen: the migration runs in dry-run mode
Then: it reports planned active tasks, recurring suggestions/configs, skipped historical rows, skipped blanks, unresolved mappings, proof requirements, waiting/follow-up candidates, and writes no DynamoDB records.
Scenario: Pending rows become normal tasks
Given: a local
todo.csvfixture with one ad-hoc pending task, one task with notes, and one task with a process-doc linkWhen: the migration runs in local write mode
Then: the rows create normal task records with due dates, comments, source provenance, process-doc context where resolvable, and visibility through existing task APIs and Operations Home.
Scenario: Repetitive history becomes recurring work
Given:
done.csvcontains repeated historical rows for Slack invites, Mailchimp backup, Slack dump, and newsletter preparationWhen: the migration analyzes history
Then: it creates or suggests recurring configs/template triggers instead of importing every historical occurrence as an active task, and the summary explains created/skipped/suggested counts.
Scenario: Process-document TODO keeps doc context
Given: a row asks to create or update a process document and includes a process document title/link/comment
When: the row is imported
Then: the task stores bounded context, resolves
instructionDocIdwhen possible, preserves unresolved externalinstructionsUrlas fallback, and requires URL/comment/artifact proof before completion when the row outcome is a document.Scenario: Waiting work drives follow-ups
Given: a row clearly represents waiting for a guest/sponsor/reviewer and includes a safe follow-up date or notes from which one can be deterministically parsed
When: the row is imported
Then: the task has
status=waiting,waitingFor,followUpAt, a short comment, and appears in follow-up views when the date is due.Scenario: Ambiguous waiting is not guessed
Given: a row mentions a person but does not clearly indicate blocked work or a safe follow-up date
When: the row is imported
Then: it remains
todo, keeps source context in comments/provenance, and the migration report records an unresolved waiting inference.Scenario: Proof-gated completion remains safe
Given: a row marked completed requires a Google Doc link, spreadsheet update, report file, invoice, backup, or public URL as proof
When: the import cannot find the required proof in row data
Then: the task is not silently marked
done; it is reported as missing proof or imported as active review work according to documented rules.Scenario: Workflow attachment avoids duplicates
Given: a spreadsheet row appears related to an existing newsletter/podcast/tax-report workflow but the match is ambiguous
When: the import runs
Then: the row is not attached to an arbitrary bundle; it is imported as standalone review work or reported as an unresolved workflow match.
Scenario: Import is idempotent
Given: the same CSV fixtures were already imported once
When: the migration runs again in local write mode
Then: no duplicate tasks, recurring configs, artifacts, notifications, or provenance records are created, and the summary reports created/updated/skipped counts.
Scenario: Export and restore safety holds after migration
Given: fixture spreadsheet rows have been imported into local work-engine data
When: portable export, export validation, and dry-run import are run
Then: relationships validate, waiting tasks have required metadata, proof-gated tasks are valid, redactions are enforced, no secrets or binaries are exported, and dry-run import reports insert/update counts without writing production data.
Scenario: Imported work is visible in the operator flow
Given: imported rows include overdue work, waiting follow-ups, recurring duties, missing proof, and process-doc links
When: the operator opens Operations Home, the task list, and relevant bundle/task detail views
Then: imported work appears in normal due/overdue/waiting/follow-up/recurring/workflow sections, proof blockers are visible, and process docs open from the task context.
Out of Scope
../dtc-operations,../datatasks,../podcast-assistant, or any other source repository.Dependencies
instructionsUrland report the unresolved mapping.docs/v1-execution-data-safety.mdanddocs/restore-drill.mdbackup/export/restore guidance.Affected Areas
work-engine/scripts/migrate-data.tsand any helper modules created for CSV/TODO migration.work-engine/tests/migrate-data.test.tsand new fixture-based migration tests.work-engine/src/db/tasks.ts,work-engine/src/db/recurring.ts,work-engine/src/db/artifacts.ts,work-engine/src/db/notifications.ts, and route validation only if existing fields cannot store migration-safe metadata.work-engine/src/export/portable.ts,work-engine/scripts/export-execution-data.ts,work-engine/scripts/validate-execution-export.ts, andwork-engine/scripts/dry-run-import.tsif exported fields or validation rules need updates.work-engine/src/public/app.js,work-engine/src/pages/index.html, and Playwright specs only if imported work is not visible/actionable through existing operator surfaces.instructionDocId.Data Safety, Export, And Restore Implications
.tmp/exports/.PK/SKinternals.[HUMAN]and must not be performed by agents unless explicitly authorized in a later issue.Blockers
[HUMAN]confirmation of the spreadsheet export source, tabs, export date, row count, redaction requirements, and target environment.[HUMAN]approval of backup/export evidence, dry-run summary, unresolved mappings, skipped/error report, and rollback/restore plan.instructionDocIdcoverage may be blocked by unresolved stable process-doc IDs until Extend process docs with stable IDs #33 or workflow-specific mapping issues cover the remaining docs; this should not block a safe import when fallbackinstructionsUrland unresolved mapping reports are present.Required Verification Commands
Run the work-engine checks because this issue changes migration logic, runtime entities, export safety, and possibly operator behavior:
npm --prefix work-engine test npm --prefix work-engine run typecheck npm --prefix work-engine run buildRun focused migration/export commands with local fixtures and project-local export directories. The exact fixture paths may differ after implementation, but verification must include dry-run, local write/import, export, export validation, and dry-run restore/import validation:
If UI/operator surfaces change, run E2E and capture screenshots for Home/task/workflow states with imported tasks:
If process-doc resolution/search changes, also run:
uv run --project lambda-functions --extra search --with pytest python -m pytest tests/docs_app cd lambda-functions uv run --extra search python -m lambda_functions.build_search_index \ --docs-dir ../content \ --output ../.tmp/dataops-content-search.indexBefore handoff, include: