Skip to content

Add process quality dashboard for runnable workflows #35

Description

@alexeygrigorev

Add process quality dashboard for runnable workflows

Status: pending
Tags: enhancement, docs, process-docs, portal, frontend, backend, work-engine, testing, data, P1
Depends on: #33, #34
Blocks: None

Scope

Build the first process quality dashboard for DataOps V1, centered on the operations-manager jobs to be done in docs/operations-manager-platform-jtbd.md: the operator opens DataOps to keep daily workflows runnable, complete tasks with proof, follow up on blockers, and use process docs only at the moment of need.

This must not become a disconnected docs analytics page. The dashboard should surface process quality only where it affects completing tasks, starting workflow runs, or maintaining workflow templates. Repo-wide documentation counts may exist as secondary drill-down context, but the default view should answer:

  • Which active workflows or due tasks are blocked or risky because an instruction document cannot be resolved?
  • Which workflow templates still rely on unstable/generated document IDs, legacy Google Docs-only instructionsUrl values, or missing sourceDocIds/instructionDocId mappings?
  • Which tasks lack usable proof or validation instructions, so the operator cannot know what evidence is required before marking work done?
  • Which process docs used by active workflows contain TODOs, empty validation sections, missing metadata, broken links/images, or missing related-doc context?
  • Which findings are operator-blocking today versus maintainer cleanup for later?

Use the implementation from #33 for stable IDs and #34 for deterministic link/reference validation. This issue should consume or expose those signals in product form; it should not redo the stable-ID migration or replace the validator.

Dashboard Placement

Implement the quality signal in the existing operations workspace:

  • Replace the current Operations Home placeholder card for Process Quality in frontend/src/app.js with live process-quality content.
  • Add a compact Operations Home section near the daily lanes / active workflows that shows the highest-priority findings affecting active or due work first.
  • Provide a drill-down view from the same portal shell, under the Processes/process-quality surface, where maintainers can filter all findings by severity, workflow/template, category, document, and status.
  • Keep the workflow/task screen primary. A finding tied to a task should open the task or workflow context first, with a path to the relevant process doc.
  • When live /work/api/* data is unavailable, show that active-work impact cannot be confirmed and fall back only to workflow-template/process-doc quality. Do not present template-only findings as if they are active production blockers.

Quality Signals

Create a normalized process-quality report that can be rendered by the frontend and tested independently. Each finding should include at least:

  • stable id
  • category
  • severity: blocking, warning, or info
  • short operator-facing title
  • actionable summary
  • affected docId and/or docPath when applicable
  • affected templateId, sourceDocIds, taskRef, instructionDocId, taskId, and/or bundleId when applicable
  • workflowSlug or workflow type when derivable
  • source: registry, link validation, lint/TODO scan, workflow-template scan, runtime task scan, or proof/validation scan
  • recommended next action, such as open task, open workflow, open doc, fix metadata, add proof requirement, or add stable doc mapping

Cover these categories in the first implementation slice:

  • broken-doc-reference: unresolved related_docs, sourceDocIds, instructionDocId, wiki refs, doc: refs, or internal Markdown links reported by Add internal link and related-doc validation #34.
  • broken-asset-reference: missing local screenshots/images used by process docs that are tied to templates or active workflows.
  • unstable-doc-id: workflow-critical docs that still rely on generated IDs or references that are paths/aliases when a canonical stable ID should exist.
  • missing-metadata: workflow-critical docs missing required frontmatter such as id, summary, doc_type, schema_version, source, systems, tags, or related_docs, based on the Extend process docs with stable IDs #33 contract.
  • todo-or-placeholder: TODO/TBD/placeholder content in task templates, linked SOPs/templates, validation sections, or source/proof fields.
  • missing-validation: SOPs or task templates used by workflows that have empty or absent validation guidance.
  • missing-proof-instructions: work-engine task definitions or runtime tasks that require proof implicitly or explicitly but do not explain the required URL/file/artifact/comment/external status clearly enough for the operator to complete the task.
  • legacy-external-only-doc: template/task instructions that only point to external Google Docs URLs even though an in-repo stable process document exists or should be mapped.
  • template-doc-gap: workflow templates missing sourceDocIds, task definitions missing instructionDocId, or seeded template records drifting from Git-backed process docs.

Severity rules should prioritize daily execution:

  • blocking: affects a due/overdue/waiting task, active workflow, required proof, or workflow template needed to create runnable tasks.
  • warning: affects a workflow template or linked process doc but no loaded active work is currently blocked.
  • info: repo-wide/process-doc cleanup that is not tied to current tasks or priority templates.

Implementation Sequencing

  1. Reuse/extend the Add internal link and related-doc validation #34 validation module and Extend process docs with stable IDs #33 document registry metadata to produce machine-readable findings instead of scraping CLI text.
  2. Add a backend endpoint or served report, preferably under the existing portal/docs API surface, for example /process-quality or /docs/process-quality.
  3. Include workflow-template context by inspecting content/tasks/templates/*.md and work-engine/scripts/seed-templates.ts or the shared template source used after Extend process docs with stable IDs #33/Add internal link and related-doc validation #34 land.
  4. Include runtime impact by allowing the frontend to combine quality findings with the existing Operations Home work snapshots from /work/api/tasks and /work/api/bundles, or by having the backend accept/provide equivalent work context if a cleaner local pattern exists.
  5. Render the Operations Home summary for top blocking/warning items that affect active workflows first.
  6. Add the Processes drill-down for maintainers with category/severity/workflow filters and links back to tasks, workflows, and docs.
  7. Document the local verification command and maintainer remediation expectations in the relevant development/process docs.
  8. Add tests before handoff and avoid source-repo changes outside DataTalksClub/dataops.

Acceptance Criteria

  • A deterministic process-quality report exists and can be generated locally without live external accounts or production writes.
  • The report uses Extend process docs with stable IDs #33/Add internal link and related-doc validation #34 registry and validation behavior rather than duplicating ad hoc stable-ID or link-resolution logic.
  • Findings cover broken doc/link/image references, unstable IDs, missing metadata, TODO/placeholders, missing validation, missing proof instructions, workflow-template doc refs, and legacy external-only doc references.
  • Findings include enough structured context for the UI to tie them to documents, workflow templates, task definitions, runtime tasks, and active bundles when those relationships exist.
  • Operations Home shows top process-quality warnings in the daily operating context, prioritizing active/due/overdue/waiting work over repo-wide cleanup.
  • The existing placeholder text for process quality is removed or replaced with live empty/loading/error states that do not use fake data.
  • The Processes drill-down lets maintainers filter by severity, category, workflow/template, and document, and each finding has an actionable next step.
  • A task or workflow with an unresolved instructionDocId, missing required proof guidance, or broken linked SOP is visibly marked as risky before the operator tries to complete it.
  • Workflow templates with missing sourceDocIds, missing task instructionDocId, generated/path-only doc IDs, or Google Docs-only instruction links are reported before they are treated as fully runnable templates.
  • Empty states clearly distinguish "no process quality findings" from "live work data unavailable" and "validation could not run".
  • Tests cover backend/report generation, frontend rendering, active-work prioritization, filters, empty/error states, and links from findings to task/workflow/doc context.
  • Process Curator expectations are documented: if this dashboard changes process quality rules or exposes real SOP/template gaps, a Process Curator review should confirm the findings and remediation copy are useful for future maintainers.
  • No source repositories outside DataTalksClub/dataops are modified.

Test Scenarios

Scenario: Active task has broken instructions

Given: a live due task has instructionDocId: missing.process.doc
When: Operations Home loads process quality findings and work snapshots
Then: the task appears with a blocking process-quality warning, the warning explains that instructions cannot be opened, and the action opens the task/workflow context.

Scenario: Active task lacks proof guidance

Given: a task requires a link, file, artifact, comment, or external status but has no clear requiredLinkName, proofRequirement.label, validation.requiredEvidence, or equivalent completion instruction
When: the process-quality report is rendered
Then: the finding is categorized as missing-proof-instructions and appears before lower-priority doc cleanup.

Scenario: Template uses unstable process docs

Given: a workflow template has sourceDocIds or task instructionDocId values that resolve only through generated/path IDs, or only has instructionsUrl when an in-repo process doc mapping exists
When: the report is generated
Then: the template is reported as a warning or blocker, depending on whether active work currently uses it.

Scenario: Linked SOP has TODO and empty validation

Given: a process doc linked from a priority workflow contains TODO text and an empty Validation section
When: the report runs
Then: the dashboard reports TODO/placeholders and missing validation against that workflow/doc, with links to the doc editor.

Scenario: Broken image is scoped to workflow impact

Given: a missing screenshot is referenced by a SOP used by an active workflow, and another missing screenshot is in an unrelated archived doc
When: findings are prioritized
Then: the active workflow finding is blocking or warning, while the archived/unrelated finding is lower priority or hidden behind the maintainer drill-down.

Scenario: No live work data

Given: /work/api/* calls fail but docs/template validation succeeds
When: Operations Home renders
Then: the dashboard says active-work impact cannot be confirmed and shows template/process-doc findings only as maintainer warnings, not active task blockers.

Scenario: Clean active workflow

Given: an active workflow's template has stable sourceDocIds, each task has resolvable instructionDocId, required proof is explained, linked docs have validation guidance, and validation reports no broken links
When: Operations Home renders
Then: the workflow has no process-quality risk badge and can still show normal due/overdue/proof status.

Scenario: Maintainer filters findings

Given: the report contains multiple finding categories across Newsletter, Podcast, and Tax Report workflows
When: a maintainer opens the Processes drill-down and filters by Podcast and blocking
Then: only blocking Podcast findings remain, with links to affected tasks, templates, and docs.

Out of Scope

  • Migrating stable IDs onto process docs or task templates; that belongs to Extend process docs with stable IDs #33.
  • Implementing the internal link/related-doc validator itself; that belongs to Add internal link and related-doc validation #34.
  • Fixing every reported SOP, template, TODO, metadata gap, or broken link discovered by the dashboard.
  • Building a generic docs analytics product, content scorecard, or vanity metrics page disconnected from workflow execution.
  • Checking external HTTP(S), Google Docs, Loom, GitHub, or account permissions live over the network.
  • Changing production DynamoDB data, external Google Docs, Trello, Google Sheets, Airtable, or other source systems.
  • Moving content/ to a separate dataops-knowledge repository.
  • Editing ../dtc-operations, ../datatasks, or ../podcast-assistant.
  • Reworking the whole Operations Home layout beyond the process-quality integration needed here.

Dependencies

  • Extend process docs with stable IDs #33 should land first so workflow-critical docs and task templates have stable document IDs and metadata contracts.
  • Add internal link and related-doc validation #34 should land first or this issue must implement a temporary adapter around whatever validation output exists, clearly marked as transitional.
  • The implementation should reuse lambda-functions/src/lambda_functions/doc_registry.py, lambda-functions/src/lambda_functions/docs_index.py, the Add internal link and related-doc validation #34 validation command/module, existing docs app tests under tests/docs_app/, and current Operations Home work snapshot logic in frontend/src/app.js.
  • Work-engine context should come from existing task/template fields such as sourceDocIds, instructionDocId, instructionsUrl, validation, requiredLinkName, requiresFile, and proofRequirement.
  • No external credentials or human-only checks are expected.

Labels

Use labels: enhancement, docs, process-docs, portal, frontend, backend, work-engine, testing, data, P1.

Remove needs grooming after this body is applied.

Verification Expectations

Run the focused process-quality command or endpoint test added by this issue. The exact command may differ by implementation, but it must be documented and deterministic.

Run docs app/backend/frontend tests:

uv run --project lambda-functions --extra search --with pytest python -m pytest tests/docs_app

Build the search index, because registry/search/content quality signals are part of this feature:

cd lambda-functions
uv run --extra search python -m lambda_functions.build_search_index \
  --docs-dir ../content \
  --output ../.tmp/dataops-content-search.index

Run work-engine checks because template/task proof and instruction-doc fields are part of the report:

npm --prefix work-engine test
npm --prefix work-engine run typecheck
npm --prefix work-engine run build

Run UI verification for the Operations Home and Processes drill-down, including screenshots for changed screens and at least one test case where /work/api/* is unavailable.

Before handoff, include:

git diff --check

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1ImportantbackendBackend/APIdataData model, migration, storagedocsDocumentation or process docs workenhancementNew or improved functionalityfrontendFrontend UIportalShared portal shell and UXprocess-docsSOPs, templates, references, playbookstestingTests and QAwork-engineDataTasks task execution engine

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions