Implement unified operator search across docs and work context

# Implement unified operator search across docs and work context

Status: pending
Tags: `enhancement`, `portal`, `process-docs`, `work-engine`, `frontend`, `backend`, `testing`, `data`, `design`, `P1`
Depends on: #33, #34
Blocks: None

## Scope

Build the first unified search implementation for the DataOps operator portal. Search should help an operator execute daily work by finding the right task, workflow, SOP, template, artifact, or assistant output in context. It must not become a separate docs-first tool that leaves the operator to manually connect results back to the work they are doing.

Use the current architecture as the starting point:

- `lambda-functions/src/lambda_functions/docs_index.py`, `build_search_index.py`, and `/search` currently index/search Markdown content with `minsearch`.
- `lambda-functions/src/lambda_functions/doc_registry.py`, `/docs/registry`, and `/docs/resolve` expose document metadata and stable-ID resolution.
- The authenticated portal brokers private work-engine APIs through `/work/api/*` in `lambda-functions/src/lambda_functions/full_app_handler.py`.
- `work-engine/` already stores runtime work entities: tasks, templates, bundles/workflows, artifacts, files, assistant jobs, notifications, and export metadata.
- `work-engine` task/template records already support `instructionDocId`, `instructionStepId`, `sourceDocIds`, `phase`, `systems`, and `validation`; instantiated tasks preserve those fields.
- The current `frontend/` search UI searches documents only and renders document rows; the Operations Home/task panels already consume live work snapshots and can open tasks/workflows in context.

The implementation should add one operator-facing search experience that can return typed results from both process knowledge and live work state. Search results should make the result type and next action obvious:

- SOP/template/reference/playbook/prompt/task-template docs open in the portal document view.
- Task results open the task detail/action panel, including process-doc context when `instructionDocId` is present.
- Workflow/bundle results open the workflow detail panel with active tasks, required proof, and linked docs.
- Runtime template/workflow-definition results should make it possible to start or inspect the workflow, using existing quick-start flows where available.
- Artifact/file results should show the owning task/workflow/assistant job and open the relevant context before exposing the raw URL/file.
- Assistant job/output results should show status and owning task/workflow where those records exist; live assistant execution or new assistant lifecycle work is out of scope.

Start with an API/UI slice that covers docs, task templates, live tasks, active bundles/workflows, artifacts/files, and assistant-job/output metadata available from existing APIs. If a data source is not configured locally or the work engine is unavailable, the UI must show partial results and a clear unavailable-state for that source without fabricating matches.

## Jobs To Be Done Implications

- When an operator starts the day, they can search for a topic such as `Mailchimp newsletter`, `podcast document`, or `Luma` and see both executable work and the SOPs/templates needed to finish it.
- When an operator is inside a task or workflow, search helps answer `what doc/template/evidence do I need now?` rather than sending them to an unrelated documentation library.
- When a workflow is at risk, search exposes nearby context: active bundle, overdue task, required artifact, linked SOP, and assistant output status.
- When content quality is incomplete, search should surface missing/unresolved process-doc links as useful states, not hide the task or show a broken result.

## Affected Areas

- Python Lambda docs portal/search backend: `lambda-functions/src/lambda_functions/docs_index.py`, `search_handler.py`, `api_handler.py`, `full_app_handler.py`, and related tests under `tests/docs_app/`.
- Frontend portal shell: `frontend/index.html`, `frontend/src/app.js`, `frontend/src/styles.css`, and screenshots for search/task/workflow flows.
- Work-engine APIs and models as needed for search payloads: `work-engine/src/routes/*`, `work-engine/src/db/*`, `work-engine/src/types.ts`, `work-engine/src/export/portable.ts`, and related unit/E2E tests.
- Content/search index metadata under `content/**`, especially `content/tasks/templates/*.md`, but do not do broad content cleanup in this issue.

## Acceptance Criteria

- [ ] The portal exposes one authenticated search entry point that returns typed results for process docs/task-template docs plus available runtime work records: tasks, templates/workflows, bundles, artifacts/files, and assistant jobs/outputs.
- [ ] Each result includes a stable `type`, `id`, display title, short summary/context, source label, relevance-friendly fields, and enough routing metadata for the frontend to open the correct portal context.
- [ ] Search supports at least these filters where data exists: result type/source, doc type, domain, tag, system, task status, assignee, due-date bucket, bundle/workflow, and template/workflow type. Unsupported filters for a source should not break other sources.
- [ ] Document results continue to use the registry/search index and preserve existing `/search?q=...&doc_type=...&domain=...` behavior for docs consumers.
- [ ] Task results include due date/status, bundle/template relationship, assignee when available, required proof state, and `instructionDocId`/`instructionStepId` context when present.
- [ ] Workflow/bundle results include stage/status, progress or active-task counts when available, next due/overdue context, required bundle links/artifacts summary, and related process-doc IDs when present.
- [ ] Runtime template/task-template results make the distinction between Git-backed task-template docs and live work-engine templates clear, while linking both back to source docs through stable IDs when possible.
- [ ] Artifact/file and assistant-job/output results show the owning task/workflow/assistant context and do not expose private raw storage paths as the primary action.
- [ ] The frontend renders grouped or clearly typed results without making docs visually dominate work results; primary actions open the task/workflow/document context inside the portal.
- [ ] Searching from the current sidebar/keyboard shortcut continues to work on desktop and mobile, and empty/error/partial-source states are clear without layout overlap.
- [ ] If `/work/api/*` or the work-engine Lambda is unavailable, document search still works and the UI reports work-search unavailability without fake work results.
- [ ] Existing task-to-process-doc links are reused; this issue must not introduce a second link model or a disconnected search-only document resolver.
- [ ] Result payloads and UI states are covered by backend/unit tests and frontend/E2E tests for mixed docs/work results, filters, missing sources, and routing actions.
- [ ] Search index generation still succeeds for content metadata changes, and work-engine tests still prove task/template process-doc metadata persists through creation, update, instantiation, export, and search result formatting where touched.
- [ ] No source repositories outside `DataTalksClub/dataops` are modified.

## Test Scenarios

### Scenario: Search returns executable work and knowledge together

Given: docs include a Mailchimp newsletter SOP/template and work-engine has an active Newsletter task and workflow
When: the operator searches for `Mailchimp newsletter`
Then: results include the SOP/template doc, the live task, and the active workflow with distinct result types and actions that open the right portal context.

### Scenario: Task result opens with process-doc context

Given: a task has `instructionDocId` and `instructionStepId` that resolve through the document registry
When: the operator selects the task search result
Then: the task panel opens, shows the process-doc title/context, and offers an action to open the doc in the portal.

### Scenario: Workflow result keeps execution context

Given: an active Podcast or Newsletter bundle has overdue tasks, missing required links, and related docs
When: the operator searches for the workflow name or a required link such as `Luma`
Then: the workflow result shows status/progress context and opens the workflow detail rather than a raw docs page.

### Scenario: Artifact result routes through owner context

Given: an artifact or file is attached to a task or bundle
When: the operator searches for the artifact title or type
Then: the result shows the owner task/workflow and opens that context before exposing the artifact URL/file action.

### Scenario: Assistant output search is metadata-only

Given: an assistant job/output record exists with task or bundle ownership
When: the operator searches for the assistant job title or output title
Then: the result shows assistant status and owner context without requiring live Telegram, Groq, Heru, or external assistant credentials.

### Scenario: Partial source failure is visible

Given: the docs index is available but `/work/api/tasks` or another work source is unavailable
When: the operator searches
Then: document results still render, unavailable work sources are named in a non-blocking status, and no fake work results appear.

### Scenario: Filters narrow mixed results

Given: a query matches docs, tasks, and artifacts
When: the operator applies filters such as `type: task`, `status: waiting`, `system: mailchimp`, or `doc_type: sop`
Then: matching sources narrow correctly and unsupported filters do not discard unrelated source errors silently.

### Scenario: Mobile and keyboard search remain usable

Given: the operator is on mobile or uses `/` / `Cmd/Ctrl+K`
When: they search and open a task/workflow/document result
Then: focus, routing, panel state, and result text fit without overlap or broken navigation.

## Out of Scope

- Replacing `minsearch` with a hosted search service or vector database.
- Building natural-language/RAG answers or assistant-generated search summaries.
- Rebuilding the portal navigation, introducing a new frontend framework, or creating a standalone search app.
- Migrating all legacy Google Docs links to stable `instructionDocId`; use existing fields and depend on #33/#34 for the stable-ID and validation contract.
- Full content cleanup, stable IDs for every imported doc, or workflow-specific doc mapping beyond what is needed to render search context.
- External link availability checks for Google Docs, Loom, GitHub, Airtable, Luma, Meetup, YouTube, Spotify, Slack, Mailchimp, Dropbox, or email systems.
- Mutating production task/workflow/artifact/assistant data as part of indexing. Runtime search should read existing state and metadata only.
- New assistant lifecycle, transcription, Telegram intake, or generated artifact processing.
- Consolidating the TypeScript work-engine into the Python backend.

## Dependencies

- #33 is required so workflow-critical docs and task-template docs have stable IDs that search can display and route through.
- #34 is required so search does not normalize broken `related_docs`, `sourceDocIds`, or `instructionDocId` references into trusted results.
- Existing `/work/api/*` broker behavior should be reused. If the work-engine Lambda is not configured locally, the implementation must still support tests with mocked/local work-engine data and a documented partial-source state.
- The implementation should coordinate with #9/#36/#37/#38 when choosing seed data for realistic workflow search fixtures, but it should not wait for every workflow mapping to be complete.
- Assistant output search depends only on metadata available from current assistant job/artifact/file APIs. Anything requiring external credentials is not part of this issue.

## Data Safety And Export Implications

- Search must treat Git-backed Markdown as process knowledge and DynamoDB/work-engine records as runtime execution state; do not copy runtime work state into Markdown or make content files the source of truth for live tasks.
- Search indexing should be read-only for runtime work data. It must not update task status, assistant job status, artifact records, file records, or bundle links.
- Private or sensitive artifact storage paths must not be exposed as the primary search result text. Prefer title/type/owner/context and existing authenticated open/download routes.
- If a persisted runtime search index/cache is introduced, it must be documented as rebuildable derived data and excluded from portable execution exports unless there is a clear reason to include it.
- Portable export behavior must continue to include canonical task/template/bundle/artifact/assistant metadata, including `instructionDocId`, `instructionStepId`, `sourceDocIds`, and owner links where already supported.

## Blockers

- Engineering should not start until #33 and #34 are closed, or the implementer explicitly downgrades unresolved doc IDs/link validation to warning states with PM approval.
- If there is no stable local way to query representative work-engine records through `/work/api/*`, add a narrow mocked/local test fixture rather than querying production data.
- If assistant output records are not available by implementation time, keep the result type and UI state metadata-ready but mark live assistant-output search as blocked by the assistant job/artifact issue that owns those records.

## Required Verification Commands And Screenshots

Run the docs/search workflow:

```bash
uv run --project lambda-functions --extra search --with pytest python -m pytest tests/docs_app
```

Build the search index when content metadata, registry, search, or routing is touched:

```bash
cd lambda-functions
uv run --extra search python -m lambda_functions.build_search_index \
  --docs-dir ../content \
  --output ../.tmp/dataops-content-search.index
```

Run work-engine checks when runtime work search payloads, APIs, models, or export behavior are touched:

```bash
npm --prefix work-engine test
npm --prefix work-engine run typecheck
npm --prefix work-engine run build
```

Run E2E coverage for changed operator flows:

```bash
npm --prefix work-engine run test:e2e
```

Capture and attach screenshots for:

- Mixed search results showing docs plus tasks/workflows.
- A task result opened from search with process-doc context visible.
- A workflow/bundle result opened from search with active-task/proof context visible.
- Partial-source unavailable state when work search cannot load.
- Mobile search results and opened context panel.

Before handoff, include:

```bash
git diff --check
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement unified operator search across docs and work context #32

Implement unified operator search across docs and work context

Scope

Jobs To Be Done Implications

Affected Areas

Acceptance Criteria

Test Scenarios

Scenario: Search returns executable work and knowledge together

Scenario: Task result opens with process-doc context

Scenario: Workflow result keeps execution context

Scenario: Artifact result routes through owner context

Scenario: Assistant output search is metadata-only

Scenario: Partial source failure is visible

Scenario: Filters narrow mixed results

Scenario: Mobile and keyboard search remain usable

Out of Scope

Dependencies

Data Safety And Export Implications

Blockers

Required Verification Commands And Screenshots

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Implement unified operator search across docs and work context #32

Description

Implement unified operator search across docs and work context

Scope

Jobs To Be Done Implications

Affected Areas

Acceptance Criteria

Test Scenarios

Scenario: Search returns executable work and knowledge together

Scenario: Task result opens with process-doc context

Scenario: Workflow result keeps execution context

Scenario: Artifact result routes through owner context

Scenario: Assistant output search is metadata-only

Scenario: Partial source failure is visible

Scenario: Filters narrow mixed results

Scenario: Mobile and keyboard search remain usable

Out of Scope

Dependencies

Data Safety And Export Implications

Blockers

Required Verification Commands And Screenshots

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions