Skip to content

Import Podcast Assistant into assistants/podcast #7

Description

@alexeygrigorev

Import Podcast Assistant into assistants/podcast

Status: pending
Tags: migration, assistant, podcast, testing, P0
Depends on: #12
Blocks: #9, #44; provides implementation source for #30 and #31

Scope

Import the local Podcast Assistant source system into the target DataOps assistant structure at assistants/podcast/, preserving the existing assistant behavior while making it runnable and testable from this repo.

The source of truth for this import is the read-only local directory ../podcast-assistant. Do not modify that source repo/directory. This issue should copy or reconcile the assistant into dataops, not make changes upstream.

Current repo state already contains a transitional root-level podcast-assistant/ copy. This issue should correct the repo toward the target structure from _docs/MERGE_PLAN.md and docs/repository-structure-recommendation.md: podcast-assistant/ should be folded into assistants/podcast/. Avoid leaving two active copies. If a temporary compatibility shim is truly needed, document why and make the canonical package path unambiguous.

Preserve these Podcast Assistant capabilities from the source system:

  • Telegram intake for text notes, voice/audio, files, images, links, and command handling.
  • Groq-backed audio transcription and image description boundaries.
  • Heru runner support for Codex/Claude engines, including progress formatting.
  • Retry/resume behavior for interrupted assistant sessions.
  • Inbox staging, used-item movement, local generated document output, and status reporting.
  • Podcast process instructions, guest-intake template, knowledge-base material, example document fixtures, scripts, and unit tests needed to validate the assistant.

Keep the V1 product direction clear: this import makes the assistant available as a DataOps module, but it does not complete the full portal workflow. Local inbox/, documents/, and heru_runs/ style outputs are acceptable as transitional local/dev outputs only. The imported README or module docs must clearly state that production V1 will attach assistant inputs, jobs, logs, and generated podcast drafts to workflow artifacts through the assistant job/artifact work tracked separately.

Implementation should update package metadata, relative paths, README commands, test configuration, .gitignore rules, and import-log entries as needed so the package works from assistants/podcast/ without runtime reads from ../podcast-assistant.

Acceptance Criteria

  • assistants/podcast/ exists and is the canonical in-repo location for the Podcast Assistant.
  • The root-level podcast-assistant/ transitional copy is removed, moved, or reduced to a documented compatibility shim; there are not two independently maintained active assistant copies.
  • The source boundary is respected: ../podcast-assistant is read only, remains unmodified, and is not used at runtime by the imported package.
  • _docs/import-log.md records the Podcast Assistant import path as assistants/podcast/, the source path ../podcast-assistant, the fact that the source is not a Git repository, copied paths, deliberate exclusions, and validation commands.
  • Sensitive/local-only files are excluded from Git, including .env, .venv, caches, raw private inbox contents, generated Heru logs, and generated episode-specific outputs unless they are intentional small fixtures.
  • Package metadata and uv configuration work from the new location. Any Heru dependency strategy is explicit and tested from the DataOps checkout without modifying ../podcast-assistant.
  • Existing assistant unit tests are imported and pass from the new package path.
  • Tests cover the key preserved behavior: Telegram text saving/status counts, Heru progress formatting, retry/resume behavior, and podcast knowledge-base build/search utilities.
  • README/setup docs under assistants/podcast/ document required env vars (TELEGRAM_BOT_API_KEY, TELEGRAM_CHAT_ID, GROQ_API_KEY, HERU_ENGINE) and local run commands from the new path.
  • The docs explain the transitional storage boundary: local inbox/, documents/, and logs are not production artifact storage; future DataOps integration should create assistant jobs and artifact records.
  • Assistant prompts/process instructions and guest-intake templates remain versioned and readable in Git.
  • [HUMAN] With real credentials, a human can confirm the Telegram bot still accepts an allowed chat, stores an intake item, reports /status, and starts a processing run or dry-run safely.

Test Scenarios

  • Run the assistant unit suite from the imported package:

    uv run --project assistants/podcast pytest
  • Run any focused tests or commands needed to validate packaging from the new directory, for example:

    cd assistants/podcast
    uv run pytest tests/test_main.py tests/test_heru_runner.py tests/test_session_retrier.py
    uv run python scripts/search_podcast_kb.py "AI agents evaluation"
  • Verify local path behavior with temporary test directories: saving Telegram text creates metadata-rich markdown under the package-local raw inbox, /status ignores .gitkeep, and tests do not touch ../podcast-assistant.

  • Verify retry/resume logic with the existing mocked/session tests; do not require live Codex/Claude/Heru engine runs for normal CI.

  • If credentials are available, perform a manual Telegram smoke test for /start, /status, saving one harmless test message, and starting a processing command. Record the result in the issue comment. This check is human-gated because it uses a real Telegram bot and external credentials.

Out of Scope

Dependencies

  • Issue dependency: Define import log and source commit policy #12 should define/confirm the import-log and source-state policy before this import is finalized.
  • Local source dependency: ../podcast-assistant must be available to the engineer as a read-only source directory.
  • Runtime/tooling dependencies: Python 3.13+, uv, Telegram bot credentials for manual smoke testing, GROQ_API_KEY for transcription/image-description paths, and a working Heru engine setup for real processing runs.
  • Review dependency: Assistant Engineer should review the imported prompts, Heru/tool boundary, and retry behavior before final PM acceptance because this module controls assistant execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P0Must haveassistantAssistant modules and jobshumanCode done or issue blocked on human verificationmigrationImport or migration workpodcastPodcast workflow and assistanttestingTests and QA

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions