Voice/STT overhaul: moonshine host STT, free Whisper RAM, transcribe-to-stdout, toggle+voice-pick Hammerspoon PTT

Capturing a session's worth of local voice/STT work to upstream for everyone. Verified against current `main` — see corrections (notably: the moonshine backend is **already shipped**; only docs + the stdout path + the RAM gate are net-new).

## Summary

Four parts — **three concrete repo edits** (1–3 + docs), **one out-of-repo reference** (4):
1. moonshine as host STT on `:8101` — **already in repo**; needs docs + a service wrapper.
2. Stop the TTS server (`:8100`) unconditionally loading `WhisperModel("large-v3")` — gate behind `AGENTWIRE_TTS_WHISPER` (default off). **CONFIRMED unconditional load.**
3. New `agentwire listen stop --stdout` transcribe-to-stdout mode — net-new.
4. Hammerspoon PTT rewrite (toggle + voice target-picker) — lives in `~/.hammerspoon/init.lua`, ship as a reference example.

Depends on #365. **Recommend splitting into 3 issues (below).**

## Part 1 — moonshine host STT (already shipped; docs + service)
`agentwire stt start --backend moonshine` is wired end-to-end already: `stt/engine.py:17` (`KNOWN_BACKENDS`), `:20-39` (`_load_moonshine`), `:89-101` (auto tries moonshine first; clear install hint), `:140-147` (transcribe branch); `__main__.py:1740-1767` (`cmd_stt_start` env-passes `STT_BACKEND`/`MOONSHINE_MODEL`, port `:8101`), `:10898-10911` (`stt start`/`serve` parsers already have `--backend`/`--model`/`--port`). **Left to do:** docs (`agentwire-cli`/`agentwire-config` skills + a wiki page for `--backend moonshine` on `:8101` and `stt.moonshine_model`) and a launchd service for `:8101` (follow-up).

## Part 2 — Whisper RAM gate (concrete; confirmed)
`agentwire/tts_server.py` lifespan startup loads large-v3 **every boot, unconditionally** (~3GB+):
```
210  print("Loading Whisper model (large-v3)...")
211  try:
212      whisper_model = WhisperModel("large-v3", device="cuda", compute_type="float16")
213  except (ValueError, RuntimeError):
215      whisper_model = WhisperModel("large-v3", device="cpu", compute_type="int8")
216  print("Whisper model loaded!")
```
(import `tts_server.py:38`). `/transcribe` (`:436-440`) already 503s when `whisper_model is None`; health reports it (`:478`).
**Fix:** gate lines 210–216 (both cuda+cpu branches) behind `AGENTWIRE_TTS_WHISPER` (default off → leave `whisper_model=None`, init at `:74`). Optionally lazy-import `WhisperModel` so the TTS venv doesn't need it when off.

## Part 3 — transcribe-to-stdout (net-new)
`agentwire/listen.py:208` `stop_recording(session, voice_prompt=True, type_at_cursor=False)` has two output branches (`type_at_cursor` paste 303–341; default send-to-tmux 342–391). **Add a third `transcribe_only` branch right after `log(f"Transcribed: {text}")` at line 301** — print raw `text` to stdout, `return 0`, before the `type_at_cursor` dispatch. Inherits the `stt.backend: custom` + `stt.url` (`:8101`) requirement (`:275-283`). CLI: `cmd_listen_stop` (`__main__.py:6561-6566`) add `transcribe_only=getattr(args,'stdout',False)`; `listen stop` parser (`:11246-11251`) add `--stdout`. Note `cmd_listen_toggle` (`:6575-6582`) also calls `stop_recording` — decide stop-only vs toggle-reachable.

## Part 4 — Hammerspoon reference (out-of-repo, docs only)
Toggle-based PTT + voice target-picker lives in `~/.hammerspoon/init.lua` (not version-controlled). Ship as a reference example (`docs/wiki/voice/hammerspoon-ptt.md` or `examples/`). The repo already accommodates Hammerspoon as an external caller (`listen.py:54-59` path fallbacks; `type_at_cursor` shells `hs -c` at 303–341). Bake in the gotchas the author found: `hs.chooser:choices()` is a setter (keep choices in a var); `hs.chooser:select(row)` fires completion + closes (use to auto-confirm); character-level fuzzy match (Levenshtein + per-word containment bonus) so STT typos match.

## Split recommendation
- **#366a — Whisper RAM gate** (`feature:tts`, `area:tech-debt`): tiny, self-contained, **no #365 dependency — ship first**.
- **#366b — moonshine host STT docs + transcribe-to-stdout** (`feature:stt`): docs for the already-shipped backend + the new `--stdout` path. Depends on #365.
- **#366c — Hammerspoon PTT reference** (`area:docs`, `feature:stt`): pure docs/example; depends on #366b's `--stdout`.

## Dependencies & follow-ups
- **Depends on #365** (the `stt.backend`/`stt.engine` split removes the `--backend moonshine` workaround) — except 366a, which is independent.
- **Follow-up:** launchd service for `:8101` (`stt start` runs in tmux; dies on reboot).
- **Follow-up:** `/health` 2s probe flake — `listen.py:111-117` fast-fails after a 2s health timeout before sending audio; on a cold server this spuriously reports "unavailable." Tune timeout / retry / drop the pre-probe.

## Files
`agentwire/tts_server.py` (Part 2: 210–216 + import 38) · `agentwire/listen.py` (Part 3: `stop_recording` @208, hook after @301; `/health` @111-117 follow-up) · `agentwire/__main__.py` (Part 3: `cmd_listen_stop` @6561-6566, parser @11246-11251; Part 1 shipped surface @1740-1767/@10898-10911) · `agentwire/stt/engine.py` (Part 1, already supports moonshine — no edit) · docs/skills.

---
> **Code-review corrections (vs original):** (1) Part 1 is **not net-new** — `agentwire stt start --backend moonshine` + the backend already ship on `main`; Part 1's real work is docs + a launchd follow-up. (2) The Whisper load has a cuda→cpu fallback (`tts_server.py:212-215`) — the gate must wrap **both** branches. (3) `stop_recording` has **no** `transcribe_only` param today (signature `(session, voice_prompt=True, type_at_cursor=False)`) — Part 3 adds it. (4) Recommend splitting into 366a/b/c so the RAM win can land immediately without waiting on #365.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Voice/STT overhaul: moonshine host STT, free Whisper RAM, transcribe-to-stdout, toggle+voice-pick Hammerspoon PTT #366

Summary

Part 1 — moonshine host STT (already shipped; docs + service)

Part 2 — Whisper RAM gate (concrete; confirmed)

Part 3 — transcribe-to-stdout (net-new)

Part 4 — Hammerspoon reference (out-of-repo, docs only)

Split recommendation

Dependencies & follow-ups

Files

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Voice/STT overhaul: moonshine host STT, free Whisper RAM, transcribe-to-stdout, toggle+voice-pick Hammerspoon PTT #366

Description

Summary

Part 1 — moonshine host STT (already shipped; docs + service)

Part 2 — Whisper RAM gate (concrete; confirmed)

Part 3 — transcribe-to-stdout (net-new)

Part 4 — Hammerspoon reference (out-of-repo, docs only)

Split recommendation

Dependencies & follow-ups

Files

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions