Whisper RAM gate: stop TTS server unconditionally loading large-v3 (~3GB)

Carved from #366 (Part 2). Independent — **no dependency on #365**, ship first.

## Problem

`agentwire/tts_server.py` lifespan startup loads `WhisperModel("large-v3")` on **every boot, unconditionally** (~3GB+ RAM) — even when host STT is served elsewhere (moonshine on `:8101`) or not needed at all.

```
210  print("Loading Whisper model (large-v3)...")
211  try:
212      whisper_model = WhisperModel("large-v3", device="cuda", compute_type="float16")
213  except (ValueError, RuntimeError):
215      whisper_model = WhisperModel("large-v3", device="cpu", compute_type="int8")
216  print("Whisper model loaded!")
```

(import at `tts_server.py:38`; `whisper_model` initialized `None` at `:74`.) `/transcribe` (`:436-440`) already 503s when `whisper_model is None`; `/health` reports it (`:478`). So the off-state is already handled downstream — only the load itself is unconditional.

## Fix (no compat — pre-launch)

Gate lines **210–216** (both the cuda branch and the cpu-fallback branch) behind env `AGENTWIRE_TTS_WHISPER` (default **off** → leave `whisper_model = None`). Optionally lazy-import `WhisperModel` so the TTS venv doesn't need faster-whisper installed when off.

## Acceptance

- [ ] Default boot (no `AGENTWIRE_TTS_WHISPER`): TTS server starts with `whisper_model = None`, no large-v3 load, ~3GB RAM freed; `/transcribe` → 503, `/health` shows whisper unavailable.
- [ ] `AGENTWIRE_TTS_WHISPER=1`: large-v3 loads as before (cuda, then cpu fallback intact); `/transcribe` works.
- [ ] No other startup behavior changes.

## Files

- `agentwire/tts_server.py` — gate **210–216**, import **38**, init **74**.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whisper RAM gate: stop TTS server unconditionally loading large-v3 (~3GB) #367

Problem

Fix (no compat — pre-launch)

Acceptance

Files

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Whisper RAM gate: stop TTS server unconditionally loading large-v3 (~3GB) #367

Description

Problem

Fix (no compat — pre-launch)

Acceptance

Files

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions