Deduplicate sources in LLM's answers

## 🧠 Context

Right now the source URLs show up **twice** for every answer:

1. The system prompt (`src/completions/prompts.py`) tells the model to end its answer with its own `Sources:` block, so the model writes a list of URLs *into the answer text*.
2. `completion_service.ask()` *also* attaches the retrieved chunks' URLs to the structured `Answer.sources` field, and the renderers (the dev CLI today, the Discord bot) print that field underneath the answer text.

Net result: the answer text contains a `Sources:` list, and then a second `Sources:` list is printed right after it from `Answer.sources`. The two can even disagree.

This ticket removes the duplication by making the **structured `Answer.sources` field the single source of truth** and stopping the model from writing its own list. The user should see the sources exactly once — rendered from `Answer.sources`.

**Out of scope (deliberately):** deciding *which* of the retrieved URLs the model actually used. `Answer.sources` will continue to list **all** retrieved chunks' URLs (bounded to a small number by `TOP_K`). Occasionally a retrieved-but-not-very-relevant page may be listed — that is accepted for now. Refining the list down to only the pages the model actually used is a possible later effort and is **not** part of this ticket.

---

## 🛠 Implementation Plan

1. **`src/completions/prompts.py` — stop the model from listing sources.**
   * Remove the instruction that tells the model to end with a `Sources:` block.
   * Replace it with an explicit *negative* instruction, e.g. *"Do not list sources, URLs, or a 'Sources:' section. Just write the answer — the source links are attached automatically."*
   * **Keep** the `[Source: <url>]` markers in the context block (that's how the model *sees* which page each chunk came from — we're only changing whether it *echoes* them back).

2. **`src/completions/services/completion_service.py` — keep `Answer.sources` as-is, defend the text.**
   * Leave `Answer.sources = _extract_sources(chunks)` unchanged: all retrieved chunks' URLs, already deduped by `_extract_sources`. No change to `confidence` (`"high"` on this path).
   * As a safety net for the case where the model ignores the instruction and writes a `Sources:` block anyway, add a small pure helper that strips a **trailing, line-anchored** `Sources:` block from the model's text before it becomes `Answer.text`:
     * Match a header that is its **own line** — `^Sources:` with multiline matching — and keep everything **before the last** such line as the prose. If there's no such line, return the text unchanged.
     * This is line-anchored on purpose so the word "sources" appearing **mid-sentence** in normal prose does **not** truncate the answer.
   * **Important design property:** this strip only edits the *display text*. It must **never** feed back into `Answer.sources` (which is independently `_extract_sources(chunks)`). So even a wrong strip can only drop some trailing text — it can never change which sources are cited. That's what makes this safe, unlike trying to parse the model's block to decide the citation list.

---

## 📝 Notes

* **Do not change `src/domain/types.py`.** The `Answer` field shapes already support this — you are changing what fills `Answer.text`, not the contract.
* **You are not removing the `sources` field from `Answer`.** Renderers print sources *from that field*; you're only removing the duplicate `Sources:` *text* the model embedded in its prose.
* The abstain path (no chunks retrieved) is unchanged — it returns before the model runs.
* The prompt instruction is the **primary** mechanism; the strip is a **fallback** for non-compliant model output. Unusual formats (e.g. an inline `Sources:` not on its own line) may slip through the strip — that's acceptable, because the worst case is occasional cosmetic duplication, never a wrong source list.
* Refining `Answer.sources` to only the pages the model actually used is intentionally left for a later effort; do not attempt URL parsing/matching here.

---

## ✅ Acceptance Criteria

* The system prompt no longer instructs the model to list sources, and instructs it **not** to.
* `Answer.sources` is unchanged in shape: the retrieved chunks' URLs, deduped, no `domain/types.py` change.
* The strip logic lives in a **pure helper** (input: model text → output: text with any trailing `Sources:` block removed) and is unit-tested offline (no Ollama or DB).
* Tests cover: a trailing line-anchored `Sources:` block is stripped; the word "sources" used mid-sentence in the prose is **not** stripped; text with no `Sources:` block is returned unchanged.
* End to end, the user sees the source list exactly once (from `Answer.sources`), not twice.
* `make test` and `make lint` pass.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deduplicate sources in LLM's answers #12

🧠 Context

🛠 Implementation Plan

📝 Notes

✅ Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Deduplicate sources in LLM's answers #12

Description

🧠 Context

🛠 Implementation Plan

📝 Notes

✅ Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions