No way to measure detection accuracy against real ground-truth labels

## What's missing

Our project guidelines (`AGENTS.md`) say computer vision work in this project should be evaluated with standard accuracy metrics — precision, recall, F1, and IoU (intersection-over-union) for segmentation — and that we should validate model output against human inspection where possible.

Right now we have unit tests that check the code behaves correctly on small synthetic examples (e.g. "does this function return the right shape"), but nothing that checks: "when we run the detector on a real, human-labeled inspection photo, how close does it get to what a person would have marked?"

There's already a `data/annotations/` folder reserved for ground-truth labels, but no script that compares model output against it.

## Why it matters

Without this, we have no objective way to know if a change to the detector (like the NMS change in issue #1) actually makes results better or worse on real images — we're relying on unit tests and manual eyeballing. For a tool used to flag well-integrity and safety-relevant failures, that's a gap.

## What needs to happen

Add a small evaluation script (e.g. `scripts/evaluate_detection.py`) that:
- Loads any labeled images from `data/annotations/`
- Runs the current detection pipeline on them
- Reports precision, recall, F1, and average IoU against the ground-truth boxes
- Prints a simple summary table

This doesn't need to be fancy — a clear, repeatable number we can watch over time is the goal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No way to measure detection accuracy against real ground-truth labels #4

What's missing

Why it matters

What needs to happen

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

No way to measure detection accuracy against real ground-truth labels #4

Description

What's missing

Why it matters

What needs to happen

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions