What's wrong
Failure type classification (src/classification/rules.py) works by checking a list of rules in a fixed order and using whichever one matches first — there's no scoring across all rules to pick the best overall match. One of those rules, _MechanicalDamage, always returns True from matches(), so it silently catches anything that fell through every earlier rule.
Because rules check on hard thresholds (e.g. aspect_ratio > 3.5 for partial plugging, aspect_ratio > 2.0 and circularity < 0.40 for wire wrap), a defect that sits right at one of those boundaries can flip from one failure type to a completely different one from a tiny, possibly noise-level change in the measured shape — there's no "this was a close call between erosion hole and wire wrap" signal surfaced anywhere.
Why it matters
Failure type drives the engineering interpretation, root-cause hypothesis, and recommended actions shown on the Analysis and Assessment pages. If classification is unstable at rule boundaries, two near-identical defects could get reported as two completely different failure mechanisms, which undermines trust in the tool for engineering decisions — especially since there's currently no real model to fall back on, just these hand-written rules.
What needs to happen
- Replace first-match-wins with a scored approach: evaluate every rule's
confidence() regardless of whether matches() is true, and pick the highest-confidence candidate (or expose the top 2 candidates when they're close, e.g. within 0.05 confidence, so the UI can show "could be X or Y").
- Consider removing the unconditional
_MechanicalDamage catch-all in favor of letting _Unknown (already the existing fallback path) take over when nothing scores above a minimum confidence, so "mechanical damage" isn't used as a silent dumping ground for unrecognized shapes.
- Add regression tests for defects sitting near rule boundaries (e.g. aspect_ratio = 3.4 vs 3.6) to confirm the new scoring approach degrades gracefully instead of flipping outright.
What's wrong
Failure type classification (
src/classification/rules.py) works by checking a list of rules in a fixed order and using whichever one matches first — there's no scoring across all rules to pick the best overall match. One of those rules,_MechanicalDamage, always returnsTruefrommatches(), so it silently catches anything that fell through every earlier rule.Because rules check on hard thresholds (e.g.
aspect_ratio > 3.5for partial plugging,aspect_ratio > 2.0 and circularity < 0.40for wire wrap), a defect that sits right at one of those boundaries can flip from one failure type to a completely different one from a tiny, possibly noise-level change in the measured shape — there's no "this was a close call between erosion hole and wire wrap" signal surfaced anywhere.Why it matters
Failure type drives the engineering interpretation, root-cause hypothesis, and recommended actions shown on the Analysis and Assessment pages. If classification is unstable at rule boundaries, two near-identical defects could get reported as two completely different failure mechanisms, which undermines trust in the tool for engineering decisions — especially since there's currently no real model to fall back on, just these hand-written rules.
What needs to happen
confidence()regardless of whethermatches()is true, and pick the highest-confidence candidate (or expose the top 2 candidates when they're close, e.g. within 0.05 confidence, so the UI can show "could be X or Y")._MechanicalDamagecatch-all in favor of letting_Unknown(already the existing fallback path) take over when nothing scores above a minimum confidence, so "mechanical damage" isn't used as a silent dumping ground for unrecognized shapes.