Skip to content

Classification rules are first-match-wins with an always-true catch-all, so small feature changes can flip the failure type entirely #6

@djimrastephane

Description

@djimrastephane

What's wrong

Failure type classification (src/classification/rules.py) works by checking a list of rules in a fixed order and using whichever one matches first — there's no scoring across all rules to pick the best overall match. One of those rules, _MechanicalDamage, always returns True from matches(), so it silently catches anything that fell through every earlier rule.

Because rules check on hard thresholds (e.g. aspect_ratio > 3.5 for partial plugging, aspect_ratio > 2.0 and circularity < 0.40 for wire wrap), a defect that sits right at one of those boundaries can flip from one failure type to a completely different one from a tiny, possibly noise-level change in the measured shape — there's no "this was a close call between erosion hole and wire wrap" signal surfaced anywhere.

Why it matters

Failure type drives the engineering interpretation, root-cause hypothesis, and recommended actions shown on the Analysis and Assessment pages. If classification is unstable at rule boundaries, two near-identical defects could get reported as two completely different failure mechanisms, which undermines trust in the tool for engineering decisions — especially since there's currently no real model to fall back on, just these hand-written rules.

What needs to happen

  • Replace first-match-wins with a scored approach: evaluate every rule's confidence() regardless of whether matches() is true, and pick the highest-confidence candidate (or expose the top 2 candidates when they're close, e.g. within 0.05 confidence, so the UI can show "could be X or Y").
  • Consider removing the unconditional _MechanicalDamage catch-all in favor of letting _Unknown (already the existing fallback path) take over when nothing scores above a minimum confidence, so "mechanical damage" isn't used as a silent dumping ground for unrecognized shapes.
  • Add regression tests for defects sitting near rule boundaries (e.g. aspect_ratio = 3.4 vs 3.6) to confirm the new scoring approach degrades gracefully instead of flipping outright.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions