Skip to content

feat: MICE estimator selection via NonlinearityTag + Unpredictable all-fallback #158

@DEVunderdog

Description

@DEVunderdog

Parent

#91

What to build

Replace the hardcoded BayesianRidge inside the MICE fitting block with adaptive estimator selection driven by NonlinearityTag. Collect the tag from each MICE column's NumericStats, take the most-complex tag across the block (ComplexNonlinear > MonotonicNonlinear > Linear > Unpredictable), and pass it with n_rows to RegressionEstimatorFactory. If every MICE column is Unpredictable, skip the MICE block entirely and fall each column back to Median via _fallback_to_median. Record the chosen estimator name and the driving NonlinearityTag in every MICE column's ColumnImputationRecord.signals.

This is the foundation slice — slices for dynamic max_iter/tol, initial_strategy, and n_nearest_features all build on the parameterised IterativeImputer construction introduced here.

Acceptance criteria

  • MICE fitting block constructs IterativeImputer with the estimator returned by RegressionEstimatorFactory instead of the default BayesianRidge
  • Most-complex-tag precedence is correctly applied: ComplexNonlinear > MonotonicNonlinear > Linear > Unpredictable
  • When all MICE columns are Unpredictable: no MICE model is stored; each column's record is updated to Median strategy via _fallback_to_median with a clear signal
  • Every MICE column's signals contains an entry recording the estimator name (e.g. "mice_estimator: GradientBoostingRegressor (tag=ComplexNonlinear)")
  • Every MICE column's signals contains an entry recording the driving NonlinearityTag
  • GradientBoostingRegressor is selected for ComplexNonlinear blocks above gradient_boost_min_rows; RandomForestRegressor for ComplexNonlinear blocks at or below that threshold
  • Unit test: MICE block with at least one ComplexNonlinear column produces a non-linear estimator signal on all MICE column records
  • Unit test: MICE block where all columns are Unpredictable produces no MICE model and each column's record shows Median strategy with a fallback signal
  • Numpy-style docstrings are present on all in-scope symbols touched by this change

Blocked by

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions