Summary
ScanIterator.compareValues(Object a, Object b) catches ClassCastException and returns 0. Because the canPruneChunk arms treat compareValues(...) < 0 / > 0 as the prune condition, a 0 means "cannot prune" — so a filter whose value type does not match the stored zone-map stat type silently prunes nothing and the whole column is decoded. No error, no log: a correctness-safe but performance-silent failure.
Where
reader/src/main/java/io/github/dfa1/vortex/reader/ScanIterator.java
private static int compareValues(Object a, Object b) {
try {
return ((Comparable<Object>) a).compareTo(b);
} catch (ClassCastException _) {
return 0; // <-- mismatch => 0 => canPruneChunk yields false => no pruning
}
}
Integer zone stats decode as Long (ArrayStats.decodeScalar returns int64_value()), and floats as Double. So a caller that builds a RowFilter with an Integer (or Float) value against an integer (or f32) column gets zero pruning even though the predicate is perfectly valid — the comparator just throws ClassCastException internally and is swallowed.
How it surfaced
Building the Calcite adapter's filter push-down (feat/vortex-calcite-demo), a WHERE date BETWEEN ... on an I32 column pruned 0 of 100 chunks until the literal was coerced from Integer to Long. The filter, expansion, and canPruneChunk path were all correct — only the boxed type differed, and the swallow hid it.
Impact
- Silent performance cliff: a valid, selective filter degrades to a full scan with no signal.
- Easy to hit from any caller that boxes filter values at the column's "natural" width (
Integer for I32, Float for F32) rather than the stat's storage width (Long / Double).
Suggested fix (pick one)
- Normalise numeric comparison in
compareValues — if both args are Number but different boxed types, compare via Long/Double (or BigDecimal) instead of returning 0. Most robust; makes pruning width-agnostic.
- Coerce on
RowFilter construction — normalise integer values to Long and float values to Double when a RowFilter is built, matching the stat storage types.
- At minimum, do not return
0 on mismatch — a genuine type mismatch (e.g. comparing a string filter to a numeric column) should be a clear error or a logged "pruning skipped", not a silent no-op.
Option 1 is preferred: it fixes every caller, not just the careful ones.
Notes
The Calcite adapter worked around this by coercing integer literals to Long (commit on feat/vortex-calcite-demo), so it is unblocked — but the underlying reader trap should be fixed so the next caller does not rediscover it.
Summary
ScanIterator.compareValues(Object a, Object b)catchesClassCastExceptionand returns0. Because thecanPruneChunkarms treatcompareValues(...) < 0/> 0as the prune condition, a0means "cannot prune" — so a filter whose value type does not match the stored zone-map stat type silently prunes nothing and the whole column is decoded. No error, no log: a correctness-safe but performance-silent failure.Where
reader/src/main/java/io/github/dfa1/vortex/reader/ScanIterator.javaInteger zone stats decode as
Long(ArrayStats.decodeScalarreturnsint64_value()), and floats asDouble. So a caller that builds aRowFilterwith anInteger(orFloat) value against an integer (or f32) column gets zero pruning even though the predicate is perfectly valid — the comparator just throwsClassCastExceptioninternally and is swallowed.How it surfaced
Building the Calcite adapter's filter push-down (
feat/vortex-calcite-demo), aWHERE date BETWEEN ...on anI32column pruned 0 of 100 chunks until the literal was coerced fromIntegertoLong. The filter, expansion, andcanPruneChunkpath were all correct — only the boxed type differed, and the swallow hid it.Impact
IntegerforI32,FloatforF32) rather than the stat's storage width (Long/Double).Suggested fix (pick one)
compareValues— if both args areNumberbut different boxed types, compare viaLong/Double(orBigDecimal) instead of returning0. Most robust; makes pruning width-agnostic.RowFilterconstruction — normalise integer values toLongand float values toDoublewhen aRowFilteris built, matching the stat storage types.0on mismatch — a genuine type mismatch (e.g. comparing a string filter to a numeric column) should be a clear error or a logged "pruning skipped", not a silent no-op.Option 1 is preferred: it fixes every caller, not just the careful ones.
Notes
The Calcite adapter worked around this by coercing integer literals to
Long(commit onfeat/vortex-calcite-demo), so it is unblocked — but the underlying reader trap should be fixed so the next caller does not rediscover it.