Skip to content

explorer: multi-tree facet filtering does N membership scans (slow in WASM at scale) — combine into one scan / cube #293

@rdhyee

Description

@rdhyee

Follow-up to #281/#282/#291 (facet trees). When SEVERAL hierarchical dims (material/context/object_type) are selected at once, the table filter (facetFilterSQL) and the live count engine each issue one membership pid-subquery per selected tree dim, AND-ed. Each subquery is a full sample_facet_membership scan, so N selected tree dims = N scans — fast natively (~0.1s, per Codex) but slow in DuckDB-WASM at scale, especially un-bbox-pruned (global view, or the cross-filter 'other-dim' subqueries which aren't bbox-scoped).

Single-dim filtering (the common case) is fast and shipped. This is the multi-dim edge.

Options:

  • Table filter: collapse the N membership subqueries into ONE scan — pid IN (SELECT pid FROM membership WHERE (facet_type='material' AND concept_uri IN(...)) OR (facet_type='context' AND ...) OR ... GROUP BY pid HAVING COUNT(DISTINCT facet_type) = <#selected dims>).
  • Live counts: the per-dim cross-filter 'other-dim' subqueries also scan full membership; same collapse, and/or a precomputed facet_tree_cross_filter cube (also the global-view count follow-up).
  • Measure WASM latency; set the same p95 budget.

Until then: single-dim tree filtering is the fast path; multi-tree works (correct results) but can be slow at scale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions