issues Search Results · language:Dune language:Python language:Java language:JavaScript language:Python language:Ruby
Filter by
58.1M results
RRF scores are ordinal and not comparable across queries with different candidate set sizes, but may be misread as
relevance probabilities.
Consider surfacing rank instead of (or alongside) raw RRF score ...
HyDE generates a hypothetical abstract from the raw query alone. Ambiguous/short queries can produce abstracts that skew
dense retrieval toward the wrong direction.
Explore query clarification, multi-sample ...
Describe the feature
Currently the best solution is to mixin the method that constructs the pair of lists. The better way to do it would
likely be to query a tag, and for each item in that tag take 1× ...
enhancement
The regex tokenizer (lowercased word-boundary alphanumeric matching) mishandles tokens like GPT-4, BERT, and math/Greek
symbols, reducing BM25 recall for queries involving model names or equations.
Improve ...
all-MiniLM-L6-v2 (22M params) may underperform on notation-heavy math/physics queries. Benchmark larger models (e.g.
all-mpnet-base-v2, text-embedding-3-large) against the gold query set for retrieval ...
ChromaDB s query() has no native ID-set filtering, so RAGLR-A over-fetches (5x top_n or the candidate set size) and
post-filters. For very large candidate sets this may not retrieve all relevant papers ...
Categories are stored on each Paper record but are not used for retrieval filtering. Consider using categories (with
cross-listing awareness) as an optional filter or ranking signal.
See docs/LIMITATIONS.md, ...
Perform weekly maintenance tasks including:
- Code refactoring
- Dependency updates
- Performance optimization
- Code cleanup
- Documentation updates
- Testing improvements
needs-triage
The corpus only reflects arXiv as of the last harvest; new preprints don t appear until the next incremental run. Set up
a recurring update_arxiv_data.py --incremental run (cron/scheduled task) to keep ...

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip! Restrict your search to the title by using the in:title qualifier.