I've been a maintainer of IResearch (Apache 2.0) since 2015. It's the C++ search core inside ArangoDB, but it's been largely invisible to the wider C++ community.
We recently decoupled it and ran it through the search-benchmark-game created by the Tantivy maintainers. It's currently winning on every query type (term, phrase, intersection, union) for both count and top-k.
Benchmark methodology: 60s warmup, single threaded execution, median of 10 runs, fixed random seed, query cache disabled. The benchmark is reproducible: clone, run `make bench`, get the same numbers.
The gains come from three places:
- Vectorized scoring (AVX2)
- std::nth_element instead of priority queue for result collection (TOP_K, TOP_K_COUNT)
- Adaptive block posting compression
- Lazy sparse query evaluation (e.g. phrase, conjunctions)
- No JVM overhead
Interactive results: https://serenedb.com/search-benchmark-game
If you're building something in C++ that needs search, IResearch is embeddable today. Happy to help you get started.
Repo: https://github.com/serenedb/serenedb/tree/main/libs/iresearch
Upd: Tantivy published results to their repo https://tantivy-search.github.io/bench/
[link] [comments]




![Halpatiokee Park [Stuart, FL]](https://preview.redd.it/yc2jvtj3cr3h1.jpg?width=140&height=140&crop=1:1,smart&auto=webp&s=b2f1cd5287a887343d041998e03125d29890059e)










English (US) ·