IResearch (C++ search engine lib) outperforms Lucene and Tantivy on every query type in the search-benchmark-game

3 weeks ago 33

I've been a maintainer of IResearch (Apache 2.0) since 2015. It's the C++ search core inside ArangoDB, but it's been largely invisible to the wider C++ community.

We recently decoupled it and ran it through the search-benchmark-game created by the Tantivy maintainers. It's currently winning on every query type (term, phrase, intersection, union) for both count and top-k.

Benchmark methodology: 60s warmup, single threaded execution, median of 10 runs, fixed random seed, query cache disabled. The benchmark is reproducible: clone, run `make bench`, get the same numbers.

The gains come from three places:

Interactive results: https://serenedb.com/search-benchmark-game

If you're building something in C++ that needs search, IResearch is embeddable today. Happy to help you get started.

Repo: https://github.com/serenedb/serenedb/tree/main/libs/iresearch

Upd: Tantivy published results to their repo https://tantivy-search.github.io/bench/

submitted by /u/mr_gnusi to r/cpp
[link] [comments]
Read Entire Article