Vector Search Benchmarks

Benchmark system for comparing CAGRA (GPU) vs Lucene HNSW (CPU) vector search algorithms.

Setup

Prerequisites:
- JDK 22+
- CUDA libraries
- Python 3.7+
- pip install pyyaml matplotlib numpy click pandas

Set library paths:

export LD_LIBRARY_PATH="/path/to/cuvs/build:/path/to/cuda/lib64:/path/to/conda/lib:$LD_LIBRARY_PATH"

Run benchmark

cuVS-Lucene benchmarks

./run_sweep.sh --data-dir /data2/vsbench-datasets --datasets datasets.json --sweeps sweeps.json --configs-dir configs --results-dir results --run-benchmarks

Solr benchmarks

./run_sweep.sh --data-dir /data2/vsbench-datasets --datasets datasets.json --mode solr --sweeps solr-sweeps.json --configs-dir configs --results-dir results --run-benchmarks

It builds Apache Solr's main branch and runs the benchmarks.

Adding Datasets

Edit datasets.json:

Creating Sweeps

Edit (or copy+edit) sweep.json:

Visualization

./run_pareto_analysis.sh (already called in run_sweep.sh) example: ./run_pareto_analysis.sh 3cNWY5 wiki10m

Serve the webui on port 8000:

cd web-ui-new; python3 -m http.server

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
src		src
trials/0		trials/0
web-ui-new		web-ui-new
.gitignore		.gitignore
README.md		README.md
bench-single.sh		bench-single.sh
convert_to_nvidia_format.py		convert_to_nvidia_format.py
data_export.py		data_export.py
datasets.json		datasets.json
generate-combinations.py		generate-combinations.py
log4j2.xml		log4j2.xml
plot_pareto.py		plot_pareto.py
pom.xml		pom.xml
prepare-datasets.sh		prepare-datasets.sh
run_pareto_analysis.sh		run_pareto_analysis.sh
run_queries.py		run_queries.py
run_sweep.sh		run_sweep.sh
solr-benchmarks.sh		solr-benchmarks.sh
solr-queries.py		solr-queries.py
solr-setup.sh		solr-setup.sh
solr-sweeps.json		solr-sweeps.json
sweeps.json		sweeps.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vector Search Benchmarks

Setup

Run benchmark

cuVS-Lucene benchmarks

Solr benchmarks

Adding Datasets

Creating Sweeps

Visualization

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

SearchScale/vectorsearch-benchmarks

Folders and files

Latest commit

History

Repository files navigation

Vector Search Benchmarks

Setup

Run benchmark

cuVS-Lucene benchmarks

Solr benchmarks

Adding Datasets

Creating Sweeps

Visualization

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages