Skip to content

Evaluation of bm42 sparse indexing algorithm

Notifications You must be signed in to change notification settings

BubbleCal/bm42_eval

 
 

Repository files navigation

BM42 vs BM25 benchmark

Introduction

Download dataset:

wget https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/quora.zip
mkdir -p data
mv quora.zip data/

cd data
unzip quora.zip

Install dependencies

pip install -r requirements.txt

(Note: for gpu inference see fastembed)

BM25 (updated)

Bm25 version uses tantivy library for indexing and search.

python index_bm25.py

python evaluate-bm25.py

Results we got:

Total hits: 12065 out of 15675, which is 0.7696969696969697
Precision: 0.12065
Average precision: 0.12065
Average recall: 0.8952571817831299

BM25 with sparse vectors

Additionally, we compare pure sparse vectors implementation with BM25. It uses exactly the same tokenizer and stemmer as BM42, which provides a more fair comparison.

# Run qdrant
docker run --rm -d --network=host qdrant/qdrant:v1.10.0

python index_bm25_qdrant.py

python evaluate-bm25-qdrant.py

Results we got:

Total hits: 11151 out of 15675, which is 0.7113875598086125
Precision: 0.11151
Average precision: 0.1115100000000054
Average recall: 0.8321873943359426

BM42 - with all-minilm-l6-v2 as a backbone

BM42 uses fastembed implementation for inference, and qdrant for indexing and search. IDF are calculated using inside Qdrant.

# Run qdrant
docker run --rm -d --network=host qdrant/qdrant:v1.10.0

python index_bm42.py

python evaluate-bm42.py

Results we got:

Total hits: 11488 out of 15675, which is 0.7328867623604466
Precision: 0.11488
Average precision: 0.11488000000000238
Average recall: 0.8515208038970792

About

Evaluation of bm42 sparse indexing algorithm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%