Skip to content

Commit

Permalink
Benchmark milvus (#850)
Browse files Browse the repository at this point in the history
* Add milvus benchmarking support

* Add latest docstring and tutorial changes

* Edit config

* Disable docker interactive mode

* Add milvus index type support

* Adjust FAISS and Milvus node branching

* Remove duplicate in config

* Revert method for speedup

* Add latest docstring and tutorial changes

* Add latest benchmark run

* Add latest docstring and tutorial changes

* Add json files

* Revert "Add latest docstring and tutorial changes"

This reverts commit e2efa5f.

* Add latest docstring and tutorial changes

* Revert "Add latest docstring and tutorial changes"

This reverts commit b085a67.

* Fix typo

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
  • Loading branch information
brandenchan and github-actions[bot] committed Apr 13, 2021
1 parent b87daed commit 77d4c2c
Show file tree
Hide file tree
Showing 13 changed files with 269 additions and 136 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ tutorials/cache
tutorials/mlruns
tutorials/model
models
saved_models
*_build

.DS_Store
5 changes: 3 additions & 2 deletions docs/_src/api/api/document_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -827,8 +827,9 @@ the vector embeddings are indexed in a FAISS Index.
Recommended options:
- "Flat" (default): Best accuracy (= exact). Becomes slow and RAM intense for > 1 Mio docs.
- "HNSW": Graph-based heuristic. If not further specified,
we use a RAM intense, but more accurate config:
HNSW256, efConstruction=256 and efSearch=256
we use the following config:
HNSW64, efConstruction=80 and efSearch=20

- "IVFx,Flat": Inverted Index. Replace x with the number of centroids aka nlist.
Rule of thumb: nlist = 10 * sqrt (num_docs) is a good starting point.
For more details see:
Expand Down
103 changes: 73 additions & 30 deletions docs/_src/benchmarks/retriever_map.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@
"BM25 / ElasticSearch",
"DPR / ElasticSearch",
"DPR / FAISS (flat)",
"DPR / FAISS (HSNW)"
"DPR / FAISS (HNSW)",
"DPR / Milvus (flat)",
"DPR / Milvus (HNSW)"

],
"axis": [
{
Expand All @@ -17,50 +20,55 @@
}
],
"data": [
{
"model": "DPR / ElasticSearch",
"n_docs": 1000,
"map": 92.95105322830888
},
{
"model": "DPR / ElasticSearch",
"n_docs": 10000,
"map": 89.87097014904354
"map": 89.87097014904356
},
{
"model": "DPR / ElasticSearch",
"model": "BM25 / ElasticSearch",
"n_docs": 100000,
"map": 86.54564090434241
"map": 56.259591531012504
},
{
"model": "BM25 / ElasticSearch",
"n_docs": 10000,
"map": 66.33019927857616
},
{
"model": "DPR / ElasticSearch",
"n_docs": 500000,
"map": 80.86137228234089
"n_docs": 1000,
"map": 92.95105322830891
},
{
"model": "BM25 / ElasticSearch",
"n_docs": 1000,
"map": 74.20444712972909
},
{
"model": "BM25 / ElasticSearch",
"n_docs": 10000,
"map": 66.20627317806674
"model": "DPR / ElasticSearch",
"n_docs": 100000,
"map": 86.54606328368973
},
{
"model": "BM25 / ElasticSearch",
"n_docs": 100000,
"map": 56.25959153101251
"n_docs": 500000,
"map": 45.60339705629754
},
{
"model": "BM25 / ElasticSearch",
"model": "DPR / ElasticSearch",
"n_docs": 500000,
"map": 45.59452709000341
"map": 80.86137228234091
},
{
"model": "DPR / FAISS (flat)",
"n_docs": 1000,
"map": 92.95105322830888
"map": 92.95105322830891
},
{
"model": "DPR / FAISS (flat)",
"n_docs": 500000,
"map": 80.86137228234091
},
{
"model": "DPR / FAISS (flat)",
Expand All @@ -70,32 +78,67 @@
{
"model": "DPR / FAISS (flat)",
"n_docs": 100000,
"map": 86.54606328368972
"map": 86.54606328368973
},
{
"model": "DPR / FAISS (flat)",
"model": "DPR / FAISS (HNSW)",
"n_docs": 10000,
"map": 89.49563682134192
},
{
"model": "DPR / FAISS (HNSW)",
"n_docs": 100000,
"map": 84.33419639513305
},
{
"model": "DPR / FAISS (HNSW)",
"n_docs": 500000,
"map": 80.8613722823409
"map": 75.73315903145605
},
{
"model": "DPR / FAISS (HNSW)",
"n_docs": 1000,
"map": 92.95105322830891
},
{
"model": "DPR / FAISS (HSNW)",
"model": "DPR / Milvus (flat)",
"n_docs": 1000,
"map": 92.95105322830888
"map": 92.95105322830891
},
{
"model": "DPR / FAISS (HSNW)",
"model": "DPR / Milvus (flat)",
"n_docs": 10000,
"map": 89.69941373746582
"map": 89.87097014904354
},
{
"model": "DPR / FAISS (HSNW)",
"model": "DPR / Milvus (flat)",
"n_docs": 100000,
"map": 85.07984377595874
"map": 86.54606328368973
},
{
"model": "DPR / Milvus (flat)",
"n_docs": 500000,
"map": 80.86137228234091
},
{
"model": "DPR / Milvus (HNSW)",
"n_docs": 1000,
"map": 92.95105322830891
},
{
"model": "DPR / Milvus (HNSW)",
"n_docs": 10000,
"map": 89.87097014904354
},
{
"model": "DPR / FAISS (HSNW)",
"model": "DPR / Milvus (HNSW)",
"n_docs": 500000,
"map": 76.91475821598232
"map": 74.85616575291942
},
{
"model": "DPR / Milvus (HNSW)",
"n_docs": 100000,
"map": 86.54606328368973
}
]
}
44 changes: 29 additions & 15 deletions docs/_src/benchmarks/retriever_performance.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,32 +22,46 @@
},
"data": [
{
"model": "DPR / ElasticSearch",
"model": "BM25 / ElasticSearch",
"n_docs": 100000,
"index_speed": 69.75508852811794,
"query_speed": 4.5992769354707805,
"map": 86.54564090434241
"index_speed": 485.5602670200369,
"query_speed": 165.51512861040828,
"map": 56.259591531012504
},
{
"model": "BM25 / ElasticSearch",
"model": "DPR / ElasticSearch",
"n_docs": 100000,
"index_speed": 482.9993330442806,
"query_speed": 162.42378943468643,
"map": 56.25959153101251
"index_speed": 71.36964873196698,
"query_speed": 5.355677072083696,
"map": 86.54606328368973
},
{
"model": "DPR / FAISS (flat)",
"n_docs": 100000,
"index_speed": 95.52108545730724,
"query_speed": 6.511162294559942,
"map": 86.54606328368972
"index_speed": 100.01184910084558,
"query_speed": 6.624479268751268,
"map": 86.54606328368973
},
{
"model": "DPR / FAISS (HNSW)",
"n_docs": 100000,
"index_speed": 89.90389306648805,
"query_speed": 40.68196225525062,
"map": 84.33419639513305
},
{
"model": "DPR / Milvus (flat)",
"n_docs": 100000,
"index_speed": 116.00982709720004,
"query_speed": 28.30393009791128,
"map": 86.54606328368973
},
{
"model": "DPR / FAISS (HSNW)",
"model": "DPR / Milvus (HNSW)",
"n_docs": 100000,
"index_speed": 84.11829911061136,
"query_speed": 33.65729082116796,
"map": 85.07984377595874
"index_speed": 115.61076852516383,
"query_speed": 28.076443272229284,
"map": 86.54606328368973
}
]
}
Loading

0 comments on commit 77d4c2c

Please sign in to comment.