dorpdown for pros and cons documentstore

deepset-ai · PiffPaffM · Sep 23, 2021 · Sep 15, 2021 · Sep 15, 2021 · Sep 15, 2021
commit cad755f45c8ccbb76c6562b3b736a5b4ef4ac0c7
diff --git a/docs/latest/components/document_store.mdx b/docs/latest/components/document_store.mdx
@@ -194,94 +194,117 @@ Having GPU acceleration will significantly speed this up.
 
 The Document Stores have different characteristics. You should choose one depending on the maturity of your project, the use case and technical environment:
 
-### Elasticsearch
-
-**Pros:**
-
-- Fast & accurate sparse retrieval with many tuning options
-- Basic support for dense retrieval
-- Production-ready
-- Support also for Open Distro
-
-**Cons:**
-
-- Slow for dense retrieval with more than ~ 1 Mio documents
-
-<div style={{ marginBottom: "3rem" }} />
-
-### Milvus
-
-**Pros:**
-
-- Scalable DocumentStore that excels at handling vectors (hence suited to dense retrieval methods like DPR)
-- Encapsulates multiple ANN libraries (e.g. FAISS and ANNOY) and provides added reliability
-- Runs as a separate service (e.g. a Docker container)
-- Allows dynamic data management
-
-**Cons:**
-
-- No efficient sparse retrieval
-
-<div style={{ marginBottom: "3rem" }} />
-
-### FAISS
-
-**Pros:**
-
-- Fast & accurate dense retrieval
-- Highly scalable due to approximate nearest neighbour algorithms (ANN)
-- Many options to tune dense retrieval via different index types (more info [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index))
-
-**Cons:**
-
-- No efficient sparse retrieval
-
-<div style={{ marginBottom: "3rem" }} />
-
-### In Memory
-
-**Pros:**
-
-- Simple
-- Exists already in many environments
-
-**Cons:**
-
-- Only compatible with minimal TF-IDF Retriever
-- Bad retrieval performance
-- Not recommended for production
-
-### SQL
-
-<div style={{ marginBottom: "3rem" }} />
-
-**Pros:**
-
-- Simple & fast to test
-- No database requirements
-- Supports MySQL, PostgreSQL and SQLite
-
-**Cons:**
-
-- Not scalable
-- Not persisting your data on disk
-
-<div style={{ marginBottom: "3rem" }} />
-
-### Weaviate
-
-**Pros:**
-
-- Simple vector search
-- Stores everything in one place: documents, meta data and vectors - so less network overhead when scaling this up
-- Allows combination of vector search and scalar filtering, i.e. you can filter for a certain tag and do dense retrieval on that subset
-
-**Cons:**
-
-- Less options for ANN algorithms than FAISS or Milvus
-- No BM25 / Tf-idf retrieval
-
-<div style={{ marginBottom: "3rem" }} />
+<Disclosures
+ options={[
+ {
+ title: "Elasticsearch",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Fast & accurate sparse retrieval with many tuning options</li>
+ <li>Basic support for dense retrieval</li>
+ <li>Production-ready</li>
+ <li>Support also for Open Distro</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>Slow for dense retrieval with more than ~ 1 Mio documents</li>
+ </ul>
+ </div>
+ )
+ },
+ {
+ title: "Milvus",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Scalable DocumentStore that excels at handling vectors (hence suited to dense retrieval methods like DPR)</li>
+ <li>Encapsulates multiple ANN libraries (e.g. FAISS and ANNOY) and provides added reliability</li>
+ <li>Runs as a separate service (e.g. a Docker container)</li>
+ <li>Allows dynamic data management</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>No efficient sparse retrieval</li>
+ </ul>
+ </div>
+ )
+ },
+ {
+ title: "FAISS",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Fast & accurate dense retrieval</li>
+ <li>Highly scalable due to approximate nearest neighbour algorithms (ANN)</li>
+ <li>Many options to tune dense retrieval via different index types (more info [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index))</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>No efficient sparse retrieval</li>
+ </ul>
+ </div>
+ )
+ },
+ {
+ title: "In Memory",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Simple</li>
+ <li>Exists already in many environments</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>Only compatible with minimal TF-IDF Retriever</li>
+ <li>Bad retrieval performance</li>
+ <li>Not recommended for production</li>
+ </ul>
+ </div>
+ )
+ },
+ {
+ title: "SQL",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Simple & fast to test</li>
+ <li>No database requirements</li>
+ <li>Supports MySQL, PostgreSQL and SQLite</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>Not scalable</li>
+ <li>Not persisting your data on disk</li>
+ </ul>
+ </div>
+ )
+ },
+ {
+ title: "Weaviate",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Simple vector search</li>
+ <li>Stores everything in one place: documents, meta data and vectors - so less network overhead when scaling this up</li>
+ <li>Allows combination of vector search and scalar filtering, i.e. you can filter for a certain tag and do dense retrieval on that subset</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>Less options for ANN algorithms than FAISS or Milvus</li>
+ <li>No BM25 / Tf-idf retrieval</li>
+ </ul>
+ </div>
+ )
+ }
+ ]}
+/>
 
 <div className="max-w-xl bg-yellow-light-theme border-l-8 border-yellow-dark-theme px-6 pt-6 pb-4 my-4 rounded-md dark:bg-yellow-900">