Skip to content

Commit

Permalink
Refactor vector store index (run-llama#508)
Browse files Browse the repository at this point in the history

Co-authored-by: Jerry Liu <[email protected]>
  • Loading branch information
Disiok and Jerry Liu authored Feb 23, 2023
1 parent 1493872 commit af2836a
Show file tree
Hide file tree
Showing 50 changed files with 1,725 additions and 1,708 deletions.
113 changes: 98 additions & 15 deletions docs/how_to/vector_stores.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,54 +61,137 @@ For instance, this is an example usage of the Pinecone data loader `PineconeRead

GPT Index also supports using a vector store itself as an index.
These are found in the following classes:

- `GPTSimpleVectorIndex`
- `GPTFaissIndex`
- `GPTWeaviateIndex`
- `GPTPineconeIndex`
- `GPTQdrantIndex`
- `GPTChromaIndex`


An API reference of each vector index is [found here](/reference/indices/vector_store.md).

Similar to any other index within GPT Index (tree, keyword table, list), this index can be constructed upon any collection
of documents. We use the vector store within the index to store embeddings for the input text chunks.

Once constructed, the index can be used for querying.

**Simple Index Construction/Querying**
```python
from gpt_index import GPTSimpleVectorIndex, SimpleDirectoryReader

# Load documents, build the GPTSimpleVectorIndex
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
index = GPTSimpleVectorIndex(documents)

# Query index
response = index.query("What did the author do growing up?")

```

**Faiss Index Construction/Querying**
![](/_static/vector_stores/faiss_index_0.png)
![](/_static/vector_stores/faiss_index_1.png)
```python
from gpt_index import GPTFaissIndex, SimpleDirectoryReader
import faiss

**Simple Index Construction/Querying**
![](/_static/vector_stores/simple_index_0.png)
# Creating a faiss index
d = 1536
faiss_index = faiss.IndexFlatL2(d)

# Load documents, build the GPTFaissIndex
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
index = GPTFaissIndex(documents, faiss_index=faiss_index)

# Query index
response = index.query("What did the author do growing up?")

```

**Weaviate Index Construction/Querying**
![](/_static/vector_stores/weaviate_index_0.png)
```python
from gpt_index import GPTWeaviateIndex, SimpleDirectoryReader
import weaviate

# Creating a Weaviate vector store
resource_owner_config = weaviate.AuthClientPassword(
username="<username>",
password="<password>",
)
client = weaviate.Client(
"https://<cluster-id>.semi.network/", auth_client_secret=resource_owner_config
)

# Load documents, build the GPTWeaviateIndex
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
index = GPTWeaviateIndex(documents, weaviate_client=client)

# Query index
response = index.query("What did the author do growing up?")

```

**Pinecone Index Construction/Querying**
![](/_static/vector_stores/pinecone_index_0.png)
```python
from gpt_index import GPTPineconeIndex, SimpleDirectoryReader
import pinecone

# Creating a Pinecone index
api_key = "api_key"
pinecone.init(api_key=api_key, environment="us-west1-gcp")
pinecone.create_index(
"quickstart",
dimension=1536,
metric="euclidean",
pod_type="p1"
)
index = pinecone.Index("quickstart")

# Load documents, build the GPTPineconeIndex
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
index = GPTPineconeIndex(documents, pinecone_index=index)

# Query index
response = index.query("What did the author do growing up?")
```

**Qdrant Index Construction/Querying**
![](/_static/vector_stores/qdrant_index_0.png)
```python
import qdrant_client
from gpt_index import GPTQdrantIndex, SimpleDirectoryReader

# Creating a Qdrant vector store
client = qdrant_client.QdrantClient(
host="<qdrant-host>",
qpi_key="<qdrant-api-key>",
https=True
)
collection_name = "paul_graham"

# Load documents, build the GPTFaissIndex
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
index = GPTQdrantIndex(documents, collection_name=collection_name, client=client)

# Query index
response = index.query("What did the author do growing up?")
```

**Chroma Index Construction/Querying**

```python

import chromadb
from gpt_index import GPTChromaIndex, SimpleDirectoryReader

# Creating a Chroma vector store
# By default, Chroma will operate purely in-memory.
chroma_client = chromadb.Client()
chroma_collection = chroma_client.create_collection("quickstart")

# load documents
documents = SimpleDirectoryReader('../../examples/paul_graham_essay/data').load_data()

# N.B: OPENAI_API_KEY must be set as an environment variable.
# Load documents, build the GPTFaissIndex
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
index = GPTChromaIndex(documents, chroma_collection=chroma_collection)

response = index.query("What did the author do growing up?", chroma_collection=chroma_collection)
display(Markdown(f"<b>{response}</b>"))
# Query index
response = index.query("What did the author do growing up?")

```

Expand Down
19 changes: 16 additions & 3 deletions docs/reference/indices/vector_store.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,22 @@
Vector Store Index
==================

Building the Vector Store Index
Below we show the vector store index classes.

.. automodule:: gpt_index.indices.vector_store
Each vector store index class is a combination of a base vector store index
class and a vector store, shown below.

NOTE: the vector store is currently not user-facing but will be soon!

.. toctree::
:maxdepth: 1

vector_stores/stores.rst
vector_stores/base_index.rst

.. automodule:: gpt_index.indices.vector_store.vector_indices
:members:
:inherited-members:
:exclude-members: delete, docstore, index_struct, index_struct_cls
:exclude-members: delete, docstore, index_struct, index_struct_cls


9 changes: 9 additions & 0 deletions docs/reference/indices/vector_stores/base_index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _Ref-Indices-VectorStore-OldIndices:

Base Vector Index class
=====================================

.. automodule:: gpt_index.indices.vector_store.base
:members:
:inherited-members:
:exclude-members: delete, docstore, index_struct, index_struct_cls
9 changes: 9 additions & 0 deletions docs/reference/indices/vector_stores/stores.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _Ref-Indices-VectorStore-Stores:

Vector Stores
==================

.. automodule:: gpt_index.vector_stores
:members:
:inherited-members:
:exclude-members: delete, docstore, index_struct, index_struct_cls
6 changes: 4 additions & 2 deletions gpt_index/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
GPTPineconeIndex,
GPTQdrantIndex,
GPTSimpleVectorIndex,
GPTVectorStoreIndex,
GPTWeaviateIndex,
)

Expand Down Expand Up @@ -94,10 +95,11 @@
"GPTListIndex",
"GPTTreeIndex",
"GPTFaissIndex",
"GPTSimpleVectorIndex",
"GPTWeaviateIndex",
"GPTPineconeIndex",
"GPTQdrantIndex",
"GPTSimpleVectorIndex",
"GPTVectorStoreIndex",
"GPTWeaviateIndex",
"GPTChromaIndex",
"GPTSQLStructStoreIndex",
"Prompt",
Expand Down
14 changes: 8 additions & 6 deletions gpt_index/composability/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,14 @@
from gpt_index.indices.registry import IndexRegistry
from gpt_index.indices.struct_store.sql import GPTSQLStructStoreIndex
from gpt_index.indices.tree.base import GPTTreeIndex
from gpt_index.indices.vector_store.chroma import GPTChromaIndex
from gpt_index.indices.vector_store.faiss import GPTFaissIndex
from gpt_index.indices.vector_store.pinecone import GPTPineconeIndex
from gpt_index.indices.vector_store.qdrant import GPTQdrantIndex
from gpt_index.indices.vector_store.simple import GPTSimpleVectorIndex
from gpt_index.indices.vector_store.weaviate import GPTWeaviateIndex
from gpt_index.indices.vector_store import (
GPTChromaIndex,
GPTFaissIndex,
GPTPineconeIndex,
GPTQdrantIndex,
GPTSimpleVectorIndex,
GPTWeaviateIndex,
)
from gpt_index.langchain_helpers.chain_wrapper import LLMPredictor
from gpt_index.response.schema import Response

Expand Down
6 changes: 0 additions & 6 deletions gpt_index/data_structs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,6 @@
IndexList,
KeywordTable,
Node,
QdrantIndexStruct,
SimpleIndexDict,
WeaviateIndexStruct,
)
from gpt_index.data_structs.table import StructDatapoint

Expand All @@ -18,8 +15,5 @@
"KeywordTable",
"IndexList",
"IndexDict",
"SimpleIndexDict",
"WeaviateIndexStruct",
"QdrantIndexStruct",
"StructDatapoint",
]
Loading

0 comments on commit af2836a

Please sign in to comment.