-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: index.docstore is empty after persisting nodes in chromadb #14574
Comments
This is correct. The docstore is disabled with most 3rd party vector stores to simplify storage, since the nodes are stored in chroma itself You can override this if you want: |
I understand. Could you please clarify the correct way to use BM25Retriever? Instead of providing the nodes during initialization, I supplied a reference to the docstore, but it resulted in an error. File "/usr/local/anaconda3/envs/rag-search/lib/python3.9/site-packages/llama_index/retrievers/bm25/base.py", line 73, in from_defaults
return cls(
File "/usr/local/anaconda3/envs/rag-search/lib/python3.9/site-packages/llama_index/retrievers/bm25/base.py", line 40, in __init__
self.bm25 = BM25Okapi(self._corpus)
File "/usr/local/anaconda3/envs/rag-search/lib/python3.9/site-packages/rank_bm25.py", line 83, in __init__
super().__init__(corpus, tokenizer)
File "/usr/local/anaconda3/envs/rag-search/lib/python3.9/site-packages/rank_bm25.py", line 27, in __init__
nd = self._initialize(corpus)
File "/usr/local/anaconda3/envs/rag-search/lib/python3.9/site-packages/rank_bm25.py", line 52, in _initialize
self.avgdl = num_doc / self.corpus_size
ZeroDivisionError: division by zero |
You'll need to either manually populate the docstore or use the flag above. And then persist the dcostore somewhere. Or, you can directly save the nodes somewhere |
Got it! Thanks for your help @logan-markewich |
Question Validation
Question
Hello,
I have persisted the nodes in ChromaDB along with the storage context. However, when retrieving the vector index, the index.docstore is empty, how can I get the nodes later to use for BM25Retriever? Here is the code used for persisting and retrieving:
The text was updated successfully, but these errors were encountered: