Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to add search_kwargs when using CustomComponent to create a vectorstore #1323

Open
wodadehencou opened this issue Jan 11, 2024 · 3 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@wodadehencou
Copy link

I want to add a custom vectorstore, like in src/backend/langflow/components/vectorstores/Chroma.py did.

build method return type is Union[VectorStore, BaseRetriever], but the actual code is return a Chroma instance.

return Chroma(persist_directory=persist_directory, client_settings=chroma_settings)

If the next component need a Retrieval, .as_retrieval() function is auto called when building the graph.

If I want to control the retrieval policy, like k or score, I need to add a search_kwargs parameter to as_retrieval.

I want to config k or score parameter to my custom vectorstore, but I find it difficult to do it.

Copy link

dosubot bot commented Jan 11, 2024

🤖

Hey @wodadehencou, good to see you again! Hope you're doing well. Let's dive into your issue.

To add a search_kwargs parameter to the as_retrieval function in order to control the retrieval policy, such as k or score, you can modify the build function in your custom vectorstore class.

Here's how you can do it:

def build(
    self,
    vectorstore: VectorStore,
    document_content_description: str,
    llm: BaseLanguageModel,
    metadata_field_info: List[str],
    search_kwargs: dict = None,  # Add this line
) -> BaseRetriever:
    
    metadata_field_obj = []
    for meta in metadata_field_info:
        meta_obj = json.loads(meta)
        if 'name' not in meta_obj or 'description' not in meta_obj or 'type' not in meta_obj :
            raise Exception('Incorrect metadata field info format.')
        attribute_info = AttributeInfo(
            name = meta_obj['name'],
            description = meta_obj['description'],
            type = meta_obj['type'],
        )
        metadata_field_obj.append(attribute_info)

    return SelfQueryRetriever.from_llm(
        llm,
        vectorstore, 
        document_content_description, 
        metadata_field_obj, 
        verbose=True,
        search_kwargs=search_kwargs  # Add this line
    )

In the above code, I added a new parameter search_kwargs to the build function and passed it to the from_llm function of SelfQueryRetriever. This will allow you to control the retrieval policy such as k or score when creating a custom vectorstore. Please note that the from_llm function should be able to handle this new parameter. If it doesn't, you will need to modify that function as well.

You can refer to the VectaraSelfQueryRetriver.py file for more details.

As for the structure of the VectorStore class, it is used to create and manage vector stores. It is a subclass of LangChainTypeCreator and has several properties and methods that are used in the creation of a custom vector store. You can refer to the base.py file for more details.

I hope this helps! If you have any other questions, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosubot dosubot bot added enhancement New feature or request question Further information is requested labels Jan 11, 2024
@wodadehencou
Copy link
Author

VectaraSelfQueryRetriver.py is something like a converter to convert an exist VectorStore to Retrieval, it is like the earlier Langflow version did, every vectorstore has a corresponding
retrieval component.

But today, I think almost every vectorstore component can act as a VectorStore or BaseRetriever. And I want to control the policy of BaseRetrieval, but I can not pass search_kwargs to a component which actually a VectorStore instance.

Copy link

dosubot bot commented Apr 12, 2024

Hi, @wodadehencou,

I'm helping the langflow team manage their backlog and am marking this issue as stale. From what I understand, you were having trouble adding search_kwargs when using CustomComponent to create a vectorstore. I provided a detailed response on how to add the search_kwargs parameter to the as_retrieval function by modifying the build function in the custom vectorstore class. You further explained that almost every vectorstore component can act as a VectorStore or BaseRetriever, and you want to control the policy of BaseRetrieval, but are unable to pass search_kwargs to a component which is actually a VectorStore instance.

Could you please confirm if this issue is still relevant to the latest version of the langflow repository? If it is, please let the langflow team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation. If you have any further questions or need assistance with anything else, feel free to reach out.

I

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant