Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ChromadbRM to use collection's default embedding function #449

Merged
merged 1 commit into from
Feb 25, 2024

Conversation

smwitkowski
Copy link
Contributor

@smwitkowski smwitkowski commented Feb 25, 2024

The previous PR #400 opened by @animtel set chromadb.utils.embedding_functions.DefaultEmbeddingFunction as the default query embedding function. This calls ONNXMiniLM_L6_V2 , which uses all-MiniLM-L6-v2 as the base embedding model.

This introduced unintended behavior where queries could be embedded using a different model than documents unless the embedding function was explicitly provided. If a model other than all-MiniLM-L6-v2 embeds queries, and the embedding function is not specified, the query and document embeddings will use different models. While this may be desired in some cases, this should not be the default behavior.

This PR changes the default query embedding method to use the collection's embedding function, accessed via the class. Users can still provide a custom embedding model.

@okhat okhat merged commit 83f5d01 into stanfordnlp:main Feb 25, 2024
1 check passed
@okhat
Copy link
Collaborator

okhat commented Feb 25, 2024

Thanks for the catch @smwitkowski !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants