Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot specify model version for SentenceTransformer embedding retriever #7810

Closed
1 task done
laffeychris opened this issue Jun 5, 2024 · 3 comments · Fixed by #7811
Closed
1 task done

Cannot specify model version for SentenceTransformer embedding retriever #7810

laffeychris opened this issue Jun 5, 2024 · 3 comments · Fixed by #7811
Milestone

Comments

@laffeychris
Copy link

Describe the bug
Sentence transformer models do not allow version to be specified.

Error message
When passing a model_version argument to nodes.EmbeddingRetriever, it ignores this and only downloads the latest version of the sentence-transformer model from huggingface (or from cache).

Expected behavior
Download the specified version.

Additional context
This is due to not propagating the revision argument to SentenceTransformer(), via _SentenceTransformersEmbeddingEncoder

Here, we should have passed the version:

self.embedding_model = SentenceTransformer(
    retriever.embedding_model, 
    device=str(retriever.devices[0]), 
    use_auth_token=retriever.use_auth_token, 
    revision=retriever.model_version,
)

To Reproduce

from haystack import nodes


nodes.EmbeddingRetriever(
    embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1",
    model_version="39c51f707d95c3c40f68b47f1df9ae52319da356",
    model_format="sentence_transformers",
)

Check cache to see what models were downloaded:

ls -al ~/.cache/huggingface/hub/models--sentence-transformers--multi-qa-mpnet-base-dot-v1/snapshots
>> 3af7c6da5b3e1bea796ef6c97fe237538cbe6e7f  # latest version, not specified version

FAQ Check

System:

  • OS: Mac
  • GPU/CPU: both
  • Haystack version (commit or version number): 1.26
  • DocumentStore: N/A
  • Reader: N/A
  • Retriever: EmbeddingRetriever
@laffeychris
Copy link
Author

I am happy to open an MR for this

@laffeychris
Copy link
Author

@masci - could we release this change in a fix (v1.26.3)?
I have an internal issue that this would solve. 🙏

@masci masci added this to the 1.26.3 milestone Jun 19, 2024
@masci
Copy link
Contributor

masci commented Jun 19, 2024

@laffeychris the release candidate is out if you want to give it a spin https://pypi.org/project/farm-haystack/1.26.3rc1/ - I'm planning to release the final artifact tomorrow after some testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants