Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XTransformer backend #716

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

juhoinkinen
Copy link
Member

The code in this PR is by @mo-fu and originally submitted via PR #540. That PR got accidentally closed and could not be re-opened, which is why this new PR needs to be opened for the XTransformer backend. (This PR is coming from the point of the git history just before the unsuccessful commits attempting to make the original PR re-openable.)

The description of the original PR is below.


This PR adds XTransformer as an optional backend to Annif. For now it does not yet use distilbert in the default configuration as this is not yet available on pypi.

The tests for the backend resort to mocking as training would download a pretrained model of size at least 500 mb.
Also we should discuss cache directories. At the moment xtransformer will download models from the huggingface hub to ~/.cache/huggingface Is this behavior desired for Annif or should the cache be placed in the data folder?

I also haven't modified the docker container yet. When I installed pecos in a venv it required BLAS libraries so this would probably have to be added to the container. Additionally pecos will install the GPU enabled pytorch. Meaning the container size will grow. Therefore I wanted to check with you first before adding it.

@sonarcloud
Copy link

sonarcloud bot commented Jun 26, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants