AntiBERTa

This is a repository with Jupyter notebooks describing how we pre-trained the model, and how we apply the model for fine-tuning.

FAQs

What dataset did you use? This is described in our paper, but briefly, we used a section of the Observed Antibody Space (OAS) database (Kovaltsuk et al., 2018) for pre-training, and a snapshot of SAbDab (Dunbar et al., 2014) as of 26 August, 2021. We've included small snippets of the OAS database that we used for pre-training, and the paratope prediction datasets under assets.
Why HuggingFace? We felt that the maturity of the library and its straight-forward API were key advantages. Not to mention it fits really well with cloud compute architectures like AWS.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
antibody-tokenizer		antibody-tokenizer
assets		assets
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
banner.png		banner.png
environment.yml		environment.yml
mlm.ipynb		mlm.ipynb
paratope-prediction.ipynb		paratope-prediction.ipynb