projected-attention-layers

Here is 1 public repository matching this topic...

JosselinSomervilleRoberts / BERT-Multitask-learning

Multitask-learning of a BERT backbone. Allows to easily train a BERT model with state-of-the-art method such as PCGrad, Gradient Vaccine, PALs, Scheduling, Class imbalance handling and many optimizations

natural-language-processing bert cs224n multitask-learning projected-attention-layers

Updated Oct 8, 2023
Python

Improve this page

Add a description, image, and links to the projected-attention-layers topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the projected-attention-layers topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

projected-attention-layers

Here is 1 public repository matching this topic...

JosselinSomervilleRoberts / BERT-Multitask-learning

Improve this page

Add this topic to your repo