LM Adapter Training #622

stefan-it · 2023-12-19T12:36:26Z

stefan-it
Dec 19, 2023

Hi everybody 🤗

I would like to perform experiments with first LM Adapters! So the main intention is, to use an existing multilingual LM (more precisely on of my hmBERT models) and train a LM Adapter. I would like to adapt new languages or new pretrain corpora for existing languages with e.g. less OCR errors or different domain (books instead of newspapers)...

My question is now about hyper-parameters that are recommended for pretraining on a) small corpora (in range of 1-2GB of text) and b) larger corpora (approx. 30GB of text).

So the example documentation uses seq_bn as adapter config. Is this config recommended for my use case 🤔

Many thanks!

hSterz · 2023-12-20T12:09:22Z

hSterz
Dec 20, 2023
Maintainer

Hello @stefan-it, the documentation uses seq_bn because this is the architecture the language adapters were originally introduced in (if you are interested in the details you can check out the MAD-X paper. The simplest way would be to use the config and hyperparameters proposed there.
Other adapter methods like LoRA should also work, but we can not recommend specific configs and hyperparameters.
I hope this helps. Let me know if you have any further questions.

1 reply

lenglaender Dec 20, 2023
Maintainer

@stefan-it you can additionally take a look at the appendix of our Adapters paper (https://aclanthology.org/2023.emnlp-demo.13/):

Figure 3 shows which learning rates work well on average for the different adapter types.
Figure 2 shows that adapters with lower capacity perform better with higher learning rates.

vabatta · 2024-01-06T19:32:41Z

vabatta
Jan 6, 2024

I would like to add to the conversation, and have some guidance and insights on what to look for when using adapters. I've been experimenting simple NER with distilbert models and it works great given our training and validation corpus, but as soon as I try with adapters, the performance and accuracy massively falls.
Any advice or guidance would be grateful, as adapters could actually be great for our goal on multiple NER classification.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LM Adapter Training #622

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

LM Adapter Training #622

stefan-it Dec 19, 2023

Replies: 2 comments · 1 reply

hSterz Dec 20, 2023 Maintainer

lenglaender Dec 20, 2023 Maintainer

vabatta Jan 6, 2024

stefan-it
Dec 19, 2023

Replies: 2 comments 1 reply

hSterz
Dec 20, 2023
Maintainer

lenglaender Dec 20, 2023
Maintainer

vabatta
Jan 6, 2024