Name	Name	Last commit message	Last commit date
parent directory ..
finetune	finetune
incr_bpe	incr_bpe
README.md	README.md
requirements.txt	requirements.txt
setup.py	setup.py

AdaLM

Domain, language and task adaptation of pre-trained models.

Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains. Yunzhi Yao, Shaohan Huang, Wenhui Wang, Li Dong and Furu Wei, ACL 2021

This repository includes the code to finetune the adapted domain-specific model on downstrem tasks and the code to generate incremental vocabulary for specific domain.

Pre-trained Model

The adapted domain-specific model can be download:

AdaLM-bio-base 12-layer, 768-hidden, 12-heads, 132M parameters || One Drive
AdaLM-bio-small 6-layer, 384-hidden, 12-heads, 34M parameters || One Drive
AdaLM-cs-base 12-layer, 768-hidden, 12-heads, 124M parameters || One Drive
AdaLM-cs-small 6-layer, 384-hidden, 12-heads, 30M parameters || One Drive

Fine-tuning Examples

Requirements

Install the requirements:

pip install -r requirements.txt

Add the project to your PYTHONPATH

export PYTHONPATH=$PYTHONPATH:`pwd`

Download Fine-tune Datasets

The biomedical downstream task can be download from BLURB Leaderboard . The computer science tasks can be download from allenai

Finetune Classification Task

# Set path to read training/dev dataset
export DATASET_PATH=/path/to/read/glue/task/data/            # Example: "/path/to/downloaded-glue-data-dir/mnli/"

# Set path to save the finetuned model and result score
export OUTPUT_PATH=/path/to/save/result_of_finetuning

export TASK_NAME=chemprot
# Set path to the model checkpoint you need to test 
export CKPT_PATH=/path/to/your/model/checkpoint

# Set config file
export CONFIG_FILE=/path/to/config/file

# Set vocab file
export VOCAB_FILE=/path/to/vocab/file

# Set path to cache train & dev features (tokenized, only use for this tokenizer!)
export TRAIN_CACHE=${DATASET_PATH}/$TASK_NAME.bert.cache
export DEV_CACHE=${DATASET_PATH}/$TASK_NAME.bert.cache

# Setting the hyperparameters for the run.
export BSZ=32
export LR=1.5e-5
export EPOCH=30
export WD=0.1
export WM=0.1
CUDA_VISIBLE_DEVICES=0 python finetune/run_classifier.py \
   --model_type bert --model_name_or_path $CKPT_PATH \
   --config_name $CONFIG_FILE --tokenizer_name $VOCAB_FILE --do_lower_case\
   --data_dir $DATASET_PATH --cached_train_file $TRAIN_CACHE --cached_dev_file $DEV_CACHE \
   --do_train --do_eval --logging_steps 1000 --output_dir $OUTPUT_PATH --max_grad_norm 0 \
   --max_seq_length 128 --per_gpu_train_batch_size $BSZ --learning_rate $LR \
   --num_train_epochs $EPOCH --weight_decay $WD --warmup_ratio $WM \
   --fp16 --fp16_opt_level O2 --seed 42 --overwrite_output_dir

Finetune NER Task

To finetune the PICO task, just need to change the run_ner to run_pico.

# Set path to read training/dev dataset
export DATASET_PATH=/path/to/ner/task/data/           

# Set path to save the finetuned model and result score
export OUTPUT_PATH=/path/to/save/result_of_finetuning

export TASK_NAME=chemprot
# Set path to the model checkpoint you need to test 
export CKPT_PATH=/path/to/your/model/checkpoint

# Set config file
export CONFIG_FILE=/path/to/config/file

# Set vocab file
export VOCAB_FILE=/path/to/vocab/file

# Set label file  such as the BIO tag
export LABEL_FILE=/path/to/vocab/file

# Set path to cache train & dev features (tokenized, only use for this tokenizer!)
export CACHE_DIR=/path/to/cache

# Setting the hyperparameters for the run.
export BSZ=16
export LR=1.5e-5
export EPOCH=30
export WD=0.1
export WM=0.1
CUDA_VISIBLE_DEVICES=0 python finetune/run_ner.py \
   --model_type bert --model_name_or_path $CKPT_PATH \
   --config_name $CONFIG_FILE --tokenizer_name $VOCAB_FILE --do_lower_case\
   --data_dir $DATASET_PATH--cache_dir $CACHE_DIR --labels $LABEL_FILE \
   --do_train --do_eval --logging_steps 1000 --output_dir $OUTPUT_PATH --max_grad_norm 0 \
   --max_seq_length 128 --per_gpu_train_batch_size $BSZ --learning_rate $LR \
   --num_train_epochs $EPOCH --weight_decay $WD --warmup_ratio $WM \
   --fp16 --fp16_opt_level O2 --seed 42 --overwrite_output_dir

Results

Biomedical

	JNLPBA	PICO	ChemProt	Average
BERT	78.63	72.34	71.86	74.28
BioBERT	79.35	73.18	76.14	76.22
PubmedBERT	80.06	73.38	77.24	76.89
AdaLM-bio-base	79.46	75.47	78.41	77.74
AdaLM-bio-small	79.04	74.91	72.06	75.34

Computer Science

	ACL-ARC	SCIERC	Average
BERT	64.92	81.14	73.03
AdaLM-cs-base	73.61	81.91	77.76
AdaLM-cs-small	68.74	78.88	73.81

License

This project is licensed under the license found in the LICENSE file in the root directory of this source tree. Portions of the source code are based on the transformers project. Microsoft Open Source Code of Conduct

Contact Information

For help or issues using AdaLM, please submit a GitHub issue.

For other communications related to AdaLM, please contact Shaohan Huang ([email protected]), Furu Wei ([email protected]).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adalm

adalm

README.md

AdaLM

Pre-trained Model

Fine-tuning Examples

Requirements

Download Fine-tune Datasets

Finetune Classification Task

Finetune NER Task

Results

License

Contact Information

Files

adalm

Directory actions

More options

Directory actions

More options

Latest commit

History

adalm

Folders and files

parent directory

README.md

AdaLM

Pre-trained Model

Fine-tuning Examples

Requirements

Download Fine-tune Datasets

Finetune Classification Task

Finetune NER Task

Results

License

Contact Information