GitHub

XLNet baseline for DREAM dataset

Author: Chenglei Si (River Valley High School, Singapore)

Update: Sometimes you may get degenerate runs where the performance is far lower than the expected performance. This is mainly because the training is not stable on smaller datasets. You may try to change the random seeds (and perhaps learning rate, batch size, warmup steps or other hyperparameters as well) and restart training. If you want, I can send you a trained checkpoint. Feel free to contact me through email: [email protected]
Note: You should use the dev set to do hyper-parameter tuning and then use the test file and trained model to evaluate on the test data. This is the standard practice for ML.

Usage:

Download data and unzip to this folder.
(If you have not installed sentencepiece) Run pip install sentencepiece
Run sh run.sh
To test a trained model, Run python test_xlnet_dream.py --data_dir=data --xlnet_model=xlnet-large-cased --output_dir=xlnet_dream --checkpoint_name=pytorch_model_3epoch_72_len256.bin --max_seq_length=256 --do_eval --eval_batch_size=1 You may need to change the checkpint name accordingly.

(The hyperparameters that I used can be found in run.sh)

Result: 72.0 (SOTA as of July 2019, leaderboard)

Note: My codes are built upon huggingface's implementation of pytorch_transformers, and the original XLNet paper is: (Yang et al., 2019).

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
file_utils.py		file_utils.py
modeling_utils.py		modeling_utils.py
modeling_xlnet.py		modeling_xlnet.py
optimization.py		optimization.py
run.sh		run.sh
run_xlnet_dream.py		run_xlnet_dream.py
test_xlnet_dream.py		test_xlnet_dream.py
tokenization_utils.py		tokenization_utils.py
tokenization_xlnet.py		tokenization_xlnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XLNet baseline for DREAM dataset

About

Releases

Packages

Languages

NoviScl/XLNet_DREAM

Folders and files

Latest commit

History

Repository files navigation

XLNet baseline for DREAM dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages