Skip to content

๐Ÿฅˆ 2022 NIPA MRC Competition 2nd Place Solution

License

Notifications You must be signed in to change notification settings

QuoQA-NLP/MRC_Baseline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

19 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

2022๋…„ ์ธ๊ณต์ง€๋Šฅ ์˜จ๋ผ์ธ ๊ฒฝ์ง„๋Œ€ํšŒ / ๋ฌธ์„œ ๊ฒ€์ƒ‰ ํšจ์œจํ™”๋ฅผ ์œ„ํ•œ ๊ธฐ๊ณ„๋…ํ•ด ๋ฌธ์ œ - ํŒ€: QuoQA

ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

ํ…์ŠคํŠธ์™€ ์งˆ๋ฌธ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ ๋ณธ๋ฌธ์—์„œ ์งˆ๋ฌธ์˜ ๋‹ต์„ ์ฐพ๋Š” ๊ณผ์ œ์ž…๋‹ˆ๋‹ค. ๋‹ต๋ณ€์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ์™€ ๋‹ต๋ณ€์ด ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋ชจ๋‘ ์กด์žฌํ•˜๋ฉฐ, Exact Match๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๊ณ„๋…ํ•ด๋Š” Extractive Question Answering ํ˜•์‹์œผ๋กœ์จ Context ์•ˆ์—์„œ Answer Span์„ ์ฐพ๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉ๋ฐฉ๋ฒ•๋ก  ๋ฐ ์žฌํ˜„ ๋ช…๋ น์–ด

  • ํ•ด๋‹น ๋ฌธ๋‹จ์œผ๋กœ ๋‹ต๋ณ€ ๊ฐ€๋Šฅ์„ฑ์„ ํŒ๋‹จํ•˜๋Š” ๊ฒƒ๊ณผ ๋‹ต๋ณ€ ๋ฌธ์ž์—ด์„ ์ถ”์ถœํ•˜๋Š” ๊ณผ์ •์„ Transformer backbone model ๋‹จ์ผ ๋ชจ๋ธ๋กœ์จ ๋™์‹œ์— ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์ด ํ•ต์‹ฌ์ž…๋‹ˆ๋‹ค.
  • ๋‹ต๋ณ€ ๊ฐ€๋Šฅ์„ฑ์„ ๊ธฐ์ค€์œผ๋กœ ์‚ฐ์ถœํ•œ loss์™€ ๋ฌธ์ž์—ด ์‹œ์ž‘์ , ๋์ ์ด ์ผ์น˜ํ•˜๋Š”์ง€๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์‚ฐ์ถœํ•œ loss๋ฅผ ๊ฐ€์ค‘ํ‰๊ท ํ•ฉํ•˜์—ฌ total loss๋ฅผ ์‚ฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ํ•œ์ •๋œ GPU VRAM ์ž์›์—์„œ ํ›ˆ๋ จ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ Gradient Accumulation, Gradient Checkpoint๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์ด๋ฅผ ํ†ตํ•ด ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์ด๋ค„๋ƒˆ์Šต๋‹ˆ๋‹ค.

ํ›ˆ๋ จ ๋ช…๋ น์–ด bash running_train_only.sh

์ถ”๋ก  ๋ช…๋ น์–ด bash running_inference_only.sh

๊ธฐํ•™์Šต๊ฐ€์ค‘์น˜(Pretrained Language Model)

KLUE: Korean Language Understanding Evaluation(2021)์—์„œ ๊ณต๊ฐœํ•œ roberta-large ๋ชจ๋ธ์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. (arXiv:2105.09680)

RoBERTa ๋ชจ๋ธ์„ ์„ ์ •ํ•œ ์ด์œ ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  1. ๋‹ต๋ณ€ ๋ถˆ๊ฐ€ ํ•ญ๋ชฉ๊ณผ ์‘๋‹ต ๋ฌธ์ž์—ด์„ ๋ฒค์น˜๋งˆํฌ๋กœ ์‚ผ์€ SQuAD v2.0 Benchmark, KLUE Benchmark์—์„œ Roberta Backbone์ด ์„ฑ๋Šฅ์ด ์ข‹๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.
  2. #Trainable Params์™€ Num Layers๋ฅผ ๋”ฐ์กŒ์„ ๋•Œ RoBERTa-large ๋ชจ๋ธ์ด KPFBert-base ๋“ฑ๊ณผ ๊ฐ™์€ base size ๋ชจ๋ธ์— ๋น„ํ•ด์„œ ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์— ๋น„๊ต์šฐ์œ„๊ฐ€ ์žˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
  3. ํŒ€ ์ž์ฒด์ ์œผ๋กœ Train Dataset์„ 5 Fold๋กœ ๋‚˜๋ˆ ์„œ Evaluation Score์„ ์‚ฐ์ถœํ–ˆ์„ ๋•Œ klue/roberta-large๊ฐ€ ์„ฑ๋Šฅ์ด ์ œ์ผ ์šฐ์ˆ˜ํ•˜๊ฒŒ ๋‚˜์™”์Šต๋‹ˆ๋‹ค.

๊ตฌ์ฒด์ ์œผ๋กœ Huggingface์— ์—…๋กœ๋“œ๋œ ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค: ๐Ÿ”— klue/roberta-large

ํ•ด๋‹น pre-trained weight๋Š” 2021๋…„ 06์›” 15์ผ์— ๊ณต๊ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น PLM์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐ์ดํ„ฐ์…‹, ํ† ํฌ๋‚˜์ด์ €, ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ›ˆ๋ จ์ด ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • Pretrained Corpora (์ด 62GB)
    • MODU Corpus
      • Korean Corpus containing formal articles and colloquial text released by the National Institute of Korean Language
    • CC-100-Kor
      • Korean portion of the multilingual web crawled corpora used for training XLM-R
    • NAMUWIKI
      • Korean web-based encyclopedia
    • NEWSCRAWL
      • Collection of 12,800,000 news articles from 2011 to 2020
    • PETITION
      • Blue House National Petition: collection of public petitions
  • Tokenizer
    • 32K Vocab Size
    • Morpheme-based subword tokenization
    • Pre-tokenize raw-text into morphemes and then apply BPE
  • Model Structure
    • 24 transformer layers
    • 337M trainable parameters
    • Dynamic / WWM Masking

๋ฐ์ดํ„ฐ์…‹

../DATA
|    +- sample_submission.csv
|    +- test.json
|    +- train.json
- 'train.json'๋ฅผ Huggingface์˜ datasets.Dataset ํด๋ž˜์Šค๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค.
- Dataset ํด๋ž˜์Šค๋กœ ๋ณ€ํ™˜๋œ train dataset์„ ๋ฐ”ํƒ•์œผ๋กœ RobertaForV2QuestionAnswering์„ ํŒŒ์ธํŠœ๋‹์„ ์ง„ํ–‰ํ•œ๋‹ค.
- 'test.json'๋ฅผ Huggingface์˜ datasets.Dataset ํด๋ž˜์Šค๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค.
- ์•ž์„œ Finetuningํ•œ RobertaForV2QuestionAnswering ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ 'FINAL_SUBMISSION.csv' ํŒŒ์ผ์„ ์ƒ์„ฑํ•œ๋‹ค.

ํ•˜๋“œ์›จ์–ด

CPU 10C, Nvidia T4 GPU x 1, 90MEM, 1TB

๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ

USER/
โ”œโ”€โ”€ running_train_only.sh
โ”œโ”€โ”€ running_inference_only.sh
โ”œโ”€โ”€ train.py
โ”œโ”€โ”€ inference.py
โ”œโ”€โ”€ trainer.py
โ”œโ”€โ”€ arguments.py
โ”œโ”€โ”€ question_ids.json
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ .gitignore
โ”‚
โ”œโ”€โ”€ models
โ”‚ย ย  โ”œโ”€โ”€ roberta.py
โ”‚ย ย  โ”œโ”€โ”€ output.py
โ”‚ย ย  โ”œโ”€โ”€ bart.py
โ”‚ย ย  โ”œโ”€โ”€ bert.py
โ”‚ย ย  โ””โ”€โ”€ electra.py
โ”‚
โ”œโ”€โ”€ utils
โ”‚ย ย  โ”œโ”€โ”€ encoder.py
โ”‚ย ย  โ”œโ”€โ”€ loader.py
โ”‚ย ย  โ”œโ”€โ”€ preprocessor.py
โ”‚ย ย  โ”œโ”€โ”€ postprocessor.py
โ”‚ย ย  โ””โ”€โ”€ metric.py
โ”‚
โ”œโ”€โ”€ exps
โ”‚ย ย  โ”œโ”€โ”€ checkpoint-125/ *ํ•˜๋‹จ ์ƒ์„ธ ๊ธฐ์ˆ *
โ”‚ย ย  โ”œโ”€โ”€ checkpoint-250/
โ”‚ย ย  โ”œโ”€โ”€ checkpoint-375/
โ”‚ย ย  โ”œโ”€โ”€ checkpoint-500/
โ”‚ย ย  โ”œโ”€โ”€ checkpoint-625/
โ”‚ย ย  โ”œโ”€โ”€ checkpoint-750/
โ”‚ย ย  โ””โ”€โ”€ checkpoint-875/
โ”‚
โ”œโ”€โ”€ mecab-0.996-ko-0.9.2/
โ”‚
โ”œโ”€โ”€ mecab-ko-dic-2.1.1-20180720/
โ”‚
โ””โ”€โ”€ RESULT * Output ์ƒ์„ธ์„ค๋ช… *
 ย ย  โ”œโ”€โ”€ final_submission.csv
    โ””โ”€โ”€ checkpoint-875
        โ”œโ”€โ”€ pytorch_model.bin
        โ”œโ”€โ”€ config.json
        โ”œโ”€โ”€ optimizer.pt
        โ”œโ”€โ”€ rng_state.pth
        โ”œโ”€โ”€ scheduler.pt
        โ”œโ”€โ”€ special_tokens_map.json
        โ”œโ”€โ”€ tokenizer_config.json
        โ”œโ”€โ”€ tokenizer.json
        โ”œโ”€โ”€ trainer_state.json
        โ”œโ”€โ”€ training_args.bin
        โ””โ”€โ”€ vocab.txt
  • running_train_only.sh

    • ๋ชจ๋ธ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ shell script ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • ํ›ˆ๋ จ์— ํ•„์š”ํ•œ argument๋Š” ์•„๋ž˜๋ฅผ ์ฐธ์กฐํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
  • running_inference_only.sh

    • ๋ชจ๋ธ ๊ฐ€์ค‘์น˜ ํŒŒ์ผ๋กœ ์ถ”๋ก ํ•˜๊ธฐ ์œ„ํ•œ shell script ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • ์ถ”๋ก ์— ํ•„์š”ํ•œ argument๋Š” ์•„๋ž˜๋ฅผ ์ฐธ์กฐํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
  • train.py

    • ๋ชจ๋ธ ํ•™์Šต์„ ์‹คํ–‰ํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
    • ์ €์žฅ๋œ model checkpoint ๊ฐ€์ค‘์น˜ ํŒŒ์ผ์€ exps/ ํด๋”์— ์žˆ์Šต๋‹ˆ๋‹ค.
    • ์ตœ์ข… ์ถ”๋ก ์— ์“ฐ์ด๋Š” ๋ชจ๋ธ ๊ฐ€์ค‘์น˜ ํŒŒ์ผ์€ RESULT/ ํด๋”์— ์žˆ์Šต๋‹ˆ๋‹ค.
  • inference.py

    • ํ•™์Šต๋œ model ๊ฐ€์ค‘์น˜๋ฅผ ํ†ตํ•ด predictionํ•˜๊ณ , ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๋ฅผ csv ํŒŒ์ผ๋กœ ์ €์žฅํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
    • ์ €์žฅ๋œ ์ตœ์ข… submission ํŒŒ์ผ์€ RESULT/ ํด๋”์— ์žˆ์Šต๋‹ˆ๋‹ค.
  • trainer.py

    • Huggingface์˜ Trainer class๋ฅผ ์ƒ์†๋ฐ›์•„ trainer๋ฅผ ๊ตฌํ˜„ํ•œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • compute_loss, evaluate, predict ํ•จ์ˆ˜๋ฅผ customํ•˜๊ฒŒ ๋ณ€๊ฒฝํ–ˆ์Šต๋‹ˆ๋‹ค.
  • arguments.py

    • ํ•™์Šต ๋ฐ ์ถ”๋ก ์— ํ•„์š”ํ•œ arguments ๊ด€๋ จ class๋ฅผ ์ •์˜ํ•œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • arguments์˜ ์ข…๋ฅ˜, ๊ธฐ๋ณธ๊ฐ’, help message ๋“ฑ์„ ์ •์˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • question_ids.json

    • ์žฌํ˜„์„ ์œ„ํ•ด์„œ ํ•™์Šตํ•  ๋•Œ์˜ train data์˜ id ๋ฆฌ์ŠคํŠธ๋ฅผ ์ €์žฅํ•˜๊ณ  ์ด๋ฅผ ์ด์šฉํ•ด์„œ /DATA/train.json์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ •๋ ฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • models/

    • ๋ชจ๋ธ class๋ฅผ ๊ตฌํ˜„ํ•œ ํŒŒ์ผ๋“ค์ด ์žˆ๋Š” ๋””๋ ‰ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค.
      • ์ตœ์ข… ๋ชจ๋ธ์€ roberta.py์— ์žˆ๋Š” RobertaForV2QuestionAnswering class๋งŒ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
      • ์ด ์™ธ์— output.py์—์„œ ๋ชจ๋ธ ์ถœ๋ ฅ๋ฌผ class๋ฅผ ๊ตฌํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • utils/ - ๋ฐ์ดํ„ฐ์…‹ ์ „์ฒ˜๋ฆฌ, ๋ชจ๋ธ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ „/ํ›„์ฒ˜๋ฆฌ, ํ‰๊ฐ€์ง€ํ‘œ ํŒŒ์ผ๋“ค์ด ์žˆ๋Š” ๋””๋ ‰ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค.

    • encoder.py
      • ๋ฐ์ดํ„ฐ๋ฅผ tokenizeํ•˜๊ณ  is_impossible, ์ •๋‹ต index ๋“ฑ์„ ๊ตฌํ•˜๋Š” Encoder class๋ฅผ ์ •์˜ํ•œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • loader.py
      • train, test ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š” /DATA ๋””๋ ‰ํ† ๋ฆฌ์—์„œ json ํŒŒ์ผ์ธ ์›์‹œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  Huggingface์˜ Datasets ํด๋ž˜์Šค์— ๋งž๊ฒŒ ํ˜•์‹์„ ๋ณ€ํ˜•ํ•˜๋Š” ํด๋ž˜์Šค๊ฐ€ ์žˆ๋Š” ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • preprocessor.py
      • ์ •๋‹ต์ด ์—†๋Š” ๊ฒฝ์šฐ, ์ •๋‹ต์ด 2๊ฐœ ์ด์ƒ์ธ ๊ฒฝ์šฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” Preprocessor class๋ฅผ ์ •์˜ํ•œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • postprocessor.py
      • ๋ชจ๋ธ ์ถœ๋ ฅ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ตœ์ข… prediction์„ ๊ตฌํ•˜๊ณ  ํฌ๋งท์— ๋งž์ถฐ ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • Konlpy์˜ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ mecab์„ ํ™œ์šฉํ•˜์—ฌ ํ˜•ํƒœ์†Œ ๋ถ„์„ ํ›„, ๋์— ์กฐ์‚ฌ ๋ฐ ์•ž๋’ค์— ํŠน์ˆ˜ ๋ฌธ์ž ์ œ๊ฑฐ (mecab version: mecab of 0.996/ko-0.9.2)
    • metric.py
      • ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ํ‰๊ฐ€์ง€ํ‘œ Metric class๋ฅผ ์ •์˜ํ•œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
  • exps/

    • train.py๋ฅผ ์‹คํ–‰ํ•  ์‹œ, ํ›ˆ๋ จ๋  ๋•Œ๋งˆ๋‹ค ์ƒ์„ฑ๋˜๋Š” ๋ชจ๋ธ checkpoint๋ฅผ ์ €์žฅํ•˜๋Š” ๋””๋ ‰ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค.
  • RESULT/

    • train.py๋ฅผ ํ†ตํ•ด ํ•™์Šต๋œ ์ตœ์ข… ๋ชจ๋ธ checkpoint ๊ฐ€์ค‘์น˜ ํŒŒ์ผ์„ ์ €์žฅํ•˜๋Š” ๋””๋ ‰ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค.

    • inference.py๋ฅผ ํ†ตํ•ด Test data์— ๋Œ€ํ•ด์„œ ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•˜๋Š” ๋””๋ ‰ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค.

    • final_submission.csv

      • ์ตœ์ข… ์˜ˆ์ธก๊ฐ’์ด ์ €์žฅ๋œ submission ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
    • checkpoint-875/

      • ์ตœ์ข… ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๊ฐ€ ์ €์žฅ๋œ ๋””๋ ‰ํ† ๋ฆฌ์ž…๋‹ˆ๋‹ค.
      • pytorch_model.bin
        • ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๊ฐ€ ์ €์žฅ๋œ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • config.json
        • ๋ชจ๋ธ์— ๋Œ€ํ•œ ์ „๋ฐ˜์ ์ธ ํŠน์ง• ๋ฐ ๊ฒฝ๋กœ๊ฐ€ ์ ํ˜€์žˆ๋Š” ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • optimizer.pt
        • optimizer weight๋ฅผ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • rng_state.pth
        • python, numpy, cpu ์ •๋ณด๋ฅผ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • scheduler.pt
        • scheduler weight๋ฅผ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • special_tokens_map.json
        • tokenizer์—์„œ ์‚ฌ์šฉํ•˜๋Š” special token์„ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • tokenizer_config.json
        • tokenizer์˜ special token, class ๋ฐ ๋ชจ๋ธ ์ด๋ฆ„ ์ •๋ณด ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • tokenizer.json
        • tokenizer์˜ ๊ฐ vocab id ์ •๋ณด๋ฅผ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • trainer_state.json
        • ๊ฐ log step ๋‹น, learning rate๋‚˜ loss, eval ์ •๋ณด ๋“ฑ์„ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • training_args.bin
        • train argument๋ฅผ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
      • vocab.txt
        • tokenizer์— ๋‹ค๋ฃจ๋Š” ๋ฌธ์ž๋“ค์„ ๋‹ด์€ ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.

Arguments

running_train_only.sh Argument ์„ค๋ช…

argument description
do_train ๋ชจ๋ธ์„ ํ›ˆ๋ จํ• ์ง€ ์—ฌ๋ถ€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
group_name wandb ๊ทธ๋ฃน ์ด๋ฆ„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
data_path Nipa dataset ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
use_validation validation์„ ์ˆ˜ํ–‰ํ• ์ง€ ์—ฌ๋ถ€ ๊ฒฐ์ •
PLM ๋ชจ๋ธ PLM ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
model_category models ํด๋” ์•ˆ์— ์‚ฌ์šฉํ•  ํŒŒ์ผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
model_name model_category์—์„œ ์„ ํƒํ•œ ํŒŒ์ผ์—์„œ ์„ธ๋ถ€ class ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
max_length ์ตœ๋Œ€ ๊ธธ์ด ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
save_strategy step or epoch ๊ธฐ์ค€ ๋“ฑ์œผ๋กœ ์ €์žฅํ•˜๋Š” ๋ฐฉ์‹์„ ์ •ํ•ฉ๋‹ˆ๋‹ค.
save_total_limit ์ตœ๋Œ€ checkpoint ์ €์žฅ ๊ฐฏ์ˆ˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
learning_rate ํ›ˆ๋ จ learning rate๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
per_device_train_batch_size train batch size๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
per_device_eval_batch_size eval batch size๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
gradient_accumulation_steps gradient accumulation ์ˆ˜๋ฅผ ์ •ํ•ฉ๋‹ˆ๋‹ค.
gradient_checkpointing gradient checkpoint ์—ฌ๋ถ€๋ฅผ ์ •ํ•ฉ๋‹ˆ๋‹ค.
max_steps ํ•™์Šต ์ตœ๋Œ€ step์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

running_inference_only.sh Argument ์„ค๋ช…

argument description
do_predict ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์˜ˆ์ธกํ• ์ง€ ๋ง์ง€๋ฅผ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
PLM ์›ํ•˜๋Š” ๊ฐ€์ค‘์น˜ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.
model_category models ํด๋” ์•ˆ์— ์‚ฌ์šฉํ•  ํŒŒ์ผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
model_name model_category์—์„œ ์„ ํƒํ•œ ํŒŒ์ผ์—์„œ ์„ธ๋ถ€ class ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
max_length ์ตœ๋Œ€ ๊ธธ์ด ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
output_dir ์˜ˆ์ธก๊ฐ’์„ ์ €์žฅํ•  ๊ฒฝ๋กœ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
file_name ์˜ˆ์ธก๊ฐ’์— ๋Œ€ํ•œ ํŒŒ์ผ ์ด๋ฆ„์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

About

๐Ÿฅˆ 2022 NIPA MRC Competition 2nd Place Solution

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published