Skip to content
/ nope Public

Data and code for "NOPE: A Corpus of Naturally-Occurring Presuppositions in English."

Notifications You must be signed in to change notification settings

nyu-mll/nope

Repository files navigation

NOPE: A Corpus of Naturally-Occurring Presuppositions in English

This repository hosts the corpus described in NOPE: A Corpus of Naturally-Occurring Presuppositions in English, as well as the raw data from the human and model experiments.

Downloading the corpus

nope-v1.zip

This archive contains the annotated main corpus (2,386 examples) and the corpus of adversarial examples (346 examples).

Replicating the model results

InferSent

Refer to the README file in the InferSent directory for instructions on how to install and run the InferSent models.

RoBERTa/DeBERTa

  1. Clone our fork of the anli repo:

    git clone https://github.com/sebschu/anli.git 
    
  2. Set up the ANLI models by following the instructions in ("Start your NLI Research")[https://github.com/sebschu/anli/blob/main/mds/start_your_nli_research.md].

  3. Train the models:

    To train the RoBERTa-large model on SNLI,MNLI,ANLI, and FEVER run:

    export MASTER_PORT=88888
    export MASTER_ADDR=localhost
    
    # setup conda environment
    source setup.sh
    
    
    python src/nli/training.py \
      --model_class_name 'roberta-large' \
      --single_gpu \
      -n 1 \
      --seed 32423 \
      -g 1 \
      -nr 0 \
      --fp16 \
      --fp16_opt_level O2 \
      --max_length 156 \
      --gradient_accumulation_steps 1 \
      --per_gpu_train_batch_size 16 \
      --per_gpu_eval_batch_size 32 \
      --save_prediction \
      --train_data \
    snli_train:none,mnli_train:none,fever_train:none,anli_r1_train:none,anli_r2_train:none,anli_r3_train:none \
      --train_weights \
    1,1,1,10,20,10 \
      --eval_data \
    snli_dev:none,mnli_m_dev:none,mnli_mm_dev:none,anli_r1_dev:none,anli_r2_dev:none,anli_r3_dev:none \
      --eval_frequency 2000 \
      --experiment_name 'roberta-large|snli+mnli+fnli+r1*10+r2*20+r3*10|nli'
    

    To train the DeBERTa-v2-XLarge model on SNLI,MNLI,ANLI, and FEVER run:

    export MASTER_PORT=88888
    export MASTER_ADDR=localhost
    
    # setup conda environment
    source setup.sh
    
    
    python src/nli/training.py \
      --model_class_name deberta \
      -n 1 \
      --seed 32423 \
      -g 2 \
      -nr 0 \
      --warmup_steps 1000 \
      --learning_rate 3e-6 \
      --fp16 \
      --fp16_opt_level O2 \
      --max_length 156 \
      --gradient_accumulation_steps 1 \
      --per_gpu_train_batch_size 16 \
      --per_gpu_eval_batch_size 32 \
      --save_prediction \
      --train_data \
    snli_train:none,mnli_train:none,fever_train:none,anli_r1_train:none,anli_r2_train:none,anli_r3_train:none \
      --train_weights \
    1,1,1,10,20,10 \
      --eval_data \
    snli_dev:none,mnli_m_dev:none,mnli_mm_dev:none,anli_r1_dev:none,anli_r2_dev:none,anli_r3_dev:none \
      --eval_frequency 2000 \
      --experiment_name 'deberta-v2-xlarge|snli+mnli+fnli+r1*10+r2*20+r3*10|nli'
    
  4. Evaluate the models

    RoBERTa:

    export MASTER_PORT=88888
    export MASTER_ADDR=localhost
    
    # setup conda environment
    source setup.sh
    
    
    python src/nli/evaluation.py \
        --model_class_name 'roberta-large' \
        --max_length 156 \
        --per_gpu_eval_batch_size 16 \
        --model_checkpoint_path \
        <PATH_TO_MODEL_FROM_STEP3>/model.pt \
        --eval_data \
        nope_main:<PATH_TO_NOPE_CORPUS>/nli_corpus.main.jsonl,nope_adv:<PATH_TO_NOPE_CORPUS>/nli_corpus.adv.jsonl \
        --output_prediction_path <OUTPUT_PATH>/predictions/test_nope/
    
    

    DeBERTa:

    export MASTER_PORT=88888
    export MASTER_ADDR=localhost
    
    # setup conda environment
    source setup.sh
    
    
    python src/nli/evaluation.py \
        --model_class_name 'deberta' \
        --max_length 156 \
        --per_gpu_eval_batch_size 16 \
        --model_checkpoint_path \
        <PATH_TO_MODEL_FROM_STEP3>/model.pt \
        --eval_data \
        nope_main:<PATH_TO_NOPE_CORPUS>/nli_corpus.main.jsonl,nope_adv:<PATH_TO_NOPE_CORPUS>/nli_corpus.adv.jsonl \
        --output_prediction_path <OUTPUT_PATH>/predictions/test_nope/
    
    

Citing

If you are using the NOPE corpus, please cite the following paper:

@inproceedings{NOPE,
  title="{NOPE}: {A} Corpus of Naturally-Occurring Presuppositions in {E}nglish",
  author={Parrish, Alicia, and Schuster, Sebastian and Warstadt, Alex and Agha, Omar and Lee, Soo-Hwan and Zhao, Zhuoye and Bowman, Samuel R. and Linzen, Tal},
  booktitle={Proceedings of the 25th Conference on Computational Natural Language Learning (CoNLL)},
  year={2021}
}

About

Data and code for "NOPE: A Corpus of Naturally-Occurring Presuppositions in English."

Resources

Stars

Watchers

Forks