This repo contains the PyTorch implementation of the ACL, 2020 paper End-to-End Bias Mitigation by Modelling Biases in Corpora.
To get all datasets used in this work, run the following commands.
cd data
bash get_datasets.sh
Downloads the GLOVE (v1)
cd data
bash get_glove.sh
- The product of experts (PoE), Debiased Focal Loss (DFL), and RuBi loss implemnentations are provided in src/losses.py
- The codes for BERT baseline are provided in src/BERT/ and the scripts to reproduce the results are provided in src/BERT/scripts/
- The codes for InferSent baseline are provided in src/InferSent/ and the scripts to reproduce the results are provided in src/InferSent/scripts
pytorch-transformers 1.1.0, transformers 2.5.0, pytorch 1.2.0, pytorch-pretrained-bert 0.6.2
-
To download the MNLI Mismatched/Matched development set from ACL 2020 paper End-to-End Bias Mitigation by Modelling Biases in Corpora use these links mismatched, matched
-
By running the get_datasets.sh scripts, the generated files will be downloaded under the names of MNLIMismatchedHardWithHardTest and MNLIMatchedHardWithHardTest.
Each dataset has three files:
- s1.test each lines shows a premise
- s2.test each line shows a hypothesis
- labels.test each line shows a label.
If you find this repo useful, please cite our paper.
@inproceedings{karimi2020endtoend,
title={End-to-End Bias Mitigation by Modelling Biases in Corpora},
author={Karimi Mahabadi, Rabeeh and Belinkov, Yonatan and Henderson, James},
booktitle={Annual Meeting of the Association for Computational Linguistics},
year={2020}
}
Hope this repo is useful for your research. For any questions, please create an issue or email [email protected], and we will get back to you as soon as possible.