Skip to content

This repository hosts the work for SNLI dataset language artefacts detection.

License

Notifications You must be signed in to change notification settings

VenkateshDas/nli_artefacts_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Detection, Evaluation and Mitigation of Language Artefacts in the Competition On Legal Information Extraction and Entailment Dataset

This REPO is still WIP. The scripts are not yet finalized.

This repository consists of the codebase for the thesis work on detecting, evaluating and mitigating the language/dataset artefacts in the legal information entailment dataset.

The codebase is categorized into separate folders containing Python notebooks for conducting the experiments.

data —> This folder is a placeholder to place the datasets needed for analysis

src/data scripts —> This folder contains scripts needed for data analysis and data preprocessing.

src/detection —> This folder contains the scripts needed for artefact detection in the dataset.

src/evaluation —> This folder contains the scripts needed for evaluating the BERT-based models for robustness.

src/mitigation —> This folder contains the necessary scripts for data augmentation to mitigate the contradiction word and word overlap artefacts.

About

This repository hosts the work for SNLI dataset language artefacts detection.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published