Skip to content

Pytorch implementation of 'Explaining text classifiers with counterfactual representations' (Lemberger & Saillenfest, 2024)

Notifications You must be signed in to change notification settings

ToineSayan/counterfactual-representations-for-explanation

Repository files navigation

Explaining text classifiers with counterfactual representations

This repository contains the code, data, and supplementary material for the experiments included in the paper 'Explaining text classifiers with counterfactual representations', accepted at ECAI 2024 (to appear).

Environment

Create and start a new virtual environment:

conda create -n CFR python=3.10.9 anaconda
conda activate CFR

Data

Download and pre-process the data before running the experiments.

Experiments on synthetic data (sections 5.1 and 5.2)

Run the notebooks:

  • CFRs_EEECp_gender_balanced.ipynb
  • CFRs_EEECp_gender_aggressive.ipynb
  • CFRs_EEECp_race_balanced.ipynb
  • CFRs_EEECp_race_aggressive.ipynb

Experiments on the natural dataset BiasInBios (sections 5.4 and Supplementary material D)

Run the notebook:

  • CFRs_biasbios.ipynb

Experiments on CEBaB (section 5.3)

Run:

  • CFRs_CEBaB_compare_methods.ipynb

Experiment on GloVe embeddings (supplementary material C)

Run the notebook:

  • CFRs_GloVe.ipynb