BkdAtk-LWS

Source code for the ACL 2021 paper "Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution" [pdf]

Getting started

If you don't have PyTorch installed, install it here: https://pytorch.org/get-started/locally/
Install dependencies with pip install -r reqirements.txt
Initialize OpenHowNet (if using LWS-Sememe) and NLTK if necessary in Python REPL:

import OpenHowNet
OpenHowNet.download()
import nltk
nltk.download('all')

Reproduction

To run the main experiment, edit the file src/models/self_learning_poison_nn.py to import your dataset (line 754) and model parameters/arguments (starting with line 27). Then, run python -m src.models.self_learning_poison_nn.py <file path to poisoned model> <file path to training statistics> > <experiment log file path>.

Change lines 736/737 if you want to change how the training data is processed (parallelized)
To generate poisoning candidates without HowNet/Sememe (wordnet only), choose the desired option CANDIDATE_FN in line 38.

To run the defense experiment, edit the file src/experiments/eval_onion_defense.py and run python -m src.experiments.eval_onion_defense.py <location of poisoned model> > <experiment log file path>.

To run the baseline experiments:

Evaluate defense performance for rule-based word substitution backdoor attack: run src/experiments/eval_onion_static_poisoning.py

Citation

Please kindly cite our paper:

@article{qi2021turn,
  title={Turn the combination lock: Learnable textual backdoor attacks via word substitution},
  author={Qi, Fanchao and Yao, Yuan and Xu, Sophia and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2106.06361},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
env		env
models		models
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
stanford-postagger.jar		stanford-postagger.jar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BkdAtk-LWS

Getting started

Reproduction

Citation

About

Releases

Packages

Contributors 2

Languages

License

thunlp/BkdAtk-LWS

Folders and files

Latest commit

History

Repository files navigation

BkdAtk-LWS

Getting started

Reproduction

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages