Update README.md

SewoongLab · Jan 4, 2024 · c7cf68f · c7cf68f
1 parent 05aeae7
commit c7cf68f
Showing 1 changed file with 5 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -1,21 +1,22 @@
 # FLIP
 ## tl;dr
-Official implementation of [FLIP](https://arxiv.org/abs/2310.18933), presented at [NeurIPS 2023](https://neurips.cc/virtual/2023/poster/70392). The implementation is a cleaned-up 'fork' of the [backdoor-suite](https://github.com/SewoongLab/backdoor-suite). A more complete (messy) version is available upon request.
+Official implementation of [FLIP](https://arxiv.org/abs/2310.18933), presented at [NeurIPS 2023](https://neurips.cc/virtual/2023/poster/70392). The implementation is a cleaned-up 'fork' of the [backdoor-suite](https://github.com/SewoongLab/backdoor-suite). Precomputed labels for our main table are available [here](https://github.com/SewoongLab/FLIP/releases/). More details are available in the paper. A more complete (messy) version of the code is available upon request.
 
 **Authors:** [Rishi D. Jha\*](http:https://rishijha.com/), Jonathan Hayase\*, Sewoong Oh
 
 ---
 ## Abstract
-In a backdoor attack, an adversary injects corrupted data into a model's training dataset in order to gain control over its predictions on images with a specific attacker-defined trigger. A typical corrupted training example requires altering both the image, by applying the trigger, and the label. Models trained on clean images, therefore, were considered safe from backdoor attacks.
-However, in some common machine learning scenarios, the training labels are provided by potentially malicious third-parties. This includes crowd-sourced annotation and knowledge distillation. We, hence, investigate a fundamental question: can we launch a successful backdoor attack by only corrupting labels? We introduce a novel approach to design label-only backdoor attacks, which we call FLIP, and demonstrate its strengths on three datasets (CIFAR-10, CIFAR-100, and Tiny-ImageNet) and four architectures (ResNet-32, ResNet-18, VGG-19, and Vision Transformer). With only 2\% of CIFAR-10 labels corrupted, FLIP achieves a near-perfect attack success rate of $99.4\%$ while suffering only a $1.8\%$ drop in the clean test accuracy. Our approach builds upon the recent advances in trajectory matching, originally introduced for dataset distillation.
+In a backdoor attack, an adversary injects corrupted data into a model's training dataset in order to gain control over its predictions on images with a specific attacker-defined trigger. A typical corrupted training example requires altering both the image, by applying the trigger, and the label. Models trained on clean images, therefore, were considered safe from backdoor attacks. However, in some common machine learning scenarios, the training labels are provided by potentially malicious third-parties. This includes crowd-sourced annotation and knowledge distillation. We, hence, investigate a fundamental question: can we launch a successful backdoor attack by only corrupting labels? We introduce a novel approach to design label-only backdoor attacks, which we call FLIP, and demonstrate its strengths on three datasets (CIFAR-10, CIFAR-100, and Tiny-ImageNet) and four architectures (ResNet-32, ResNet-18, VGG-19, and Vision Transformer). With only 2\% of CIFAR-10 labels corrupted, FLIP achieves a near-perfect attack success rate of $99.4\%$ while suffering only a $1.8\%$ drop in the clean test accuracy. Our approach builds upon the recent advances in trajectory matching, originally introduced for dataset distillation.
 
 ![Diagram of algorithm.](/img/flip.png)
 
 ---
 
 ## In this repo
 
-This repo is split into four main folders: `experiments`, `modules`, `precomputed`, and `schemas`. The `experiments` folder (as described in more detail [here](#installation)) contains subfolders and `.toml` configuration files on which an experiment may be run. The `modules` folder stores source code for each of the subsequent part of an experiment. These modules take in specific inputs and outputs as defined by their subseqeunt `.toml` documentation in the `schemas` folder. Each module refers to a step of the FLIP algorithm. Finally in the `precomputed` folder, precomputed labels used for the main table of our paper are provided for analysis.
+This repo is split into three main folders: `experiments`, `modules`, and `schemas`. The `experiments` folder (as described in more detail [here](#installation)) contains subfolders and `.toml` configuration files on which an experiment may be run. The `modules` folder stores source code for each of the subsequent part of an experiment. These modules take in specific inputs and outputs as defined by their subseqeunt `.toml` documentation in the `schemas` folder. Each module refers to a step of the FLIP algorithm.
+
+Additionally, in the `precomputed` release, labels used for the main table of our paper are provided for analysis.
 
 Please don't hesitate to file a GitHub issue or reach out for any issues or requests!