GitHub - fscdc/scBackdoor: Backdoor attacks in single-cell pretrained models

Unveiling potential threats: backdoor attacks in single-cell pretrained models

🙋 Please let us know if you find out a mistake or have any suggestions!

🌟 If you find this resource helpful, please consider to star this repository and cite our research:

Sicheng Feng, Siyu Li, Luonan Chen, Shengquan Chen. Unveiling potential threats: backdoor attacks in single-cell pretrained models. 2024.

Requirements and Installation

Use python 3.9 from Anaconda

torch==2.1.2
anndata==0.10.7
datasets==2.19.1
einops==0.8.0
matplotlib==3.9.0
numba==0.59.1
numpy==1.26.3
pandas==2.2.2
scanpy==1.10.1
scgpt==0.2.1
scikit-learn==1.5.0
scipy==1.13.0
torchtext==0.16.2

To install all dependencies:

conda create -n scBackdoor python=3.9
conda activate scBackdoor
pip install -r requirements.txt

Datasets

You can download the example datasets from [scGPT] and [GeneFormer] , then place the downloaded contents under Yourpath4Dataset to reproduce the experiments.

Pretrained Models

You can download the pretrained models from [scGPT] (whole-human), [scBERT] and [GeneFormer], then place the downloaded contents under Yourpath4PretrainedModels to reproduce the experiments.

Quick Demos

Download datasets and pretrained models, then place them under rightpath and adjust the path-params in the scripts.
Then you can try to reproduce the experiments with the provided scripts. For example, you can evaluate on Human Pancreas datasets by:

nohup ./run.sh & # for scGPT_Exp

Details of Experiments

The commands to run the experiments are as follows:

nohup ./run.sh & # for scGPT_Exp
nohup ./run.sh & # for scBERT_Exp
python geneformer_scBackdoor.py # for GeneFormer_Exp

The poison-related code is in the poison_utils.py or poison_trigger.py. You can find them in each experiment's folder.

The folder tree is as follows:

├── LICENSE
├── README.md                             -- introduction about the project
├── figures                               -- use for show up
│   └── fig1.png
├── requirements.txt                      -- requirements for installation
│── scGPT_Exp                             
│   ├── test                              -- the attack pipeline
│   │   ├── run.sh
│   │   └── scBackdoor.py
│   └── utils                             -- the scGPT items
│       ├── detect_tools.py
│       ├── poison_trigger.py
│       ├── preprocess.py
│       ├── print_tools.py
│       └── tools.py
├── GeneFormer_Exp 
│   ├── geneformer                        -- the GeneFormer items
│   │   ├── __init__.py
│   │   ├── classifier.py
│   │   ├── classifier_utils.py
│   │   ├── collator_for_classification.py
│   │   ├── emb_extractor.py
│   │   ├── evaluation_utils.py
│   │   ├── gene_median_dictionary.pkl
│   │   ├── gene_name_id_dict.pkl
│   │   ├── in_silico_perturber.py
│   │   ├── in_silico_perturber_stats.py
│   │   ├── perturber_utils.py
│   │   ├── poison_utils.py
│   │   ├── pretrainer.py
│   │   ├── token_dictionary.pkl
│   │   └── tokenizer.py
│   └── geneformer_scBackdoor.py          -- the attack pipeline
└── scBERT_Exp
    ├── attn_sum_save.py
    ├── finetune.py
    ├── lr_baseline_crossorgan.py
    ├── performer_pytorch                 -- the scBERT items
    │   ├── __init__.py
    │   ├── performer_pytorch.py
    │   └── reversible.py
    ├── poison_utils.py
    ├── predict.py
    ├── preprocess.py
    ├── pretrain.py
    ├── run.sh                            -- the attack pipeline
    └── utils.py

Acknowledgement

We sincerely thank the authors of the following open-source projects:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unveiling potential threats: backdoor attacks in single-cell pretrained models

Requirements and Installation

Datasets

Pretrained Models

Quick Demos

Details of Experiments

Further Reading

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
GeneFormer_Exp		GeneFormer_Exp
figures		figures
scBERT_Exp		scBERT_Exp
scGPT_Exp		scGPT_Exp
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

fscdc/scBackdoor

Folders and files

Latest commit

History

Repository files navigation

Unveiling potential threats: backdoor attacks in single-cell pretrained models

Requirements and Installation

Datasets

Pretrained Models

Quick Demos

Details of Experiments

Further Reading

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages