Skip to content
forked from s3prl/s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit.

License

Notifications You must be signed in to change notification settings

mtran14/dp_w2v2

 
 

Repository files navigation

The code is mainly based on the S3PRL Speech Toolkit. Please refer to the S3PRL toolkit for installation instructions. A set of trained models used in the project can be found here.

  1. Train the speaker identification (SID) model. For dataset preparation, please follow this link.
cd s3prl
python3 run_downstream.py -m train -u wav2vec2 -d voxceleb1 -n sid_w2v2
  1. Generate the Privacy-risk Saliency map dataset.
cd s3prl
python3 run_downstream_saliency_map.py -m train -e results/downstream/sid_w2v2/dev-best.ckpt --save_path path_to_store_saliecy_maps
  1. Train the Privacy-risk Saliency Estimator model.
cd s3prl
python3 train_pse.py path_to_store_saliecy_maps
  1. Apply pertubations on the downstream tasks. Please follow the instructions here to prepare the datasets.
cd s3prl
python3 run_downstream.py -m train -u wav2vec2 -d [emotion/asr/fluent_commands/sv_voxceleb1] -n ExperimentName
python3 run_downstream_perturb.py -m evaluate -e path_to_trained_downstream_model --pse path_to_trained_pse --eps EPS --threshold THRESHOLD

THRESHOLD: {0 (100% perturbed), 1 (80% perturbed), 2 (60% perturbed), 3 (40% perturbed), 4 (20% perturbed)}

EPS: amount of perturbation (e.g. 0.5, 1.0, etc.)

If you use the provided downstream models, consider using -o config.downstream_expert.datarc.[file_path/root/libri_root/] to set the dataset paths correctly.

About

Self-Supervised Speech Pre-training and Representation Learning Toolkit.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 95.8%
  • Shell 4.1%
  • Dockerfile 0.1%