This repository implements all experiments in the thesis Federated Principle Componenet Analysis (PCA) Learning.
Author: Tung-Anh Nguyen, Huilan Zhu under the supervision of Dr. Nguyen Tran
-
torch, torchvision, sklearn, pandas, matplotlib, numpy, scipy, tqdm, pillow, H5py.
-
To download the dependencies: pip3 install -r requirements.txt
-
To generate non-iid MNIST Data:
- Access data/Mnist and run: "python3 generate_niid_20users.py"
- We can change the number of user and number of labels for each user using 2 variable NUM_USERS = 20 and NUM_LABELS = 2
-
To generate non-iid CIFAR-10 Data:
- Access data/Cifar10 and run: "python3 generate_niid_20users.py"
- We can change the number of user and number of labels for each user using 2 variable NUM_USERS = 20 and NUM_LABELS = 2
-
To generate niid Synthetic:
- Access data/Synthetic and run: "python3 generate_synthetic_05_05.py". Similar to MNIST data, the Synthetic data is configurable with the number of users and the numbers of labels for each user.
- There is a file "plot.py" which allows running all experiments and generate figures. The experimental results are already stored as pickle files in the folder "results". Running commands in the following sections will render figures according to pickle files. Alternatively, if you would like to rerun all experiments, delete pickle files in the "results" folder and run these commands.
-
To produce the experiments on effects of hyperparameters on FAPL and FGPL:
-
Effects of local epochs, run below commands:
python plot.py R FAPL 30 python plot.py R FGPL 30
-
Effects of step size for dual variables, run below commands:
python plot.py rho FAPL 30 python plot.py rho FGPL 30
-
Effects of local learning rates, run below commands:
python plot.py eta FAPL 30 python plot.py eta FGPL 30
-
- To produce the comparison experiments on FAPL, FGPL and Centralised-PCA using fixed hyperparameters
python plot.py comparefixed
- To produce the comparison experiments on FAPL, FGPL and Centralised-PCA using opyimal hyperparameters
python plot.py compareoptimal
- The main file "main.py" which allows running a single algorithm on a dataset with specific hyperparameters defined in file "utils/options.py". To run a single algorithm on a specified dataset, one example is shown as follow:
python3 main.py --dataset Mnist --batch_size 2 --learning_rate 0.00000002 --ro 1 --num_global_iters 20 --local_epochs 4 --dim 30 --optimizer SGD --algorithm FGPL --subusers 20 --times 1