Hongyu Zhang, Dongyi Zheng, Lin Zhong, Xu Yang, Jiyuan Feng, Yunqing Feng, Qing Liao*
This is the source code and baselines of our ECML-PKDD'24 paper FedHCDR: Federated Cross-Domain Recommendation with Hypergraph Signal Decoupling. In this paper, we propose FedHCDR, a novel federated cross-domain recommendation framework with hypergraph signal decoupling.
Run the following command to install dependencies:
pip install -r requirements.txt
Note that my Python version is 3.8.16
.
We utilize publicly available datasets from the Amazon website to construct FedCDR scenarios. We select ten domains to generate three cross-domain scenarios: Food-Kitchen-Cloth-Beauty (FKCB), Sports-Clothing-Elec-Cell (SCEC), and Sports-Garden-Home-Toy (SGHT).
The preprocessed CDR datasets can be downloaded from Google Drive. You can download them and place them in the ./data
path of this project.
FedHCDR
βββ LICENSE LICENSE file
βββ README.md README file
βββ checkpoint Model checkpoints saving directory
β βββ ...
βββ data Data directory
β βββ ...
βββ log Log directory
β βββ ...
βββ models Local model packages
β βββ __init__.py Package initialization file
β βββ dhcf dhcf package
β β βββ __init__.py Package initialization
β β βββ dhcf_model.py Model architecture
β β βββ config.py Model configuration file
β β βββ modules.py Backbone modules (such as hyper GCN)
β βββ ...
βββ pic Picture directory
β βββ FedHCDR-Framework.png Model framework diagram
βββ utils Tools such as data reading, IO functions, training strategies, etc.
β βββ __init__.py Package initialization file
β βββ data_utils.py Data reading (including ratings and graphs)
β βββ io_utils.py IO functions
β βββ train_utils.py Training strategies
βββ client.py Client architecture
βββ dataloader.py Customized dataloader
βββ dataset.py Customized dataset
βββ fl.py The overall process of federated learning
βββ local_graph.py Local graph and hypergraph data structure
βββ losses.py Loss functions
βββ main.py Main function, including the complete data pipeline
βββ requirements.txt Dependencies installation
βββ server.py Server-side model parameters and user representations aggregation
βββ trainer.py Training and test methods of FedHCDR and other baselines
βββ .gitignore .gitignore file
To train FedHCDR (ours), you can run the following command:
python -u main.py \
--num_round 60 \
--local_epoch 3 \
--eval_interval 1 \
--frac 1.0 \
--batch_size 1024 \
--log_dir log \
--method FedHCDR \
--lr 0.001 \
--seed 42 \
--lam 2.0 \
--gamma 2.0 \
Food Kitchen Clothing Beauty
There are a few points to note:
-
the positional arguments
Food Kitchen Clothing Beauty
indicates training FedHCDR in FKCB scenario. If you want to choose another scenario, you can change it toSports Clothing Elec Cell
(SCEC) orSports Garden Home Toys
(SGHT). -
The argument
--lam
is used to control local-global bi-directional knowledge transfer for FedHCDR method (ours). For FKCB,2.0
is the best; for SCEC,3.0
is the best; For SGHT,1.0
is the best. -
The argument
--gamma
is used to control the intensity of hypergraph contrastive learning for FedHCDR method (ours). For FKCB,2.0
is the best; for SCEC,1.0
is the best; For SGHT,3.0
is the best. -
If you restart training the model in a certain scenario, you can add the parameter
--load_prep
to load the dataset preprocessed (including ratings and graphs) in the previous training to avoid repeated data preprocessing.
To test FedHCDR, you can run the following command:
python -u main.py \
--log_dir log \
--method FedHCDR \
--load_prep \
--model_id 1709476223 \
--do_eval \
--seed 42 \
Food Kitchen Clothing Beauty
Here --model_id
is the model ID under which you saved the model before. You can check the ID of the saved models in the checkpoint/domain_{$dataset}
directory.
To train other baselines (LocalMF, LocalGNN, LocalDHCF, FedMF, FedGNN, PriCDR, FedP2FCDR, FedPPDM), you can run the following command:
python -u main.py \
--num_round 60 \
--local_epoch 3 \
--eval_interval 1 \
--frac 1.0 \
--batch_size 1024 \
--log_dir log \
--method FedPPDM \
--lr 0.001 \
--seed 42 \
Food Kitchen Clothing Beauty
Here FedPPDM
can be replaced with the name of the baselines you want to train.
For the local version without federated aggregation, you can run the following command:
python -u main.py \
--num_round 60 \
--local_epoch 3 \
--eval_interval 1 \
--frac 1.0 \
--batch_size 1024 \
--log_dir log \
--method LocalPPDM \
--lr 0.001 \
--seed 42 \
Food Kitchen Clothing Beauty
Similarly, FedPPDM
here can be replaced with the name of the baselines you want to train.
If you find this work useful for your research, please kindly cite FedHCDR by:
@misc{zhang2024fedhcdr,
title={FedHCDR: Federated Cross-Domain Recommendation with Hypergraph Signal Decoupling},
author={Hongyu Zhang and Dongyi Zheng and Lin Zhong and Xu Yang and Jiyuan Feng and Yunqing Feng and Qing Liao},
year={2024},
eprint={2403.02630},
archivePrefix={arXiv},
primaryClass={cs.LG}
}