Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network
Following GraphGym, we release a platform Space4HGNN for designing and evaluating Heterogeneous Graph Neural Networks (HGNN). It is implemented with PyTorch and DGL, using the OpenHGNN package.
We have deployed the code into OpenHGNN. Here we will introduce the space4hgnn part in OpenHGNN and how to run it.
.
├── README.md
├── openhgnn
│ ├── __init__.py
│ ├── dataset
│ │ ├── LinkPredictionDataset.py
│ │ ├── NodeClassificationDataset.py
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── academic_graph.py
│ │ ├── base_dataset.py
│ │ ├── hgb_dataset.py
│ │ └── utils.py
│ ├── layers
│ │ ├── GeneralGNNLayer.py
│ │ ├── GeneralHGNNLayer.py
│ │ ├── HeteroGraphConv.py
│ │ ├── HeteroLinear.py
│ │ ├── MetapathConv.py
│ │ ├── SkipConnection.py
│ │ └── __init__.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── base_model.py
│ │ ├── general_HGNN.py
│ │ └── homo_GNN.py
│ ├── tasks
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── base_task.py
│ │ ├── link_prediction.py
│ │ ├── node_classification.py
│ ├── trainerflow
│ │ ├── README.md
│ │ ├── link_prediction.py
│ │ ├── node_classification.py
│ └── utils
│ ├── __init__.py
│ ├── activation.py
│ ├── evaluator.py
│ └── utils.py
├── requirements.txt
├── setup.py
├── space4hgnn
│ ├── README.md
│ ├── __init__.py
│ ├── figure
│ │ ├── distribution.py
│ │ └── rank.py
│ ├── generate_yaml.py
│ ├── parallel.sh
│ ├── prediction
│ │ └── excel
│ │ └── gather_all_Csv.py
│ └── utils.py
└── space4hgnn.py
The installation process is same with OpenHGNN Get Started.
Here we will generate a random design combination for each dataset and save it in a .yaml
file. The candidate designs are listed in ./space4hgnn/generate_yaml.py
.
python ./space4hgnn/generate_yaml.py --gnn_type gcnconv --times 1 --key has_bn --configfile test
--aggr -a
, specify the gnn type, [and gcnconv, gatconv, sageconv, ginconv are optional].
--times -t
, the ID of yaml file to control different random sampling.
--key -k
, specify a design dimension.
--configfile -c
, specify a directory name to store configure yaml file.
Note: .yaml
file will be saved in the yaml_file_path which is controlled by four arguments.
yaml_file_path = './space4hgnn/config/{}/{}/{}_{}.yaml'.format(configfile, key, gnn_type, times)
# Here yaml_file_path = './space4hgnn/config/test/has_bn/gcnconv_1.yaml' with the above example code
python space4hgnn.py -m general_HGNN -u metapath -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
--model -m
name of models
--subgraph_extraction -u
subgraph extraction methods
--task -t
name of task
--dataset -t
name of dataset
--gpu -g
controls which gpu you will use. If you do not have gpu, set -g -1.
--repeat -r
times to repeat, default 5
--gnn_type -a
gun type.
--times -t
same with generating random designs
--key -k
a design dimension
--value -v
the value of key
design dimension
--configfile -c
load the yaml file which is in the directory configfile
--predictfile -p
The file path to store predict files.
e.g.:
We implement three model families in Space4HGNN, Homogenization model family, Relation model family, Meta-path model family.
For Homogenization model family, we can omit the parameter --subgraph_extraction
,
python space4hgnn.py -m homo_GNN -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
For Relation model family, --model
is general_HGNN and --subgraph_extraction
is relation,
python space4hgnn.py -m general_HGNN -u relation -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
For Meta-path model family, --model
is general_HGNN and --subgraph_extraction
is meta-path
python space4hgnn.py -m general_HGNN -u metapath -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
**Note: **
Similar with generating yaml file, experiment will load the design configuration from yaml_file_path
. And it will save the results into a .csv
file in prediction_file_path
.
yaml_file_path = './space4hgnn/config/{}/{}/{}_{}.yaml'.format(configfile, key, gnn_type, times)
# Here yaml_file_path = './space4hgnn/config/test/has_bn/gcnconv_1.yaml'
prediction_file_path = './space4hgnn/prediction/excel/{}/{}_{}/{}_{}_{}_{}.csv'.format(predictfile, key, value, model_family, gnn_type, times, dataset)
# Here prediction_file_path = './space4hgnn/prediction/test/has_bn_True/metapath_gcnconv_1_HGBn-ACM.yaml'
An example:
./space4hgnn/parallel.sh 0 5 has_bn True node_classification test_paral test_paral
It will generate configuration files for the batch of experiments and launch a batch of experiments.
The following is the arguments descriptions:
- The first argument controls which gpu to use. Here is 0.
- Repeat times. Here is 5
- Design dimension. Here is BN.
- Choice of design dimension. Here set BN
True
. - Task name. Here is nodeclassification
- Configfile is the path to save configuration files.
- Predictfile is the path to save prediction files.
Note:
If you encounter the error bash: ./space4hgnn/parallel.sh: Permission denied
, you can try with cmd chmod +x ./space4hgnn/parallel.sh
.
To gather all experiments results, we should run the following command to gather all results into one .csv
file.
python ./space4hgnn/prediction/excel/gather_all_Csv.py -p ./space4hgnn/prediction/excel/HGB
We offer ./figure/result.csv
recording the experimental results.
We analyze the results with average ranking following GraphGym, the corresponding code is in figure/rank.py
.
We analyze the results with distribution estimates following NDS, and the corresponding code is in figure/distribution.py
.
Please kindly cite our paper if you use this code:
@inproceedings{zhao2022space4hgnn,
title={Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network},
author={Zhao, Tianyu and Yang, Cheng and Li, Yibo and Gan, Quan and Wang, Zhenyi and Liang, Fengqi and Zhao, Huan and Shao, Yingxia and Wang, Xiao and Shi, Chuan},
booktitle={SIGIR},
year={2022}
}
The code is built on GraphGym, a method defining design space for graph neural network.