Towards Better Dynamic Graph Learning: New Architecture and Unified Library

Overview

Dynamic Graph Library (DyGLib) is an open-source toolkit with standard training pipelines, extensible coding interfaces, and comprehensive evaluating strategies, which aims to promote standard, scalable, and reproducible dynamic graph learning research. Diverse benchmark datasets and thorough baselines are involved in DyGLib.

Benchmark Datasets and Preprocessing

Thirteen datasets are used in DyGLib, including Wikipedia, Reddit, MOOC, LastFM, Enron, Social Evo., UCI, Flights, Can. Parl., US Legis., UN Trade, UN Vote, and Contact. The first four datasets are bipartite, and the others only contain nodes with a single type. All the used original dynamic graph datasets come from Towards Better Evaluation for Dynamic Link Prediction, which can be downloaded here. Please first download them and put them in DG_data folder. Then, please run preprocess_data/preprocess_data.py for pre-processing the datasets. For example, to preprocess the Wikipedia dataset, we can run the following commands:

cd preprocess_data/
python preprocess_data.py  --dataset_name wikipedia

Dynamic Graph Learning Models

Eight popular continuous-time dynamic graph learning methods are included in DyGLib, including JODIE, DyRep, TGAT, TGN, CAWN, EdgeBank, TCL, and GraphMixer. Our recent work DyGFormer is also integrated into DyGLib, which can explore the correlations of the source node and destination node by a neighbor co-occurrence encoding scheme, and effectively and efficiently benefit from longer histories via a patching technique.

Evaluation Tasks

DyGLib supports dynamic link prediction under both transductive and inductive settings with three (i.e., random, historical, and inductive) negative sampling strategies, as well as dynamic node classification.

Incorporate New Datasets or New Models

New datasets and new models are welcomed to be incorporated into DyGLib by pull requests.

For new datasets: The format of new datasets should satisfy the requirements in DG_data/DATASETS_README.md. Users can put the new datasets in DG_data folder, and then run preprocess_data/preprocess_data.py to get the processed datasets.
For new models: Users can put the model implementation in models folder, and then create the model in train_xxx.py or evaluate_xxx.py to run the model.

Environments

PyTorch 1.8.1, numpy, pandas, and tqdm

Executing Scripts

Scripts for Dynamic Link Prediction

Dynamic link prediction could be performed on all the thirteen datasets. If you want to load the best model configurations determined by the grid search, please set the load_best_configs argument to True.

Model Training

Example of training DyGFormer on Wikipedia dataset:

python train_link_prediction.py --dataset_name wikipedia --model_name DyGFormer --patch_size 2 --max_input_sequence_length 64 --num_runs 5 --gpu 0

If you want to use the best model configurations to train DyGFormer on Wikipedia dataset, run

python train_link_prediction.py --dataset_name wikipedia --model_name DyGFormer --load_best_configs --num_runs 5 --gpu 0

Model Evaluation

Three (i.e., random, historical, and inductive) negative sampling strategies can be used for model evaluation.

Example of evaluating DyGFormer with random negative sampling strategy on Wikipedia dataset:

python evaluate_link_prediction.py --dataset_name wikipedia --model_name DyGFormer --patch_size 2 --max_input_sequence_length 64 --negative_sample_strategy random --num_runs 5 --gpu 0

If you want to use the best model configurations to evaluate DyGFormer with random negative sampling strategy on Wikipedia dataset, run

python evaluate_link_prediction.py --dataset_name wikipedia --model_name DyGFormer --negative_sample_strategy random --load_best_configs --num_runs 5 --gpu 0

Scripts for Dynamic Node Classification

Dynamic node classification could be performed on Wikipedia and Reddit (the only two datasets with dynamic labels).

Model Training

Example of training DyGFormer on Wikipedia dataset:

python train_node_classification.py --dataset_name wikipedia --model_name DyGFormer --patch_size 2 --max_input_sequence_length 64 --num_runs 5 --gpu 0

If you want to use the best model configurations to train DyGFormer on Wikipedia dataset, run

python train_node_classification.py --dataset_name wikipedia --model_name DyGFormer --load_best_configs --num_runs 5 --gpu 0

Model Evaluation

Example of evaluating DyGFormer on Wikipedia dataset:

python evaluate_node_classification.py --dataset_name wikipedia --model_name DyGFormer --patch_size 2 --max_input_sequence_length 64 --num_runs 5 --gpu 0

If you want to use the best model configurations to evaluate DyGFormer on Wikipedia dataset, run

python evaluate_node_classification.py --dataset_name wikipedia --model_name DyGFormer --load_best_configs --num_runs 5 --gpu 0

Acknowledgments

We are grateful to the authors of TGAT, TGN, CAWN, EdgeBank, and GraphMixer for making their project codes publicly available.

Citation

Please consider citing our paper when using this project.

@article{yu2023towards,
  title={Towards Better Dynamic Graph Learning: New Architecture and Unified Library},
  author={Yu, Le and Sun, Leilei and Du, Bowen and Lv, Weifeng},
  journal={arXiv preprint arXiv:2303.13047},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
DG_data		DG_data
figures		figures
models		models
preprocess_data		preprocess_data
processed_data		processed_data
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluate_link_prediction.py		evaluate_link_prediction.py
evaluate_models_utils.py		evaluate_models_utils.py
evaluate_node_classification.py		evaluate_node_classification.py
train_link_prediction.py		train_link_prediction.py
train_node_classification.py		train_node_classification.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Better Dynamic Graph Learning: New Architecture and Unified Library

Overview

Benchmark Datasets and Preprocessing

Dynamic Graph Learning Models

Evaluation Tasks

Incorporate New Datasets or New Models

Environments

Executing Scripts

Scripts for Dynamic Link Prediction

Model Training

Model Evaluation

Scripts for Dynamic Node Classification

Model Training

Model Evaluation

Acknowledgments

Citation

About

Releases

Packages

Languages

License

erfanloghmani/DyGLib

Folders and files

Latest commit

History

Repository files navigation

Towards Better Dynamic Graph Learning: New Architecture and Unified Library

Overview

Benchmark Datasets and Preprocessing

Dynamic Graph Learning Models

Evaluation Tasks

Incorporate New Datasets or New Models

Environments

Executing Scripts

Scripts for Dynamic Link Prediction

Model Training

Model Evaluation

Scripts for Dynamic Node Classification

Model Training

Model Evaluation

Acknowledgments

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages