GitHub - guaiyoui/TransZero: Official github repository for the paper "Efficient Unsupervised Community Search with Pre-trained Graph Transformer"

code for the paper "Efficient Unsupervised Community Search with Pre-trained Graph Transformer" which is accepted by VLDB 2024.

Fast Start

0: unzip dataset.zip
1: python link_pretrain.py --dataset cora --batch_size 2708 --dropout 0.1 --hidden_dim 512 --hops 5  --n_heads 8 --n_layers 1 --pe_dim 3 --peak_lr 0.01  --weight_decay=1e-05 --epochs 100
2: python accuracy_globalsearch.py

Train all datasets

bash ./training_all.sh

Test all datasets

bash ./test_all_global.sh >> ./logs/test_all_global.txt 2>&1 &
bash ./test_all_local.sh >> ./logs/test_all_local.txt 2>&1 &

Dataset and query generation

In the fold of "dataset_dealing", we provide the scripts to download the dataset and generate the query automatically. 

We provide the processed datasets of cora, citeseer and photo as space limit. The other datasets can be generated by the following procedure.

1: make a new folder by "unzip dataset.zip" or "mkdir dataset"
2: get into the dataset folder by "cd dataset"
3: make a folder for each dataset, e.g., "mkdir texas"
4: use the scripts in dataset_dealing to download datasets and generate the query. Note that there are two scripts for each dataset, i.e., "texas_download_pyg.py" or "texas_data.py".
The first one is used for download datasets automatically and the second one is used to generate query automatically. Please put the first one script under the folder of "./dataset/" and put the second script under the folder of "./dataset/dataset_name/", e.g., "./dataset/texas/"
5: python texas_download_pyg.py
6: python texas_data.py

Folder Structure

.
├── dataset                     # make a new folder by "mkdir dataset"
├── dataset_dealing             # the scripts to download datasets and deal datasets automatically
├── logs                        # the running logs
├── model                       # the saved model
├── pretrain_result             # the pretrained latent representation
├── scripts                     # the scripts to run the model and the experiments
├── accuracy_globalsearch.py    # the IESG solver-Global_Binary_Search
├── accuracy_localsearch.py     # the IESG solver-Local_Search
├── data_loader.py              # data loader
├── early_stop.py               # early stop module to alleviate overfitting
├── layer.py                    # the layer in the network
├── link_pretrain.py            # the overall entrance for the model
├── layer.py                    # the layer in the network
├── lr.py                       # the learning rate module
├── model.py                    # the model definition
├── utils.py                    # the utils used
├── test_all_global.sh          # test the performance of all datasets by global binary search
├── test_all_local.sh           # test the performance of all datasets by local search
├── training_all.sh             # the script to train all the models
└── README.md

Citation

@article{wang2024efficient,
  title={Efficient Unsupervised Community Search with Pre-trained Graph Transformer},
  author={Wang, Jianwei and Wang, Kai and Lin, Xuemin and Zhang, Wenjie and Zhang, Ying},
  journal={arXiv preprint arXiv:2403.18869},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast Start

Train all datasets

Test all datasets

Dataset and query generation

Folder Structure

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
__pycache__		__pycache__
dataset_dealing		dataset_dealing
other_models		other_models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
accuracy_globalsearch.py		accuracy_globalsearch.py
accuracy_globalsearch_binary.py		accuracy_globalsearch_binary.py
accuracy_localsearch.py		accuracy_localsearch.py
data_loader.py		data_loader.py
dataset.zip		dataset.zip
early_stop.py		early_stop.py
layer.py		layer.py
link_pretrain.py		link_pretrain.py
lr.py		lr.py
model.py		model.py
requirements.txt		requirements.txt
test_all_global.sh		test_all_global.sh
test_all_global_binary.sh		test_all_global_binary.sh
test_all_local.sh		test_all_local.sh
training_all.sh		training_all.sh
utils.py		utils.py

guaiyoui/TransZero

Folders and files

Latest commit

History

Repository files navigation

Fast Start

Train all datasets

Test all datasets

Dataset and query generation

Folder Structure

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages