using graph convolutional neural network and spaital transcriptomics data to infer cell-cell interactions
GCNG for extracellular gene relationship inference. (A) GCNG model using spatial single cell expression data. A binary cell adjacent matrix and an expression matrix are extracted from spatial data. After normalization, both matrices are fed into the graph convolutional network. (B) Training and test data separation and generation strategy. The known ligand and receptor genes can form complicated directed networks. For cross validation, all ligand and receptors are separated exclusively as training and test gene sets, and only gene pairs where both genes are in training (test) are used for training (test). To balance the dataset, each positive ligand-receptor (La; Rb) gene pair with label 1 will have a negative pair sample (La; Rx) with label 0 where Rx was randomly selected from all training (test) receptor genes which are not interacting with La in training (test).
Author's environment is python 3.6.3 in a Linux server which is now running Centos 6.5 as the underlying OS and Rocks 6.1.1 as the cluster management revision.
Please used the old version of spektral as "spektral"github suggests : https://github.com/danielegrattarola/spektral#tensorflow-1-and-keras
Users should first set the path as the downloaded folder.
python data_generation_interaction_ten_fold.py
data_generation_interaction_ten_fold.py
uses the spatial location data to generate normalized adjacent matrix of cells, and save it in seqfish_plus
folder; also uses the expression data to generate expression matrix for ten fold cross validation, and save it in rand_1_10fold
folder.
python gcn_LR2_LR_as_nega_big.py
gcn_LR2_LR_as_nega_big.py
exocrine GNCG that uses normalized adjacent matrix to generate normalized laplacian matrix, and then uses laplacian matrix to train and test GCNG models in ten fold cross validation.
(
python gcn_LR2_LR_as_nega_big_plus_autocrine.py
gcn_LR2_LR_as_nega_big_plus_autocrine.py
autocrine plus GNCG that uses adjacent matrix plus diagonal matrix to generate laplacian matrix, and then uses laplacian matrix to train and test GCNG models in ten fold cross validation
)
python gcn_LR2_LR_as_nega_big_layer_predict_min.py
gcn_LR2_LR_as_nega_big_layer_predict_min.py
tries to find the optimal model during the trainning, by monitoring the validation dataset's accuracy.
python predict_analysis_more_kegg_tfs_average_whole_new_rand.py
predict_analysis_more_kegg_tfs_average_whole_new_rand.py
collects results of the optimal model in each fold to present the final results.