SpaTrio is a computational tool based on optimal transport that can align single-cell multi-omics data in space while preserving spatial topology of the tissue section and local geometry of modality
This toolkit is written in both R and Python programming languages. The core optimal transport algorithm is implemented in Python, while the initial data preparation and downstream multimodal analysis are written in R.
# We recommend using Anaconda, and then you can create a new environment.
# Create and activate Python environment
conda create -n spatrio python=3.8
conda activate spatrio
# Install requirements
cd SpaTrio-main
pip install -r requirements.txt
# Install spatrio
python setup.py build
python setup.py install
install.packages("doParallel")
# Install SpaTrio package from local file
install.packages("SpaTrio_1.0.0.tar.gz", repos = NULL, type = "source")
To use SpaTrio we require five formatted .csv
files as input (i.e. read in by pandas).
- multi_rna.csv/spatial_rna.csv (The gene expression matrix of cells/spots)
Cell1 | ··· | Celln | |
---|---|---|---|
Gene1 | 0 | ··· | 1 |
··· | ··· | ··· | ··· |
Genem | 2 | ··· | 1 |
- multi_meta.csv/spatial_meta.csv (The meta information matrix of cells/spots)
id | type | |
---|---|---|
Cell1 | Cell1 | A |
··· | ··· | ··· |
Celln | Celln | B |
- emb.csv (The low-dimensional embedding matrix of cells)
emb1 | ··· | embk | |
---|---|---|---|
Cell1 | 1.997 | ··· | -0.307 |
··· | ··· | ··· | ··· |
Celln | 2.307 | ··· | 2.119 |
- pos.csv (The spatial location matrix of spots)
x | y | |
---|---|---|
Cell1 | 0.28 | 10.65 |
··· | ··· | ··· |
Celln | 5.98 | 2.16 |
- ref_counts.csv (The number of cells contained in each spot)
| | cell_num 1 | |--|--|--|--| | Spot1 | 0 | | ··· | ··· | | Spotj | 1 | ··· | 0 |
We have included two test datasets (demo1 & demo2) in the tutorial/data/ of this repository as examples to show how to use SpaTrio to align cells to space.
Simulated data in the strip pattern:
Simulated data in the ring pattern:
More importantly, we support directly calling the core functions written in Python from the R language to facilitate downstream analysis.
DBiT-seq mouse embryo datasets (Google Drive):
10x Visium+ADT mouse liver datasets (Google Drive):
We have applied spatrio on different tissues of multiple species, here we give step-by-step tutorials for all application scenarios. And preprocessed datasets used can be downloaded from Google Drive.
-
Using spatrio to reconstruct and analyze single-cell multi-modal data of mouse cerebral cortex
-
Using spatrio to reconstruct and analyze single-cell multi-modal data of human steatosis liver
-
Using spatrio to reconstruct and analyze single-cell multi-modal data of human breast cancer
Should you have any questions, please feel free to contact the author of the manuscript, Mr. Penghui Yang ([email protected]).