Here, we will explain how the analysis occured for our paper ‘Robust decomposition of cell type mixtures in spatial transcriptomics’, which introduces and validates the RCTD R package. You may access the RCTD open-source R package here.
The data generated and/or used in this study may be accessed at the Broad Institute’s Single Cell Portal. This repository contains both the Slide-seq datasets used in this study, and the single-cell RNA-sequencing references.
For each single-cell dataset, we generated a Seurat object and saved as an RDS file. For example, the script dropSeqProcess.R is used to convert the hippocampus single-cell dataset to a Seurat object.
To obtain a simulated doublet dataset from each of the single-cell and single-nucleus references, we ran the script doubletsimulation.R.
To cluster interneuron subtypes into three subtype classes, we ran the script subcluster.R. This script additionally creates a Seurat object for the interneuron subtypes and computes average cell type profiles.
For each dataset, RCTD was run according to the instructions for the RCTD package. Configuration files used are located in conf. Specifically, ‘datasetCerPuck.yml’ was used for the Cerebellum Slide-seq dataset, ‘datasetHippoPuck.yml’ was used for the hippocampus Slide-seq dataset, ‘datasetCross.yml’ was used for the simulated Cerebellum doublet dataset, and ‘datasetInterneuronCoarse.yml’ and ‘datasetHippoInterneuron.yml’ were used for running RCTD on interneruon subtypes.
On the simulated doublets dataset, in addition to running RCTD with the typical pipeline, the script weightDecompose.R was used to evaluate RCTD’s ability to predict cell type proportion.
We provide R Markdown files that were used to create the main figures:
- Unsupervised clustering on the Slide-seq cerebellum (Figure 1)
- Platform Effect Prediction (Figure 1,2)
- Comparison of Ordinary Least Squares and RCTD (Figure 1,2)
- Validation of RCTD on decomposition of simulated doublets (Figure 3)
- RCTD on the Slide-seq cerebellum (Figure 4)
- Spatially localizing 27 interneuron subtypes (Figure 5)
- RCTD on the Hippocampus and spatially localizing three interneuron subclasses (Figure 5)
- Finding Astrocyte Genes Dependent on Cellular Colocalization (Figure 6)
- Finding Spatially Variable Genes (Figure 6)
We have also provided here additional R Markdown files to update these analyses to be compatible with the current version of RCTD:
Prepocessing of the Visium dataset occured using processVisium.R. NMFreg on the Slide-seq cerebellum occurred using the NMFreg IPython notebook, and we did pre-processing and post-processing in R. Supplemental figures were generated with the supp.Rmd and supp_part2.Rmd R markdown files.