SECANT (Beta)

SECANT is a biology-guided SEmi-supervised method for Clustering, classification, and ANnoTation of single-cell multi-omics.

SECANT can be used to analyze CITE-seq data, or jointly analyze CITE-seq and scRNA-seq data. The novelties of SECANT include:

1. using confident cell type labels classified from surface protein data through gating as guidance for cell clustering with RNA data
1. providing general annotation of confident cell types for each cell cluster
1. fully utilizing cells with uncertain or missing cell type labels to increase performance
1. accurate prediction of confident cell types identified from surface protein data for scRNA-seq data

In general, the input of SECANT include:

1. ADT confident cell type labels L, where L ranges from 0 to C. Each unique value refers to one confident cell type, such as B cells, Monocytes. The maximum value C indicates uncertain cell type (e.g., cells on the boundary of different cell types in a gating plot)
1. RNA data after dimension reduction (e.g., scVI or PCA)
1. Optional (for the purpose of jointly analyzing CITE-seq and scRNA-seq data): RNA data after dimension reduction and batch effect correction

Get Started

Analyzing CITE-seq data

Here, we demonstrate this functionality with public human PBMC data, bone marrow data and upper lobe lung data. The same pipeline would generally be used to analyze any CITE-seq dataset.

PBMC10k: SECANT_GitHub_10X10k_PBMC.ipynb
Bone marrow: SECANT_GitHub_Bone_marrow.ipynb
Lung: SECANT_GitHub_Upper_lobe_lung.ipynb

Jointly analyzing CITE-seq and scRNA-seq data

Here we demonstrate how to jointly analyze CITE-seq and scRNA-seq datasets with SECANT using two public PBMC CITE-seq datasets from 10x Genomics, namely 10X10k and 10X5k. We use the entire 10X10k dataset (i.e., both ADT and RNA) while we hold-out the ADT data of the 10X5k dataset to mimic scRNA-seq. We will store the original values to validate our results.

SECANT_GitHub_Joint_10X.ipynb

Search for the best configuration of concordance matrix in a data-driven manner

Due to computational burden, we suggest running this step in parallel on a server with multiple CPUs or GPUs. Here is an example SECANT_GitHub_Search_Best_Config.ipynb

Simulation study

We provide an example of simulation study, including both how to generate simualted data and assessing performance. For computational burden, we recommend runnining simulation on a server with multiple CPUs or GPUs. To replicate result using Google Colab, one needs to copy all files under simulation_files to Google Drive, and mount Google Colab with Google Drive. SECANT_GitHub_simulation.ipynb

Datasets

A collection of datasets are available with SECANT. All datasets stored in this repository are pre-processed by scVI.

Public data:

Dataset	Number of cells	Description	Original data source
10X10k_PBMC	7,865	Human PBMCs (from 10X Genomics)	source
10X5k_PBMC	5,527	Human PBMCs (from 10X Genomics)	source
Bone_marrow	30,672	Human bone marrow	source
Upper_lobe_lung	5,451	Human upper lobe lung (on GEO, use DropletUtils for pre-processing)	source

In-house data:

In-house data will be available soon.

Installation:

From source

Download a local copy of SECANT and install from the directory:

git clone https://github.com/tarot0410/SECANT.git
cd SECANT
pip install .

Dependencies

Torch, sklearn, umap, pandas, numpy and all of their respective dependencies.

Other relevant material

Example of using automatic gating tool to classify major cell types with CITE-seq data

FLOCK + LDA for PBMC data: AutoGating.html

Clustering uncertainty used in downstream analysis

Here, we give an example of utilizing clustering uncertainty (through posterior probability) from SECANT for downstream analysis. Specifically, we remove cells with low confident clustering result in trajectory analysis for sensitivity analysis.

Trajectory analysis of CD8+ T cells: Trajectory.html

Paper

Wang X, Xu Z, Hu H, Zhou X, Zhang Y, Lafyatis R, Chen K, Huang H, Ding Y, Duerr RH, Chen W. SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics. PNAS Nexus. 2022

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
SECANT		SECANT
example		example
real_data		real_data
simulation_files		simulation_files
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SECANT (Beta)

Get Started

Analyzing CITE-seq data

Jointly analyzing CITE-seq and scRNA-seq data

Search for the best configuration of concordance matrix in a data-driven manner

Simulation study

Datasets

Public data:

In-house data:

Installation:

From source

Dependencies

Other relevant material

Example of using automatic gating tool to classify major cell types with CITE-seq data

Clustering uncertainty used in downstream analysis

Paper

About

Releases

Packages

Languages

tarot0410/SECANT

Folders and files

Latest commit

History

Repository files navigation

SECANT (Beta)

Get Started

Analyzing CITE-seq data

Jointly analyzing CITE-seq and scRNA-seq data

Search for the best configuration of concordance matrix in a data-driven manner

Simulation study

Datasets

Public data:

In-house data:

Installation:

From source

Dependencies

Other relevant material

Example of using automatic gating tool to classify major cell types with CITE-seq data

Clustering uncertainty used in downstream analysis

Paper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages