Skip to content

Commit

Permalink
Update tutorials
Browse files Browse the repository at this point in the history
Fixes minor spell typos and updates the installation page
with the conda installation instructions, tweaks some images
and adds authors to the license.
  • Loading branch information
Coral Fustero-Torre authored and SGMartin committed Jan 25, 2021
1 parent 6df1914 commit 380e7a5
Show file tree
Hide file tree
Showing 8 changed files with 117 additions and 39 deletions.
Binary file modified .img/drug_signatures.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed .img/workflow_landscape.png
Binary file not shown.
Binary file modified .img/workflow_tutorial.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ National Cancer Research Center (CNIO), www.cnio.es.

ACADEMIC PUBLIC LICENSE

Copyright (C) 2021 Coral Fustero-Torre, María José Jiménez-Santos, Santiago García-Martín, Carlos Carretero-Puche, Luis García-Jimeno, Tomás Di Domenico, Gonzalo Gómez-López and Fátima Al-Shahrour
Copyright (C) 2021 Coral Fustero-Torre, María José Jiménez-Santos,
Santiago García-Martín, Carlos Carretero-Puche, Luis García-Jimeno,
Tomás Di Domenico, Gonzalo Gómez-López and Fátima Al-Shahrour


Preamble
Expand Down
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

## Workflow overview

**Beyondcell workflow.** Given two inputs, the scRNA-seq expression matrix and a collection of drug signatures, the methodology calculates a beyondcell score (BCS) for each drug-cell pair. The BCS ranges from 0 to 1 and measures the susceptibility of each cell to a given drug. The resulting BCS matrix can be used to determine the sample’s therapeutic clusters. Furthermore, drugs are prioritized in a table and each individual drug score can be visualized in a UMAP.
**Beyondcell workflow.** Given two inputs, the scRNA-seq expression matrix and a collection of drug signatures, the methodology calculates a Beyondcell score (BCS) for each drug-cell pair. The BCS ranges from 0 to 1 and measures the susceptibility of each cell to a given drug. The resulting BCS matrix can be used to determine the sample’s therapeutic clusters. Furthermore, drugs are prioritized in a table and each individual drug score can be visualized in a UMAP.

![Beyondcell workflow](./.img/workflow_tutorial.png)

Expand All @@ -22,18 +22,19 @@ Depending on the evaluated signatures, the BCS represents the cell perturbation
* If time points are available, identify the changes in drug tolerance of your samples
* Identify mechanisms of resistance

## Installing beyondcell
The **Beyondcell** algorithm is implemented in R (v. 4.0.0 or greater). We recommend running the installation via gitlab using devtools:
## Installing Beyondcell
The Beyondcell algorithm is implemented in R (v. 4.0.0 or greater). We recommend running the installation via conda:

```r
library("devtools")
devtools::install_gitlab("bu_cnio/Beyondcell")
# Create a conda environment
conda create -n beyondcell
# Install Beyondcell package and dependencies
conda install -c bu_cnio beyondcell
```

See the DESCRIPTION file for a complete list of R dependencies. If the R dependencies are already installed, installation should finish promptly.

## Results
We have validated Beyondcell in a population of MCF7-AA cells exposed to 500nM of bortezomib and collected at different time points: t0 (before treatment), t12, t48 and t96 (72h treatment followed by drug wash and 24h of recovery) obtained from *Ben-David U, et al., Nature, 2018*. We integrated all four conditions using the Seurat pipeline (left). After calculating the beyondcell scores (BCS) for each cell, a clustering analysis was applied. **Beyondcell** was able to cluster the cells based on their treatment time point, to separate untreated cells from treated cells (center) and to recapitulate the changes arisen by the treatment with bortezomib (right).
We have validated Beyondcell in a population of MCF7-AA cells exposed to 500nM of bortezomib and collected at different time points: t0 (before treatment), t12, t48 and t96 (72h treatment followed by drug wash and 24h of recovery) obtained from *Ben-David U, et al., Nature, 2018*. We integrated all four conditions using the Seurat pipeline (left). After calculating the BCS for each cell, a clustering analysis was applied. Beyondcell was able to cluster the cells based on their treatment time point, to separate untreated cells from treated cells (center) and to recapitulate the changes arisen by the treatment with bortezomib (right).

![results_golub](./.img/integrated_bendavid.png)

Expand All @@ -45,17 +46,16 @@ For general instructions on running Beyondcell, check out the [analysis workflow
## Authors

* Coral Fustero-Torre
* María José Jiménez
* María José Jiménez-Santos
* Santiago García-Martín
* Carlos Carretero-Puche
* Luis G. Jimeno
* Luis García-Jimeno
* Tomás Di Domenico
* Gonzalo Gómez-López
* Fátima Al-Shahrour



## References
## Citation

## Support
If you have any question regarding the use of **Beyoncell**, feel free to submit an [issue](https://gitlab.com/bu_cnio/Beyondcell/issues).
If you have any question regarding the use of Beyoncell, feel free to submit an [issue](https://gitlab.com/bu_cnio/Beyondcell/issues).
73 changes: 73 additions & 0 deletions tutorial/GenerateGenesets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# GenerateGenesets function



By default, `GenerateGenesets` returns a `geneset` with the `250` most upregulated and downregulated genes in each drug signature. You can change this behaviour by providing new values to `n.genes` and `mode`. Moreover, a small collection of functional pathways will be included in your `geneset` object. These pathways are related to the regulation of the epithelial-mesenchymal transition (EMT), cell cycle, proliferation, senescence and apoptosis. Note that `n.genes` and `mode` arguments do not affect to functional pathways.

```r
# Generate geneset object with one of the ready to use signature collections.
gset <- GenerateGenesets(PSc)
# Retrieve only the top 100 most upregulated genes in drug signatures (functional pathways remain unchanged)
up100 <- GenerateGenesets(PSc, n.genes = 100, mode = "up")
# You can deactivate the functional pathways option if you are not interested in evaluating them
nopath <- GenerateGenesets(PSc, include.pathways = FALSE)
```

Additionaly, you can computed a `geneset` from a pre-loaded PSc subset called DSS.

```r
# Generate geneset object with one of the ready to use signature collections
dss <- GenerateGenesets(DSS, include.pathways = FALSE)
```

Also, you can filter PSc, SSc and DDS objects by several fields (cap insensitive):

* `drugs`: Drug name of interest (i.e sirolimus).
* `IDs`: `sig_id` of the signature(s) of interest.
* `MoA`: Desired mechanism of action of interest (i.e. MTOR INHIBITOR).
* `targets`: Target gene of interest (i.e. MTOR).
* `source`: `"LINCS"` (for PSc) or `"GDSC"`, `"CCLE"` and/or `"CTRP"` (for SSc)

```r
# Return a `geneset` with all sirolimus signatures, as well as signatures of sirolimus synonyms such as
# rapamycin or BRD-K84937637
sirolimus <- GenerateGenesets(SSc, include.pathways = FALSE, filters = list(drugs = "sirolimus"))
# Return just a subset of sirolimus signatures
my_sigs <- GenerateGenesets(SSc, include.pathways = FALSE, filters = list(IDs = c("sig_2349", "sig_7409"))
# Return all MTOR INHIBITORS
MTORi <- GenerateGenesets(SSc, include.pathways = FALSE, filters = list(MoA = "MTOR INHIBITOR")
# Return all drugs targetting MTOR
mtor_targets <- GenerateGenesets(SSc, include.pathways = FALSE, filters = list(targets = "MTOR")
# Return only signatures derived from GDSC and CCLE
my_sources <- GenerateGenesets(SSc, include.pathways = FALSE, filters = list(source = c("GDSC", "CCLE"))
```

By calling `ListFilters` function, you can retrieve all the available values for a given field. The signatures that pass **ANY** of these filters are included in the final `geneset`.

```r
# Values for targets
ListFilters(entry = "targets")
# Geneset with all drugs taht target MTOR and sirolimus signatures
filter_combination <- GenerateGenesets(SSc, include.pathways = FALSE,
filters = list(drugs = "sirolimus", targets = "MTOR"))
```
You can check information about the pre-loaded signatures calling the object `drugInfo`. Also, each `geneset` object obtained using pre-loaded matrices contains a subset of `drugInfo` for the selected drugs.

```r
# drugInfo of the signatures of interest
gset@info
```

Finally, Beyondcell allows the user to input a GMT file containing the functional pathways/signatures of interest or a numeric matrix (containing a ranking criteria such as the t-statistic or logFoldChange).

* **In case your input is a GMT file:** You must supply the path to the file. Take into account that the names of each gene set must end in `"_UP"` or `"_DOWN"` to specify its mode. In this case, `n.genes` and `mode` are deprecated.
* **In case your input is a numeric matrix:** Make sure that rows correspond to genes and columns to signatures.

In both cases, `filters` argument is deprecated but you must indicate if the `comparison` that yielded your input was `"treated_vs_control"` or `"sensitive_vs_resistant"`.

```r
# Mock numeric matrix
m <- matrix(rnorm(500 * 25), ncol = 25, dimnames = list(rownames(PSc[[1]])[1:500], colnames(PSc[[1]])[1:25]))
num_matrix <- GenerateGenesets(m, n.genes = 100, mode = c("up", "down"),
comparison = "treated_vs_control", include.pathways = TRUE)
```
53 changes: 28 additions & 25 deletions tutorial/analysis_workflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,53 +9,56 @@ We have validated Beyondcell in a population of MCF7-AA cells exposed to 500nM o
## Using Beyondcell
For a correct analysis with **Beyondcell**, users should follow these steps:

1. Read single cell expression matrix
2. Compute Beyondcell scores
3. Compute Therapeutic clusters
1. Read a single-cell expression object
2. Compute the Beyondcell scores (BCS)
3. Compute the Therapeutic Clusters (TCs)
* Check clustering and look for unwanted sources of variation
* Regress out unwanted sources of variation
* Recompute UMAP
* Recompute UMAP rduction
4. Compute ranks
5. [**Visualize**](https://gitlab.com/bu_cnio/Beyondcell/-/tree/master/tutorial/visualization) the results


### 1. Read single cell expression object
In order to correctly compute the scores, the transcriptomic data needs to be pres-processed. This means that proper cell-based quality control filters, as well as normalization, scaling and clustering of the data, should be applied prior to the analysis with **Beyondcell**.
### 1. Read a single-cell expression object
Beyondcell can accept both a single-cell matrix or a Seurat object. In order to correctly compute the scores, the transcriptomics data needs to be pre-processed. This means that proper cell-based quality control filters, as well as normalization and scaling of the data, should be applied prior to the analysis with Beyondcell.

> Note: We recommend using a Seurat object.
```r
library("Beyondcell")
library("beyondcell")
library("Seurat")
# Read single cell experiment
sc = readRDS(path_to_sc)
```

### 2. Compute BCS
The `bcCompute` function allows you to input either a pre-processed seurat object or a single cell matrix. Have in mind, that when a seurat object is used as an input, the `DefaultAssay` must be specified, both `SCT` and `RNA` assays are accepted.
Note that if you are using a Seurat object, the `DefaultAssay` must be specified. Both `SCT` and `RNA` assays are accepted.

```r
# Set Assay
DefaultAssay(sc) <- "RNA"
```
**Generate Signatures**\
In order to compute the BCS, we also need a **gene signatures object** containing the drug or functional signatures we are interested in evaluating. To create this object, the `GenerateGenesets` function needs to be called. **Beyondcell** includes two drug signature collections that are ready to use:

* The drug Perturbation Signatures collection (PSc): captures the transcriptional changes induced by a drug.
* The drug Sensitivity Signatures collection (SSc): captures the drug sensitivity to a given drug.
### 2. Compute the BCS
We need to perform two steps:

#### Get a geneset object with signatures of interest
In order to compute the BCS, we also need a `geneset` object containing the drug or functional signatures we are interested in evaluating. To create this object, the `GenerateGenesets` function needs to be called. Beyondcell includes two drug signature collections that are ready to use:

A small collection of functional pathways will be included by default in your gene signatures object. These pathways are related to the regulation of the epithelial-mesenchymal transition (EMT), cell cycle, proliferation, senescence and apoptosis.
* **Drug Perturbation Signatures collection (PSc):** Captures the transcriptional changes induced by a drug.
* **Drug Sensitivity Signatures collection (SSc):** Captures the drug sensitivity to a given drug.

A small collection of functional pathways will be included by default in your gene signatures object. These pathways are related to the regulation of the epithelial-mesenchymal transition (EMT), cell cycle, proliferation, senescence and apoptosis.

```r
# Generate gene signatures object with one of the ready to use signature collections
gs <- GenerateGenesets(PSc, include.pathways = TRUE)
# Generate geneset object with one of the ready to use signature collections
gset <- GenerateGenesets(PSc)
# You can deactivate the functional pathways option if you are not interested in evaluating them
gs <- GenerateGenesets(PSc, include.pathways = FALSE)
nopath <- GenerateGenesets(PSc, include.pathways = FALSE)
```

Furthermore, **Beyondcell** allows the user to input a .GMT file containing the functional pathways/signatures of interest, or a numeric matrix (containing a ranking criteria such as the t-statistic or logFoldChange).

You can check out the structure of the obtained gene set object, information on the drug signatures, mode of action and target genes can be found at `gs@info` or by using the `FindDrugs` function.
PSc and SSc signatures can also be filtered according to several values. Moreover, Beyondcell allows the user to input a GMT file containing the functional pathways/signatures of interest, or a numeric matrix (containing a ranking criteria such as the t-statistic or logFoldChange). For further information please check [GenerateGenesets](https://gitlab.com/bu_cnio/Beyondcell/-/tree/master/tutorial/GenerateGenesets) tutorial.

**Compute BCS**
#### Compute the BCS
```r
# Compute score for the PSc. This might take a few minutes depending on the size of your dataset.
bc <- bcScore(sc, gs, expr.thres = 0.1)
Expand Down Expand Up @@ -85,13 +88,13 @@ It is important to check whether any unwanted source of variation is guiding the

```r
# Visualize whether cells are clustered based on the number of genes detecter per each cell
bcClusters(bc, UMAP = "Beyondcell", idents = "nFeature_RNA", factor.col = FALSE)
bcClusters(bc, UMAP = "beyondcell", idents = "nFeature_RNA", factor.col = FALSE)
```
<img src=".img/nFeature_variation.png" width="500">

```r
# Visualize whether cells are clustered based on their cell cycle status
bcClusters(bc, UMAP = "Beyondcell", idents = "Phase", factor.col = TRUE)
bcClusters(bc, UMAP = "beyondcell", idents = "Phase", factor.col = TRUE)
```
<img src=".img/Phase_variation.png" width="500">

Expand All @@ -112,9 +115,9 @@ Once corrected, you will need to recompute the dimensionality reduction and clus
# Recompute UMAP
bc <- bcUMAP(bc, pc = 5, res = 0.2, add.DSS = FALSE, k.neighbors = 20)
# Visualize UMAP
bcClusters(bc, UMAP = "Beyondcell", idents = "nFeature_RNA", factor.col = FALSE, pt.size = 1)
bcClusters(bc, UMAP = "beyondcell", idents = "nFeature_RNA", factor.col = FALSE, pt.size = 1)
# Visualize Therapeutic clusters
bcClusters(bc, UMAP = "Beyondcell", idents = "bc_clusters_res.0.2", pt.size = 1)
bcClusters(bc, UMAP = "beyondcell", idents = "bc_clusters_res.0.2", pt.size = 1)
```

<p float="left">
Expand Down
2 changes: 1 addition & 1 deletion tutorial/visualization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ bcSignatures(bc, UMAP = "beyondcell", genes = list(values = "PSMA5"), pt.size =
<img src=".img/psma5_expr.png" width="500">

## Ranking visualization
We can summarize the ranking results using the `bc4Squares` function. This function summarizes the top hits obtained for each of the specified condition levels. The residuals are represented in the x axis, the switch point is represented in the y axis. The top-left and bottom-right corners contain the drugs to which all selected cells are most/least sensistive, respectively. The centre quadrants show the drugs with an heterogeneous response. In this case, we can clearly see how the tool predicts an heterogeneous response to bortezomib.
We can summarize the ranking results using the `bc4Squares` function. This function summarizes the top hits obtained for each of the specified condition levels. The residuals are represented in the x axis, the switch point is represented in the y axis. The top-left and bottom-right corners contain the drugs to which all selected cells are least/most sensitive, respectively. The centre quadrants show the drugs with an heterogeneous response. In this case, we can clearly see how the tool predicts an heterogeneous response to bortezomib.

```r
bc4Squares(bc, idents = "condition", lvl = "t0", top = 5)
Expand Down

0 comments on commit 380e7a5

Please sign in to comment.