Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Running CellChat with a large dataset #244

Open
daccachejoe opened this issue Jul 13, 2021 · 12 comments
Open

Running CellChat with a large dataset #244

daccachejoe opened this issue Jul 13, 2021 · 12 comments

Comments

@daccachejoe
Copy link

Hi, I am trying to run CellChat with a large dataset ~100,000 cells but the 'computeCommunProb' step repeatedly runs into memory issues. The object is already downsampled by identity class within Seurat down from 500,000 cells so I would like to not have to downsample the object further.

Do any options exist within the pipeline to decrease the memory requirement and make the algorithm more scalable? I am running the analysis on 120 cores with 400 GB of memory available.

Thanks!

@sqjin
Copy link
Owner

sqjin commented Jul 15, 2021

@jad362 Thanks for pointing this issue. I think the reason is due to the calculation of mean value per cell group. Can you try the following

  1. Can you check if you have do.fast = TRUE option in the function computeCommunProb? If not, please update your cellchat package.
  2. No parallel. do not run future::plan("multiprocess", workers = 4) .
  3. set nboot = 20, which will run few times of permutation

@daccachejoe
Copy link
Author

Hi @sqjin thanks for your quick reply.
I originally had do.fast = TRUE and also tried running not in parallel and those changes did not solve the issue, but setting nboot = 20 had it work right away, so thanks! Could you provide some insight on the drawbacks of decreasing that permutation parameter and how much confidence can I continue to have in the results?

Thanks

@daccachejoe
Copy link
Author

Also this is an unrelated question, but is there a method to group cells by a meta data variable not used for the analysis itself? For example create a summary plot of myeloid cell interactions as a whole with Endothelial cells when the analysis itself was run on more specific meta data variables?

thanks!

@sqjin
Copy link
Owner

sqjin commented Jul 19, 2021

@jad362 Please check the tutorial on
group.cellType <- c(rep("FIB", 4), rep("DC", 4), rep("TC", 4)) group.cellType <- factor(group.cellType, levels = c("FIB", "DC", "TC")) object.list <- lapply(object.list, function(x) {mergeInteractions(x, group.cellType)}) cellchat <- mergeCellChat(object.list, add.names = names(object.list))

@daccachejoe
Copy link
Author

Thanks, that works!
Another note- when performing a Bonferroni correction on the ligand-receptor results (as stored in cellchat@net) would I be dividing by the total number of ligand-receptor pairs in the database I used or would I divide by the number of interactions I was modeling (i.e. the total number of cell type combinations)?

Thanks!

@sqjin
Copy link
Owner

sqjin commented Jul 22, 2021

@jad362 I am thinking it should be the total number of L-R pairs, which is similar to differential expression analysis, where you divide by the total number of genes.

For your last question 'but setting nboot = 20 had it work right away, so thanks! Could you provide some insight on the drawbacks of decreasing that permutation parameter and how much confidence can I continue to have in the results?'

I think the results will not change too much. If nboot = 100, then thresh = 0.05 means there are five permuations having larger communication probabilities. If nboot = 20, then thresh = 0.05 means there are one permutation having larger communication pprobabilities.

@Fatomk11295
Copy link

@sqjin Can you please explain this code for me ? I am new to r and i cant understand how to reproduce this line of code to my data

group.cellType <- c(rep("FIB", 4), rep("DC", 4), rep("TC", 4)) group.cellType <- factor(group.cellType, levels = c("FIB", "DC", "TC")) object.list <- lapply(object.list, function(x) {mergeInteractions(x, group.cellType)}) cellchat <- mergeCellChat(object.list, add.names = names(object.list))

@sukks105
Copy link

sukks105 commented Nov 9, 2022

@sqjin I am trying to do differential number of interactions for clusters A, B, C and D in two different conditions. I have a question on why you used "4" in the code group.cellType <- c(rep("FIB", 4), rep("DC", 4), rep("TC", 4))? I got following error when i used "4".

My code :
group.cellType <- c(rep("A", 4), rep("B",4), rep("C",4), rep("D",4))
group.cellType <- factor(group.cellType, levels = c("A", "B", "C", "D"))
object.list <- lapply(object.list, function(x) {mergeInteractions(x, group.cellType)})

Error: Error in count[group.merged == i, group.merged == j] :
(subscript) logical subscript too long

However, when i used 1 or 2 instead of 4 in group.cellType <- c(rep("A", 1), rep("B",1), rep("C",1), rep("D",1)) or
group.cellType <- c(rep("A", 2), rep("B",2), rep("C",2), rep("D",2)) i no longer get errror but got completely different results.

Would you please clarify on this?

@sqjin
Copy link
Owner

sqjin commented Nov 12, 2022

@sukks105 My data have four subclusters of FIB and thus group them into one cell type.

@sukks105
Copy link

@sukks105 My data have four subclusters of FIB and thus group them into one cell type.

Thanks a lot for the clarification!

@luongthang1908
Copy link

@jad362 I am thinking it should be the total number of L-R pairs, which is similar to differential expression analysis, where you divide by the total number of genes.

For your last question 'but setting nboot = 20 had it work right away, so thanks! Could you provide some insight on the drawbacks of decreasing that permutation parameter and how much confidence can I continue to have in the results?'

I think the results will not change too much. If nboot = 100, then thresh = 0.05 means there are five permuations having larger communication probabilities. If nboot = 20, then thresh = 0.05 means there are one permutation having larger communication pprobabilities.

Hi @sqjin, can I keep nboot = 100 and process the computeCommunProb in parallel?
Thanks,

@sqjin
Copy link
Owner

sqjin commented Jan 22, 2024

@luonthang1908 Yes, you can. You can also perform subsampling before running cellchat

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants