Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Getting adjusted p values CellChat #706

Open
jv-20 opened this issue Oct 4, 2023 · 5 comments
Open

Getting adjusted p values CellChat #706

jv-20 opened this issue Oct 4, 2023 · 5 comments

Comments

@jv-20
Copy link

jv-20 commented Oct 4, 2023

Hi @sqjin

Couple of quick questions.

Do we need to do multiple testing to obtain adjusted pvalues / the p-value reported with cellchat identifications of interactions in the individual dataset is sufficient?

Also in terms of identifying dysregulated signalling when comparing two datasets, would you be able to advise the code to how to get adjusted pvalues? Likewise is multiple testing needed to be done here as additional step to filter out false positives?

# define a positive dataset, i.e., the dataset with positive fold change against the other dataset
pos.dataset = "LS"
# define a char name used for storing the results of differential expression analysis
features.name = pos.dataset
# perform differential expression analysis
cellchat <- identifyOverExpressedGenes(cellchat, group.dataset = "datasets", pos.dataset = pos.dataset, features.name = features.name, only.pos = FALSE, thresh.pc = 0.1, thresh.fc = 0.1, thresh.p = 1)
#> Use the joint cell labels from the merged CellChat object
# map the results of differential expression analysis onto the inferred cell-cell communications to easily manage/subset the ligand-receptor pairs of interest
net <- netMappingDEG(cellchat, features.name = features.name)
# extract the ligand-receptor pairs with upregulated ligands in LS
net.up <- subsetCommunication(cellchat, net = net, datasets = "LS",ligand.logFC = 0.2, receptor.logFC = NULL)
# extract the ligand-receptor pairs with upregulated ligands and upregulated recetptors in NL, i.e.,downregulated in LS
net.down <- subsetCommunication(cellchat, net = net, datasets = "NL",ligand.logFC = -0.1, receptor.logFC = -0.1)

Thanks

@jv-20 jv-20 changed the title Getting adjusted p values Getting adjusted p values CellChat Oct 4, 2023
@sqjin
Copy link
Owner

sqjin commented Oct 4, 2023

@jv-20 The statistical test is used to find the interactions that are over-represented in certain cell groups. Thus we think it is sufficient. In addition, since we only permute the cell labels 100 times, it is not suitable for further multiple testing. Finally, based on recent benchmarking study (one paper published in Genome Biol 2023), our method ranks on the top.

For the differential expression analysis, we do already perform the multiple testing using '"bonferroni"' method. You can check the source codes if you want.

@jv-20
Copy link
Author

jv-20 commented Oct 5, 2023

@sqjin Many thanks for the reply.

One last question. So for example, the ligand.pvalues obtained using this function and code below is the adjusted.p.value?

net <- netMappingDEG(cellchat, features.name = features.name)
# extract the ligand-receptor pairs with upregulated ligands in LS
net.up <- subsetCommunication(cellchat, net = net, datasets = "LS",ligand.logFC = 0.2, receptor.logFC = NULL)

Thanks

@jv-20
Copy link
Author

jv-20 commented Oct 5, 2023

@sqjin , also would you be able to advise how to get logFC data frame for all signalling genes across all cell types comparing two datasets. Not just the significant ones.

If I use,

pos.dataset = "d1"
# define a char name used for storing the results of differential expression analysis
features.name = pos.dataset

# perform differential expression analysis
cellchat <- identifyOverExpressedGenes(cellchat.HC_CDI, group.dataset = "datasets", pos.dataset = pos.dataset, features.name = features.name, only.pos = FALSE, thresh.pc = 0, thresh.fc = 0, thresh.p = 1)

#> Use the joint cell labels from the merged CellChat object # map the results of differential expression analysis onto the inferred cell-cell communications to easily manage/subset the ligand-receptor pairs of interest
net <- netMappingDEG(cellchat, features.name = features.name,thresh = 1)

#trying to create a logFC dataframe for ligands
net.d1=net%>%filter(datasets=="d1")
net.d1_long=net.d1%>%select(source,ligand,ligand.pvalues,ligand.logFC)%>%arrange(source)%>%distinct(source,ligand,.keep_all=T)

net.d1_wide=reshape(net.d1_long,idvar="ligand",timevar="source",direction="wide")
net.d1_wide.logFC=net.d1_wide[,grepl("logFC",colnames(net.d1_wide))]
net.d1_wide.pvalues=net.d1_wide[,grepl("pvalues",colnames(net.d1_wide))]

I get information on genes for certain cell types. Some of the cell types show "NA" values. Is the NA values shown for certain cell types are because these cell types sufficiently doesn't express these genes to be included in the analyses so coded as NA?

Is it possible to get information on logFC and adjusted pvalues for all signalling genes across all celltypes?

The bonferroni adjusted p values obtained using

[email protected]$d1.info

returns only handful of genes as significant. It seems to over correct. Would you have any suggestions?

Thanks

@sqjin
Copy link
Owner

sqjin commented Oct 6, 2023

@jv-20 In this case, you need to modify the source code of identifyOverExpressedGenes

@jv-20
Copy link
Author

jv-20 commented Oct 6, 2023

Thanks @sqjin It is little confusing. Could you please help in clarifying which lines of codes to modify in identifyOverExpressedGenes

Thank you

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants