Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Error with netClustering(cellchat, type = "functional") #174

Open
YitengDang opened this issue Apr 13, 2021 · 18 comments
Open

Error with netClustering(cellchat, type = "functional") #174

YitengDang opened this issue Apr 13, 2021 · 18 comments

Comments

@YitengDang
Copy link

Hi, I'm simply trying to run the tutorial script in Rstudio, but am running into the following problem.

When running
cellchat <- netClustering(cellchat, type = "functional"),
I get the error message:

Screenshot 2021-04-13 at 15 29 06

Initially I thought it might be due to UMAP which wasn't installed, but I installed UMAP and verified that it works. Could you please help me solve this issue? Thanks a lot in advance!

@sqjin
Copy link
Owner

sqjin commented Apr 18, 2021

@YitengDang This was an interesting issue. I reinstalled the package from my github, but I did not find any issue. However, you are not the first people mentioned this issue recently. You can try the method in the Pull Request suggested by another user.

@ColeKeenum
Copy link

ColeKeenum commented Apr 20, 2021

@sqjin I am also having this issue when running the vignette!

Getting the following error:

> cellchat <- computeNetSimilarity(cellchat, type = "functional")
> cellchat <- netEmbedding(cellchat, type = "functional")
Manifold learning of the signaling networks for a single dataset 
C:\Users\colek\AppData\Roaming\Python\Python38\site-packages\umap\umap_.py:132: UserWarning: A large number of your vertices were disconnected from the manifold.
Disconnection_distance = 1 has removed 142 edges.
It has fully disconnected 3 vertices.
You might consider using find_disconnected_points() to find and remove these points from your data.
Use umap.utils.disconnected_vertices() to identify them.
  warn(
> #> Manifold learning of the signaling networks for a single dataset
> cellchat <- netClustering(cellchat, type = "functional")
Classification learning of the signaling networks for a single dataset 
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

@YitengDang
Copy link
Author

YitengDang commented Apr 22, 2021

So this issue is persisting for me, even after reinstalling the package and running the tutorial. Exactly the same error also arises when I apply the function to my own data. However, interestingly, if I knit the tutorial (using knitr) to create an html file, it does not give any problems and actually outputs a plot. However, if I knit my own data then it still gives the same error. Altogether, this is very puzzling.

@sqjin
Copy link
Owner

sqjin commented Apr 22, 2021

@YitengDang @ColeKeenum Can you guys share your cellchat object so that I can replicate the error?

@YitengDang
Copy link
Author

YitengDang commented Apr 23, 2021

@sqjin I am inviting you to a private repo which contains my own data in the file "./CellChat/cellchat_Cell-ECM_out_run2.Rds". Let me know if you can access the data.
For the tutorial, I checked that loading the data "./tutorial/cellchat_humanSkin_LS.rds" from this repo still gives the same error for me.
Thanks a lot in advance.

@YitengDang
Copy link
Author

This has been solved in one of the updates of CellChat somewhere between 1.0.0 and 1.1.3. After pulling the latest version from GitHub (1.1.3), I was able to run the functional clustering part.

@zyy-doctor
Copy link

Hello, actually I have the same questions and errors as yours, but the version of my CellChat is 1.1.3. So I wonder to know which update do you think is the key to the question. Thank you so much!

@YitengDang
Copy link
Author

YitengDang commented Mar 2, 2022

Sorry to reopen this issue, but after some reinstallations and updates I'm encountering the same problem again. There seem to be two separate issues here:

  1. An update in the R package future has caused the parallelization to break down in various parts of the code. The solution is to
    a. rewrite the ifelse loop as explained in this thread
    b. replace the deprecated future::plan("multiprocess") by future::plan("multisession") whenever you encounter future::plan.
    The functions that need to be updated are all functions that use future, including identifyOverExpressedGenes, identifyOverExpressedInteractions, computeCommunProb and netClustering.

  2. An issue with netClustering(cellchat, type = "functional") specifically that has been mentioned in many other issues, e.g. #301, #278, #336 and several others (too many to list). This seems to be related to the fact that the netClustering function cannot deal with NaN values in the UMAPs. Several upstream and downstream functions are affected, so the whole pipeline needs to be examined.
    a. First we calculate similarities between pathways by running cellchat <- computeNetSimilarity(cellchat, type = "functional"). This generates a similarity matrix stored in cellchat@netP$similarity[['functional']]$matrix.
    b. The UMAPs are calculated then from the similarity matrices by running cellchat <- netEmbedding(cellchat, type = "functional"). The result is stored in cellchat@netP$similarity[["functional"]]$dr. However, for some unknown reasons this sometimes gives pathways with NaN values for the UMAPs.
    c. As a result, the netClustering function breaks down because the K-means clustering algorithm invoked in the line idents <- kmeans(data.use, kRange[x], nstart=10)$cluster cannot deal with NaNs.

Altogether, the following temporary patch solves the issue by removing the pathways that have NaNs:

  • In thenetClustering() function, after Y <- methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] add the following lines:
pathways.ignore <- rownames( Y[rowSums(!is.finite(Y))>0, ] )
cellchat@options$pathways.ignore = pathways.ignore
Y <- Y[!rowSums(!is.finite(Y)),] # filter out rows with NaN, not working downstream
methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] <- Y
data.use <- Y

This filters out the clusters with NaN values for the UMAPs.

  • In the main code, run netVisual_embedding with option pathway.remove = cellchat@options$pathways.ignore.
    This removes the pathways with NaNs from the plot to avoid an error.

@sqjin: hopefully this helps in solving these recurrent issues that many users seem to face! This is not a final solution since we just filter out a few pathways that don't work well, but it would be better to directly patch either the UMAP (so it doesn't produce NaNs) or the K-means clustering (so it can deal with NaNs without throwing an error). If I have time I'll try to look into how to solve this.

@YitengDang YitengDang reopened this Mar 2, 2022
@zyy-doctor
Copy link

zyy-doctor commented Mar 4, 2022 via email

@xiandxing
Copy link

Hi,
Thanks for this Issue and also thanks for this great package.
I got the same problem in running netClustering for cellchat_humanSkin_LS.rds.
Then I have solved the problem with @YitengDang's last comment, but I still have a problem for netVisual_embedding . This is the problem below:

my code line:

netVisual_embedding(cellchat, type = "functional", pathway.remove = cellchat@options$pathways.ignore, label.size = 3.5)

the problem is described as below:

Error in data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), : arguments imply differing number of rows: 10, 13
Traceback:

  1. netVisual_embedding(cellchat, type = "functional", pathway.remove = cellchat@options$pathways.ignore,
    . label.size = 3.5)
  2. data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum),
    . labels = as.character(unlist(dimnames(prob)[3])), Groups = as.factor(Groups))
  3. stop(gettextf("arguments imply differing number of rows: %s",
    . paste(unique(nrows), collapse = ", ")), domain = NA)

And I wonder if there is a problem with my cellchat@options$pathways.ignore, it gives: NULL

It may not give a really bad effect, but I do want to solve this problem. Thanks a lot!

@sofiapuvogelvittini
Copy link

sofiapuvogelvittini commented Apr 27, 2022

Same problem here, I used [YitengDang] code modification to netClustering() and it worked. However, while using netVisual_embedding(cellchat, type = "functional", label.size = 3.5, pathway.remove = cellchat@options$pathways.ignore)
I obtain the following error
Error in data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), : arguments imply differing number of rows: 49, 51

And also I obtain NULL while typing cellchat@options$pathways.ignore

Could you please help with this? Thank you very much.

@sqjin
Copy link
Owner

sqjin commented Apr 27, 2022

@sofiapuvogelvittini It is so sad to know that this issue has not been well addressed. I am wondering if you would like to share your cellchat object with me and I can test it. Do you have the same issue when running the data in trhe turorial "./tutorial/cellchat_humanSkin_LS.rds"

@sofiapuvogelvittini
Copy link

Dear sqjin, thank you very much for offering help. I would be happy to share the cellchat object with you so you can test.
I have the same issue when running the data in the tutorial. By modifying the code as suggested by [YitengDang] I can run my.netClustering() without problem, however while plotting with

netVisual_embedding(cellchat, type = "functional", label.size = 3.5, pathway.remove = cellchat@options$pathways.ignore) i still have the error:
Error in data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), : arguments imply differing number of rows: 25, 26

and I also obtain NULL in cellchat@options$pathways.ignore.
How can I send you the object?
All the best,
Sof'ia

@AIYang1210
Copy link

@YitengDang Hi, may I know if you have solved your problem now? I also met the same issue.

  1. future::plan("multiprocess", workers = 4)
    Error: ‘node$session_info$process$pid == pid’ is not TRUE

when I changed the code, it still appeared.

future::plan("multisession", workers = 4)
Error: ‘node$session_info$process$pid == pid’ is not TRUE

Interesting, only by this code it worked. And the following "identifyOverExpressedGenes" ran well.
future::plan("multisession", workers = 1)

But, if workers more than 1, error appeared again:

future::plan("multisession", workers = 2)
Error: ‘node$session_info$process$pid == pid’ is not TRUE
future::plan("multisession", workers = 3)
Error: ‘node$session_info$process$pid == pid’ is not TRUE
future::plan("multisession", workers = 4)
Error: ‘node$session_info$process$pid == pid’ is not TRUE

  1. for netClustering(), I changed
    netClustering <- function(object, slot.name = "netP", type = c("functional","structural"), comparison = NULL, k = NULL, methods = "kmeans", do.plot = TRUE, fig.id = NULL, do.parallel = TRUE, nCores = 1, k.eigen = NULL)

but it reported:
Error in storage.mode(x) <- "double" :
'list' object cannot be coerced to type 'double'

when I modified "kmeans(data.frame(data.use),kRange[x],nstart=10)$cluster",
it showed "Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class ‘"umap"’ to a data.frame "

So, could you help me to solve these problems? Thank you very much!

@sqjin
Copy link
Owner

sqjin commented May 29, 2022

@AIYang1210 I suggest you to set future::plan("multisession", workers = 1) when running this function. We will fix it when finding a good solution

@pouryany
Copy link

Sorry to reopen this issue, but after some reinstallations and updates I'm encountering the same problem again. There seem to be two separate issues here:

  1. An update in the R package future has caused the parallelization to break down in various parts of the code. The solution is to
    a. rewrite the ifelse loop as explained in this thread
    b. replace the deprecated future::plan("multiprocess") by future::plan("multisession") whenever you encounter future::plan.
    The functions that need to be updated are all functions that use future, including identifyOverExpressedGenes, identifyOverExpressedInteractions, computeCommunProb and netClustering.
  2. An issue with netClustering(cellchat, type = "functional") specifically that has been mentioned in many other issues, e.g. #301, #278, #336 and several others (too many to list). This seems to be related to the fact that the netClustering function cannot deal with NaN values in the UMAPs. Several upstream and downstream functions are affected, so the whole pipeline needs to be examined.
    a. First we calculate similarities between pathways by running cellchat <- computeNetSimilarity(cellchat, type = "functional"). This generates a similarity matrix stored in cellchat@netP$similarity[['functional']]$matrix.
    b. The UMAPs are calculated then from the similarity matrices by running cellchat <- netEmbedding(cellchat, type = "functional"). The result is stored in cellchat@netP$similarity[["functional"]]$dr. However, for some unknown reasons this sometimes gives pathways with NaN values for the UMAPs.
    c. As a result, the netClustering function breaks down because the K-means clustering algorithm invoked in the line idents <- kmeans(data.use, kRange[x], nstart=10)$cluster cannot deal with NaNs.

Altogether, the following temporary patch solves the issue by removing the pathways that have NaNs:

  • In thenetClustering() function, after Y <- methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] add the following lines:
pathways.ignore <- rownames( Y[rowSums(!is.finite(Y))>0, ] )
cellchat@options$pathways.ignore = pathways.ignore
Y <- Y[!rowSums(!is.finite(Y)),] # filter out rows with NaN, not working downstream
methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] <- Y
data.use <- Y

This filters out the clusters with NaN values for the UMAPs.

  • In the main code, run netVisual_embedding with option pathway.remove = cellchat@options$pathways.ignore.
    This removes the pathways with NaNs from the plot to avoid an error.

@sqjin: hopefully this helps in solving these recurrent issues that many users seem to face! This is not a final solution since we just filter out a few pathways that don't work well, but it would be better to directly patch either the UMAP (so it doesn't produce NaNs) or the K-means clustering (so it can deal with NaNs without throwing an error). If I have time I'll try to look into how to solve this.

I was facing the same types of error (as in having NaNs). I just changed the UMAP background package to uwot and the errors seem to be fixed. I guess this is related to the initial seeds of the umap function. So if you can someway introduce options to set seeds within the netEmbedding function, such errors could be fixed.

Best,
Pourya

@maqingyue
Copy link

maqingyue commented Mar 20, 2023

I have the same error reported, but it appears in the computeCommunProb function. Is there an exact solution?"

cellchat <- computeCommunProb(cellchat, raw.use = TRUE, type = "truncatedMean", trim = 0.1, distance.use = TRUE, interaction.length = 200, scale.distance = 0.01)
truncatedMean is used for calculating the average gene expression per cell group.
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

@double322
Copy link

Hi, I'm simply trying to run the tutorial script in Rstudio, but am running into the following problem.

When running
,
I get the error message:cellchat <- netEmbedding(cellchat,slot.name = 'netP',type = "functional")
Manifold learning of the signaling networks for a single dataset
Error in runUMAP(Similarity, min_dist = min_dist, n_neighbors = n_neighbors, :
Cannot find UMAP, please install through pip (e.g. pip install umap-learn or reticulate::py_install(packages = 'umap-learn')).

reticulate::py_install(packages = 'umap-learn')
Using virtual environment "C:/Users/18408/Documents/.virtualenvs/r-reticulate" ...

  • "C:/Users/18408/Documents/.virtualenvs/r-reticulate/Scripts/python.exe" -m pip install --upgrade --no-user umap-learn
    Requirement already satisfied: umap-learn in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (0.5.4)
    Requirement already satisfied: numpy>=1.17 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (1.25.2)
    Requirement already satisfied: scipy>=1.3.1 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (1.11.3)
    Requirement already satisfied: scikit-learn>=0.22 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (1.3.1)
    Requirement already satisfied: numba>=0.51.2 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (0.58.0)
    Requirement already satisfied: pynndescent>=0.5 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (0.5.10)
    Requirement already satisfied: tqdm in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (4.66.1)
    Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from numba>=0.51.2->umap-learn) (0.41.0)
    Requirement already satisfied: joblib>=0.11 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from pynndescent>=0.5->umap-learn) (1.3.2)
    Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from scikit-learn>=0.22->umap-learn) (3.2.0)
    Requirement already satisfied: colorama in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from tqdm->umap-learn) (0.4.6)

cellchat <- netEmbedding(cellchat, type = "functional")
Manifold learning of the signaling networks for a single dataset
Error in runUMAP(Similarity, min_dist = min_dist, n_neighbors = n_neighbors, :
Cannot find UMAP, please install through pip (e.g. pip install umap-learn or reticulate::py_install(packages = 'umap-learn')).
Could you please help me solve this issue? Thanks a lot in advance!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants