Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

keep dimensions consistent when removing pathways in netVisual_embeddingPairwise #271

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ktessema
Copy link

in reference to #270

Seems like there's an issue with removing pathways in netVisual_embeddingPairwise. Pathways to be removed are identified in the similarity matrix, where each pathway is represented multiple times (once for each dataset). Each pathway in pathway.remove is labeled according to the dataset in which it meets the removal condition (colSums(similarity)==1) and labeled accordingly (eg "Pathway1--Dataset1"). 

However, when removing these from each individual dataset's probability matrix, the "--Dataset1" is dropped and "Pathway1" is subsequently removed from all datasets. As a result, the filtered similarity matrix (datasets combined) has a different number of rows compared to the sum of the rows of each dataset's filtered probability matrix. This will happen anytime a pathway is found in multiple datasets but did not meet the removal condition in all of those datasets.

I can think of two ways to resolve this:
A) remove each pathway in pathway.remove from all datasets (in the similarity matrix and in each individual probability matrix)
B) only remove a pathway from an individual dataset's probability matrix if the pathway met the removal condition in that particular dataset

I am not too familiar with the theory here, so I'm not sure what is best. For now I've edited the function to do option B, and I labeled the edited portions "edit_1" and "edit_2"
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant