-
-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clustergram - Request to cluster based on a selected aggregated "groupby" column #645
Comments
I just ran into a dataset that had so many data samples that Scipy ran into a "maximum recursion depth exceeded" error when attempting to cluster the samples, so being able to optionally cluster by an aggregated category would also alleviate this issue. |
Hi @adkinsrs. The reordering of the data is proceeding not in the Clustegram component directly, but in the Dendrogram class from the plotly.figure_factory module. So we don't available to fix the main problem of this issue in the dash-bio project. We can create an issue about the reordering problem in the original Dendrogram component from figure_factory. Best wishes, |
Currently the Dash clustergram is restricted to clustering based on all row or column values. There are cases where I would like to sort my data based on a chosen metadata category, and then cluster based on the mean value of that metadata category. Right now I am forced to choose to preserve sorting without clustering, or cluster by the raw data values and lose the aesthetic grouping that came from pre-sorting the data. Below I have two pictures of Dash-Bio Clustergrams (with my own post-processing touches) that show the situation I am trying to convey.
Clustering by individual samples instead of category
![Screen Shot 2021-12-08 at 11 03 43 AM](https://user-images.githubusercontent.com/5665914/145241495-3d9bad28-6116-45ad-8faa-e399245c7f2a.png)
Sorted by a category but no clustering
![Screen Shot 2021-12-08 at 11 03 31 AM](https://user-images.githubusercontent.com/5665914/145241498-74269377-da57-4a34-8231-6a201e166ba7.png)
The functionality I am requesting is similar to the dendrogram option for Scanpy's heatmap function (see https://scanpy.readthedocs.io/en/stable/generated/scanpy.pl.heatmap.html).
I thought a potential solution would be to
dashbio.Clustergram
on this to get the dendrogram traces backdashbio.Clustergram
using the sorted non-grouped original data.But I would be running the "clustergram" tool twice, and since the category groups have uneven counts of members, the traces from step 2 would not line up 1-to-1 with the sorted data and the x/y coords would need to be adjusted.
Any thoughts on this enhancement?
The text was updated successfully, but these errors were encountered: