You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Error reporting for distance matrix input is unclear.
There are 2 problems here: 1) is that sklearn pairwise_distances returns something where the diagonals are not exactly 0 when you pass in a pandas dataframe of floats, which is obviously not your problem. 2) is that the error message says dimension mismatch when it shouldn't (should report trace is not 0 in one of the distance matrices).
However, wondering if it also makes sense to change these to be soft checks (close to 0 as opposed to exactly 0). I don't have a strong feeling about that either way.
Reproducing code example:
import numpy as np
import pandas as pd
from sklearn.metrics import pairwise_distances
X = np.random.uniform(size=(100, 2))
Y = np.random.normal(size=(100, 2))
X = pd.DataFrame(X)
X_dist = pairwise_distances(X)
Y_dist = pairwise_distances(Y)
print(np.diag(X_dist).max())
MGC(None).test(X_dist, Y_dist)
Error message
2.1073424255447017e-08
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/mnt/c/Users/t-bpedig/code/graph-embedding-methods/GraphEmbeddingMethods/sandbox/notebooks/2020-06-29-v2-maggot-hemisphere-single-embed.py in
163 Y_dist = pairwise_distances(Y)
164 print(np.diag(X_dist).max())
---> 165 MGC(None).test(X_dist, Y_dist)
~/miniconda3/envs/embed/lib/python3.7/site-packages/hyppo/independence/mgc.py in test(self, x, y, reps, workers)
217
218 if self.is_distance:
--> 219 check_xy_distmat(x, y)
220
221 # using our joblib implementation instead of multiprocessing backend in
~/miniconda3/envs/embed/lib/python3.7/site-packages/hyppo/_utils.py in check_xy_distmat(x, y)
80 if nx != px or ny != py or np.trace(x) != 0 or np.trace(y) != 0:
81 raise ValueError(
---> 82 "Shape mismatch, x and y must be distance matrices "
83 "have shape [n, n] and [n, n]."
84 )
ValueError: Shape mismatch, x and y must be distance matrices have shape [n, n] and [n, n].
Version information
OS: Ubuntu 20.04
Python Version 3.7.3
Package Version 0.1.2
The text was updated successfully, but these errors were encountered:
Error reporting for distance matrix input is unclear.
There are 2 problems here: 1) is that sklearn pairwise_distances returns something where the diagonals are not exactly 0 when you pass in a pandas dataframe of floats, which is obviously not your problem. 2) is that the error message says dimension mismatch when it shouldn't (should report trace is not 0 in one of the distance matrices).
However, wondering if it also makes sense to change these to be soft checks (close to 0 as opposed to exactly 0). I don't have a strong feeling about that either way.
Reproducing code example:
Error message
Version information
The text was updated successfully, but these errors were encountered: