-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unbiased squared distance covariance increases as sample size becomes large #353
Comments
hello @loremarchi I reviewed your code and error and found out may be it is because of Implementation Differences or Statistical Power . To resolve this you may use another library or TRY THIS CODE: import numpy as np def _r_distance_corr(X, Y, mode="squared_cov", unbiased=True):
t = hyppo.independence.Dcorr() n_samples = [100, 1000, 10000, 50000, 70000, 100000] |
Dear team, first of all thank you for maintaining this useful package. I was trying to run the independence test by Shen et al. (2022), but I received some weird results, such as a rejection of independence even though I knew that the two one-dimensional vectors I was testing were independent. It was only when working with large samples (more than 70,000 observations) that I noticed something was off.
Although I have not yet identified the source of the problem in the code, I have a script available that reproduces the issue I am referring to. Please note that I have modified the method statistic in the dcorr.py to output both "stat" and "covar".
Reproducing code example:
Please, see that as the sample size increases (n = 70000 and n = 100000) the hyppo unbiased squared distance covariance becomes unintuitive.
Results
Version information
The text was updated successfully, but these errors were encountered: