Hacker News new | past | comments | ask | show | jobs | submit login

There's a fun paper by Autodesk where they make datasets that look whatever way you want them to.

https://www.autodesk.com/research/publications/same-stats-di...




Yes, a both funny and insightful lesson on how weak basic indicators (mean, standard deviation, correlation) can be, the interest and limits of box-plot with quartiles and whiskers, the benefit of the violin-plot.

Definitely worth a quick look.


NB: Post author here.

This is great! So fun...will have to use in the future...


While I have your attention...

> For the median and average to be equal, the points less than the median and greater than the median must have the same distribution (i.e., there must be the same number of points that are somewhat larger and somewhat smaller and much larger and much smaller).

[0, 2, 5, 9, 9] has both median and mean = 5, but the two sides don't really have the same distribution.


Totally true...thoughts on how I could rephrase? I guess it's more the "weight" of points greater than and less than the median should be the same, so symmetric distributions definitely have it, asymmetric may or may not. Definitely open to revising...


> symmetric distributions definitely have it, asymmetric may or may not

Doesn't have to be any more complicated than that. It's more a curio than an important point anyway :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: