Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leave division for variance calculation until it's required #11

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rupertmillard
Copy link

(Hi - this is the first time I've used git or github!)
Instead of storing variance, I suggest storing Q = variance * size
This improves speed of add() by ~2% and merge() by ~12%, without affecting the numerical stability of this implementation
Of course, it shall reduce the speed of variance() slightly, but even if you calculate variance after every insertion, we're saving some multiplies over the previous implementation

@BatmanAoD
Copy link

Doesn't this (slightly) increase the risk of an overflow error? (The tradeoff may still be worth it; I'm not sure.)

@rupertmillard
Copy link
Author

No, not at all.

When the variance was stored, at the beginning of the algorithm for inclusion of another sample, q = variance * size was calculated, then at the end of that, variance = q / size was calculated. Now q is stored and only divided by size to calculate variance when you want the variance.

@BatmanAoD
Copy link

Ah, I missed the calculation of prevq. Objection withdrawn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants