Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cov Float32 precision issues #83

Closed
milankl opened this issue Aug 16, 2021 · 2 comments · Fixed by #85
Closed

Cov Float32 precision issues #83

milankl opened this issue Aug 16, 2021 · 2 comments · Fixed by #85

Comments

@milankl
Copy link

milankl commented Aug 16, 2021

Julia v1.6 on both mac and linux

julia> using Statistics
julia> A = 20*randn(Float32,10_000_000) .+ 100;
julia> cov(A,A)
293.97824f0

julia> var(A)
399.9234f0

While the true variance should be 400 for A the estimate from cov is way off. In contrast, var seems to be written in a way that the rounding error does not scale with the number of elements in A, as it should be.

@nalimilan
Copy link
Member

Good catch. See https://github.com/JuliaLang/Statistics.jl/pull/85.

BTW, note that only the var method for arrays is accurate. If we use the method defined for general iterators, which does a single pass over the data (using Welford's algorithm), we get the same result as cov. Not sure something can be done about it.

julia> using Statistics

julia> A = 20*randn(Float32,10_000_000) .+ 100;

julia> cov(A,A)
394.65015f0

julia> var(A)
400.24118f0

julia> var((x for x in A))
394.6492f0

julia> var((x for x in A), mean=mean(A))
394.65015f0

@milankl
Copy link
Author

milankl commented Sep 13, 2021

Amazing work, thanks @nalimilan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants