Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isapprox broken on large numbers #12375

Closed
jw3126 opened this issue Jul 29, 2015 · 12 comments
Closed

isapprox broken on large numbers #12375

jw3126 opened this issue Jul 29, 2015 · 12 comments
Labels
maths Mathematical functions

Comments

@jw3126
Copy link
Contributor

jw3126 commented Jul 29, 2015

I was writing some tests for isapprox and made the following horrible discovery:

isapprox(1e17, 1)
true

The issue lies in the default tolerances:

import Base.rtoldefault
rtoldefault(1e17, 1)
2.5198420997897464

More generally numbers larger then 5e16 are close to everything by default, since

rtoldefault(5e16, 1)
2.0

Julia Version 0.4.0-dev+5491
Commit cb77503 (2015-06-21 09:45 UTC)
Platform Info:
System: Linux (x86_64-unknown-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas
LIBM: libopenlibm
LLVM: libLLVM-3.3

@stevengj stevengj added the maths Mathematical functions label Jul 29, 2015
@stevengj
Copy link
Member

cc @jiahao, @alanedelman

@stevengj
Copy link
Member

My basic problem with the current implementation is that rtoldefault and atoldefault seem wrong (and in fact they are dimensionally incorrect, i.e. they have the wrong "units" if you think of x and y as dimensionful values, and this leads to incorrect scaling):

  • rtoldefault is a relative tolerance, so it should be dimensionless. The current definition, cbrt(max(eps(x), eps(y))), has the cube root of the dimensions of x or y, which is wrong. It should be cbrt(max(eps(typeof(x)), eps(typeof(y)))), assuming you want 1/3 of the digits.
  • atoldefault is currently sqrt(max(eps(x), eps(y))). This is also dimensionally incorrect, because it has the units of the square root of x or y (whereas an absolute tolerance should have the same units as x and y). I think the only meaningful default absolute tolerance we can use is zero, since a nonzero absolute tolerance needs to come from information outside of x or y.

If we change it to

rtoldefault(x::FloatingPoint, y::FloatingPoint) = cbrt(max(eps(typeof(x)), eps(typeof(y))))
atoldefault(x::FloatingPoint, y::FloatingPoint) = 0

then it would seem okay to me, although

  • cbrt (1/3 of the digits) seems pessimistic to me; sqrt (1/2 the digits) might be a better default?
  • It's not clear to me why we are taking the max of the epsilons, which corresponds to comparing equality according to the lower of the two precisions. If I ask whether a Float64 is approximately equal to a BigFloat, aren't I usually asking about equality at the BigFloat's precision? We promote to the higher of two precisions for every other operation, why not this one?

@jiahao
Copy link
Member

jiahao commented Jul 29, 2015

I'd be happy to get rid of isapprox entirely. The use cases are really application specific and don't generalize well.

FWIW Knuth defines two approximate floating point comparisons based on a (unspecified) threshold ϵ relative to the fr-exponent: ~ for "approximately equal to" and ≈ for "essentially equal to".

screen shot 2015-07-29 at 5 06 07 pm

Ref: TAoCP 2/e, Vol. 2, p. 218

GSL implements ~ in gsl_fcmp.

@jiahao
Copy link
Member

jiahao commented Jul 29, 2015

I should have read on, for on the next page, Knuth recommends an ϵ based on wanting the associativity of floating point multiplication to hold under ≈:

screen shot 2015-07-29 at 5 12 38 pm

@jiahao
Copy link
Member

jiahao commented Jul 29, 2015

If we use Knuth's definition for the OP, then we get the rhs of (24) to be 64.00000000000001 (modulo possible sloppiness with factors of 2), which is not too far from

julia> Base.atoldefault(1e17,1)
4.0

@pao
Copy link
Member

pao commented Jul 29, 2015

The original default tolerances discussion (for background, though probably not too enlightening) is in #652.

@stevengj
Copy link
Member

@pao, the current tolerances are dimensionally incorrect (they scale incorrectly with x and y), so they are just unacceptable.

@jiahao, it seems to me Knuth's tolerance is too strict to be commonly useful as defaults for isapprox; I don't think he's recommending it for general usage, just for analyzing associativity.

My feeling is that sqrt(ε) or cbrt(ε) [where ε is the relative precision, i.e. eps(T)] is about right as a default: if you've lost more than 1/2 or 2/3 of your digits, then you probably need to go to higher precision. Whereas losing 1/3 of your digits is often considered acceptable.

@jiahao
Copy link
Member

jiahao commented Jul 30, 2015

I like the idea of setting atol = 0.0 and rtol = √eps(promote_type(typeof(a), typeof(b)) as defaults.

@stevengj
Copy link
Member

(And probably should be a synonym for isapprox.)

@simonbyrne
Copy link
Contributor

Only problems I can see are the boundaries (subnormals or Infs). Other than that, +1.

@pao
Copy link
Member

pao commented Jul 30, 2015

the current tolerances are dimensionally incorrect (they scale incorrectly with x and y), so they are just unacceptable

I made no claim to the contrary.

stevengj added a commit to stevengj/julia that referenced this issue Jul 30, 2015
stevengj added a commit to stevengj/julia that referenced this issue Jul 30, 2015
stevengj added a commit to stevengj/julia that referenced this issue Jul 30, 2015
stevengj added a commit to stevengj/julia that referenced this issue Jul 30, 2015
stevengj added a commit to stevengj/julia that referenced this issue Jul 30, 2015
stevengj added a commit that referenced this issue Jul 31, 2015
@stevengj
Copy link
Member

stevengj commented Aug 5, 2015

See also #12472 for further developments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions
Projects
None yet
Development

No branches or pull requests

5 participants