Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fmaf is sometimes incorrect #160

Open
ikirill opened this issue Sep 3, 2017 · 9 comments
Open

fmaf is sometimes incorrect #160

ikirill opened this issue Sep 3, 2017 · 9 comments

Comments

@ikirill
Copy link

ikirill commented Sep 3, 2017

This example was pointed out by njuffa on scicomp.stackexchange.com. I checked it also with a recent version of Z3 and exact rational arithmetic.

# This is the native cpu instruction
julia> fma(0.9474001f0, 4.639901f-7, -0.24325085f0)
-0.24325041f0

julia> Base.fma_libm(0.9474001f0, 4.639901f-7, -0.24325085f0)
-0.2432504f0

julia> versioninfo()
Julia Version 0.6.0
Commit 903644385b (2017-06-19 13:05 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
@yuyichao
Copy link
Contributor

yuyichao commented Sep 3, 2017

FWIW, glibc fmaf got this right. (And it also uses hardware instructions when available...)

@ikirill
Copy link
Author

ikirill commented Sep 3, 2017

For reference, this is what glibc seems to be doing, if I translated the code right (in works in my tests):


function get_FE_INEXACT()
  # https://github.com/JuliaLang/julia/blob/2e6d908cea8334e875ea4d22d771a6eab0eef8e7/src/jlapi.c#L413
  u = zeros(Cint, 9) + typemin(Cint)
  ccall(:jl_get_fenv_consts, Void, (Ptr{Cint},), u)
  myid() > 1 || println(STDERR, "jl_get_fenv_consts: ", u)
  u[1]
end

const FE_INEXACT = get_FE_INEXACT()

fetestexcept(excepts) = ccall((:fetestexcept, :libc), Cint, (Cint,), excepts)
feclearexcept(excepts) = ccall((:feclearexcept, :libc), Cint, (Cint,), excepts) == 0 || error("feclearexcept")
feinexact() = fetestexcept(FE_INEXACT) != 0

function glibc_fmaf(x::Float32, y::Float32, z::Float32)
  temp = Float64(x) * Float64(y)
  if temp == Float64(-z)
    return Float32(temp) + z
  end

  setrounding(Float64, RoundToZero)
  feclearexcept(FE_INEXACT)
  u = temp + Float64(z)
  inexact = feinexact()
  setrounding(Float64, RoundNearest)
  w = reinterpret(UInt64, u)
  if inexact && (w & 0x1) == 0 && ((w & 0x7ff0_0000_0000_0000) >> 52) != 0x7ff
    Float32(reinterpret(Float64, w | 0x1))
  else
    Float32(u)
  end
end

@ikirill
Copy link
Author

ikirill commented Sep 3, 2017

One idea I had is that the result - xy == z is not actually a correct way to test whether the sum xy + z was computed exactly, and replace the test with computing the correct error using the double-double 2sum algorithm and checking that the known correct error is zero instead. This is different from what glibc does, which seems based on a different concept entirely (but maybe not, on further thought I'm not sure; maybe all I'm doing is emulating the FE_INEXACT flag). Then it no longer seems to fail (I tested it on 2^35 random values, which is good enough to trigger the original bug):

# Hopefully this causes it to correctly round to zero
@noinline function add_rtz(x::Float64, y::Float64)
  ccall(:fesetround, Cint, (Cint,), Base.Rounding.JL_FE_TOWARDZERO)
  w = x + y
  ccall(:fesetround, Cint, (Cint,), Base.Rounding.JL_FE_TONEAREST)
  return w
end

# This is a translation of openlibm's fmaf
function fma_libm_modified(x::Float32, y::Float32, z::Float32)
  xy = Float64(x) * Float64(y)

  # The original openlibm code:
  # result = xy + Float64(z)
  # exact = result - xy == z

  # A correct test for whether xy+z was computed exactly
  # This is the double-double 2sum algorithm
  result = xy + Float64(z)
  temp = result - xy
  # order of parens is important:
  result_err = (xy - (result - temp)) + (Float64(z) - temp)
  exact = result_err == 0.0

  w = reinterpret(UInt64, result)

  if (w & 0x1fffffff) != 0x10000000 || (w & 0x7ff0000000000000) == 0x7ff0000000000000 || exact
    return Float32(result)
  else
    adjusted_result = add_rtz(xy, Float64(z))
    if result == adjusted_result
      return Float32(reinterpret(Float64, w + 1))
    else
      return Float32(adjusted_result)
    end
  end
end

@simonbyrne
Copy link
Member

We can't really use glibc code for licensing reasons. musl code shouldn't be a problem.

@ikirill
Copy link
Author

ikirill commented Sep 5, 2017

@simonbyrne Right, sorry, I wasn't thinking about licenses when I posted that, I was just curious what the right approach was. They both seem to be based on fma emulation through rounding-to-odd, like this: Emulation of a FMA and correctly-rounded sums:
proved algorithms using rounding to odd
by Sylvie Boldo, Guillaume Melquiond

@ViralBShah
Copy link
Member

In Julia, JuliaLang/julia#42783 fixes the software fma.

@ViralBShah
Copy link
Member

➜  ~ ./julia/julia 
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.0-DEV.894 (2021-11-05)
 _/ |\__'_|_|_|\__'_|  |  Commit 7d41d1eb61 (4 days old master)
|__/                   |

julia> Base.fma_llvm(0.9474001f0, 4.639901f-7, -0.24325085f0)
-0.24325041f0

julia> Base.fma(0.9474001f0, 4.639901f-7, -0.24325085f0)
-0.24325041f0

julia> Base.fma_emulated(0.9474001f0, 4.639901f-7, -0.24325085f0)
-0.24325041f0

@ViralBShah ViralBShah changed the title fmaf is sometimes incorrect fmaf is sometimes incorrect in Julia Nov 10, 2021
@yuyichao
Copy link
Contributor

Is the openlibm function fixed?

@ViralBShah
Copy link
Member

Huh - I thought this issue was about the Julia fma implementation, which is fixed. I see that the openlibm one is still broken.

@ViralBShah ViralBShah reopened this Nov 10, 2021
@ViralBShah ViralBShah changed the title fmaf is sometimes incorrect in Julia fmaf is sometimes incorrect Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants