Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster Float32 and Float16 pow #40236

Merged
merged 2 commits into from
Apr 24, 2021

Conversation

oscardssmith
Copy link
Member

@oscardssmith oscardssmith commented Mar 27, 2021

Returns Float32 speed to roughly the speed of 1.5. Also speeds up Float16 pow. These new methods are .5 ULP (from limited testing). There is further room to optimize these, but this fixes the regression. At some point, I hope to have a Float64 version, but that will be much harder as it requires a log2 function that gives extra bits of precision, and an exp2 function that takes in a Double Double

@dkarrasch dkarrasch added domain:maths Mathematical functions performance Must go faster labels Mar 27, 2021
@oscardssmith
Copy link
Member Author

Do we require pow produce exact results for Integer arguments? That's the test that is currently failing.

@oscardssmith
Copy link
Member Author

I think this is ready to go. I haven't done fully rigorous tests on it, but every test-case I've used has worked, and conceptually, this should be just over .5 ULP.

@oscardssmith
Copy link
Member Author

I've tested a wide variety of random numbers and haven't found more than .5 ULP. Given that current pow is system dependent (about .75 ULP on my system), I think we should merge.

@oscardssmith
Copy link
Member Author

Bumping this. Can someone look at it and merge?

@@ -867,29 +867,35 @@ end
z
end
@inline function ^(x::Float32, y::Float32)
z = ccall("llvm.pow.f32", llvmcall, Float32, (Float32, Float32), x, y)
z = Float32(exp2_fast(log2(Float64(x))*y))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we want to trust llvm (aka libm) here anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.6 introduced about a 3x regression on this due to a switch in which libm got loaded.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fix openlibm?

Copy link
Member Author

@oscardssmith oscardssmith Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably. That said, I generally think that we shouldn't be relying on external libraries, so I'm not that sad about replacing this anyway.

base/math.jl Show resolved Hide resolved
@oscardssmith oscardssmith merged commit 1474566 into JuliaLang:master Apr 24, 2021
@oscardssmith oscardssmith deleted the better-pow-32 branch April 24, 2021 05:31
@kimikage
Copy link
Contributor

kimikage commented Apr 25, 2021

@oscardssmith, does this cause a problem when used with @fastmath?
(cc: @vtjnash)

julia> versioninfo()
Julia Version 1.7.0-DEV.1006
Commit 248c02f531* (2021-04-24 17:37 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)

julia> Float16(-1)^2
Float16(1.0)

julia> @fastmath Float16(-1)^2 # used in Colors.jl
ERROR: DomainError with -1.0:
log2 will only return a complex result if called with a complex argument. Try log2(Complex(x)).

The previous commit is OK.

julia> versioninfo()
Julia Version 1.7.0-DEV.998
Commit ac7974acef* (2021-04-23 20:59 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)

julia> @fastmath Float16(-1)^2
Float16(1.0)

@kimikage
Copy link
Contributor

Another case (found in ColorVectorSpace.jl)

julia> (-1.0f0)^2.0f0
ERROR: DomainError with -1.0:
log2 will only return a complex result if called with a complex argument. Try log2(Complex(x)).

cf. PkgEval: https://github.com/JuliaCI/NanosoldierReports/blob/master/pkgeval/by_date/2021-04/25/report.md

@oscardssmith
Copy link
Member Author

That's unfortunate. I'll revert tomorrow unless I can think of something clever to fix it without too much performance impact.

simeonschaub added a commit that referenced this pull request Apr 26, 2021
KristofferC pushed a commit that referenced this pull request Apr 26, 2021
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
Approximately .5 ULP, relatively fast.
Update float^integer as well
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
jarlebring pushed a commit to jarlebring/julia that referenced this pull request May 4, 2021
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
Approximately .5 ULP, relatively fast.
Update float^integer as well
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
johanmon pushed a commit to johanmon/julia that referenced this pull request Jul 5, 2021
Approximately .5 ULP, relatively fast.
Update float^integer as well
johanmon pushed a commit to johanmon/julia that referenced this pull request Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:maths Mathematical functions performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants