The test for checking FMA_NATIVE is faulty. #33011

KristofferC · 2019-08-21T15:06:22Z

We currently check for support for FMA (fused multiply add) by doing

const FMA_NATIVE = muladd(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) != 0

From #33010 (comment) @yuyichao said:

This test is wrong. Llvm has all the freedom it want to return either values for muladd no matter if fma is available. Fwiw, arm / aarch 64 even have fast muladd instruction with intermediate rounding.

The text was updated successfully, but these errors were encountered:

mbauman · 2019-08-21T15:10:08Z

More details and a breadcrumb for exposing LLVM support here: #9855.

yuyichao · 2019-08-21T15:19:36Z

We don't really need LLVM support since we have all the information in src/processor* now.

It's hard to expose a user-facing generic, flexible and stable API but for now exposing a temperary internal API should do no worse than now (Also note that the check cannot be fixed as is since it's at most reflecting the compiling machine status)

We could simply add a function to process_*.cpp jl_is_fma_native which will be implemented as jl_test_cpu_feature(JL_X86_fma) || jl_test_cpu_feature(JL_X86_fma4) for x86, jl_test_cpu_feature(JL_AArch32_vfp4) for arm, 1 for aarch64 and 0 for fallback. Note that this is the native feature of the running machine and not taking into account any target specification.

yuyichao · 2019-08-21T15:24:41Z

FWIW, given the target machine people are likely going to run julia on, I feel like setting it to true for x64 should be fine......................... Assuming the user of it doesn't directly emit fma instruction of course....

chriselrod · 2020-03-02T13:07:11Z

The Travis Mac doesn't have fma.

Despite compiling locally on a machine where it should be true, it's somehow set to false (although running the definition in the REPL returns true) . I guess that's because

Llvm has all the freedom it want to return either values for muladd no matter if fma is available.

JeffreySarnoff · 2020-10-05T01:49:58Z

What if we changed
const FMA_NATIVE = muladd(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) != 0
to
const FMA_NATIVE = fma(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) != 0
(at worst) it would be correct much more often [currently, afaik everyone gets the slower version of log(::Float64)]

oscardssmith · 2020-10-05T02:01:29Z

Wait, that's our check? That's straight up broken. We absolutely should fix that.

JeffreySarnoff · 2020-10-05T07:03:32Z

now with PR #37886

simonbyrne · 2021-11-18T00:58:21Z

To follow up on @yuyichao's comment, we can now do:

import Base.BinaryPlatforms.CPUID

function has_fma()
    CPUID.test_cpu_feature(CPUID.JL_X86_fma) ||
    CPUID.test_cpu_feature(CPUID.JL_X86_fma4) ||
    CPUID.test_cpu_feature(CPUID.JL_AArch32_vfp4) ||
    CPUID.normalize_arch(String(Sys.ARCH)) == "aarch64"
end

This has two problems:

It doesn't match the cpu target used by Julia i.e. starting julia -C "nehalem" still gives has_fma() == true on my (skylake) machine.
It won't be constant propagated, so would need to be used with @static, i.e.:

if @static has_fma()
   ...
end

In other words, we wouldn't be able to use this for the standard library, as it would simply be reflecting the architecture of the buildbot.

simonbyrne · 2021-11-18T01:18:58Z

Is there a way we can query the cpu_target from within Julia?

oscardssmith · 2021-11-18T01:36:14Z

Theoretically #43085 should take care of this for us.

KristofferC added the performance Must go faster label Aug 21, 2019

JeffBezanson added compiler:codegen Generation of LLVM IR and native code domain:maths Mathematical functions labels Nov 5, 2019

KristofferC mentioned this issue Mar 2, 2020

Fix FMA_NATIVE constant #32318

Merged

chriselrod mentioned this issue Mar 8, 2020

A faster log function #8869

Closed

KristofferC mentioned this issue Mar 9, 2020

Odd behavior with native FMA instructions when first launching Julia #35055

Closed

This was referenced Jun 19, 2020

hypot performance questionable #36353

Closed

A CPUID library in Base or stdlib #36367

Closed

JeffBezanson mentioned this issue Sep 8, 2020

Faster sinh and cosh for Float32, Float64 #37426

Merged

JeffreySarnoff mentioned this issue Oct 5, 2020

improve FMA_NATIVE definition #37886

Closed

KristofferC mentioned this issue Aug 28, 2021

^(::Float, ::Integer) #42031

Merged

KristofferC mentioned this issue Sep 8, 2021

Correctly rounded variant of the hypot code #32345

Merged

simonbyrne mentioned this issue Nov 15, 2021

Constant prop gives a different result in the presence of FMA #41450

Closed

simonbyrne mentioned this issue Nov 18, 2021

indicator for fast FMA #9855

Closed

simonbyrne mentioned this issue Nov 18, 2021

FMA multiversioning. #43085

Merged

vtjnash closed this as completed Feb 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The test for checking FMA_NATIVE is faulty. #33011

The test for checking FMA_NATIVE is faulty. #33011

KristofferC commented Aug 21, 2019

mbauman commented Aug 21, 2019

yuyichao commented Aug 21, 2019 •

edited

Loading

yuyichao commented Aug 21, 2019

chriselrod commented Mar 2, 2020

JeffreySarnoff commented Oct 5, 2020 •

edited

Loading

oscardssmith commented Oct 5, 2020

JeffreySarnoff commented Oct 5, 2020 •

edited

Loading

simonbyrne commented Nov 18, 2021 •

edited

Loading

simonbyrne commented Nov 18, 2021

oscardssmith commented Nov 18, 2021

The test for checking FMA_NATIVE is faulty. #33011

The test for checking FMA_NATIVE is faulty. #33011

Comments

KristofferC commented Aug 21, 2019

mbauman commented Aug 21, 2019

yuyichao commented Aug 21, 2019 • edited Loading

yuyichao commented Aug 21, 2019

chriselrod commented Mar 2, 2020

JeffreySarnoff commented Oct 5, 2020 • edited Loading

oscardssmith commented Oct 5, 2020

JeffreySarnoff commented Oct 5, 2020 • edited Loading

simonbyrne commented Nov 18, 2021 • edited Loading

simonbyrne commented Nov 18, 2021

oscardssmith commented Nov 18, 2021

yuyichao commented Aug 21, 2019 •

edited

Loading

JeffreySarnoff commented Oct 5, 2020 •

edited

Loading

JeffreySarnoff commented Oct 5, 2020 •

edited

Loading

simonbyrne commented Nov 18, 2021 •

edited

Loading