Support for Renoir Zen 2 CPUs #36826

jebej · 2020-07-27T21:44:36Z

On a laptop with a Ryzen 7 4700U CPU running Windows 10, this is what is being reported (also reported on Discourse under WSL2):

julia> versioninfo()
Julia Version 1.5.0-rc2.0
Commit 7f0ee122d7 (2020-07-27 15:24 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: AMD Ryzen 7 4700U with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, znver1)

julia> LinearAlgebra.versioninfo()
BLAS: libopenblas (OpenBLAS 0.3.9  USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott MAX_THREADS=32)
LAPACK: libopenblas64_

As far as I can tell, LLVM 9 should support znver2, and OpenBLAS should as well.

The text was updated successfully, but these errors were encountered:

yuyichao · 2020-07-27T22:13:31Z

#36502

jebej · 2020-07-27T22:20:38Z

This patch seems to have been included in rc2, which was the version I tried...

yuyichao · 2020-07-27T22:24:56Z

No it's not. Only the bug fix part of it was.

jebej · 2020-07-27T22:26:44Z

I see, so support will only come in 1.6 then?

yuyichao · 2020-07-27T22:46:12Z

It should be in 1.6.

jebej · 2020-07-27T22:46:49Z

I tried a nighly and got this:

julia> versioninfo()
Julia Version 1.6.0-DEV.548
Commit 9267bbf1fc (2020-07-27 16:57 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: AMD Ryzen 7 4700U with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, btver1)

julia> LinearAlgebra.versioninfo()
BLAS: libopenblas (OpenBLAS 0.3.9  USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott MAX_THREADS=32)
LAPACK: libopenblas64_

yuyichao · 2020-07-27T23:17:15Z

AFAICT our detection code is identical to LLVM's. And previously you saw a "better" result since the detection code misidentify it as zen1 and bypassing LLVM ones. If you can find out your CPUID it'll be pretty easy to add it to the list of znver2 ones.

Also, for the openblas issue, you should report to https://github.com/xianyi/OpenBLAS instead. AFAICT we are using a version witht identical AMD CPU detectiton as the latest master there.

jebej · 2020-07-28T13:14:11Z

CPUID:

CPUID signature: 860F01
Family: 23 (017h)
Model: 96 (060h)
Stepping: 1 (01h)

Full info uploaded to cpu-world: https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=71632

jebej · 2020-07-28T13:28:14Z

Regarding OpenBLAS, that might not necessarily be their fault, on Discourse, the OP installed the package from AUR and the architecture got detected properly:

Update: I reinstalled and reconfigured the libopenblas from AUR, now the system Julia has:

julia> LinearAlgebra.versioninfo()
BLAS: libopenblas (OpenBLAS 0.3.10 NO_AFFINITY USE_OPENMP ZEN MAX_THREADS=12)
LAPACK: liblapack

EDIT: nevermind that, see OpenMathLib/OpenBLAS#2738

* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826

jebej · 2020-07-30T19:56:49Z

Thanks! Do you think the fix could be backported to 1.5?

yuyichao · 2020-07-30T20:16:34Z

#36831 is for master only so it won't be backported unless is. And you aren't missing out much on 1.5. It's identified as zen 1 which affects the scheduling module a little bit but none of the feature detection are affected.

* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826 (cherry picked from commit cd3fb4d)

stillyslalom · 2020-08-04T19:10:01Z

I have a Renoir processor that's correctly detected as znver2 by versioninfo(), but it's still described as Prescott in LinearAlgebra.versioninfo() on latest nightly, which leads to terrible BLAS performance:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.0-DEV.580 (2020-08-04)
 _/ |\__'_|_|_|\__'_|  |  Commit 8a6656016b (0 days old master)
|__/

julia> versioninfo()
Julia Version 1.6.0-DEV.580
Commit 8a6656016b (2020-08-04 16:02 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: AMD Ryzen 9 4900HS with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, znver2)
Environment:
  JULIA_NUM_THREADS = 8

julia> using LinearAlgebra

julia> LinearAlgebra.versioninfo()
BLAS: libopenblas (OpenBLAS 0.3.9  USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott MAX_THREADS=32)

julia> BLAS.set_num_threads(1)

julia> peakflops()/1e9
18.47128632768085

yuyichao · 2020-08-04T19:14:38Z

Yes, that's OpenBLAS issue. Ref OpenMathLib/OpenBLAS#2738 . We need to carry patch and/or bump openblas version.

Also, openblas is in general way too conservative on the dispatch. It appears to never dispatch based on features and only look for exact uarch match without a gental fallback...

jebej · 2020-08-04T19:17:45Z

In the meantime you can use an environment variable as explained on discourse: OPENBLAS_CORETYPE=ZEN.

stillyslalom · 2020-08-04T19:29:53Z

Can confirm, I'm seeing (roughly) expected speeds now:

julia> BLAS.set_num_threads(1)

julia> peakflops(4000)/1e9
61.15865161036225

julia> BLAS.set_num_threads(8)

julia> peakflops(4000)/1e9
225.9833945035955

julia> BLAS.set_num_threads(16)

julia> peakflops(4000)/1e9
239.07667095448187

* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix JuliaLang#36826

* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826 (cherry picked from commit cd3fb4d)

yuyichao closed this as completed Jul 27, 2020

yuyichao reopened this Jul 27, 2020

yuyichao mentioned this issue Jul 28, 2020

A few processor detection/features tweaks #36831

Merged

yuyichao closed this as completed in #36831 Jul 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Renoir Zen 2 CPUs #36826

Support for Renoir Zen 2 CPUs #36826

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 28, 2020

jebej commented Jul 28, 2020 •

edited

Loading

jebej commented Jul 30, 2020

yuyichao commented Jul 30, 2020

stillyslalom commented Aug 4, 2020

yuyichao commented Aug 4, 2020

jebej commented Aug 4, 2020

stillyslalom commented Aug 4, 2020

Support for Renoir Zen 2 CPUs #36826

Support for Renoir Zen 2 CPUs #36826

Comments

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 27, 2020

yuyichao commented Jul 27, 2020

jebej commented Jul 28, 2020

jebej commented Jul 28, 2020 • edited Loading

jebej commented Jul 30, 2020

yuyichao commented Jul 30, 2020

stillyslalom commented Aug 4, 2020

yuyichao commented Aug 4, 2020

jebej commented Aug 4, 2020

stillyslalom commented Aug 4, 2020

jebej commented Jul 28, 2020 •

edited

Loading