Starch config: Add aarch64 #108

gtjoseph · 2021-02-08T00:10:12Z

Added aarch64 to dsp/starchgen.py and Makefile.
Regenerated files

* Added aarch64 to dsp/starchgen.py and Makefile. * Regenerated files

dsp/starchgen.py

gtjoseph · 2021-02-08T05:25:57Z

When run in 64 bit mode, all you get for features is fp asimd evtstrm crc32 cpuid as opposed to the half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 you get in 32 bit mode. There's no neon or vfpv4 and no -mfpu option to gcc.

The arches available are armv8-a armv8.1-a armv8.2-a armv8.3-a and the default is the lowest compatibility (armv8-a) so I think the -march isn't needed.

mutability · 2021-02-08T05:43:12Z

Well, if there's nothing special needed to compile for aarch64 and it doesn't enable any features, then we don't need a special flavor for it, generic will be fine. (Maybe we should move -ffast-math to the generic flavor, it should be fine in all cases, we don't care about the non-exactness it can cause)

However I don't think that's true - aarch64 / armv8 support implies neon support, so we should have a flavor that enables the neon intrinsics in that case. (I have not tested the neon stuff under aarch64 at all - it may need some tweaking - but AFAIK the A64 neon instruction set is a superset of the A32 neon instruction set so I doubt any intrinsics will be missing)

mutability · 2021-02-08T11:41:33Z

The neon stuff at least compiles OK (runtime not tested) with -march=armv8-a+simd -mfpu=neon-fp-armv8

mutability · 2021-02-08T11:48:59Z

So I think the way to go here is to have an armv8_neon flavor with:

features = ['neon']
compile_flags = ['-march=armv8-a+simd', '-mfpu=neon-fp-armv8', '-ffast-math']
test_function = 'cpu_supports_armv8_simd'

where cpu_supports_armv8_simd should test for Aarch64Info.features.asimd

and an aarch64 mix that includes generic + armv8_neon

gtjoseph · 2021-02-08T13:24:10Z

Works for me. Coming up.

gtjoseph · 2021-02-08T13:48:50Z

Well, almost. As I said, there are no -mfpu options for aarch64 so it'll just have to be -march=armv8-a+simd. I'm also going to try a flavor with sve2.

gtjoseph · 2021-02-08T14:43:11Z

I think we've got a catch-22 situation here. Let's say you use -march=armv8-a+sve2 in starchgen and you add a test in cpu.c to test for aarch64_info()->features.sve2, on a processor that doesn't support it, the compile is going to fail because -march=armv8-a+sve2 was included in the compile command line. Doesn't the test for flags have to happen in a module that is compiled without the flags?

mutability · 2021-02-08T15:29:29Z

-mfpu works just fine on my (Raspbian buster) gcc here (and in fact it is necessary when using neon intrinsics)

pi@piaware:~ $ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/8/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Raspbian 8.3.0-6+rpi1' --with-bugurl=file:https:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 8.3.0 (Raspbian 8.3.0-6+rpi1) 
pi@piaware:~ $ echo '#include <arm_neon.h>' >test.c
pi@piaware:~ $ gcc -march=armv8-a+simd -mfpu=neon-fp-armv8 -c test.c
pi@piaware:~ $

mutability · 2021-02-08T15:30:23Z

I think we've got a catch-22 situation here.

There's no catch-22. Only the stuff in dsp/generated is compiled with special flags. cpu.c and the bulk of dump1090 itself are compiled without any -march etc.

The assumption is that for a given mix, the compiler is capable of generating all code for the flavors that make up the mix, even if the current CPU can't execute that code. (which is fine, given that gcc's code generation isn't affected by the choice of host machine, only by choice of the target)

mutability · 2021-02-08T15:38:04Z

Maybe this is an aarch64-native vs armv8-on-32-bit target issue. I do notice that gcc seems to be producing 32-bit object files even in armv8 mode. You may need to experiment yourself to find the correct set of compiler flags to get neon intrinsics etc working; I don't have an aarch64 system on hand to try it on.

gtjoseph · 2021-02-08T16:03:49Z

gcc running natively on aarch64 does NOT support -mfpu.

gcc10 supports -march=armv8+sve2 but gcc8 only supports -march=armv8+sve. I added a flavor for armv8_sve2 with a -march=armv8-a+sve2 option along with the armv8_neon_simd flavor. When I compile on my rpi4 with native aarch64 and gcc8, I get...

cc -I. -DMODES_DUMP1090_VERSION=\"unknown\" -DMODES_DUMP1090_VARIANT=\"dump1090-fa\" -D_DEFAULT_SOURCE -DENABLE_CPUFEATURES -Icpu_features/include -DENABLE_RTLSDR -DSTARCH_MIX_AARCH64 -std=c11 -O3 -g -Wall -Wmissing-declarations -Werror -W -D_DEFAULT_SOURCE -fno-common -I/ -c -DSTARCH_MIX_AARCH64 -march=armv8-a+simd -ffast-math dsp/generated/flavor.armv8_neon_simd.c -o dsp/generated/flavor.armv8_neon_simd.o
cc -I. -DMODES_DUMP1090_VERSION=\"unknown\" -DMODES_DUMP1090_VARIANT=\"dump1090-fa\" -D_DEFAULT_SOURCE -DENABLE_CPUFEATURES -Icpu_features/include -DENABLE_RTLSDR -DSTARCH_MIX_AARCH64 -std=c11 -O3 -g -Wall -Wmissing-declarations -Werror -W -D_DEFAULT_SOURCE -fno-common -I/ -c -DSTARCH_MIX_AARCH64 -march=armv8-a+sve2 -ffast-math dsp/generated/flavor.armv8_sve2.c -o dsp/generated/flavor.armv8_sve2.o
cc1: error: invalid feature modifier in ‘-march=armv8-a+sve2’
make: *** [dsp/generated/makefile.aarch64:28: dsp/generated/flavor.armv8_sve2.o] Error 1

It's attempting to compile all the flavors before it even knows what flavors are valid.

mutability · 2021-02-08T16:56:02Z

Well, yeah, that's how it works. If you tell starch to build a given mix, it'll build the flavors associated with the mix, that's how it's designed to work. If you need to build different combinations of flavors depending on the compiler in use, those would need to be separate mixes. See what I said above about the assumption that the compiler can build all flavors in the mix you request.

I deliberately did not put any sort of compiler/architecture detection into starch because it is a real can of worms; those decisions need to be made in the surrounding makefiles when selecting a mix to use.

The flavor is now armv8_neon_simd

gtjoseph · 2021-02-08T17:12:06Z

Gotcha. Updated to use simd anyway.

gtjoseph · 2021-02-08T17:15:36Z

Just FYI... It's almost impossible to rebase when there are starch changes because of conflicts in the generated files.

mutability

Looks good, just need a wisdom.aarch64.

There's a lot of noise changes in the generated code that are just changes in iteration ordering, I'll take a look at making that more deterministic.

mutability · 2021-02-09T05:30:39Z

dsp/starchgen.py

+gen.add_mix(name = 'aarch64',
+ description = 'AARCH64',
+ flavors = ['armv8_neon_simd', 'generic'],
+ wisdom_file = 'wisdom.aarch64')
+


Do you have a suitable wisdom.aarch64 to add?

mutability · 2021-02-09T06:07:16Z

Just FYI... It's almost impossible to rebase when there are starch changes because of conflicts in the generated files.

Maybe the the thing to do here is don't include the starch generated code changed in your PR; I can regenerate after merging.

mutability · 2021-02-09T06:09:24Z

I went ahead and merged this since I have some starch changes about to land that would have caused a bunch of conflicts

gtjoseph · 2021-02-09T12:50:55Z

I was going to ask you about how you wanted to handle wisdom generation. I did add new wisdom.aarch64.pi4b and wisdom.aarch64.tegra to the wisdom directory but wasn't sure if you wanted me to put the pi4b one in the top level directory.

mutability · 2021-02-09T12:57:11Z

The files in the wisdom subdir are just for reference, they're not directly used.

wisdom.arm etc in the top level dir are read by starchgen to provide the default ordering of functions in the generated code, if no custom wisdom is provided at runtime. So we want a good default there. For arm (32-bit) I essentially took the top pi4 armv7-specific result and the top pi0w generic result and used those.

I put together a wisdom.aarch64 for now based on the Pi 4 aarch64 results, with the Pi 0W results as a fallback if for some reason the armv8/neon detection says that it isn't supported (I don't think this is possible with the armv8-a profile, but..). If you want you can run a generic-flavors-only benchmark (-F generic) on aarch64 to generate something more representative for the fallback case.

Starch config: Add aarch64

1b0bcef

* Added aarch64 to dsp/starchgen.py and Makefile. * Regenerated files

mutability reviewed Feb 8, 2021

View reviewed changes

dsp/starchgen.py Outdated Show resolved Hide resolved

Starch: Changed aarch64 flavor to use simd

2308c1d

The flavor is now armv8_neon_simd

mutability requested changes Feb 9, 2021

View reviewed changes

mutability merged commit 5600d3f into flightaware:dev Feb 9, 2021

gtjoseph deleted the dev-fa-starch-aarch64 branch February 9, 2021 12:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Starch config: Add aarch64 #108

Starch config: Add aarch64 #108

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability commented Feb 8, 2021

mutability commented Feb 8, 2021

mutability commented Feb 8, 2021 •

edited

Loading

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability commented Feb 8, 2021

mutability commented Feb 8, 2021 •

edited

Loading

mutability commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability commented Feb 8, 2021 •

edited

Loading

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability left a comment

mutability Feb 9, 2021

mutability commented Feb 9, 2021

mutability commented Feb 9, 2021

gtjoseph commented Feb 9, 2021

mutability commented Feb 9, 2021

Starch config: Add aarch64 #108

Starch config: Add aarch64 #108

Conversation

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability commented Feb 8, 2021

mutability commented Feb 8, 2021

mutability commented Feb 8, 2021 • edited Loading

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability commented Feb 8, 2021

mutability commented Feb 8, 2021 • edited Loading

mutability commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability commented Feb 8, 2021 • edited Loading

gtjoseph commented Feb 8, 2021

gtjoseph commented Feb 8, 2021

mutability left a comment

Choose a reason for hiding this comment

mutability Feb 9, 2021

Choose a reason for hiding this comment

mutability commented Feb 9, 2021

mutability commented Feb 9, 2021

gtjoseph commented Feb 9, 2021

mutability commented Feb 9, 2021

mutability commented Feb 8, 2021 •

edited

Loading

mutability commented Feb 8, 2021 •

edited

Loading

mutability commented Feb 8, 2021 •

edited

Loading