-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Starch config: Add aarch64 #108
Conversation
gtjoseph
commented
Feb 8, 2021
- Added aarch64 to dsp/starchgen.py and Makefile.
- Regenerated files
* Added aarch64 to dsp/starchgen.py and Makefile. * Regenerated files
When run in 64 bit mode, all you get for features is The arches available are armv8-a armv8.1-a armv8.2-a armv8.3-a and the default is the lowest compatibility (armv8-a) so I think the -march isn't needed. |
Well, if there's nothing special needed to compile for aarch64 and it doesn't enable any features, then we don't need a special flavor for it, However I don't think that's true - aarch64 / armv8 support implies neon support, so we should have a flavor that enables the neon intrinsics in that case. (I have not tested the neon stuff under aarch64 at all - it may need some tweaking - but AFAIK the A64 neon instruction set is a superset of the A32 neon instruction set so I doubt any intrinsics will be missing) |
The neon stuff at least compiles OK (runtime not tested) with |
So I think the way to go here is to have an
where and an aarch64 mix that includes generic + armv8_neon |
Works for me. Coming up. |
Well, almost. As I said, there are no -mfpu options for aarch64 so it'll just have to be -march=armv8-a+simd. I'm also going to try a flavor with sve2. |
I think we've got a catch-22 situation here. Let's say you use |
|
There's no catch-22. Only the stuff in The assumption is that for a given mix, the compiler is capable of generating all code for the flavors that make up the mix, even if the current CPU can't execute that code. (which is fine, given that gcc's code generation isn't affected by the choice of host machine, only by choice of the target) |
Maybe this is an aarch64-native vs armv8-on-32-bit target issue. I do notice that gcc seems to be producing 32-bit object files even in armv8 mode. You may need to experiment yourself to find the correct set of compiler flags to get neon intrinsics etc working; I don't have an aarch64 system on hand to try it on. |
gcc running natively on aarch64 does NOT support -mfpu. gcc10 supports -march=armv8+sve2 but gcc8 only supports -march=armv8+sve. I added a flavor for armv8_sve2 with a
It's attempting to compile all the flavors before it even knows what flavors are valid. |
Well, yeah, that's how it works. If you tell starch to build a given mix, it'll build the flavors associated with the mix, that's how it's designed to work. If you need to build different combinations of flavors depending on the compiler in use, those would need to be separate mixes. See what I said above about the assumption that the compiler can build all flavors in the mix you request. I deliberately did not put any sort of compiler/architecture detection into starch because it is a real can of worms; those decisions need to be made in the surrounding makefiles when selecting a mix to use. |
The flavor is now armv8_neon_simd
Gotcha. Updated to use simd anyway. |
Just FYI... It's almost impossible to rebase when there are starch changes because of conflicts in the generated files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just need a wisdom.aarch64.
There's a lot of noise changes in the generated code that are just changes in iteration ordering, I'll take a look at making that more deterministic.
gen.add_mix(name = 'aarch64', | ||
description = 'AARCH64', | ||
flavors = ['armv8_neon_simd', 'generic'], | ||
wisdom_file = 'wisdom.aarch64') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a suitable wisdom.aarch64
to add?
Maybe the the thing to do here is don't include the starch generated code changed in your PR; I can regenerate after merging. |
I went ahead and merged this since I have some starch changes about to land that would have caused a bunch of conflicts |
I was going to ask you about how you wanted to handle wisdom generation. I did add new wisdom.aarch64.pi4b and wisdom.aarch64.tegra to the wisdom directory but wasn't sure if you wanted me to put the pi4b one in the top level directory. |
The files in the wisdom subdir are just for reference, they're not directly used.
I put together a |