Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Linux/ARM64 cross-compilation support #425

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dicej
Copy link
Contributor

@dicej dicej commented Jun 8, 2024

Currently, the Makefile assumes the LLVM toolchain it builds can be executed natively to build wasi-libc etc., which isn't true when cross-compiling for another platform, but we can work around that by:

  1. Building the native LLVM toolchain and using it to build everything else, as usual
  2. Deleting that LLVM build and rebuilding (and reinstalling) it with LLVM_CMAKE_FLAGS set to cross compile
  3. Rebuilding and reinstalling a cross-compiled wasm-component-ld
  4. Building deb and tar files from the above

Note that we now label the tarfiles linux-amd64 and linux-arm64, respectively for clarity.

The whole approach is a bit hacky, but GitHub is planning to roll out ARM64 runner support for open source projects later this year, at which point we can start building natively, so I don't think we need to invest a lot of effort into this.

I've run CI in my fork and verified the artifact produced there works on my Ubuntu 24.04 ARM64 machine (Asahi Linux on an Apple M2 Pro).

Note that I have not yet updated the dockerbuild CI step, so the dist-ubuntu-bionic artifacts do not yet include an ARM64 build; I'll experiment with that next. Update: now we only do a cross build as part of dockerbuild, since dist-ubuntu-bionic is the one we actually publish in releases.

Fixes #236
Fixes #347

@dicej
Copy link
Contributor Author

dicej commented Jun 8, 2024

Oops, the new step shouldn't be run on macOS, obviously. I'll fix that.

@alexcrichton
Copy link
Collaborator

Personally I've always felt that the build logic in this repository is, even prior to this PR, at a bit of a breaking point. I'm worried that adding this logic will push it over the edge to the point that it's significantly more complicated to continue to edit the build and expand it over time. One pain point with this PR is that the cross builds are interacting with the Makefile in subtle ways:

  • The logic is wrapped up in *.yml files that isn't easy to reproduce locally.
  • This subtly relies on the strip target only depending on LLVM, nothing else. (as currently the rules would otherwise invoke the wrong-architecture compiler to compile the sysroot)
  • This manually compiles wasm-component-ld, but outside of the Makefile so for example the version required is duplicated.

Now I suspect that none of this is necessarily news to you (or probably others either). I don't want to throw up a blocker for this PR, but I do at the same time want to bring up all this and perhaps see if we could brainstorm to make this easier in the future. For example if we could build everything from scratch in my mind a more ideal workflow might look like:

  • Don't use a Makefile since it's mostly just used like a shell script here. Instead have something like build.py (or maybe cmake? unsure) or similar which enables doing all of this in a more readable programming language. It'd still call out to cmake/make for sub-projects though.
  • Split the build into host artifacts and the wasi-sysroot.
  • Each host we're producing builds for (e.g. windows/mac/linux-x64/linux-arm64) would all have their own builder. This builder would produce just the host tools, e.g. more-or-less what this PR added to the *.yml
  • One target builder, say linux-x64, would wait for the linux-x64 tools and then use that to build the entire sysroot.
  • One final builder would take all these components and weave them together, avoiding building the sysroot on multiple platforms and producing the same tarballs produced today.

For example I don't think there's much use in having two Linux builds of clang, one in docker and one not. I think that all Linux builds should go through Docker for "defined glibc compatibility" and that could perhaps also be somewhere to wrap up cross-compile logic from x64 to arm64.

I realize though that what I'm talking about here is not necessarily trivial work and I doubt you're looking to really focus on doing any of this as it's just a means to an end. Despite that though I really do feel that the build here in this repository is at a breaking point and I'm fearful of the size of the debt being grown here if this isn't taken as a time to refactor.

@dicej
Copy link
Contributor Author

dicej commented Jun 10, 2024

Yeah, that all sounds great to me, and I 100% agree that there's a lot of duplicated and wasted effort, including the redundant builds of the sysroot and the ubuntu-latest and 32-bit Windows artifacts which we build but then discard in favor of the ubuntu-bionic and 64-bit Windows ones. Trimming that bloat would be reason enough to refactor this thing, not to mention the maintenance benefits.

OTOH, I don't have the bandwidth right now for a significant refactor. Happy to open an issue to track it, though!

Currently, the Makefile assumes the LLVM toolchain it builds can be executed
natively to build `wasi-libc` etc., which isn't true when cross-compiling for
another platform, but we can work around that by:

1. Building the native LLVM toolchain and using it to build everything else, as usual
2. Deleting that LLVM build and rebuilding (and reinstalling) it with `LLVM_CMAKE_FLAGS` set to cross compile
3. Rebuilding and reinstalling a cross-compiled `wasm-component-ld`
4. Building deb and tar files from the above

Note that we now label the tarfiles `linux-amd64` and `linux-arm64`,
respectively for clarity.

The whole approach is a bit hacky, but GitHub is planning to roll out ARM64
runner support for open source projects later this year, at which point we can
start building natively, so I don't think we need to invest a lot of effort into
this.

I've run CI in my fork and verified the artifact produced there works on my
Ubuntu 24.04 ARM64 machine (Asahi Linux on an Apple M2 Pro).

Fixes WebAssembly#236
Fixes WebAssembly#347

Signed-off-by: Joel Dice <[email protected]>
@dicej
Copy link
Contributor Author

dicej commented Jun 10, 2024

I just pushed an update; now we only build for Linux/ARM64 as part of the dockerbuild job since dist-ubuntu-bionic is the build we actually care about.

@dicej dicej marked this pull request as ready for review June 10, 2024 21:41
alexcrichton added a commit to alexcrichton/wasi-sdk that referenced this pull request Jun 16, 2024
This commit is an attempt to provide a concrete path forward on
WebAssembly#425. I personally think it's pretty important to
get the ability to have more architectures here but at the same time I
also think it's important to to take this as an opportunity to refactor
and improve the build system of this repository. To that end this
represents my attempt to improve the status quo.

This removes the old `Makefile` and replaces it with a CMake-based
system to build all these projects. Overall this is intended to be a "no
functional change" intended sort of refactoring. Changing build systems
inevitably causes issues, however, so this change additionally has a
very high likelihood of needing follow-up fixes. At a high enough level
this commit introduces two major changes to how this repository is
built:

1. The `make`-based system (the root `Makefile`) is replaced with CMake.
   This additionally updates tests to use CMake.
2. A single "build" is split into either building a toolchain or
   building a sysroot. This enables builds to only build one or the
   other as necessary.

The first change, using CMake, is due to the fact that using `make` on
Windows basically is not pleasant coupled with the fact that more
advanced logic, such as changing flags, compilers, etc, is much easier
with a CMake-based system. The second change is intended to cover the
use case of WebAssembly#425 in addition to refactoring the current build.

Throughout this change I have intentionally not tried to keep a 1:1
correspondance with behaviors in the old `Makefile` because much of this
PR is intended to address shortcomings in the old build system. A list
of changes, improvements, etc, made here are:

* CMake provides a much nicer portability story to Windows than `make`.
  This is moving towards the direction of not needing `bash`, for
  example, to build an SDK. Currently `wasi-libc` still requires this,
  but that's now the only "hard" dependency.

* The set of targets built can now be configured for smaller builds
  and/or debugging just a single target. All WASI targets are still
  built by default but it's much easier to add/remove them.

* Different targets are now able to be built in parallel as opposed to
  the unconditional serial-nature of the `Makefile`.

* Use of `ninja` is no longer required and separate build systems can be
  used if desired.

* The sysroot and the toolchain can now be built with different CMake
  build profiles. For example the `Makefile` hardcoded `MinSizeRel` and
  `RelWithDebInfo` and this can now be much more easily customized by
  the SDK builder.

* Tarballs are now more consistently produced and named. For a tarball
  of the name `foo.tar.gz` it's guaranteed that there's a single folder
  `foo` created when unpacking the tarball.

* The macOS binaries are no longer hybrid x64/arm64 binaries which
  greatly inflates the size of the SDK. There's now a separate build for
  each architecture.

* CI now produces arm64-linux binaries. The sysroot is not built on the
  arm64-linux builder and the sysroot from the x86_64-linux builder is
  used instead.

* Windows now executes tests in CI.

* Tests are now integrated into CMake. This means that the wasm binaries
  are able to be built in parallel and the tests are additionally
  executed in parallel with `ctest`. It is possible to build/run a
  single test. Tests no longer place all of their output in the source
  tree.

* Out-of-tree builds are now possible and the build/installation
  directories can both be customized.

* CI configuration of Windows/macOS/Linux is much more uniform by having
  everything in one build matrix instead of separate matrices.

* Linux builds are exclusively done in docker containers in CI now. CI
  no longer produces two Linux builds only for one to be discarded when
  artifacts are published.

* Windows 32-bit builds are no longer produced in CI since it's expected
  that everyone actually wants the 64-bit ones instead.

* Use of `ccache` is now automatically enabled if it's detected on the
  system.

* Many preexisting shell scripts are now translated to CMake one way or
  another.
alexcrichton added a commit to alexcrichton/wasi-sdk that referenced this pull request Jun 16, 2024
This commit is an attempt to provide a concrete path forward on
WebAssembly#425. I personally think it's pretty important to
get the ability to have more architectures here but at the same time I
also think it's important to to take this as an opportunity to refactor
and improve the build system of this repository. To that end this
represents my attempt to improve the status quo.

This removes the old `Makefile` and replaces it with a CMake-based
system to build all these projects. Overall this is intended to be a "no
functional change" intended sort of refactoring. Changing build systems
inevitably causes issues, however, so this change additionally has a
very high likelihood of needing follow-up fixes. At a high enough level
this commit introduces two major changes to how this repository is
built:

1. The `make`-based system (the root `Makefile`) is replaced with CMake.
   This additionally updates tests to use CMake.
2. A single "build" is split into either building a toolchain or
   building a sysroot. This enables builds to only build one or the
   other as necessary.

The first change, using CMake, is due to the fact that using `make` on
Windows basically is not pleasant coupled with the fact that more
advanced logic, such as changing flags, compilers, etc, is much easier
with a CMake-based system. The second change is intended to cover the
use case of WebAssembly#425 in addition to refactoring the current build.

Throughout this change I have intentionally not tried to keep a 1:1
correspondance with behaviors in the old `Makefile` because much of this
PR is intended to address shortcomings in the old build system. A list
of changes, improvements, etc, made here are:

* CMake provides a much nicer portability story to Windows than `make`.
  This is moving towards the direction of not needing `bash`, for
  example, to build an SDK. Currently `wasi-libc` still requires this,
  but that's now the only "hard" dependency.

* The set of targets built can now be configured for smaller builds
  and/or debugging just a single target. All WASI targets are still
  built by default but it's much easier to add/remove them.

* Different targets are now able to be built in parallel as opposed to
  the unconditional serial-nature of the `Makefile`.

* Use of `ninja` is no longer required and separate build systems can be
  used if desired.

* The sysroot and the toolchain can now be built with different CMake
  build profiles. For example the `Makefile` hardcoded `MinSizeRel` and
  `RelWithDebInfo` and this can now be much more easily customized by
  the SDK builder.

* Tarballs are now more consistently produced and named. For a tarball
  of the name `foo.tar.gz` it's guaranteed that there's a single folder
  `foo` created when unpacking the tarball.

* The macOS binaries are no longer hybrid x64/arm64 binaries which
  greatly inflates the size of the SDK. There's now a separate build for
  each architecture.

* CI now produces arm64-linux binaries. The sysroot is not built on the
  arm64-linux builder and the sysroot from the x86_64-linux builder is
  used instead.

* Windows now executes tests in CI.

* Tests are now integrated into CMake. This means that the wasm binaries
  are able to be built in parallel and the tests are additionally
  executed in parallel with `ctest`. It is possible to build/run a
  single test. Tests no longer place all of their output in the source
  tree.

* Out-of-tree builds are now possible and the build/installation
  directories can both be customized.

* CI configuration of Windows/macOS/Linux is much more uniform by having
  everything in one build matrix instead of separate matrices.

* Linux builds are exclusively done in docker containers in CI now. CI
  no longer produces two Linux builds only for one to be discarded when
  artifacts are published.

* Windows 32-bit builds are no longer produced in CI since it's expected
  that everyone actually wants the 64-bit ones instead.

* Use of `ccache` is now automatically enabled if it's detected on the
  system.

* Many preexisting shell scripts are now translated to CMake one way or
  another.

* There's no longer a separate build script for how to build wasi-sdk in
  docker and outside of docker which needs to be kept in sync,
  everything funnels through the same script.
alexcrichton added a commit to alexcrichton/wasi-sdk that referenced this pull request Jun 16, 2024
This commit is an attempt to provide a concrete path forward on
WebAssembly#425. I personally think it's pretty important to
get the ability to have more architectures here but at the same time I
also think it's important to to take this as an opportunity to refactor
and improve the build system of this repository. To that end this
represents my attempt to improve the status quo.

This removes the old `Makefile` and replaces it with a CMake-based
system to build all these projects. Overall this is intended to be a "no
functional change" intended sort of refactoring. Changing build systems
inevitably causes issues, however, so this change additionally has a
very high likelihood of needing follow-up fixes. At a high enough level
this commit introduces two major changes to how this repository is
built:

1. The `make`-based system (the root `Makefile`) is replaced with CMake.
   This additionally updates tests to use CMake.
2. A single "build" is split into either building a toolchain or
   building a sysroot. This enables builds to only build one or the
   other as necessary.

The first change, using CMake, is due to the fact that using `make` on
Windows basically is not pleasant coupled with the fact that more
advanced logic, such as changing flags, compilers, etc, is much easier
with a CMake-based system. The second change is intended to cover the
use case of WebAssembly#425 in addition to refactoring the current build.

Throughout this change I have intentionally not tried to keep a 1:1
correspondance with behaviors in the old `Makefile` because much of this
PR is intended to address shortcomings in the old build system. A list
of changes, improvements, etc, made here are:

* CMake provides a much nicer portability story to Windows than `make`.
  This is moving towards the direction of not needing `bash`, for
  example, to build an SDK. Currently `wasi-libc` still requires this,
  but that's now the only "hard" dependency.

* The set of targets built can now be configured for smaller builds
  and/or debugging just a single target. All WASI targets are still
  built by default but it's much easier to add/remove them.

* Different targets are now able to be built in parallel as opposed to
  the unconditional serial-nature of the `Makefile`.

* Use of `ninja` is no longer required and separate build systems can be
  used if desired.

* The sysroot and the toolchain can now be built with different CMake
  build profiles. For example the `Makefile` hardcoded `MinSizeRel` and
  `RelWithDebInfo` and this can now be much more easily customized by
  the SDK builder.

* Tarballs are now more consistently produced and named. For a tarball
  of the name `foo.tar.gz` it's guaranteed that there's a single folder
  `foo` created when unpacking the tarball.

* The macOS binaries are no longer hybrid x64/arm64 binaries which
  greatly inflates the size of the SDK. There's now a separate build for
  each architecture.

* CI now produces arm64-linux binaries. The sysroot is not built on the
  arm64-linux builder and the sysroot from the x86_64-linux builder is
  used instead.

* Tests are almost ready to execute on Windows, there's just a few minor
  issues related to exit statuses and probably line endings which need
  to be worked out. Will require someone with a Windows checkout, however.

* Tests are now integrated into CMake. This means that the wasm binaries
  are able to be built in parallel and the tests are additionally
  executed in parallel with `ctest`. It is possible to build/run a
  single test. Tests no longer place all of their output in the source
  tree.

* Out-of-tree builds are now possible and the build/installation
  directories can both be customized.

* CI configuration of Windows/macOS/Linux is much more uniform by having
  everything in one build matrix instead of separate matrices.

* Linux builds are exclusively done in docker containers in CI now. CI
  no longer produces two Linux builds only for one to be discarded when
  artifacts are published.

* Windows 32-bit builds are no longer produced in CI since it's expected
  that everyone actually wants the 64-bit ones instead.

* Use of `ccache` is now automatically enabled if it's detected on the
  system.

* Many preexisting shell scripts are now translated to CMake one way or
  another.

* There's no longer a separate build script for how to build wasi-sdk in
  docker and outside of docker which needs to be kept in sync,
  everything funnels through the same script.

* The `docker/Dockerfile` build of wasi-sdk now uses the actual
  toolchain built from CI and additionally doesn't duplicate various
  CMake-based configuration files.

Overall one thing I want to additionally point out is that I'm not CMake
expert. I suspect there's lots of little stylistic and such improvements
that can be made.
alexcrichton added a commit to alexcrichton/wasi-sdk that referenced this pull request Jun 16, 2024
This commit is an attempt to provide a concrete path forward on
WebAssembly#425. I personally think it's pretty important to
get the ability to have more architectures here but at the same time I
also think it's important to to take this as an opportunity to refactor
and improve the build system of this repository. To that end this
represents my attempt to improve the status quo.

This removes the old `Makefile` and replaces it with a CMake-based
system to build all these projects. Overall this is intended to be a "no
functional change" intended sort of refactoring. Changing build systems
inevitably causes issues, however, so this change additionally has a
very high likelihood of needing follow-up fixes. At a high enough level
this commit introduces two major changes to how this repository is
built:

1. The `make`-based system (the root `Makefile`) is replaced with CMake.
   This additionally updates tests to use CMake.
2. A single "build" is split into either building a toolchain or
   building a sysroot. This enables builds to only build one or the
   other as necessary.

The first change, using CMake, is due to the fact that using `make` on
Windows basically is not pleasant coupled with the fact that more
advanced logic, such as changing flags, compilers, etc, is much easier
with a CMake-based system. The second change is intended to cover the
use case of WebAssembly#425 in addition to refactoring the current build.

Throughout this change I have intentionally not tried to keep a 1:1
correspondance with behaviors in the old `Makefile` because much of this
PR is intended to address shortcomings in the old build system. A list
of changes, improvements, etc, made here are:

* CMake provides a much nicer portability story to Windows than `make`.
  This is moving towards the direction of not needing `bash`, for
  example, to build an SDK. Currently `wasi-libc` still requires this,
  but that's now the only "hard" dependency.

* The set of targets built can now be configured for smaller builds
  and/or debugging just a single target. All WASI targets are still
  built by default but it's much easier to add/remove them.

* Different targets are now able to be built in parallel as opposed to
  the unconditional serial-nature of the `Makefile`.

* Use of `ninja` is no longer required and separate build systems can be
  used if desired.

* The sysroot and the toolchain can now be built with different CMake
  build profiles. For example the `Makefile` hardcoded `MinSizeRel` and
  `RelWithDebInfo` and this can now be much more easily customized by
  the SDK builder.

* Tarballs are now more consistently produced and named. For a tarball
  of the name `foo.tar.gz` it's guaranteed that there's a single folder
  `foo` created when unpacking the tarball.

* The macOS binaries are no longer hybrid x64/arm64 binaries which
  greatly inflates the size of the SDK. There's now a separate build for
  each architecture.

* CI now produces arm64-linux binaries. The sysroot is not built on the
  arm64-linux builder and the sysroot from the x86_64-linux builder is
  used instead.

* Tests are almost ready to execute on Windows, there's just a few minor
  issues related to exit statuses and probably line endings which need
  to be worked out. Will require someone with a Windows checkout, however.

* Tests are now integrated into CMake. This means that the wasm binaries
  are able to be built in parallel and the tests are additionally
  executed in parallel with `ctest`. It is possible to build/run a
  single test. Tests no longer place all of their output in the source
  tree.

* Out-of-tree builds are now possible and the build/installation
  directories can both be customized.

* CI configuration of Windows/macOS/Linux is much more uniform by having
  everything in one build matrix instead of separate matrices.

* Linux builds are exclusively done in docker containers in CI now. CI
  no longer produces two Linux builds only for one to be discarded when
  artifacts are published.

* Windows 32-bit builds are no longer produced in CI since it's expected
  that everyone actually wants the 64-bit ones instead.

* Use of `ccache` is now automatically enabled if it's detected on the
  system.

* Many preexisting shell scripts are now translated to CMake one way or
  another.

* There's no longer a separate build script for how to build wasi-sdk in
  docker and outside of docker which needs to be kept in sync,
  everything funnels through the same script.

* The `docker/Dockerfile` build of wasi-sdk now uses the actual
  toolchain built from CI and additionally doesn't duplicate various
  CMake-based configuration files.

Overall one thing I want to additionally point out is that I'm not CMake
expert. I suspect there's lots of little stylistic and such improvements
that can be made.
@alexcrichton
Copy link
Collaborator

I'm personally always very hesitant to put stop energy on things when I don't have a great alternative. To that effect I've tried to put my money where my mouth is in #429

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Package release for Linux ARM64 Add binary release for Linux on Apple Silicon M1
2 participants