Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an official ARM docker image and/or provide documentation for building on ARM #1861

Closed
gizmochief7 opened this issue Oct 16, 2017 · 97 comments
Labels

Comments

@gizmochief7
Copy link

gizmochief7 commented Oct 16, 2017

Would it be possible to get an official Docker image of Envoy on ARM or at least some documentation on how to build it?
My team and I have tried off and on to build Envoy on ARM using the Bazel build process without any luck. Its challenging to say the least. I just saw a PR #1795 by @costinm dated a couple weeks ago based on the issue #1781 that talked about building it on ARM successfully after a few fixes. This is great news but we aren't sure how to duplicate it. @costinm mentioned that he used a custom CMAKE and a pi-cross-compiler to make it work. Documentation on this would be very helpful. Also, if there exist an image I could use for testing, then that would be very helpful too. Thanks again for such an awesome product!

@gizmochief7 gizmochief7 changed the title Provide an official ARM docker image and/or provide documentation for building on ARM Provide an official ARM docker image and/or provide documentation for building on ARM label:"help wanted" Oct 16, 2017
@gizmochief7 gizmochief7 changed the title Provide an official ARM docker image and/or provide documentation for building on ARM label:"help wanted" Provide an official ARM docker image and/or provide documentation for building on ARM Label: Help Wanted Oct 16, 2017
@gizmochief7 gizmochief7 changed the title Provide an official ARM docker image and/or provide documentation for building on ARM Label: Help Wanted Provide an official ARM docker image and/or provide documentation for building on ARM [Label: Help Wanted] Oct 16, 2017
@mattklein123 mattklein123 changed the title Provide an official ARM docker image and/or provide documentation for building on ARM [Label: Help Wanted] Provide an official ARM docker image and/or provide documentation for building on ARM Oct 16, 2017
@vielmetti
Copy link

@gizmochief7 it's been a little while, have you made progress? It sounds like you're building for 32-bit ARMv7; my particular interest is in arm64 builds.

@gizmochief7
Copy link
Author

@vielmetti It has been and I almost forgot about posting this. We did not have any further luck with this attempt and without any community help we decided to move past Envoy. We tried out another reverse proxy/side car project for micro services and it just worked out of the box on both 32-bit and 64-bit. So we haven't wasted any more time here. Sorry I couldn't be more help.

@costinm
Copy link
Contributor

costinm commented Nov 21, 2017 via email

@rshriram
Copy link
Member

fwiw, I tried to build bazel on my pi (yes, it took several hours), and then built envoy on pi for armv7l 64bit. While envoy started up and was able to talk to http services, it had difficulty talking to certain https services (esp twitter). There seemed to be some boringssl issue that prevented ssl connections from working. My hunch was that it didn't compile properly. At that point, I gave up, because I had spent 4 solid days fighting bazel on pi.

@costinm
Copy link
Contributor

costinm commented Nov 21, 2017 via email

@costinm
Copy link
Contributor

costinm commented Nov 21, 2017 via email

@moderation
Copy link
Contributor

I've secured a Pine64 Rock64 SOC which is ARM8 / aarch64 and has 4GB of RAM. I've successfully
compiled Bazel 0.13.0. I'm not having any luck compiling Envoy.

I've leveraged @clnperez way of removing the Lua JIT.

I can't get her trick to use the locally installed Go to work go_register_toolchains(go_version="host"). When I use this Bazel uses a symbolic link to the installed Go tools (~/Library/go/pkg/tool/linux_arm64) and tries to compile Gazelle and protoc-gen-go and fails because link can't see gcc on $PATH. Installing these tools on the host don't fix the issue. I expect this is something to do with breaking the hermetic seal by using go_version="host".

I've modified source/extensions/extensions_build_config.bzl to only include a minimum number of extensions.

I'm using bazel --bazelrc=/dev/null build -s -c opt //source/exe:envoy-static.stripped --define google_grpc=disabled --define signal_trace=disabled --define hot_restart=disabled --verbose_failures as the build command with the following WORKSPACE.

workspace(name = "envoy")

load("//bazel:repositories.bzl", "envoy_dependencies")
load("//bazel:cc_configure.bzl", "cc_configure")

envoy_dependencies(
   skip_targets=['luajit']
)
cc_configure()

load("@envoy_api//bazel:repositories.bzl", "api_dependencies")
api_dependencies()

load("@io_bazel_rules_go//go:def.bzl", "go_download_sdk", "go_rules_dependencies", "go_register_toolchains")
load("@com_lyft_protoc_gen_validate//bazel:go_proto_library.bzl", "go_proto_repositories")

go_proto_repositories(shared=0)
go_rules_dependencies()

go_download_sdk(name = "go_sdk", sdks = {"linux_arm64":  ("go1.10.2.linux-arm64.tar.gz", "d6af66c71b12d63c754d5bf49c3007dc1c9821eb1a945118bfd5a539a327c4c8"), }, )

go_register_toolchains()

This prevents the build automatically downloading Go for linux_amd64.

The build then fails with

ERROR: ~/.cache/bazel/_bazel_moderation/1d277edd85b03195d275adc4870a9bef/external/com_lyft_protoc_gen_validate/validate/BUILD:35:1: no such package '@com_github_golang_protobuf//ptypes/duration': no such package '@io_bazel_rules_go_repository_tools//': no such package '@go_sdk//': Unsupported host linux_amd64 and referenced by '@com_lyft_protoc_gen_validate//validate:go_default_library'
ERROR: Analysis of target '//source/exe:envoy-static.stripped' failed; build aborted: no such package '@com_github_golang_protobuf//ptypes/duration': no such package '@io_bazel_rules_go_repository_tools//': no such package '@go_sdk//': Unsupported host linux_amd64

My suspicion is that the lyft/protoc-gen-validate is hard coded for linux_amd64 (or considering compilation works on MacOS / Darwin maybe it just doesn't support ARM).

Any help from Bazel wizards like @jmillikin-stripe or from people that have tried non amd64 compilation like @clnperez @costinm @rshriram would be great.

@vielmetti
Copy link

Issue noted in bufbuild/protoc-gen-validate#75 - there are definitely some hard-coded dependencies in lyft/protoc-gen-validate to amd64 binaries.

@clnperez
Copy link
Contributor

clnperez commented May 17, 2018

I'll try and help. I don't have much ARM experience but a coworker (@tophj-ibm) does. So we'll try to put our heads together on this one. Any progress we can make would be great. We'd also like to have a ppc64le docker image.

wonder twins unite?

My first guess is that something in cc_configure works differently for ARM. I ran into some things where, when using certain flags, gcc would behave differently for power than it did for x86. envoy, for the last release, seems to have come to rely a lot more on how bazel configured the gcc toolchain and did a little less customization. That got rid of almost all the issues I was seeing, but maybe there is some more tweaking needed for ARM.

For your WORKSPACE file, what's the logic behind including the call to `go_download_sdk'?

@moderation
Copy link
Contributor

Wonder twins unite @clnperez! Without adding the specific go_download_sdk when trying to build on ARM you can see Bazel download the the amd64 version of Go and then subsequently fails as the Go it has downloaded doesn't work on aarch64. I agree that the gcc wrapping could be problematic but the errors I'm currently seeing are linux_amd64 errors.

@vielmetti
Copy link

The issue of Bazel mis-identifying the aarch64 system is noted at bazelbuild/rules_go#1506 - thanks for the diagnosis.

@moderation
Copy link
Contributor

Before resuming work on this I think we should prioritize moving to rules_go 0.12.0. At the moment a build with 0.12.0 fails with a cycle in dependency graph error. I suspect the root cause for this and the inability to build on aarch64 is lyft/protoc-gen-validate. It has a number of hacks and workarounds in that it works today but I think without some maintenance it will prevent a move to rules_go 0.12.0 and the ability to compile for non amd64, darwin, s390x platforms.

@clnperez
Copy link
Contributor

So, I'm hitting this now as well on ppc64le. :D

@moderation
Copy link
Contributor

moderation commented Jun 28, 2018

@clnperez I've had success this week building on aarch64 / arm64 using these patches from @lubinsz #3681 (comment). @htuch unblocked protobuf 3.6.0 with this patch to lyft/protoc-gen-validate bufbuild/protoc-gen-validate@345b6b4. rules_go 0.12.0 is still blocked.

@clnperez
Copy link
Contributor

@moderation i just took a look at those patches, and they look like they're all adding arm support, whereas we already have power support. i'm actually hitting a different error today than I did last night, and that is the gazelle compile (at least i think it's the gazelle compile) isn't finding gcc in my PATH. I might open a new issue for that.

@moderation
Copy link
Contributor

@clnperez We were having the same issue on arm64. The patch for rules_go at lubinsz/rules_go@497b488 fixed that issue. Not sure how portable that patch is but might give you some ideas.

@clnperez
Copy link
Contributor

clnperez commented Jun 29, 2018

@moderation using it as-is seems to break compatibility with lyft/protoc-gen-validate as the patch for rules_go is based off of a branch that deprecated go_repositories. so i tried just changing the path bits for the linking (using what was in that patch) and it didn't get me around that.

there are so many moving parts to trying out one thing, though, i could have missed something. but i think i have it all lined up correctly. i'll dig some more and see what i find.

edit: link the link-only patch based off of rules_go 0.11.1: clnperez/rules_go@2f55b46

@clnperez
Copy link
Contributor

@moderation it turns out this was what i needed: bazelbuild/bazel-gazelle#242

@moderation
Copy link
Contributor

@clnperez this is similar to this patch that enabled the arm64 build - lubinsz/rules_go@497b488#diff-b88d1275b3ee8565f7821ba59b402139. Thanks @lubinsz. It would be good to go back and have these changes included upstream.

@vielmetti
Copy link

Several upstream issues in bazel have been resolved, specifically bazelbuild/rules_go#1550 which might make this worth a retry.

@mattklein123 mattklein123 added this to the 1.9.0 milestone Nov 16, 2018
@jared2501
Copy link

Would be super interested in an arm64 docker image!

@trilom
Copy link

trilom commented Jul 7, 2020

Put this together but it does not build envoy, something about external/go_sdk/bin location not being found. I've been able to nurse it past but have ran into a couple other issues that I believe might be around environment. Either way this feels pretty close to an alpine build on edge if it's useful to anyone.

FROM alpine:edge AS bazel

ARG BAZEL_VERSION='3.3.1'

RUN mkdir /bazel
WORKDIR /bazel
RUN wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-dist.zip
RUN unzip bazel-${BAZEL_VERSION}-dist.zip
# script needs bash, build needs rest
RUN apk add bash=5.0.17-r0 \
            openjdk8=8.242.08-r2 \
            build-base=0.5-r2 \
            linux-headers=5.4.5-r1 \
            zip=3.0-r8 \
            python3=3.8.3-r0 
RUN ln -s /usr/bin/python3 /usr/local/bin/python && \
    env JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh

FROM alpine:edge AS envoy
RUN apk add git=2.27.0-r0 \
            openjdk8=8.242.08-r2 \
            autoconf=2.69-r2 \
            automake=1.16.2-r0 \
            libtool=2.4.6-r7 \
            samurai=1.1-r0 \
            cmake=3.17.3-r0 \
            python3=3.8.3-r0 \
            build-base=0.5-r2 \
            llvm10-dev=10.0.0-r2 \
            go=1.14.4-r1 \
            clang=10.0.0-r3 \
            linux-headers=5.4.5-r1 \
            libc6-compat=1.1.24-r9 \
            musl-dev=1.1.24-r9
RUN go get -u github.com/bazelbuild/buildtools/buildifier
RUN go get -u github.com/bazelbuild/buildtools/buildozer
RUN git clone https://github.com/envoyproxy/envoy.git /envoy
WORKDIR /envoy
COPY --from=bazel /bazel/output/bazel /usr/local/bin/bazel

RUN ln -sf /usr/lib/llvm10/ /opt/llvm && \
    ln -sf /usr/bin/python3 /usr/local/bin/python && \
    env JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk \
    bazel build //source/exe:envoy-static --sandbox_debug

@trilom
Copy link

trilom commented Jul 9, 2020

ci/run_envoy_docker.sh 'ci/do_ci.sh bazel.release.server_only' no longer works Tick-Tocker/bazel-arm64 does not have Bazel 3.3.1 version and the .bazelversion specifies 3.3.1. If you switch it to 3.3.0 it will build but I am not sure if it is successful yet.

@lizan
Copy link
Member

lizan commented Jul 9, 2020

@Jingzhao123 ^^

@lizan
Copy link
Member

lizan commented Jul 9, 2020

@trilom It is not recommended to build a musl (alpine) based binary, you have to disable signal_actions and gperftools. Disabling latter will impact performance significantly (around 30~40% on x64 on a simple benchmark)

@trilom
Copy link

trilom commented Jul 9, 2020

@trilom It is not recommended to build a musl (alpine) based binary, you have to disable signal_actions and gperftools. Disabling latter will impact performance significantly (around 30~40% on x64 on a simple benchmark)

I did read that at one point, I'm not versed in building with musl but I did read similar, and that is for the most part why I stopped with the alpine approach. Thank you for the information. My ideal goal is to use something similar to docker buildx for this so that I can have some CI to this image (something similar to this), but aside from that the command above should fit my needs.

@Jingzhao123
Copy link
Contributor

@lizan @trilom It has been updated.

@CelsoSantos
Copy link

CelsoSantos commented Jul 14, 2020

Realistically speaking, how far are we from an arm64 image that could be considered stable?
From my understanding, we already have an image that runs on arm64, that can be used to build Envoy from arm64 architectures. But then does this produce an arm64 image or just the binary that still needs to be containerized?

Also (though, this one is my fault for not knowing), how long does it take to build an arm64 image with that container? Hours, minutes? Days? I have a RPi4 available that I could use to build Envoy.

Also, I'm getting an error trying to pull the images:

pi@k3s:~ $ docker pull envoyproxy/envoy-build-ubuntu:f21773ab398a879f976936f72c78c9dd3718ca1e
f21773ab398a879f976936f72c78c9dd3718ca1e: Pulling from envoyproxy/envoy-build-ubuntu
no matching manifest for linux/arm/v7 in the manifest list entries

EDIT: Nevermind the error.. I forgot I had a 32bit OS on this Pi

@carmiac
Copy link

carmiac commented Jul 14, 2020

@CelsoSantos I ran the build just a couple of days ago on my RPi4. It produced a static non-containerized binary that I was able to drop onto my target device and run. The build time was around 12 hours, though I did notice that it was just using a single core.

@moderation
Copy link
Contributor

@carmiac the single core issue you saw began with a recent release of Bazel. Here is what I posted in the Bazel Slack.

This line in the Envoy Proxy :bazel: config is the culprit - https://github.com/envoyproxy/envoy/blob/master/.bazelrc#L14. According to the documentation this flag has remained as false as the default. But my compilation went to a single core when I jumped to Bazel 3.2.0. The config in the Envoy code doesn't specify whether it is true or false. My best guess is that in 3.2.0 the interpretation of the flag without a true or false changed and the Envoy Proxy definition flipped to being evaluated as true. The following override in ~/.bazelrc has me back to using multiple concurrent cores.

# disable as this was restricting Bazel actions to 1
build --experimental_local_memory_estimate=false

Setting this flag will allow you to use more than one core. I had a 4G RAM RPi4 and I had to restrict the build using --jobs=3 or --jobs=2 to prevent memory exhaustion. I just upgraded to am 8G RAM RPi4 and no longer have this issue and can use all 4 cores. All of the above assumes arm64 using a 64 bit OS like Ubuntu Server.

Lastly check out https://stevesloka.com/compile-envoy-on-raspberry-pi4/.

@al45tair
Copy link

@CelsoSantos It's been a while, but IIRC the command I ran back on the 18th of May worked and generated a Docker image (which I uploaded to Docker Hub as al45tair/envoy-arm64). It seems to work.

@moderation @carmiac It's probably worth getting some time on an ARM64 server machine rather than building on a Pi. Even if you don't have access to one at work (I'm lucky in that regard), you can spin them up in AWS — there are Graviton and Graviton2 instances available, I believe — and they'll build it a lot faster than a Pi will.

@moderation
Copy link
Contributor

@al45tair I'm pretty happy with my local RPi4 dev environment. After the initial build the subsequent incremental builds are relatively quick and compiling out all the extensions you don't need works well. I'm in a very small minority of people who don't think Docker helps with anything so I'm building Bazel and Envoy straight on the RPi4

@lizan
Copy link
Member

lizan commented Jul 14, 2020

Realistically speaking, how far are we from an arm64 image that could be considered stable?

I wouldn't be comfortable saying it is stable until it passes most of our test suite on arm64.

@vielmetti
Copy link

A note that Bazel 3.4.0 now ships with official arm64 binaries -

bazelbuild/bazel#8833 (comment)

That may save you a build step, @moderation , if you can adapt to any changes associated with the Bazel point release at 3.4.0.

@moderation
Copy link
Contributor

Thanks @vielmetti. Looks like they need to work on their automation a bit as the latest release 3.4.1 isn't yet available. But definitely looks promising

lizan added a commit that referenced this issue Jul 22, 2020
Risk Level: Low
Testing: CI
Docs Changes: N/A
Release Notes: N/A (should be added when docker build is enabled)
Part of #1861
 
Signed-off-by: Lizan Zhou <[email protected]>
@lizan lizan closed this as completed in 9d70da7 Jul 30, 2020
KBaichoo pushed a commit to KBaichoo/envoy that referenced this issue Jul 30, 2020
Risk Level: Low
Testing: CI
Docs Changes: N/A
Release Notes: N/A (should be added when docker build is enabled)
Part of envoyproxy#1861

Signed-off-by: Lizan Zhou <[email protected]>
Signed-off-by: Kevin Baichoo <[email protected]>
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this issue Aug 7, 2020
In this patch, it will enable the envoyproxy/envoy arm image to build
in community arm CI environments.
1. Do some modifications in docker_ci.sh script for building arm images
   by buildx. It will firstly set up environments. Then use the buildx
   tool to build the envoyproxy/envoy arm images on x86 platform.
2. Modify the docker build job for building multi-arch images.
   It will firstly download the arm64 and amd64 envoy binaries. Then
   invoke the docker_ci.sh scripts to generate images.

Risk Level: Medium (of breaking images)
Testing: CI
Docs Changes: N/A
Release Notes: Added
Fixes envoyproxy#1861 

Signed-off-by: Jingzhao.Ni <[email protected]>
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this issue Aug 7, 2020
In this patch, it will enable the envoyproxy/envoy arm image to build
in community arm CI environments.
1. Do some modifications in docker_ci.sh script for building arm images
   by buildx. It will firstly set up environments. Then use the buildx
   tool to build the envoyproxy/envoy arm images on x86 platform.
2. Modify the docker build job for building multi-arch images.
   It will firstly download the arm64 and amd64 envoy binaries. Then
   invoke the docker_ci.sh scripts to generate images.

Risk Level: Medium (of breaking images)
Testing: CI
Docs Changes: N/A
Release Notes: Added
Fixes envoyproxy#1861

Signed-off-by: Jingzhao.Ni <[email protected]>
Signed-off-by: chaoqinli <[email protected]>
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this issue Aug 7, 2020
In this patch, it will enable the envoyproxy/envoy arm image to build
in community arm CI environments.
1. Do some modifications in docker_ci.sh script for building arm images
   by buildx. It will firstly set up environments. Then use the buildx
   tool to build the envoyproxy/envoy arm images on x86 platform.
2. Modify the docker build job for building multi-arch images.
   It will firstly download the arm64 and amd64 envoy binaries. Then
   invoke the docker_ci.sh scripts to generate images.

Risk Level: Medium (of breaking images)
Testing: CI
Docs Changes: N/A
Release Notes: Added
Fixes envoyproxy#1861

Signed-off-by: Jingzhao.Ni <[email protected]>
Signed-off-by: chaoqinli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests