Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why running the action on Linux VMs is slow #46

Closed
ychescale9 opened this issue Apr 2, 2020 · 27 comments
Closed

Why running the action on Linux VMs is slow #46

ychescale9 opened this issue Apr 2, 2020 · 27 comments

Comments

@ychescale9
Copy link
Member

ychescale9 commented Apr 2, 2020

Hi! Thanks for trying out this action.

I'd like to share a few words on the recently added Linux support to give you a bit of context on why it's much slower than running on macos.

In general if you run this action on a Linux VM (e.g. ubuntu-latest), you'll get very few benefits over other CI solutions (perhaps other than easier configuration and less scripting) as there's no hardware acceleration on ubuntu VMs provided by GitHub Actions.

While you can sort of have some versions of the emulators running without hardware acceleration, in practice you really need hardware acceleration enabled to run the Android Emulator, period. In fact I almost believe that the emulator binary should just not start at all and throw an error if it can't be hardware-accelerated, otherwise it defeats the purpose of having these modern x86, x86_64 system images.

To give you a bit more context, I created this action solely to take advantage of the hardware acceleration support on the macos VM (enabled by HAXM which is installed on macos VMs), and never intended to add Linux support cause I know it won't be a good experience (at least not better than what you already have) without hardware acceleration.

I wrote this article about running Android emulator on CI which should give you more context on running instrumentation tests on CI in general and why the lack of hardware acceleration support from the host machine is still the biggest challenge for running fast and stable instrumented tests on CI.

I understand this might be a bit of a disappointment for some of you, but unfortunately this is a much bigger problem than what can be solved / improved by a GitHub Action.

If you looked at the code, there's no magic at all. It basically automates the following process and makes it a little easier to configure for the common use cases:

  1. Install / update the required Android SDK components including build-tools, platform-tools, platform (for the required API level), emulator and system-images (for the required API level).
  2. Create a new instance of AVD with the provided configurations.
  3. Launch a new Emulator with the provided configurations.
  4. Wait until the Emulator is booted and ready for use.
  5. Run a custom script provided by user once the Emulator is up and running - e.g. ./gradlew connectedCheck.
  6. Kill the Emulator and finish the action.

As soon as you see "Emulator booted." in the log, the job of this action is basically done as from this point on the environment is handed over to your script so anything that works / doesn't work would not have much to do with the action itself, which only makes sure a live Emulator instance is available in the background.


For those of you who are not fortunate enough to be able to use the macOS VM, as an alternative I'd encourage you to have a look at Cirrus CI which provides KVM-enabled Linux VMs and pricing seems pretty reasonable (100% free for opensource projects). I wrote a long article going over all the features relevant to Android and how you can optimize your pipeline etc. I also created some templates to help you get started.

Hope that gives you some context regarding the Linux support, why it's expected to be slow and the alternative you might want to look at.

Thanks!

@filipkowicz
Copy link

i just saw that windows machine has HAXM component installed https://github.com/actions/virtual-environments/blob/2378e1c967bc72fdf555f9b33e8c99446c06b4c4/images/win/Windows2016-Readme.md
maybe thats the way to go (windows is still way cheaper than macOS)

@ychescale9
Copy link
Member Author

ychescale9 commented Jul 7, 2020

Thanks! Definitely worth giving it a shot. I’ll creat a separate issue.

@sisoje
Copy link

sisoje commented Sep 18, 2020

macos minutes are x10 times more expensive than linux though - for private repos

@brookmg
Copy link

brookmg commented Sep 18, 2020

There is no better option at the moment @sisoje. Neither Linux nor Windows instances provided by Github action support hardware-accelerated VMs.

@eighthave
Copy link

I hope this isn't too off topic, I'm interested in furthering the development of running the emulator in CI on GNU/Linux. I've gotten the new emulator running under Docker in GitLab CI. It works without KVM, where its fast enough to run JUnit tests and really simple Espresso tests. It is too slow for more elaborate things. The big update is that it works with the default GitLab CI runners, e.g. without acceleration. It is important to use the default emulator images rather than google_apis because the Google apps seem to slow down the boot process a lot. Also, it seems the android-22 through android-27 system images seem to require less resources than the newer ones.

Our Docker image and how its used is documented in our wiki https://gitlab.com/fdroid/wiki/-/wikis/Running-emulators-in-GitLab-CI

@francois-spectre
Copy link

Looks like some people are running their tests using ubuntu and are happy with ably/ably-flutter#232 (comment), even if it is slower. Me I would be interested too, because we have a very short tests suite and because the cost. I found their config here https://github.com/ably/ably-flutter/blob/2481f2ba55caf64c9f3df3e34afea1e5b0db966c/.github/workflows/flutter_integration.yaml

@martijnarts
Copy link

There's potentially nested virt support on the larger runners: actions/runner-images#183 (comment)

Has anyone gotten into the beta and been able to verify this?

@AndrewGable
Copy link

AndrewGable commented Oct 19, 2022

I ran a test on the Ubuntu 20.04 64 core "large runner" and from what I could tell there was no KVM support.

See the run here: https://github.com/Expensify/App/actions/runs/3276676725/jobs/5393009096

Relevant Logs:

ProbeKVM: This user doesn't have permissions to use KVM (/dev/kvm).
  The KVM line in /etc/group is: [kvm:x:108:]
  
  If the current user has KVM permissions,
  the KVM line in /etc/group should end with ":" followed by your username.
  
  If we see LINE_NOT_FOUND, the kvm group may need to be created along with permissions:
      sudo groupadd -r kvm
      # Then ensure /lib/udev/rules.d/50-udev-default.rules contains something like:
      # KERNEL=="kvm", GROUP="kvm", MODE="0660"
      # and then run:
      sudo gpasswd -a $USER kvm
  
  If we see kvm:... but no username at the end, running the following command may allow KVM access:
      sudo gpasswd -a $USER kvm
  
  You may need to log out and back in for changes to take effect.
  
  WARNING | x86 emulation may not work without hardware acceleration!

I'm happy to run any further tests if I've misconfigured anything in the test, just let me know. cc @hannojg

@nebuk89
Copy link

nebuk89 commented Jan 26, 2023

👋 hello :) I am a PM at GitHub, if anyone fancies a chat about android nested virt testing on linux drop me an email on [email protected] - would love to talk about what we have upcoming

@JavierSegoviaCordoba
Copy link

@nebuk89 can't you share anything here?

@nebuk89
Copy link

nebuk89 commented Jan 27, 2023

@JavierSegoviaCordoba I can do :D, to be a bit more centralised for everyone watching I wrote up where we are at under actions/runner-images#183

If y'all do any testing I would <3 feedback on the perf bits, I saw better perf on the 4-core than on the Macs which netted out pretty well :)

@tugceaktepe
Copy link

tugceaktepe commented Mar 15, 2023

Hi,

I want to increase the performance of emulator booting with KVM.
I'm getting error below, even if kvm is OK. I'm trying android-29 emulator.


KVM: entry failed, hardware error 0x8
FATAL   | Emulator: exiting becase of the internal error 'kvm_arch_handle_exit: hardware error happened in KVM, aborting now

I checked these. Everything seems to be OK but I could not understand why I am getting error while trying to do vm accelaration.

$./emulator -accel-check
accel:
0
KVM (version 12) is installed and usable.
accel

$kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

Environment info:
5.15.0-67-generic 74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Virtualization features:
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full

memory 32GiB System Memory

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
CPU family: 6
Model: 106
Thread(s) per core: 2
Core(s) per socket: 1
Socket(s): 8

@mattjohnsonpint
Copy link
Contributor

FWIW, we started using a 4-core larger runner for our Android tests and it's been working great so far.

See https://github.com/getsentry/sentry-dotnet/blob/main/.github/workflows/device-tests-android.yml#L46

The results are interesting. It's not any slower than macOS, but it's not dramatically faster either. However - it does seem to be more stable. When we were running on macOS, we frequently got failures booting the emulator before our tests could run at all. That doesn't seem to be happening after the switch.

The downside is that the macOS runner was free (for public repos) and the 4-core Linux runner costs $0.016/min. We matrix for multiple versions of android, and run often, so it adds up.

@mattjohnsonpint
Copy link
Contributor

One gotcha I should mention. When first switching over, nothing worked. It crashed immediately when trying to start the emulator. The culprit turned out to be caching. We were using an AVD image built on macOS on the Linux runner. Easy fix - make sure that runner.os is part of the cache key.

@mattjohnsonpint
Copy link
Contributor

On the KVM setup, we use the bits mentioned here:
https://github.blog/changelog/2023-02-23-hardware-accelerated-android-virtualization-on-actions-windows-and-linux-larger-hosted-runners

- name: Enable KVM group perms
  run: |
      echo 'KERNEL=="kvm", GROUP="kvm", MODE="0666", OPTIONS+="static_node=kvm"' | sudo tee /etc/udev/rules.d/99-kvm4all.rules
      sudo udevadm control --reload-rules
      sudo udevadm trigger --name-match=kvm

@mrk-han
Copy link
Collaborator

mrk-han commented Mar 18, 2023

Just did some benchmarking of the 4, 8, 16, 32 core linux agent with KVM enabled vs macOS

The project I used was a non-compose project with about 10 UI tests. This was a fully fledged and shipped android application and not a sample app.

The 4 core is super cheap and stable, and still significantly faster than macOS. (7min46sec vs 14min22sec total build + test time

The 8 core seemed to be the most cost-efficient and was about 33% faster than the 4 core.

The 16 core was a slight build improvement above the 8 core but almost the same testing time.

The 32 core had negligible increases from the 16 core and the testing time sometimes still took longer.

Seems like the 8 core will be what I'll be trying to use, and is still only .032 cents per minute compared to the .08 cents per minute of the macOS agent.

4 Core .016 cents per minute
Build 3m59s
Test 3m47s
Total 7m46s

8 Core .032 cents per minute
Build 2m41s
Test 2m47s
Total 5m28s

16 Core .064 cents per minute
Build 2m13s
Test 2m50s
Total 5m3s

32 Core: .128 cents per minute
Build 2m4s
Test 2m59s
Total 5m3s

MacOS .08 cents per minute
Build 5m15s
Test 9m7s
Total 14m22s

I actually ran this test with GMD and not ReactiveCircus but will do more benchmarking with avd caching + reactive circus later.

        managedDevices {
            devices {
                pixel2api30 (ManagedVirtualDevice) {
                    device = "Pixel 2"
                    apiLevel = 30
                    systemImageSource = "google"
                }
            }
        }

Linux is certainly not slow anymore! 😍

matejdro added a commit to inovait/kotlinova that referenced this issue Apr 11, 2023
These tests do not work well on linux images that we are using (see ReactiveCircus/android-emulator-runner#46)

This reverts commit df473cd.
@henrikra
Copy link

henrikra commented Jun 8, 2023

@mrk-han Share the GHA config file? Was there anything special?

@nebuk89
Copy link

nebuk89 commented Jun 14, 2023

I am glad this has helped folks out! Always feel free to reach out in the future <3
(I am working on 2-core support for KVM support currently so watch this space as well :) )

@mattjohnsonpint
Copy link
Contributor

@nebuk89 - Alternatively, it would be awesome if the 4-core linux runner was free for public repos. In my case, when we switched from macOS to Linux for the KVM support, we drastically improved the stability of our test runs, but we also went from completely free to having a monthly cost associated with this. Our repo is public because its for an open source library, not an application.

@erawhctim
Copy link

I actually ran this test with GMD and not ReactiveCircus but will do more benchmarking with avd caching + reactive circus later.

@mrk-han Thanks for posting the detailed diagnostic info from your test runs using the various XL Linux runners! Super super helpful 🙏

Wanted to follow-up on your last comment: did you ever have a chance to benchmark this with AVD Caching + android-emulator-runner? Any findings to add here?

@nebuk89
Copy link

nebuk89 commented Jul 26, 2023

Hey all, sorry I haven't followed up. We are actively working on migration plans still for the existing 2-core machines to enable this. I am sorry it has taken longer than I expected 😞 and please keep an eye here for me to follow up.

@ychescale9
Copy link
Member Author

This issue is no longer relevant as running hardware accelerated emulators with the upgraded free Linux runner is much faster than than the macOS runners:

https://github.blog/2024-01-17-github-hosted-runners-double-the-power-for-open-source/
https://github.blog/changelog/2023-02-23-hardware-accelerated-android-virtualization-on-actions-windows-and-linux-larger-hosted-runners/

@ychescale9 ychescale9 unpinned this issue Jan 19, 2024
@ychescale9 ychescale9 changed the title Why running the action on Linux VMs is slow (and that you probably shouldn't do it) Why running the action on Linux VMs is slow Jan 19, 2024
@isen-ng
Copy link

isen-ng commented Jan 20, 2024

I'm trying to use ubuntu-latest on my repo and I'm still getting kvm not found.

Run echo 'KERNEL=="kvm", GROUP="kvm", MODE="0666", OPTIONS+="static_node=kvm"' | sudo tee /etc/udev/rules.d/99-kvm4all.rules
KERNEL=="kvm", GROUP="kvm", MODE="0[6](https://github.com/tofu-tech/laundry/actions/runs/7592535719/job/20681997439?pr=21#step:3:7)66", OPTIONS+="static_node=kvm"
Failed to open the device 'kvm': No such file or directory
Error: Process completed with exit code 1.

Is there something else I need to do?

Edit:
It seems like my CPU count is still 2

Run lscpu
Architecture:                       x86_6[4](https://github.com/tofu-tech/laundry/actions/runs/7592571770/job/20682098302#step:3:5)
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      4[6](https://github.com/tofu-tech/laundry/actions/runs/7592571770/job/20682098302#step:3:7) bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             2

Edit2:
Looks like the 4-cpu upgrade is only for public repositories. Free runners for private repositories remain at 2-cpu. Really got me excited for nothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests