Kurtosis testing for nimbus eth1 and eth2 #2281

advaita-saha · 2024-06-02T06:07:49Z

Fixes #172
The kurtosis package that will spin up a private Ethereum testnet over Docker or Kubernetes with multi-client support, Flashbot's mev-boost infrastructure for PBS-related testing/validation, and other useful network tools (transaction spammer, monitoring tools, etc). Kurtosis packages are entirely reproducible and composable, so this will work the same way over Docker or Kubernetes, in the cloud or locally on your machine. For more details https://github.com/kurtosis-tech/ethereum-package

This PR adds easier support to spin up test networks and run testnets both locally and over the github CI. For running the testnet and then the checks run the run-kurtosis-check.sh script. This will spin up a testnet with your local nimbus-eth1 build and pull the nimbus-eth2 build from statusim dockerhub, then run the following tests

Check if atleast one client is ready
Check if all EL and CL clients are synced
- Check if CL clients are synced
- Check if EL clients are synced
Check consensus finality
Check consensus attestation stats
Check consensus reorgs
Check consensus forks
Check if all clients propose blocks with legacy EOA transactions
10 EOA transactions per block
check block proposals
Execute all opcodes as contract deployment
Execute all opcodes as contract call
Check precompile calls
Check chain stability ( finality checks )

The networks configs can be easily changes and multi-client testing can be introduced in this whole setup only, by just changing the network configs in kurtosis-network-params.yml

NOTE: This kurtosis CI check is now set to fail, when the checks pass

run-kurtosis-check.sh

kurtosis-network-params.yml

Dockerfile

run-kurtosis-check.sh

tersec · 2024-06-03T02:51:21Z

In general, it's useful to run these shell script through shellcheck. Some of what it turns up (useless use of cat) is essentially cosmetic/aesthetic, and some of it is in practice probably harmless (using $foo rather than "$foo" when it's a URL which will typically be well-behaved regarding spaces, etc), but still worth running it and seeing if it's worth addressing, e.g., a sample of its output:

In run-kurtosis-check.sh line 42:
cat kurtosis-network-params.yml | envsubst > assertoor.yaml
    ^-------------------------^ SC2002 (style): Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.


In run-kurtosis-check.sh line 72:
docker logs -f $assertoor_container &
               ^------------------^ SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: 
docker logs -f "$assertoor_container" &


In run-kurtosis-check.sh line 76:
  tasks=$(curl -s ${assertoor_url}/api/v1/test_run/$1 | jq -c ".data.tasks[] | {index, parent_index, name, title, status, result}")
                  ^--------------^ SC2086 (info): Double quote to prevent globbing and word splitting.
                                                   ^-- SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: 
  tasks=$(curl -s "${assertoor_url}"/api/v1/test_run/"$1" | jq -c ".data.tasks[] | {index, parent_index, name, title, status, result}")


In run-kurtosis-check.sh line 80:
  while read task; do
        ^--^ SC2162 (info): read without -r will mangle backslashes.


In run-kurtosis-check.sh line 90:
    if [ ! -z "$task_graph" ]; then
         ^-- SC2236 (style): Use -n instead of ! -z.


In run-kurtosis-check.sh line 111:
  done <<< $(echo "$tasks")
           ^--------------^ SC2046 (warning): Quote this to prevent word splitting.
           ^--------------^ SC2116 (style): Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo'.


In run-kurtosis-check.sh line 128:
  tests=$(curl -s ${assertoor_url}/api/v1/test_runs | jq -c ".data[] | {run_id, test_id, name, status}")
                  ^--------------^ SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: 
  tests=$(curl -s "${assertoor_url}"/api/v1/test_runs | jq -c ".data[] | {run_id, test_id, name, status}")


In run-kurtosis-check.sh line 129:
  while read test; do
        ^--^ SC2162 (info): read without -r will mangle backslashes.

.dockerignore

run-kurtosis-check.sh

advaita-saha · 2024-06-05T21:10:42Z

Removed the need for pushing the temporary image built to dockerhub, now it will be able to use a local docker image

tersec · 2024-06-06T23:35:09Z

One general concern is, adding the Kurtosis tests right now means CI will always be failing. While this is to some extent an accurate reflection of, well, nimbus-eth1 doesn't really function as an EL currently, the CI also functions as

a set of regression tests so at least what works now doesn't stop working without people noticing; and
when people look at a list of commits or PRs, it results in everything looking "failing". Which, see above, and it should be made to work when it can be, but until it's actionable information, it just masks other CI failures in these summary views

Merging this PR as-is would hinder these goals. We've used a few approaches here:

not running it CI (but it's kind of useful to do so, good to make sure it stays working)
reduce the standards it needs to run, so it runs and meets a lower standard (e.g., just finalizes in a way that's allowed to be optimistic, for example, and does not contain any odd Kurtosis-injected behavior)
add the moral equivalent of || true to the end of the relevant parts of the scripts, with a noted TODO to fix it
some kind of "skip" status, not sure if that's well supported by GitHub Actions.

So for example

Check if all clients propose blocks with legacy EOA transactions

10 EOA transactions per block

check block proposals

Execute all opcodes as contract deployment

Execute all opcodes as contract call

Check precompile calls

Check chain stability ( finality checks )

There's a lot of flexibility here, which is useful -- might be able to find a useful subset of those which work today, and merge it with those working (to get the regression properties in master), and then create a PR on top of which adds the next checks one wants to get green in CI, then merge that when it is.

But arguably, just adding guaranteed-fails to the CI is a regression unless otherwise mitigated.

Apropos the the shellcheck warnings, at least some of them are worth addressing:

the specific usage of a bash variable before declaration was removed, but that's something I noticed by hand, and it would still be better to add set -eu (or the other "bash strict modes" to taste, set -euo pipefail, etc, but the specific error would have been caught automatically by set -eu)
some of the things it points out should be quoted probably really should be quoted. Decimal numbers aren't likely to suddenly become bash-unfriendly, but $assertoor_container, well, yes, that might; there's no intrinsic reason the those container names must be bash-friendly, nor is there there an intrinsic reason URLs are. Just an example, but it might be easier to just unconditionally add all the quotes it suggests. Sometimes one is intentionally aiming for some wildcard/globbing behavior, but that doesn't appear to be the case here.

We've run into random "standard task doesn't work when a user on Windows with, horror, spaces in their path tries to build/etc" things it's worth being careful about the quoting. Considered creating a lint job which runs/enforces some version of shellcheck.

arnetheduck · 2024-06-07T04:57:25Z

CI will always be failing.

we can run them without reporting the error to the runner - or even beer, report an error when they start working so we know when that happens

tersec · 2024-06-07T09:42:55Z

CI will always be failing.

we can run them without reporting the error to the runner - or even beer, report an error when they start working so we know when that happens

Yeah, especially if there's no useful subset that is interesting. But it depends where it fails. If all but one of the listed Kurtosis checks works, that's might be worth genuinely checking as part of CI. I suspect this is not the case though:

time="2024-06-05T21:53:59Z" level=warning msg="upstream client error: beacon node is optimistic, retrying in 10 sec..." client=1-nimbus-nimbus module=consensus
time="2024-06-05T21:53:59Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:53:59Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:04Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:04Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:09Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:09Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:09Z" level=warning msg="upstream client error: beacon node is optimistic, retrying in 10 sec..." client=1-nimbus-nimbus module=consensus
time="2024-06-05T21:54:14Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:14Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:19Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:19Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:19Z" level=warning msg="upstream client error: beacon node is optimistic, retrying in 10 sec..." client=1-nimbus-nimbus module=consensus

and since most/all of the listed tests depend on the engine API and non-optimistic block processing to be interesting (in the context of nimbus-eth1, whether nimbus-eth2 can maintain an optimistic sync without an EL isn't especially interesting), and this appears not to be functioning from the beginning, agree that right now running and triggering error on success would be a good approach.

.github/workflows/kurtosis.yml

Dockerfile

run-kurtosis-check.sh

.github/workflows/kurtosis.yml

advaita-saha added 2 commits June 2, 2024 11:39

feat: kurtosis local testing script

42ea85c

fix: updated correct copyright year

a058aa8

advaita-saha force-pushed the kurtosis-integration branch from c3ee850 to a058aa8 Compare June 2, 2024 06:09

fix: remove unwanted files, and introduce a post script cleanup

ef3010d

tersec reviewed Jun 3, 2024

View reviewed changes

run-kurtosis-check.sh Outdated Show resolved Hide resolved

tersec reviewed Jun 3, 2024

View reviewed changes

kurtosis-network-params.yml Outdated Show resolved Hide resolved

tersec reviewed Jun 3, 2024

View reviewed changes

Dockerfile Outdated Show resolved Hide resolved

tersec reviewed Jun 3, 2024

View reviewed changes

Dockerfile Outdated Show resolved Hide resolved

tersec reviewed Jun 3, 2024

View reviewed changes

run-kurtosis-check.sh Outdated Show resolved Hide resolved

tersec reviewed Jun 3, 2024

View reviewed changes

.dockerignore Outdated Show resolved Hide resolved

tersec reviewed Jun 3, 2024

View reviewed changes

run-kurtosis-check.sh Show resolved Hide resolved

advaita-saha added 6 commits June 3, 2024 12:43

fix: remove placeholder image-name & unwanted lines

265130b

fix: check for jq shifted from dpkg

7227077

changed copyright year to 2024

1272815

add optimizations and optional features

8ddc38d

feat: kurtosis CI for eth1-eth2 interop

cfbcbe9

fix: added to run on PRs

4bd33d7

advaita-saha marked this pull request as ready for review June 5, 2024 21:16

fix: curl installation missing in docker

57a52cc

advaita-saha requested a review from tersec June 6, 2024 09:35