Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kurtosis testing for nimbus eth1 and eth2 #2281

Merged
merged 14 commits into from
Jun 10, 2024
Merged

Kurtosis testing for nimbus eth1 and eth2 #2281

merged 14 commits into from
Jun 10, 2024

Conversation

advaita-saha
Copy link
Contributor

@advaita-saha advaita-saha commented Jun 2, 2024

Fixes #172
The kurtosis package that will spin up a private Ethereum testnet over Docker or Kubernetes with multi-client support, Flashbot's mev-boost infrastructure for PBS-related testing/validation, and other useful network tools (transaction spammer, monitoring tools, etc). Kurtosis packages are entirely reproducible and composable, so this will work the same way over Docker or Kubernetes, in the cloud or locally on your machine. For more details https://github.com/kurtosis-tech/ethereum-package

This PR adds easier support to spin up test networks and run testnets both locally and over the github CI. For running the testnet and then the checks run the run-kurtosis-check.sh script. This will spin up a testnet with your local nimbus-eth1 build and pull the nimbus-eth2 build from statusim dockerhub, then run the following tests

  • Check if atleast one client is ready

  • Check if all EL and CL clients are synced

    • Check if CL clients are synced
    • Check if EL clients are synced
  • Check consensus finality

  • Check consensus attestation stats

  • Check consensus reorgs

  • Check consensus forks

  • Check if all clients propose blocks with legacy EOA transactions

  • 10 EOA transactions per block

  • check block proposals

  • Execute all opcodes as contract deployment

  • Execute all opcodes as contract call

  • Check precompile calls

  • Check chain stability ( finality checks )

The networks configs can be easily changes and multi-client testing can be introduced in this whole setup only, by just changing the network configs in kurtosis-network-params.yml

NOTE: This kurtosis CI check is now set to fail, when the checks pass

run-kurtosis-check.sh Outdated Show resolved Hide resolved
kurtosis-network-params.yml Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
run-kurtosis-check.sh Outdated Show resolved Hide resolved
@tersec
Copy link
Contributor

tersec commented Jun 3, 2024

In general, it's useful to run these shell script through shellcheck. Some of what it turns up (useless use of cat) is essentially cosmetic/aesthetic, and some of it is in practice probably harmless (using $foo rather than "$foo" when it's a URL which will typically be well-behaved regarding spaces, etc), but still worth running it and seeing if it's worth addressing, e.g., a sample of its output:

In run-kurtosis-check.sh line 42:
cat kurtosis-network-params.yml | envsubst > assertoor.yaml
    ^-------------------------^ SC2002 (style): Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.


In run-kurtosis-check.sh line 72:
docker logs -f $assertoor_container &
               ^------------------^ SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: 
docker logs -f "$assertoor_container" &


In run-kurtosis-check.sh line 76:
  tasks=$(curl -s ${assertoor_url}/api/v1/test_run/$1 | jq -c ".data.tasks[] | {index, parent_index, name, title, status, result}")
                  ^--------------^ SC2086 (info): Double quote to prevent globbing and word splitting.
                                                   ^-- SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: 
  tasks=$(curl -s "${assertoor_url}"/api/v1/test_run/"$1" | jq -c ".data.tasks[] | {index, parent_index, name, title, status, result}")


In run-kurtosis-check.sh line 80:
  while read task; do
        ^--^ SC2162 (info): read without -r will mangle backslashes.


In run-kurtosis-check.sh line 90:
    if [ ! -z "$task_graph" ]; then
         ^-- SC2236 (style): Use -n instead of ! -z.


In run-kurtosis-check.sh line 111:
  done <<< $(echo "$tasks")
           ^--------------^ SC2046 (warning): Quote this to prevent word splitting.
           ^--------------^ SC2116 (style): Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo'.


In run-kurtosis-check.sh line 128:
  tests=$(curl -s ${assertoor_url}/api/v1/test_runs | jq -c ".data[] | {run_id, test_id, name, status}")
                  ^--------------^ SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: 
  tests=$(curl -s "${assertoor_url}"/api/v1/test_runs | jq -c ".data[] | {run_id, test_id, name, status}")


In run-kurtosis-check.sh line 129:
  while read test; do
        ^--^ SC2162 (info): read without -r will mangle backslashes.

.dockerignore Outdated Show resolved Hide resolved
@advaita-saha
Copy link
Contributor Author

Removed the need for pushing the temporary image built to dockerhub, now it will be able to use a local docker image

@advaita-saha advaita-saha marked this pull request as ready for review June 5, 2024 21:16
@advaita-saha advaita-saha requested a review from tersec June 6, 2024 09:35
@tersec
Copy link
Contributor

tersec commented Jun 6, 2024

One general concern is, adding the Kurtosis tests right now means CI will always be failing. While this is to some extent an accurate reflection of, well, nimbus-eth1 doesn't really function as an EL currently, the CI also functions as

  • a set of regression tests so at least what works now doesn't stop working without people noticing; and
  • when people look at a list of commits or PRs, it results in everything looking "failing". Which, see above, and it should be made to work when it can be, but until it's actionable information, it just masks other CI failures in these summary views

Merging this PR as-is would hinder these goals. We've used a few approaches here:

  • not running it CI (but it's kind of useful to do so, good to make sure it stays working)
  • reduce the standards it needs to run, so it runs and meets a lower standard (e.g., just finalizes in a way that's allowed to be optimistic, for example, and does not contain any odd Kurtosis-injected behavior)
  • add the moral equivalent of || true to the end of the relevant parts of the scripts, with a noted TODO to fix it
  • some kind of "skip" status, not sure if that's well supported by GitHub Actions.

So for example

Check if all clients propose blocks with legacy EOA transactions

10 EOA transactions per block

check block proposals

Execute all opcodes as contract deployment

Execute all opcodes as contract call

Check precompile calls

Check chain stability ( finality checks )

There's a lot of flexibility here, which is useful -- might be able to find a useful subset of those which work today, and merge it with those working (to get the regression properties in master), and then create a PR on top of which adds the next checks one wants to get green in CI, then merge that when it is.

But arguably, just adding guaranteed-fails to the CI is a regression unless otherwise mitigated.

Apropos the the shellcheck warnings, at least some of them are worth addressing:

  • the specific usage of a bash variable before declaration was removed, but that's something I noticed by hand, and it would still be better to add set -eu (or the other "bash strict modes" to taste, set -euo pipefail, etc, but the specific error would have been caught automatically by set -eu)

  • some of the things it points out should be quoted probably really should be quoted. Decimal numbers aren't likely to suddenly become bash-unfriendly, but $assertoor_container, well, yes, that might; there's no intrinsic reason the those container names must be bash-friendly, nor is there there an intrinsic reason URLs are. Just an example, but it might be easier to just unconditionally add all the quotes it suggests. Sometimes one is intentionally aiming for some wildcard/globbing behavior, but that doesn't appear to be the case here.

We've run into random "standard task doesn't work when a user on Windows with, horror, spaces in their path tries to build/etc" things it's worth being careful about the quoting. Considered creating a lint job which runs/enforces some version of shellcheck.

@arnetheduck
Copy link
Member

arnetheduck commented Jun 7, 2024

CI will always be failing.

we can run them without reporting the error to the runner - or even beer, report an error when they start working so we know when that happens

@tersec
Copy link
Contributor

tersec commented Jun 7, 2024

CI will always be failing.

we can run them without reporting the error to the runner - or even beer, report an error when they start working so we know when that happens

Yeah, especially if there's no useful subset that is interesting. But it depends where it fails. If all but one of the listed Kurtosis checks works, that's might be worth genuinely checking as part of CI. I suspect this is not the case though:

time="2024-06-05T21:53:59Z" level=warning msg="upstream client error: beacon node is optimistic, retrying in 10 sec..." client=1-nimbus-nimbus module=consensus
time="2024-06-05T21:53:59Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:53:59Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:04Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:04Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:09Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:09Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:09Z" level=warning msg="upstream client error: beacon node is optimistic, retrying in 10 sec..." client=1-nimbus-nimbus module=consensus
time="2024-06-05T21:54:14Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:14Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:19Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_execution_sync_status taskidx=5
time="2024-06-05T21:54:19Z" level=info msg="Check result: false, Failed Clients: [1-nimbus-nimbus]" RunID=1 TestID=external-0 module=test task=check_consensus_sync_status taskidx=4
time="2024-06-05T21:54:19Z" level=warning msg="upstream client error: beacon node is optimistic, retrying in 10 sec..." client=1-nimbus-nimbus module=consensus

and since most/all of the listed tests depend on the engine API and non-optimistic block processing to be interesting (in the context of nimbus-eth1, whether nimbus-eth2 can maintain an optimistic sync without an EL isn't especially interesting), and this appears not to be functioning from the beginning, agree that right now running and triggering error on success would be a good approach.

Dockerfile Outdated Show resolved Hide resolved
run-kurtosis-check.sh Outdated Show resolved Hide resolved
run-kurtosis-check.sh Outdated Show resolved Hide resolved
run-kurtosis-check.sh Outdated Show resolved Hide resolved
@tersec tersec merged commit 5fca57a into master Jun 10, 2024
8 checks passed
@tersec tersec deleted the kurtosis-integration branch June 10, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a sharable debugging and testing environment for Ethereum
3 participants