Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publishBlockV2 fails gossip validation for valid block #6205

Closed
nflaig opened this issue Apr 16, 2024 · 7 comments
Closed

publishBlockV2 fails gossip validation for valid block #6205

nflaig opened this issue Apr 16, 2024 · 7 comments

Comments

@nflaig
Copy link

nflaig commented Apr 16, 2024

Describe the bug

I am testing Nimbus BN with Lodestar VC on Kurtosis. The previous issue (#6176) is now fixed on unstable branch (c5f04dd) but there is another issue during block publishing which only happens with Nimbus from unstable branch, but works on stable branch.

This is the error logged on the Nimbus BN

WRN 2024-04-16 12:01:56.939+00:00 Block failed validation                    topics="beacval" blockRoot=00000000 blck="(slot: 1, proposer_index: 1147, parent_root: \"515bb412\", state_root: \"145b6448\", eth1data: (deposit_root: d70a234731285c6804c2a4f56711ddb8c82c99740f207854891028af34e27e5e, deposit_count: 0, block_hash: 6903b6723e1921a599446c55bbb140dac226f150d5ba737bd94456171c8b3ad9), graffiti: \"5-geth-nimbus-lodestar\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 0, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 0, block_number: 1, block_hash: \"0x8e132997d62f55007d8a1d67fe7eca60c5f2f738dbe47e1790f582db2cf25efe\", parent_hash: \"0x6903b6723e1921a599446c55bbb140dac226f150d5ba737bd94456171c8b3ad9\", fee_recipient: \"0x8943545177806ed17b9f23f0a21ee5948ecaa776\", bls_to_execution_changes_len: 0, blob_kzg_commitments_len: 0)" signature=aefb60f2 error="(Reject, \"BeaconBlock: Invalid proposer signature\")"

From the looks of it, it seems like the block is failing gossip validation, the publish request done by Lodestar looks like this

Apr 16 12:01:56.928 2024 REQUEST /eth/v2/beacon/blocks?broadcast_validation=gossip
�[38;5;6m{�[39m
�[38;5;6m  "message": {�[39m
�[38;5;6m    "slot": "1",�[39m
�[38;5;6m    "proposer_index": "1147",�[39m
�[38;5;6m    "parent_root": "0x515bb41259ab0902ad3e5ec658327cd25f387dfd5cf7763f7cd593060e213e7f",�[39m
�[38;5;6m    "state_root": "0x145b6448f76dd454df1b339ae4731e6e30ce73b2c8ea4a008f290899de2c339a",�[39m
�[38;5;6m    "body": {�[39m
�[38;5;6m      "randao_reveal": "0x87b1cce981a31d800402667eee9c7de84de72c0a01ffbe8744106aed6c50660faa5faed1e5846a08207202784bb28105054d9425621e8b56db41dde74d2b6ab775f4dac84bb43593db34f65570ba296b025a86da859dd9a5097c4a083b0932c5",�[39m
�[38;5;6m      "eth1_data": {�[39m
�[38;5;6m        "deposit_root": "0xd70a234731285c6804c2a4f56711ddb8c82c99740f207854891028af34e27e5e",�[39m
�[38;5;6m        "deposit_count": "0",�[39m
�[38;5;6m        "block_hash": "0x6903b6723e1921a599446c55bbb140dac226f150d5ba737bd94456171c8b3ad9"�[39m
�[38;5;6m      },�[39m
�[38;5;6m      "graffiti": "0x352d676574682d6e696d6275732d6c6f64657374617200000000000000000000",�[39m
�[38;5;6m      "proposer_slashings": [],�[39m
�[38;5;6m      "attester_slashings": [],�[39m
�[38;5;6m      "attestations": [],�[39m
�[38;5;6m      "deposits": [],�[39m
�[38;5;6m      "voluntary_exits": [],�[39m
�[38;5;6m      "sync_aggregate": {�[39m
�[38;5;6m        "sync_committee_bits": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",�[39m
�[38;5;6m        "sync_committee_signature": "0xc00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"�[39m
�[38;5;6m      },�[39m
�[38;5;6m      "execution_payload": {�[39m
�[38;5;6m        "parent_hash": "0x6903b6723e1921a599446c55bbb140dac226f150d5ba737bd94456171c8b3ad9",�[39m
�[38;5;6m        "fee_recipient": "0x8943545177806ED17B9F23F0a21ee5948eCaa776",�[39m
�[38;5;6m        "state_root": "0xaa65ab8098cfddd2a7e893480584e2a331ea7be5a91afede0bce3b8e4515e643",�[39m
�[38;5;6m        "receipts_root": "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",�[39m
�[38;5;6m        "logs_bloom": "0x�[39m
�[38;5;6m        "prev_randao": "0x6903b6723e1921a599446c55bbb140dac226f150d5ba737bd94456171c8b3ad9",�[39m
�[38;5;6m        "block_number": "1",�[39m
�[38;5;6m        "gas_limit": "25024413",�[39m
�[38;5;6m        "gas_used": "0",�[39m
�[38;5;6m        "timestamp": "1713268916",�[39m
�[38;5;6m        "extra_data": "0xd883010d0e846765746888676f312e32312e37856c696e7578",�[39m
�[38;5;6m        "base_fee_per_gas": "875000000",�[39m
�[38;5;6m        "block_hash": "0x8e132997d62f55007d8a1d67fe7eca60c5f2f738dbe47e1790f582db2cf25efe",�[39m
�[38;5;6m        "transactions": [],�[39m
�[38;5;6m        "withdrawals": []�[39m
�[38;5;6m      },�[39m
�[38;5;6m      "bls_to_execution_changes": []�[39m
�[38;5;6m    }�[39m
�[38;5;6m  },�[39m
�[38;5;6m  "signature": "0xaefb60f20fce54c1c9a4669f2a5d7b9fda279b602b60221d57befcb2d78358232542494ab89530a21d72229784e1b3d4009f0f04967a0e19ead7cb45a9f38739cc17cdcdd148701d7160886816432e183339ecc715f2d4203c8cf19d9366e800"�[39m
�[38;5;6m}�[39m

And the response from Nimbus BN

Apr 16 12:01:56.939 2024 RESPONSE (status 503 Service Unavailable) 
�[38;5;2m{�[39m
�[38;5;2m  "code": 503,�[39m
�[38;5;2m  "message": "Beacon node is currently syncing and not serving request on that endpoint",�[39m
�[38;5;2m  "stacktraces": [�[39m
�[38;5;2m    "BeaconBlock: Invalid proposer signature"�[39m
�[38;5;2m  ]�[39m
�[38;5;2m}�[39m

To Reproduce
Steps to reproduce the behavior:

Kurtosis config file

participants:
# ...
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: nimbus
    cl_image: ethpandaops/nimbus-eth2:unstable-c5f04dd
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    # vc_image: nflaig/lodestar:use-publish-v1 <-- using publish v1 api solves the issue
    count: 1
# ...

Additional context

The issue seems to be related to publishBlockV2 only, forcing Lodestar VC to use publishBlock (v1) resolves the issue.

@cheatfate
Copy link
Contributor

This issue should be fixed in unstable.

@pinebit
Copy link

pinebit commented May 28, 2024

When the fix from unstable branch is going to be released?
Also, is this affecting kurtosis only, or this is happen in real network(s)?
Thank you!

EDIT: we tested Nimbus BN with kurtosis for all combinations of VCs (various vendors) and this failed for them all.

@tersec
Copy link
Contributor

tersec commented May 28, 2024

In theory, it's in https://github.com/status-im/nimbus-eth2/releases/tag/v24.5.1 as #6261

Is what you're seeing specifically still this issue or another cause of failure?

@nflaig
Copy link
Author

nflaig commented May 28, 2024

Is what you're seeing specifically still this issue or another cause of failure?

gave this another try using statusim/nimbus-eth2:amd64-latest and I am still seeing the same error as reported in the issue

@pinebit
Copy link

pinebit commented May 29, 2024

The specific errors we saw is similar (if not identical) to the origin description:

WRN 2024-05-28 13:25:19.254+00:00 Could not obtain blinded execution payload header topics="beacval" error="Unable to decode Deneb blinded header: Serialization error with HTTP status 204, Content-Type <missing> and content @[]" slot=138 validator_index=173 head=c8980f7d:137

NTC 2024-05-28 13:25:19.255+00:00 Payload builder error                      topics="beacval" slot=138 head=c8980f7d:137 validator=9592c95f err="Unable to decode Deneb blinded header: Serialization error with HTTP status 204, Content-Type <missing> and content @[]"

WRN 2024-05-28 13:25:19.279+00:00 Block failed validation                    topics="beacval" blockRoot=00000000 blck="(slot: 138, proposer_index: 173, parent_root: \"c8980f7d\", state_root: \"a36af555\", eth1data: (deposit_root: d70a234731285c6804c2a4f56711ddb8c82c99740f207854891028af34e27e5e, deposit_count: 0, block_hash: 1491705c0bc212c9be1c15f33c39cc6f6fdf58358d37040299770921e5d495ea), graffiti: \"charon/v1.1.0-dev-ddb95fd\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 1, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 512, block_number: 78, block_hash: \"0x5d573185b1a5a78af8b4483a64ad8013e8ec746e3a89de8fd31c215e002a3aa5\", parent_hash: \"0x14ed1972d36be2d0960942e86eb2035f864c1e2706fe9d25ab5c1e2b690949f0\", fee_recipient: \"0x8943545177806ed17b9f23f0a21ee5948ecaa776\", bls_to_execution_changes_len: 0, blob_kzg_commitments_len: 0)" signature=b9c904e6 error="(Reject, \"BeaconBlock: Invalid proposer signature\")"

We (at Obol) use Nimbus in mainnet with no issues. Therefore we concluded the issue above is related to Kurtosis. We intensively use Kurtosis for testing our product (Charon DV) compatibility against various BN/VC combinations from different vendors. Other vendors (Lighthouse, Teku, Lodestar, Prysm) work just fine, but not Nimbus. Therefore we suspect a bug in Nimbus that only affects Kurtosis environment in some way.

I was trying to build the mentioned unstable branch to a local docker image, but the provided scripts failed to build for any target arch. For example, on MacOS ARM64 host:

What's Next?
  View a summary of image vulnerabilities and recommendations → docker scout quickview
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
==================STARTING BUILD==================
Build Tools = 0

PLATFORM=Linux_amd64
Building: Nim compiler
make[2]: *** [vendor/nimbus-build-system/makefiles/targets.mk:81: build-nim] Error 126
make[1]: *** [vendor/nimbus-build-system/makefiles/targets.mk:124: vendor/nimbus-build-system/vendor/Nim/bin/nim] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [vendor/nimbus-build-system/makefiles/targets.mk:116: update-common] Error 2
make: *** [dist-amd64] Error 2

And docker says the builder image is running as AMD64 under Rosetta. I quickly checked the build script and did not find any obvious switches to change this behavior. I also tried to build AMD64 target platform but not successful, having the same errors.

@nflaig
Copy link
Author

nflaig commented May 29, 2024

Did some more testing, if I force Lodestar to use JSON body to submit the block to publishBlockV2 the error is gone

participants:
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: nflaig/lodestar:ssz-api-json-publish
    vc_type: nimbus
    vc_image: statusim/nimbus-validator-client:amd64-latest
    count: 2
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: nimbus
    cl_image: statusim/nimbus-eth2:amd64-latest
    vc_type: lodestar
    vc_image: nflaig/lodestar:ssz-api-json-publish
    count: 2
# ...

Running this setup, passed block production assertions, the chain finalized and no missed proposals.

@pinebit
Copy link

pinebit commented May 29, 2024

Great, I just did the same for our Charon DV by setting https://github.com/attestantio/go-eth2-client/blob/e02b07f2405232b26018a50a25d9fcd9ed75c205/http/parameters.go#L100 (because we use go-eth2-client. And this seem helped.
Leaving this for Nimbus team to investigate why this is happening.

pk910 added a commit to ethpandaops/ethereum-package that referenced this issue Jun 11, 2024
Lodestar works with all clients now on `unstable` branch(es). 

There is just one exception which is related to publishing blocks to
Nimbus BN but that's an issue with all other VCs as well and seems to be
only happening when running via kurtosis as confirmed here
status-im/nimbus-eth2#6205 (comment).
This issue can be solved though by forcing Lodestar to publish blocks as
JSON, see
#664 (comment).


The kurtosis config I was using
```yaml
participants:
  # Lighthouse
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:next
    vc_type: lighthouse
    vc_image: sigp/lighthouse:latest
    count: 1
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lighthouse
    cl_image: sigp/lighthouse:latest
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    count: 1
  # Teku
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:next
    vc_type: teku
    vc_image: consensys/teku:latest
    count: 1
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: teku
    cl_image: consensys/teku:latest
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    count: 1
  # Nimbus
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:next
    vc_type: nimbus
    vc_image: statusim/nimbus-validator-client:amd64-latest
    vc_extra_params:
      - --doppelganger-detection=off
    count: 1
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: nimbus
    cl_image: statusim/nimbus-eth2:amd64-latest
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    vc_extra_params:
      - --http.requestWireFormat=json
    count: 1
  # Grandine
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: grandine
    cl_image: sifrai/grandine:stable
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
  # Prysm
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: nflaig/lodestar:ignore-empty-statuses
    vc_type: prysm
    # vc_image: gcr.io/prysmaticlabs/prysm/validator:latest
    vc_image: ethpandaops/prysm-validator:develop-dfe31c9
    count: 1
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: prysm
    # cl_image: gcr.io/prysmaticlabs/prysm/beacon-chain:latest
    cl_image: ethpandaops/prysm-beacon-chain:develop-dfe31c9
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    count: 1
  # Lodestar stable
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:next
    vc_type: lodestar
    vc_image: chainsafe/lodestar:latest
    count: 1
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:latest
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    count: 1
  # Lodestar ssz
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:next
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    vc_extra_params:
      - --http.requestWireFormat=ssz
    count: 1
  - el_type: geth
    el_image: ethereum/client-go:stable
    cl_type: lodestar
    cl_image: chainsafe/lodestar:next
    vc_type: lodestar
    vc_image: chainsafe/lodestar:next
    count: 1
network_params:
  genesis_delay: 120
  num_validator_keys_per_node: 64
launch_additional_services: true
additional_services:
  - assertoor
  - dora
snooper_enabled: false
disable_peer_scoring: true
assertoor_params:
  image: "ethpandaops/assertoor:master"
  run_stability_check: false
  run_block_proposal_check: false
  tests:
    - https://raw.githubusercontent.com/ethpandaops/assertoor-test/2a45f2f78dd2c336ac99bf15e61edc076f15ce67/assertoor-tests/block-proposal-check.yaml

```

---------

Co-authored-by: pk910 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants