Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync to Goerli #687

Closed
3 tasks
jlokier opened this issue May 27, 2021 · 10 comments
Closed
3 tasks

Sync to Goerli #687

jlokier opened this issue May 27, 2021 · 10 comments
Labels
Sync Prevents or affects sync with Ethereum network

Comments

@jlokier
Copy link
Contributor

jlokier commented May 27, 2021

We have a number of sync issues, but here's one that sets a simple goal that comes up often: Sync to goerli.
The goal is:

  • Nimbus eth1 to bulk sync to Ethereum Goerli state in any acceptable sync mode (fast, full, snap, beam, etc)
  • Track real-time chain updates as they happen, including reorgs to a new winning branch
  • Performs chain validation and PoA consensus, on bulk and real-time updates

Not a goal for this issue:

  • We do not require block mining or transaction pool for this goal

I believe Nimbus has synced with goerli before, but time moved on and it won't do so at the moment. There is also excessive space usage at the moment, though because it's Goerli it's managable.

@jlokier jlokier added the Sync Prevents or affects sync with Ethereum network label May 27, 2021
@jangko
Copy link
Contributor

jangko commented Sep 21, 2021

Finally, after premix debugging tools is fixed, I found a legit bug at block number 4_463_713. This is not far from the beginning of Berlin(4_460_644).

The validation code complains about:

DBG 2021-09-21 12:37:36.427+07:00 wrong receiptRoot in block                 tid=12308 blockNumber=4463713 actual=FBAFEA49FD186AFE450FEC9DDD1E8CD19A6511C946C280423A79F1A5034A6B6C expected=382CC45D2242719F9F612409324273EC42140174F38E0BC9E38022F8326E84CD

This is the receipt of transaction no 16:

  {
    "index": 16,
    "type": 0,
    "isHash": false,
    "status": false,   ## this is true in Geth's receipt
    "hash": "0x0000000000000000000000000000000000000000000000000000000000000000",
    "cumulativeGasUsed": 1458153,
    "bloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
    "logs": []
  },

And why nimbus produce wrong EVM status code, this need deeper investigation. but at least we know where to start and we have the data to reproduce this bug.

@jangko
Copy link
Contributor

jangko commented Sep 22, 2021

the bug is fixed in #842, let's see if it can sync to the tip of london

@jlokier
Copy link
Contributor Author

jlokier commented Sep 28, 2021

I confirm fix #842 makes it pass block 4463617 now on my instance. It raises some questions though:

  1. Why does chain sync stop at block 4463617 when the excution error is on block 4463713?

    This is a problem in the chain sync code, it should progress to the best block that validates, i.e. 4463712. We can clearly see it works in batches, but it should handle a partial batch.

  2. Why does stateRoot pass?

    Gas consumption usually affects account balance. Do we have a problem in stateRoot validation, or it correct in this case?

@jangko
Copy link
Contributor

jangko commented Sep 29, 2021

copied from discord discussion:

  1. I'm not entirely sure why
  2. Why does stateRoot pass? because provided gasLimit is exactly the amount consumed by evm execution. evm error will consume all gas too. no matter if there is evm error or not, all available gas will be consumed by this tx. the only difference is in receipt's status.

@jangko
Copy link
Contributor

jangko commented Sep 29, 2021

pineapple — 29 Sept 2021 6:31 AM
Not that far for me. nimbus --goerli now stops due to some block or blocks in the range 4494913..4495040. I get nowhere near 5080941. I guess you are testing in a different way? (edited)

so we have two new places where sync stop:

  • 4494913..4495040 -> via nimbus --goerli -- It's a networking logic error in p2p/blockchain_sync.nim, now solved
  • 5080941-> via premix/persist + geth

@jangko
Copy link
Contributor

jangko commented Sep 29, 2021

bug in block 5080941 turn out to be another area not covered by ethereum/tests.
it's a CREATE/CREATE2 opcode bug. all nim-evm, evmc, and vm2 are suffered from this bug. but evmc-evm also reveals another potential uncovered area by ethereum/tests.

CREATE/CREATE2 opcode in nim-evm and vm2:

buggy:
  ...
  if child.isSuccess:
     ...
  else:         
     c.returnData = child.output

fixed:
  ...
  if child.isSuccess:
     ...
  elif not child.error.burnsGas: # OP REVERT EXECUTED
     # set returnData if last child's opcode is REVERT
     c.returnData = child.output

CREATE/CREATE2 opcode in evmc-evm

buggy:
   ...
   c.returnData = @(makeOpenArray(c.res.outputData, c.res.outputSize.int))  # <--- always set `returnData`
   ...

fixed:
  ...
  if c.res.status_code == EVMC_REVERT:
     # set returnData if last child's opcode is REVERT
     c.returnData = @(makeOpenArray(c.res.outputData, c.res.outputSize.int))
  ...

@jlokier
Copy link
Contributor Author

jlokier commented Sep 29, 2021

By sheer coincidence that is exactly the issue I was working through the first two days of this week, when refactoring for Beam sync, asynchronous EVM execution and evmone compatibility.

I had some local ethereum/tests failures after the refactor, while I tried to follow the specs and reconcile with the existing code. And the code in writeContract just looked wrong to me. So rather than setting c.returnData, I've looked at what should be stored in c.res.outputData by writeContract - and that's still a place which should be changed.

The c.returnData assignment doesn't need to be changed when writeContract is correct.

I ended up carefully going through EIP-2, EIP-140, EIP-170 and EIP-211 to work out what the behaviour of writeContract and afterExecCreate really should be. An unclear part of the spec language is whether "the failure data in the failure case" (EIP-211) includes failure to write the contract due to EIP-170 (or any other constraints).

There are actually about 20 tests in ethereum/tests which cover interactions in this area. But evidently, as we pass those, a corner case has been missed which occurs in Goerli block 5080941.

@jlokier
Copy link
Contributor Author

jlokier commented Sep 29, 2021

I'm unable to test block 5080941 yet, because since passing 4494721..4495040 yesterday, even on a fast machine and network, due to various other crashes, non-deterministic networking issues, etc. it's only up to 4787521. So it will be a while before I can be sure if my changes are correct for block 5080941 (moving writeContract and afterExecCreate from the EVM to the host side).

@jangko
Copy link
Contributor

jangko commented Sep 30, 2021

I agree the logic in writeContract and afterExecCreate is convoluted and the branching of if is hard to follow. It will be nice if we can make it less convoluted.

If the spec language is not clear, we can always see what the implementation language does.

geth create/create2 opcode:

  ...
  if suberr == ErrExecutionReverted {
    return res, nil       # keep child's output data when revert executed
  }
  return nil, nil   # no output data on success, halt, or, exception

besu also do similar thing, only revert keeps the output data, while other conditions will remove output data.

  private void exceptionalHalt(final MessageFrame frame) {
    clearAccumulatedStateBesidesGasAndOutput(frame);
    frame.clearGasRemaining();
    frame.clearOutputData();
    frame.setState(MessageFrame.State.COMPLETED_FAILED);
  }

   protected void revert(final MessageFrame frame) {
    clearAccumulatedStateBesidesGasAndOutput(frame);
    frame.setState(MessageFrame.State.COMPLETED_FAILED);
  }

  private void completedSuccess(final MessageFrame frame) {
    frame.getWorldUpdater().commit();
    frame.getMessageFrameStack().removeFirst();  # this operation remove output data
    frame.notifyCompletion();
  }

  private void completedFailed(final MessageFrame frame) {
    frame.getMessageFrameStack().removeFirst();  # this operation remove output data
    frame.notifyCompletion();
  }

@tersec
Copy link
Contributor

tersec commented May 25, 2024

@tersec tersec closed this as completed May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sync Prevents or affects sync with Ethereum network
Projects
None yet
Development

No branches or pull requests

3 participants