Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R4R]add sharedStorage for prefetching to L1 #792

Merged
merged 15 commits into from
Mar 28, 2022

Conversation

flywukong
Copy link
Contributor

@flywukong flywukong commented Mar 11, 2022

Description

In the previous prefetch behavior of importing blocks, the prefetch and the main process use completely different state dbs, the prefetch process can only put the retrieved data into the L3 layer, if the main process wants to access, it must go through the L1-L2-L3 layers, the new design let the main process and the prefetch process share a shared_pool with the stateDB. This pool stores the originStorage data (each stateobjects has the corresponding
entry ). By this way , the prefetch data that can be directly fished to the L1 layer
This shared_pool can be represented as the following secondary map structure, which is a private member variable in stateDB , each stateObjects has pointer to access the Corresponding entry of shared_pool

image

This PR also remove the overhead of TryPreload which cost 2%-3% in main process.

Rationale

we can make a lot of data in the main process that originally need to be accessed in L3 directly accessed in L1 layer, which reduces a lot of L2 and L3 layer overhead in the number of layers

Performance

The two test machine is on AWS , the two nodes has worked in fullsync mode and has caught up to the latest block

This is the delay of process block , we can see the machine which running master branch data has nearly same performance data
image

Then update a machine with this PR code, it can be clearly seen that the delay data is reduced (green performance data), and the average process delay is reduced by about 7%

image

The proportion of I/O overhead in the total block cost of the main process dropped from 33% to 30.5%
image

The overhead of total block cost has reduce from to 3.67% to 1.1% with PR code
image

Example

add an example CLI or API response...

Changes

Notable changes:

  • add each change in a bullet point here
  • ...

@flywukong flywukong changed the title add sharedStorage for prefetching to L1 [WIP]add sharedStorage for prefetching to L1 Mar 11, 2022
Copy link

@RenRick RenRick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// sharedStorage is used to store maps of originStorage of stateObjects
type SharedStorage struct {
poolLock *sync.RWMutex
shared_map map[common.Address]*sync.Map
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"shared_map" needs to keep the same naming style with poolLock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

go func(start time.Time, followup *types.Block, throwaway *state.StateDB, interrupt *uint32) {
bc.prefetcher.Prefetch(followup, throwaway, bc.vmConfig, &followupInterrupt)
}(time.Now(), block, throwaway, &followupInterrupt)
}(time.Now(), block, statedb, &followupInterrupt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not suggest to share StateDB between Prefetcher and main StateDB.
Use prefetch DB instead, main StateDB can access prefetch DB to get the originStorage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefetch DB could be used as: L1.5 Cache

Copy link
Contributor Author

@flywukong flywukong Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stateDB of main process an prefetcher are different, they only share the sharedpool which store the originStroage maps , I had adjust the sharedPool to be L1.5 Cache by the new commit

go func(start time.Time, followup *types.Block, throwaway *state.StateDB, interrupt *uint32) {
bc.prefetcher.Prefetch(followup, throwaway, bc.vmConfig, &followupInterrupt)
}(time.Now(), block, throwaway, &followupInterrupt)
}(time.Now(), block, statedb, &followupInterrupt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not suggest to share StateDB between Prefetcher and main StateDB.
Use prefetch DB instead, main StateDB can access prefetch DB to get the originStorage

storage.RUnlock()
if !ok {
m := new(sync.Map)
storage.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concurrent race condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, newObjects called by main preoces or prefetcher will both call this function to check or update the sharedpool

core/state/state_object.go Outdated Show resolved Hide resolved
core/state/statedb.go Outdated Show resolved Hide resolved
core/state/statedb.go Outdated Show resolved Hide resolved
core/state/statedb.go Outdated Show resolved Hide resolved
core/state/statedb.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@unclezoro unclezoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix lint.

storageMap, ok := s.sharedMap[address]
s.RUnlock()
if !ok {
m := new(sync.Map)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has race condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

}
}

// Check whether the storage exist in pool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better to follow the commenting style used in go-ethereum, like, // Copy creates a deep, independent copy of the state. // Snapshots of the copied state cannot be applied to the copy. func (s *StateDB) Copy() *StateDB {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@unclezoro
Copy link
Collaborator

LGTM

Copy link

@richardrich975 richardrich975 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@unclezoro unclezoro changed the title [WIP]add sharedStorage for prefetching to L1 [R4R]add sharedStorage for prefetching to L1 Mar 28, 2022
@unclezoro unclezoro merged commit 77cb056 into bnb-chain:develop Mar 28, 2022
core/state/state_object.go Show resolved Hide resolved
@@ -204,7 +236,8 @@ func (s *StateObject) GetCommittedState(db Database, key common.Hash) common.Has
if value, pending := s.pendingStorage[key]; pending {
return value
}
if value, cached := s.originStorage[key]; cached {

if value, cached := s.getOriginStorage(key); cached {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can leave it unchanged; the API: getOriginStorage can be removed;
// Try to get it from origin storage prefetch pool
if s.sharedOriginStorage != nil {
val, ok := s.sharedOriginStorage.Load(key)
if ok {
s.originStorage[key] = val.(common.Hash)
return val.(common.Hash), true
}
}

return common.Hash{}, false
}

func (s *StateObject) setOriginStorage(key common.Hash, value common.Hash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add a new APIsetOriginStorage either

@@ -0,0 +1,37 @@
package state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about rename the file with: shared_storage_pool.go, shared_pool.go is a bit confusing, shared what?

@@ -1633,3 +1575,7 @@ func (s *StateDB) GetDirtyAccounts() []common.Address {
}
return accounts
}

func (s *StateDB) GetStorage(address common.Address) *sync.Map {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to addGetStorage;
GetStorage is paired with SetStorage, it is for fakeStorage.
Do not occupy the name.

@@ -79,7 +80,9 @@ type StateObject struct {
trie Trie // storage trie, which becomes non-nil on first access
code Code // contract bytecode, which gets set when code is loaded

originStorage Storage // Storage cache of original entries to dedup rewrites, reset for every transaction
sharedOriginStorage *sync.Map // Storage cache of original entries to dedup rewrites, reset for every transaction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Storage cache of original entries to dedup rewrites, reset for every transaction
the comment is for originStorage, pls add a new comment for sharedOriginStorage

@@ -120,14 +123,21 @@ func newObject(db *StateDB, address common.Address, data Account) *StateObject {
if data.Root == (common.Hash{}) {
data.Root = emptyRoot
}
var storageMap *sync.Map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename storageMap to sharedStorage, storage is a Map by itself.

@@ -101,6 +98,8 @@ type StateDB struct {
stateObjectsPending map[common.Address]struct{} // State objects finalized but not yet written to the trie
stateObjectsDirty map[common.Address]struct{} // State objects modified in the current execution

storagePool *StoragePool // sharedPool to store L1 originStorage of stateObjects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: storagePool -> originStoragePool to make it clear.

@@ -101,6 +98,8 @@ type StateDB struct {
stateObjectsPending map[common.Address]struct{} // State objects finalized but not yet written to the trie
stateObjectsDirty map[common.Address]struct{} // State objects modified in the current execution

storagePool *StoragePool // sharedPool to store L1 originStorage of stateObjects
writeOnSharedStorage bool // Write to the shared origin storage of a stateObject while reading from the underlying storage layer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename writeOnSharedStorage to updateSharedOriginStorage

unclezoro pushed a commit that referenced this pull request Mar 29, 2022
* add sharedStorage for prefetching to L1

* remote originStorage in stateObjects

* fix core

* fix bug of sync map

* remove read lock when get & set keys

* statedb copy use CopyWithSharedStorage

* reduce lock access

* fix comment

* avoid sharedPool effects on other modules

* remove  tryPreload

* fix comment

* fix var name

* fix lint

* fix L1 miss data && data condition

* fix comment
setunapo pushed a commit to setunapo/bsc that referenced this pull request Apr 1, 2022
* add sharedStorage for prefetching to L1

* remote originStorage in stateObjects

* fix core

* fix bug of sync map

* remove read lock when get & set keys

* statedb copy use CopyWithSharedStorage

* reduce lock access

* fix comment

* avoid sharedPool effects on other modules

* remove  tryPreload

* fix comment

* fix var name

* fix lint

* fix L1 miss data && data condition

* fix comment
owen-reorg pushed a commit to bnb-chain/op-geth that referenced this pull request May 31, 2023
Description: This PR refers to the code optimization done by `BSC`, which mainly includes the following parts:
1. Do BlockBody verification concurrently.
2. Do the calculation of intermediate root concurrently.
3. Commit the MPTs concurrently.
4. Preload accounts before processing blocks.
5. Make the snapshot layers configurable.
6. Reuse some objects to reduce GC.
7. Add shared_pool for `stateDB` to improve cache usage.

References:
- bnb-chain/bsc#257
- bnb-chain/bsc#792

---------

Co-authored-by: j75689 <[email protected]>
Co-authored-by: realowen <[email protected]>
owen-reorg pushed a commit to bnb-chain/op-geth that referenced this pull request May 31, 2023
Description

This PR refers to the code optimization done by `BSC`, which mainly includes the following parts:
1. Do BlockBody verification concurrently.
2. Do the calculation of intermediate root concurrently.
3. Commit the MPTs concurrently.
4. Preload accounts before processing blocks.
5. Make the snapshot layers configurable.
6. Reuse some objects to reduce GC.
7. Add shared_pool for `stateDB` to improve cache usage.

References
- bnb-chain/bsc#257
- bnb-chain/bsc#792

---------

Co-authored-by: j75689 <[email protected]>
Co-authored-by: realowen <[email protected]>
owen-reorg pushed a commit to bnb-chain/op-geth that referenced this pull request May 31, 2023
Description

This PR refers to the code optimization done by `BSC`, which mainly includes the following parts:
1. Do BlockBody verification concurrently.
2. Do the calculation of intermediate root concurrently.
3. Commit the MPTs concurrently.
4. Preload accounts before processing blocks.
5. Make the snapshot layers configurable.
6. Reuse some objects to reduce GC.
7. Add shared_pool for `stateDB` to improve cache usage.

References
- bnb-chain/bsc#257
- bnb-chain/bsc#792

---------

Co-authored-by: j75689 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants