resource usage during sync #262

stefantalpalaru · 2019-03-05T11:39:43Z

SVG graph with per-process statistics provided by pidstat (missing the network activity, for now, but still interesting):

That CPU usage over 100% must come from the multi-threaded rocksdb library.

The text was updated successfully, but these errors were encountered:

jangko · 2019-03-05T12:43:07Z

interesting, disk write pattern shows there is still room for improvements.

stefantalpalaru · 2019-03-06T02:07:53Z

Another run of ./build/nimbus --prune:archive --port:30304, with network traffic added (using nethogs), better colours and some variables drawn as areas instead of lines.

Full dataset, one second per pixel:

Five seconds per pixel, to better see the memory leak:

Except for the short RocksDB spikes every 6-7 minutes or so, most of the time is spent waiting for data from the network or maxing out a CPU core while processing that data. It all looks very serialised, which means it will benefit from parallelisation.

The average download speed is extremely low, at 7.83 kB/s. Disk I/O is a non-issue right now, on this SSD I'm using.

jangko · 2019-03-06T02:19:53Z

is the keep climbing up red-line-RSS an indicator of memory leak? if yes, that is very bad.

stefantalpalaru · 2019-03-06T02:29:18Z

Yep: https://en.wikipedia.org/wiki/Resident_set_size
When it drops, thats's a garbage collection. I don't see a legitimate reason to hang on to so much data in RAM, during execution, so my guess is that the upward trend is due to a memory leak.

What's weird is that even the stack keeps growing, albeit much slower.

jangko · 2019-03-06T02:37:21Z

which region of blocks do you sync? I mean the block number. I noticed during block 600K to 700K, memory consumption is very high then stable at block 800K to 900K.
I think I will do some measurement myself to improve block sync speed.

@stefantalpalaru: can you share some script with me? how did you produce that svg?

stefantalpalaru · 2019-03-06T14:50:12Z

which region of blocks do you sync? I mean the block number. I noticed during block 600K to 700K, memory consumption is very high then stable at block 800K to 900K.

I started with an empty db and let it run until it crashed due to an assert in transaction rollback (vendor/nim-eth/eth/trie/db.nim:145 - "doAssert t.db.mostInnerTransaction == t and t.state == Pending").

I don't see block numbers in the output log, because those are logged at the TRACE level which is not included by default.

@stefantalpalaru: can you share some script with me? how did you produce that svg?

Freshly published: https://github.com/status-im/process-stats

jangko · 2019-03-06T15:36:45Z

thank you very much.

jangko · 2019-03-19T02:30:27Z

The backend database contribute significantly to block syncing speed.
When the database size already reach 20GB+, it become slower and slower because rocksdb doing background compaction.
Writing to the database seems not slowing down because of WAL(write ahead layer) mechanism, but reading from database can be really-really slow when it competes with compaction.

at 50GB+ (900K blocks), it become very slow. my current solution is: I created separate databases on separate physical drives. every time I have synced around or near 20GB, I move the database to drive A, and open it as read only database.

when a database opened as read only on drive A, it will doing compaction faster because it does not have to compete with regular read write operation on drive B.

without this poor man sharding, the drive activity will always 100%, while doing this simple sharding, the disk activity on both drive A or B only less than 30%.

for comparison, using single db, to sync 1.4M blocks will take many hours.
but when using several 20GB dbs, it will take less than one hour.

zah · 2019-03-19T08:35:23Z

Thanks for sharing this, @jangko. BTW, how does the lmdb performance compare to rocksdb?

jangko · 2019-03-19T09:38:33Z

I stopped using it because it is slower compared to rocksdb when still syncing below 100K blocks, don't know the performance if it contains more data.

Swader · 2019-03-20T13:26:31Z

Would it be possible to actually use this approach of "poor man's sharding" as a solution? Maybe divide data into 10GB snapshots, each snapshot can be one such shard i.e. one rocksdb database, and then use those same snapshots to retrieve data across the network for faster sync among Nimbus clients?

arnetheduck · 2019-03-20T13:30:49Z

https://www.zeroknowledge.fm/9 - interview with one of the parity devs about how they're tuning rocksdb

stefantalpalaru · 2019-03-31T17:59:03Z

A look at allocated RAM (RSS) versus heap usage according to the GC:

To get these heap stats, I added at the end of persistBlocks(), in nimbus/p2p/chain.nim:

  dumpNumberOfInstances()
  echo "===", getTime().toUnix()

(and an import times above the function)

Nimbus compile flags:
make NIMFLAGS="--opt:speed -d:nimTypeNames" nimbus

I ran Nimbus like this:
rm -rf ~/.cache/nimbus/db; ./build/nimbus --prune:archive --maxpeers:250 --log-level:trace --log-file:output6.log > heap.txt

I processed "heap.txt" using this quick and dirty script: https://gist.github.com/stefantalpalaru/0b502def452591aaca289ec8fc119e8b

This looks like memory fragmentation to me, with the RSS growing from 47 to 219 MiB in 37 minutes.

The memory leak is extremely small in comparison, with the used heap minimum going from about 5 to about 10 MiB.

jangko · 2019-04-15T15:21:10Z

currently, our rocksdb using default configuration:

target_file_size_base=64MB
target_file_size_multiplier=1
filter_policy=null.

if we change some of the configurations:

target_file_size_base=64MB
target_file_size_multiplier=4 or 8 -> it will reduces number of files, reduces number of file descriptors, faster file access.
filter_policy= 10 bits bloom filter. -> speed up random read if accounts not in state trie.

zah · 2019-04-25T11:30:00Z

@jangko, can we use the Premix's regress tool as a benchmarking utility when deciding whether to go for these RocksDB tweaks? It would be nice if we can create a database of blocks that can be distributed in some efficient way to multiple machines with various hardware configurations and then we'll be able to use regress to obtain statistics that will inform us regarding the best possible settings.

jangko · 2019-04-25T12:15:47Z

regress is too complicated. I observed, the bottleneck of database operations came from building the state trie.

here what I have done:
block 4.174.280 already contains 5.819.335 accounts ~24.9GB, it took almost 19 hours to move that 5.8m accounts from one SSD to another SSD.

we can use the hexary-trie to tweak and benchmark the database. both of the hexary-trie and database need more optimization.

SjonHortensius · 2022-10-13T07:31:49Z

apparently this is still an issue, syncing a fresh nimbus instance on a high performance machine will result in mediocre sync performance (less than 10 blocks/s) with one thread blocking at 100% and all the other cores using less than 10% cpu

arnetheduck · 2024-05-28T06:36:16Z

Obsoleted by aristo - will need to be re-run

jlokier added the Sync Prevents or affects sync with Ethereum network label May 11, 2021

arnetheduck closed this as completed May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resource usage during sync #262

resource usage during sync #262

stefantalpalaru commented Mar 5, 2019 •

edited by 0xc1c4da

jangko commented Mar 5, 2019

stefantalpalaru commented Mar 6, 2019

jangko commented Mar 6, 2019

stefantalpalaru commented Mar 6, 2019 •

edited

jangko commented Mar 6, 2019

stefantalpalaru commented Mar 6, 2019

jangko commented Mar 6, 2019

jangko commented Mar 19, 2019

zah commented Mar 19, 2019

jangko commented Mar 19, 2019

Swader commented Mar 20, 2019

arnetheduck commented Mar 20, 2019

stefantalpalaru commented Mar 31, 2019 •

edited

jangko commented Apr 15, 2019

zah commented Apr 25, 2019

jangko commented Apr 25, 2019

SjonHortensius commented Oct 13, 2022

arnetheduck commented May 28, 2024

resource usage during sync #262

resource usage during sync #262

Comments

stefantalpalaru commented Mar 5, 2019 • edited by 0xc1c4da

jangko commented Mar 5, 2019

stefantalpalaru commented Mar 6, 2019

jangko commented Mar 6, 2019

stefantalpalaru commented Mar 6, 2019 • edited

jangko commented Mar 6, 2019

stefantalpalaru commented Mar 6, 2019

jangko commented Mar 6, 2019

jangko commented Mar 19, 2019

zah commented Mar 19, 2019

jangko commented Mar 19, 2019

Swader commented Mar 20, 2019

arnetheduck commented Mar 20, 2019

stefantalpalaru commented Mar 31, 2019 • edited

jangko commented Apr 15, 2019

zah commented Apr 25, 2019

jangko commented Apr 25, 2019

SjonHortensius commented Oct 13, 2022

arnetheduck commented May 28, 2024

stefantalpalaru commented Mar 5, 2019 •

edited by 0xc1c4da

stefantalpalaru commented Mar 6, 2019 •

edited

stefantalpalaru commented Mar 31, 2019 •

edited