Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting org.rocksdb.RocksDBException: bad entry in block #9279

Open
Rafat001 opened this issue Dec 10, 2021 · 18 comments
Open

Getting org.rocksdb.RocksDBException: bad entry in block #9279

Rafat001 opened this issue Dec 10, 2021 · 18 comments
Labels

Comments

@Rafat001
Copy link

Note: Please use Issues only for bug reports. For questions, discussions, feature requests, etc. post to dev group: https://groups.google.com/forum/#!forum/rocksdb or https://www.facebook.com/groups/rocksdb.dev

Expected behaviour

Get the value when get() is called.

Actual behaviour

java.lang.Exception: org.rocksdb.RocksDBException: bad entry in block
    at com.techspot.store.RocksStore.encodePacket(RocksStore.java:684) ~[techspot-encoder-dev.jar:?]
    at com.techspot.store.workers.PacketEncoder.run(PacketEncoder.java:67) [techspot-encoder-dev.jar:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_212]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_212]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: org.rocksdb.RocksDBException: bad entry in block
    at org.rocksdb.RocksDB.get(Native Method) ~[rocksdbjni-6.13.3.jar:?]
    at org.rocksdb.RocksDB.get(RocksDB.java:1948) ~[rocksdbjni-6.13.3.jar:?]
    at com.techspot.store.RocksStore.packetExists(RocksStore.java:402) ~[techspot-encoder-dev.jar:?]
    at com.techspot.store.RocksStore.encodePacket(RocksStore.java:634) ~[techspot-encoder-dev.jar:?]
    ... 6 more

The error first started occurring when the number of entries was ~350 million and the database size was ~18GB. I'm using RocksDB version 6.13.3 and using the following options for rocksDB:

Options options = new Options();
BlockBasedTableConfig blockBasedTableConfig = new BlockBasedTableConfig();
blockBasedTableConfig.setBlockSize(16 * 1024);// 16Kb

options.setWriteBufferSize(64 * 1024 * 1024);// 64MB
options.setMaxWriteBufferNumber(8);
options.setMinWriteBufferNumberToMerge(1);

options.setTableCacheNumshardbits(8);
options.setLevelZeroSlowdownWritesTrigger(1000);
options.setLevelZeroStopWritesTrigger(2000);
options.setLevelZeroFileNumCompactionTrigger(1);

options.setCompressionType(CompressionType.LZ4_COMPRESSION);
options.setTableFormatConfig(blockBasedTableConfig);
options.setCompactionStyle(CompactionStyle.UNIVERSAL);
options.setCreateIfMissing(Boolean.TRUE);

options.setEnablePipelinedWrite(true);
options.setIncreaseParallelism(8);

Steps to reproduce the behaviour

This issue is hard to reproduce, I tried to reproduce it by putting almost ~700 million entries but couldn't do it.

@hx235
Copy link
Contributor

hx235 commented Dec 10, 2021

Hi @Rafat001 - did you by any chance still have the information logs around? From there, we might be able to see what operations happen between Get and the "bad entry in block" exception.

@hx235 hx235 added question up-for-grabs Up for grabs waiting Waiting for a response from the issue creator. labels Dec 10, 2021
@suyanlong
Copy link

2021-12-13 11:25:13,407 [-dispatcher-4] INFO  a.e.s.Slf4jLogger                      - Slf4jLogger started
2021-12-13 11:25:14,020 [main]          ERROR o.alephium.app.Boot$                   - Cannot initialize system: {}
org.alephium.io.IOError$RocksDB: org.rocksdb.RocksDBException: bad entry in block
	at org.alephium.io.IOUtils$$anonfun$error$1.applyOrElse(IOUtils.scala:67)
	at org.alephium.io.IOUtils$$anonfun$error$1.applyOrElse(IOUtils.scala:64)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:35)
	at org.alephium.io.IOUtils$.tryExecute(IOUtils.scala:54)
	at org.alephium.flow.io.NodeStateStorage.isInitialized(NodeStateStorage.scala:41)
	at org.alephium.flow.io.NodeStateStorage.isInitialized$(NodeStateStorage.scala:39)
	at org.alephium.flow.io.NodeStateRockDBStorage.isInitialized(NodeStateStorage.scala:148)
	at org.alephium.flow.client.Node$.buildBlockFlowUnsafe(Node.scala:134)
	at org.alephium.flow.client.Node$Default.<init>(Node.scala:74)
	at org.alephium.flow.client.Node$.build(Node.scala:59)
	at org.alephium.app.Server.node(Server.scala:44)
	at org.alephium.app.Server.node$(Server.scala:44)
	at org.alephium.app.Server$Impl.node$lzycompute(Server.scala:109)
	at org.alephium.app.Server$Impl.node(Server.scala:109)
	at org.alephium.app.Server$Impl.<init>(Server.scala:124)
	at org.alephium.app.Server$.apply(Server.scala:106)
	at org.alephium.app.BootUp.<init>(Boot.scala:58)
	at org.alephium.app.Boot$.delayedEndpoint$org$alephium$app$Boot$1(Boot.scala:37)
	at org.alephium.app.Boot$delayedInit$body.apply(Boot.scala:35)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:76)
	at scala.App.$anonfun$main$1$adapted(App.scala:76)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:919)
	at scala.App.main(App.scala:76)
	at scala.App.main$(App.scala:74)
	at org.alephium.app.Boot$.main(Boot.scala:35)
	at org.alephium.app.Boot.main(Boot.scala)
Caused by: org.rocksdb.RocksDBException: bad entry in block
	at org.rocksdb.RocksDB.get(Native Method)
	at org.rocksdb.RocksDB.get(RocksDB.java:2084)
	at org.alephium.io.RocksDBColumn.existsRawUnsafe(RocksDBColumn.scala:78)
	at org.alephium.io.RocksDBColumn.existsRawUnsafe$(RocksDBColumn.scala:77)
	at org.alephium.flow.io.NodeStateRockDBStorage.existsRawUnsafe(NodeStateStorage.scala:148)
	at org.alephium.flow.io.NodeStateStorage.$anonfun$isInitialized$1(NodeStateStorage.scala:41)
	at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.scala:17)
	at org.alephium.io.IOUtils$.tryExecute(IOUtils.scala:53)
	... 27 common frames omitted

@Rafat001
Copy link
Author

Hi @Rafat001 - did you by any chance still have the information logs around? From there, we might be able to see what operations happen between Get and the "bad entry in block" exception.

Hi @hx235 here's the LOG file. Sorry, it's not supported to upload it in GitHub so I've uploaded it in drive and attached the link.

@Rafat001
Copy link
Author

Using either SNAPPY or ZSTD compression instead of LZ4 seems to solve the problem. Any idea about what's wrong with LZ4?

@ajkr ajkr removed the waiting Waiting for a response from the issue creator. label Apr 22, 2022
@wqshr12345
Copy link
Contributor

Hello, I have recently encountered the same issue in a production environment. Did you continue to follow up on a solution to this problem?

@Rafat001
Copy link
Author

Hi @wqshr12345, for me the compression algorithm was the culprit. And changing the algorithm fixed the problem. I’m still not sure what was wrong with LZ4. Did you try changing your compression algorithm?

@wqshr12345
Copy link
Contributor

Hi @Rafat001,my production environment uses the Snappy compression algorithm, and I have attempted to download the erroneous SST files from the server. I can open and iterate through them normally using sst_dump, so it seems that the contents of my files are not corrupted. I suspect there might be some issues with the memory, but I am not certain.

@Rafat001
Copy link
Author

It was also similar for me. We got the error while querying some certain keys but I could always query these keys manually from the same set of sst files. Even when I tried to find the corrupted file by manually querying every key in the database, I couldnt reproduce it. It was totally random and I agree with you on this. sst files are not corrupted, something else causing the problem.

@ajkr
Copy link
Contributor

ajkr commented Jun 17, 2024

@wqshr12345 Does the problem happen again after reopening the DB?

@wqshr12345
Copy link
Contributor

@ajkr Actually, quite unfortunately, We run RocksDB on a distributed KV system.Upon encountering a 'bad entry in block' error, the upper scheduler deleted all the data from that node, making it impossible for me to attempt a reopen. However, we plan to save the crash site files for the next occurrence to try reproducing the problem.

@wqshr12345
Copy link
Contributor

@ajkr Hi,yesterday, we encountered the similar problem again, and this time we retained the on-site logs. Upon reopening the database, we found that every key was intact, including the one that had problem, leading us to suspect a problem with the Block Cache. Additionally, we are using an older version of RocksDB, which is 6.4.6. Do you know of any potential issues that might exist with this version?Thanks.

@ajkr
Copy link
Contributor

ajkr commented Jun 27, 2024

Sorry for the delay. When you verify the keys are intact, what does the verification involve? I noticed there are some checks that can produce "bad entry in block" that are only present in binary search. So, for example, Get() would trigger those checks while iterator scan would not.

@wqshr12345
Copy link
Contributor

@ajkr In our application layer using RocksDB, we encountered the "Bad Entry" problem when calling RocksDB's Iterator Seek for a specific key. After reopening, we continued using the Seek interface and sought the same key again, but did not encounter the "Bad Entry" problem.

@ajkr
Copy link
Contributor

ajkr commented Jun 29, 2024

Do you use IngestExternalFile(), and if so, was the bad key in a block from an ingested file? The only potentially relevant bug fix I found in the C++ code is #6669. Java bug fixes and possible memory corruptions are other possibilities.

@wqshr12345
Copy link
Contributor

@ajkr Hi, we did not use IngestExternalFile().I would like to ask if this issue could potentially lead to a "bad entry in block" problem?#7405.

@ajkr
Copy link
Contributor

ajkr commented Jul 8, 2024

With #7405 the corruption would be persisted so should still be present after a restart. It sounded like the problem did not come back after a restart so I thought that one was unlikely. It's possible I missed something though.

edit: Oh, I did miss something. It would only be persisted if compaction hit the incorrect block in block cache. If a user read hit it it would just mess up that one query. If you crash after that query finds the corruption then the mis-ordering would not be persisted.

@ajkr
Copy link
Contributor

ajkr commented Jul 8, 2024

edit: Oh, I did miss something. It would only be persisted if compaction hit the incorrect block in block cache. If a user read hit it it would just mess up that one query. If you crash after that query finds the corruption then the mis-ordering would not be persisted.

Still, I don't see how this problem (using the wrong block) can cause "bad entry in block". The wrong block should not be corrupt so those integrity checks should pass.

@wqshr12345
Copy link
Contributor

edit: Oh, I did miss something. It would only be persisted if compaction hit the incorrect block in block cache. If a user read hit it it would just mess up that one query. If you crash after that query finds the corruption then the mis-ordering would not be persisted.

Still, I don't see how this problem (using the wrong block) can cause "bad entry in block". The wrong block should not be corrupt so those integrity checks should pass.

I'm not sure if this is correct, but here's my guess.
When I call BinarySeek, I might locate a Data Block through the Index Block. However, due to the aforementioned bug, the Block Cache might contain a block from an SST that has been deleted for a long time (it could be a Data Block, Index Block, Filter Block, etc. Let's assume it's an Index Block). We mistakenly think it's a Data Block and then try to parse the Key-Value pairs based on the Restart Point positions, which will result in an bad entry error. The reason is obvious: this Block is not a Data Block and is not organized according to the Restart Point format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants