Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
serde: fix bug in handling of invalid UTF-8
This fixes a bug that was tripped if one tried to use Serde to deserialize records that are not valid UTF-8. In particular, given a struct like the following: #[derive(Debug, Deserialize)] struct Row { h1: String, #[serde(with = "serde_bytes")] h2: Vec<u8>, h3: String, } then using ByteRecord::deserialize would still return an invalid UTF-8 error if invalid UTF-8 bytes were found in the h2 field, even though the field explicitly supports arbitrary bytes. The root cause of this bug is that the deserializer was attempting to run raw byte buffers through UTF-8 validation. Instead, it should simply use the raw bytes directly. We add a regression test that exhibits code that did not work but now does work. Thanks to @dcarosone, whose question led to this bug's discovery: https://users.rust-lang.org/t/csv-serde-vs-non-utf8-easily/23262
- Loading branch information
Perhaps this commented-out code was intended to be removed?