Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Struct Encoding/Decoding #15

Closed
JP-Ellis opened this issue Dec 26, 2014 · 3 comments
Closed

Custom Struct Encoding/Decoding #15

JP-Ellis opened this issue Dec 26, 2014 · 3 comments

Comments

@JP-Ellis
Copy link
Contributor

Firstly, I am not sure whether this issue/question belongs here or over at rust-lang/rustc-serialize, but I think I should ask this here first.

I was wondering how one goes about telling CSV and/or serialize that a particular Struct is formatted a particular way. This is probably best illustrated with the following example:

#![feature(phase)]

extern crate csv;
extern crate regex;
#[phase(plugin)] extern crate regex_macros;
extern crate "rustc-serialize" as rustc_serialize;

use std::fmt;
use std::str;

#[deriving(RustcDecodable)]
struct Rational {
    numerator: i64,
    denominator: i64
}

impl fmt::Show for Rational {
    /// Display the rational number as `numerator/denominator`.
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}/{}", self.numerator, self.denominator)
    }
}

impl str::FromStr for Rational {
    /// Parse a string into a Rational.  Allow for the possibility of whitespace around `/`.
    fn from_str(s: &str) -> Option<Rational> {
        if let Some(caps) = regex!(r"^([0-9]+)\s*/\s*([0-9]+)$").captures(s) {
            Some(Rational {
                numerator: caps.at(1).unwrap().parse().unwrap(),
                denominator: caps.at(2).unwrap().parse().unwrap()
            })
        } else {
            None
        }
    }
}

fn main() {
    let data = "
X,Y,Rational
1.1,2.3,5/8
13.21,34.55,144/233
377.610,987.1597,2584/4181";

    let mut rdr = csv::Reader::from_string(data).has_headers(true);
    for row in rdr.decode() {
        let (x, y, r): (f64, f64, Rational) = row.unwrap();
        println!("({}, {}): {}", x, y, r);
    }
}

This will compile fine, but it unsurprisingly results in the error

     Running `target/csv-test`
thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: CSV decode error: Struct 'Rational' has 2 fields but current record has 1 fields.', /build/rust-git/src/rust/src/libcore/result.rs:746

since it it expect two columns for Rational.

I have tried to replace the #[deriving(RustcDecodable)] by implementing that trait manually; but I do not seem to be getting very far. This is my attempt so far

impl rustc_serialize::serialize::Decodable<csv::Decoded, csv::Error> for Rational {
    fn decode(d: &mut csv::Decoded) -> Result<Rational, csv::Error> {
        if let Ok(s) = d.read_str() {
            if let Some(r) = s.as_slice().parse::<Rational>() {
                Ok(r)
            } else {
                Err("Unable to parse the string into rational number.")
            }
        } else {
            Err("Unable to read the string.")
        }
    }
}

but for some reason, it seems that read_str is not implemented for csv::Decoded yet the documentation (and source) clearly show that it should be implemented. (It is getting late, so I am perhaps missing something obvious).

Curiously, why does CSV use the serialize crate in order to parse the data? Why not simply use FromStr? It would seem to me a lot more straightforward; especially since as far as I see it, the whole serialize crate is acting like a big and overly complicated wrapper for just FromStr in this context.

BurntSushi added a commit that referenced this issue Dec 26, 2014
@BurntSushi
Copy link
Owner

but for some reason, it seems that read_str is not implemented for csv::Decoded yet the documentation (and source) clearly show that it should be implemented. (It is getting late, so I am perhaps missing something obvious).

My guess is that your code is missing an import for the Decoder trait. i.e., use rustc_serialize::Decoder. Without that in scope, you'll get errors exactly like what you're seeing.

Note that you can implement Decodable more generically, so it will work with anything that uses the serialize infrastructure:

impl<E, D: Decoder<E>> Decodable<D, E> for Rational {
    fn decode(d: &mut D) -> Result<Rational, E> {
        let field = try!(d.read_str());
        match field.parse() {
            Some(rat) => Ok(rat),
            None => Err(d.error(&*format!(
                "Could not parse '{}' as a rational.", field))),
        }
    }
}

Curiously, why does CSV use the serialize crate in order to parse the data? Why not simply use FromStr? It would seem to me a lot more straightforward; especially since as far as I see it, the whole serialize crate is acting like a big and overly complicated wrapper for just FromStr in this context.

For the most part, the serialize infrastructure is useful for representing rows as structured data, either with a struct or a tuple. It also has limited support for enums and Option types. (See the docs for examples.)

For example, you're actually using a lot more than just FromStr in your example. This line

        let (x, y, r): (f64, f64, Rational) = row.unwrap();

in your main function is using the Decodable impl for 3-tuples to properly load x, y and r. Similarly, you could use a struct:

#[deriving(RustcDecodable)]
struct Row {
    x: f64,
    y: f64,
    rat: Rational,
}

and decoded it like so: let row: Row = row.unwrap().

All of this is only made possible because of the automatic deriving implementations provided by the serialize crate. Otherwise, you would have to write the FromStr impl for Row yourself.

Finally, the CSV parsing and decoding are decoupled. You don't have to use the decoding infrastructure if you don't want to. You can just read Vec<String> with the records iterator.

This was a really phenomenal question, so I've added a fully working example to the repo. :-)

@JP-Ellis
Copy link
Contributor Author

Wow! Thank you for this amazing and very thorough answer; it is much appreciated :)

Regarding the missing from_str method, indeed it was a silly mistake and I forgot the appropriate include in the file.

Thanks for demonstrating how to implement the Decodable trait and even showing how it can be done more generically, and thank you also for explaining the reasoning behind using the Serialize crate; it makes perfect sense.

Lastly, I'm glad you were able to adapt my question and make an example out of it :)

@BurntSushi
Copy link
Owner

No problem! Happy hacking! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants