Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't change set of accepted linefeed characters for parser #35

Closed
uybhatti opened this issue Sep 8, 2017 · 7 comments
Closed

Can't change set of accepted linefeed characters for parser #35

uybhatti opened this issue Sep 8, 2017 · 7 comments
Labels

Comments

@uybhatti
Copy link

uybhatti commented Sep 8, 2017

In documentation it is written that we can change Linefeed character. But when I try with different Linefeed character, it is not parsing correctly. Below is test case

    void testWithDifferentLinefeeds(boolean useBytes) throws Exception {

        String CSV = "data11,data12,data13|data21,data22,data23";

        CsvMapper mapper = new CsvMapper();
        mapper.disable(CsvParser.Feature.WRAP_AS_ARRAY);

        CsvParser csvParser = mapper.getFactory().createParser(CSV);

        CsvSchema csvSchema = CsvSchema.builder()
                .setLineSeparator("|")
                .setColumnSeparator(',')
                .build();

        csvParser.setSchema(csvSchema);

        MappingIterator<Object[]> mappingIterator = mapper.readerFor(Object[].class).readValues(csvParser);

        assertTrue(mappingIterator.hasNext());
        Object[] record = mappingIterator.nextValue();
        assertNotNull(record);
        assertEquals("data11", record[0]);
        assertEquals("data12", record[1]);
        assertEquals("data13", record[2]);
        assertEquals(3, record.length);

        assertTrue(mappingIterator.hasNext());
        record = mappingIterator.nextValue();
        assertNotNull(record);
        assertEquals("data21", record[0]);
        assertEquals("data22", record[1]);
        assertEquals("data23", record[2]);
        assertEquals(3, record.length);

        assertFalse(mappingIterator.hasNext());
        mappingIterator.close();
    }

But same test case is working when I change the Linefeed char to "\n" in my data. e.g.,

    void testWithDifferentLinefeeds(boolean useBytes) throws Exception {

        String CSV = "data11,data12,data13\ndata21,data22,data23";
        CsvMapper mapper = new CsvMapper();
        mapper.disable(CsvParser.Feature.WRAP_AS_ARRAY);
        CsvParser csvParser = mapper.getFactory().createParser(CSV);
        CsvSchema csvSchema = CsvSchema.builder()
                .setLineSeparator("\n")
                .setColumnSeparator(',')
                .build();

        csvParser.setSchema(csvSchema);

        MappingIterator<Object[]> mappingIterator = mapper.readerFor(Object[].class).readValues(csvParser);

        assertTrue(mappingIterator.hasNext());
        Object[] record = mappingIterator.nextValue();
        assertNotNull(record);
        assertEquals("data11", record[0]);
        assertEquals("data12", record[1]);
        assertEquals("data13", record[2]);
        assertEquals(3, record.length);

        assertTrue(mappingIterator.hasNext());
        record = mappingIterator.nextValue();
        assertNotNull(record);
        assertEquals("data21", record[0]);
        assertEquals("data22", record[1]);
        assertEquals("data23", record[2]);
        assertEquals(3, record.length);

        assertFalse(mappingIterator.hasNext());
        mappingIterator.close();
    }

It means that currently Jackson Csv parser only supports "\n" as a line delimiter. right?

@cowtowncoder
Copy link
Member

Almost but not quite: line separator is only configurable for writing (generation); for reading (parsing) three standard line-feeds ("\n", "\r", "\r\n") are accepted.
There is no way to change this behavior currently.

@cowtowncoder cowtowncoder changed the title Can't change the default Linefeed character Can't change set of accepted linefeed characters for parser Nov 30, 2017
@tyler2cr
Copy link

This requested feature may resolve an unexpected scenario:

These tests fail with java.lang.AssertionError: expected null, but was:<>

I've been unable to determine where the <> value comes from.

import com.fasterxml.jackson.annotation.JsonPropertyOrder;
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import org.junit.Test;

import java.io.IOException;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;

public class newlinetest {

    @JsonPropertyOrder({"column1", "column2"})
    static
    class Pojo {
        private String column1;
        private String column2;

        public String getColumn1() {
            return column1;
        }

        public void setColumn1(String column1) {
            this.column1 = column1;
        }

        public String getColumn2() {
            return column2;
        }

        public void setColumn2(String column2) {
            this.column2 = column2;
        }
    }

    @Test
    public void two_rows() throws IOException {
        // setup
        String csv = "value|\nvalue|\n";
        CsvMapper csvMapper = new CsvMapper();
        CsvSchema csvSchema = csvMapper.schemaFor(Pojo.class).withColumnSeparator('|');

        // execute
        MappingIterator<Pojo> mappingIterator = csvMapper.readerFor(Pojo.class).with(csvSchema).readValues(csv);

        // test
        assertTrue(mappingIterator.hasNext());
        Pojo row1 = mappingIterator.next();
        assertNotNull(row1);
        assertEquals("value", row1.column1);
        assertNull(row1.column2);

    }

    @Test
    public void one_row() throws IOException {
        // setup
        String csv = "value|\n";
        CsvMapper csvMapper = new CsvMapper();
        CsvSchema csvSchema = csvMapper.schemaFor(Pojo.class).withColumnSeparator('|');

        // execute
        Pojo pojo = csvMapper.readerFor(Pojo.class).with(csvSchema).readValue(csv);

        // test
        assertNotNull(pojo);
        assertEquals("value", pojo.column1);
        assertNull(pojo.column2);
    }
}

@rage-shadowman
Copy link

rage-shadowman commented Aug 21, 2019

@tyler2cr the "<>" comes from JUnit. It's just telling you that it expected null but found an empty string (or something that toString()d to an empty string). If you did Assert.assertNull("") you'd get the same output, as would Assert.assertNull(new Object() { public String toString() { return ""; } });. JUnit just surrounds the unexpected value in angle brackets in its error message.

@tyler2cr
Copy link

@rage-shadowman 🤦‍♂ thank you!!

After this realization, I used .withNullValue("") on the CsvSchema, and it works like a charm

@cowtowncoder
Copy link
Member

Ok, I think I can close this issue since I am not sure what work would be expected here: it seems like question was about alternate linefeeds like \r (or \r\n), which should be allowed already.
If I misunderstood this, may be reopened with additional information explaining what would be desired ( there are other linefeeds that Unicode has for example).

@tyler2cr
Copy link

@cowtowncoder could the reader be enhanced to use customized line feed endings? (the writer already can, per your comment)
#35 (comment)

If so, this ticket could be reopened since it's talking about the parser

@cowtowncoder
Copy link
Member

@tyler2cr In theory, yes, in practice, that is not necessarily an easy change.

But what I would need is a real use-case for why such a feature is needed: I have learned not to work on hypothetical use cases. So: given that all standard (\n, \r, \r\n) linefeeds work, what is the missing use case (why and what). Especially since users may implement Readers that handle linefeed conversion on-the-fly, if that is needed (just as an example).

So: if someone has specific use case, please add details here and this issue can be re-opened and perhaps someone could try to implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants