Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty string throws exception #59

Open
eric-j-ason opened this issue Jun 13, 2024 · 3 comments
Open

Empty string throws exception #59

eric-j-ason opened this issue Jun 13, 2024 · 3 comments

Comments

@eric-j-ason
Copy link

Empty strings in a CSV file, such as the middle field in Greg,,The world's Greggest!, causes an exception to be thrown when read.

Example:

from dataclasses import dataclass
from dataclass_csv import DataclassReader, DataclassWriter
from io import StringIO

@dataclass
class Person:
    firstname: str
    lastname: str
    motto: str

csv_data = """firstname,lastname,motto
Greg,,The world's Greggest!
"""

print(list(DataclassReader(StringIO(csv_data), Person)))

Result:

dataclass_csv.exceptions.CsvValueError: The field `lastname` is required. [CSV Line number: 2]

Using dataclass-csv 1.4.0.

@theta-lin
Copy link

@eric-j-ason A easy workaround is to specify an empty string as the default value:

@dataclass
class Person:
    firstname: str
    lastname: str = ""
    motto: str

However, I do believe that dataclass-csv should interpret an empty string field as an empty string, which would be consistent with the behavior of Python's csv module. The following code example below is both a comparison between Python's csv module and dataclass-csv, and to demonstrate what works for dataclass-csv:

import io
import csv
from dataclasses import dataclass
from typing import Optional
from dataclass_csv import DataclassReader, DataclassWriter

output = io.StringIO()
# csv module's default behavior is to output empty string and `None` both as empty string
# Using `csv.QUOTE_NOTNULL` in Python 3.12 can differentiate between `None` and not `None` though
writer = csv.writer(output, quoting=csv.QUOTE_MINIMAL)
writer.writerow(["empty", "", "none", None])
print(output.getvalue())

# When reading an empty string, it would still be interpreted as an empty string
output.seek(0)
reader = csv.reader(output)
print(list(reader))


@dataclass
class Test:
    a: str

    # (*) This works, but not providing a default value would cause issues below
    b: str = ""

    # The line below also works, except that the default value is `None`
    # b: Optional[str] = None

    # This does not work
    # b: str

    # It's a bit counterintuitive, but this also does not work
    # b: Optional[str]


output = io.StringIO()
writer = DataclassWriter(output, [Test("empty", "")], Test)
writer.write()
print(output.getvalue())

output.seek(0)
reader = DataclassReader(output, Test)
# Line below would raise exception if `Test.b` does not have a default value according to (*)
print(list(reader))

output = io.StringIO()
# There is a type mismatch here using `None` though, but this is just for demonstration
writer = DataclassWriter(output, [Test("none", None)], Test)
writer.write()
print(output.getvalue())

output.seek(0)
reader = DataclassReader(output, Test)
# Line below would raise exception if `Test.b` does not have a default value according to (*)
print(list(reader))

The output is:

empty,,none,

[['empty', '', 'none', '']]
a,b
empty,

[Test(a='empty', b='')]
a,b
none,

[Test(a='none', b='')]

Additionally, you can see that only using Optional[str] as a type hint also does not work suppose that you want an empty string in csv to be interpreted as None. This is due to that the criteria for an optional field is whether it has a default value, and that Optional type hint is not taken into consideration:

def _get_optional_fields(self):
return [
field.name
for field in dataclasses.fields(self._cls)
if not isinstance(field.default, dataclasses._MISSING_TYPE)
or not isinstance(field.default_factory, dataclasses._MISSING_TYPE)
]

@eric-j-ason
Copy link
Author

A easy workaround is to specify an empty string as the default value:

@dataclass
class Person:
    firstname: str
    lastname: str = ""
    motto: str

Thank you, but that catches me out with another error:

TypeError: non-default argument 'motto' follows default argument

@theta-lin
Copy link

@eric-j-ason Oops, forgot to test, just deal with the error accordingly, putting lastname as the last attribute should work:

@dataclass
class Person:
    firstname: str
    motto: str
    lastname: str = ""

Or you can add a default value to motto as well :

@dataclass
class Person:
    firstname: str
    lastname: str = ""
    motto: str = ""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants