Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix accountBalance format and file encoding #73

Merged
merged 4 commits into from
Jun 11, 2022
Merged

Fix accountBalance format and file encoding #73

merged 4 commits into from
Jun 11, 2022

Conversation

mincong-h
Copy link
Owner

@mincong-h mincong-h commented Jun 11, 2022

Account Balance

TL;DR: format(amount)=FR, format(accountbalance)=ISO

Previously, the format of the column "accountbalance" was written in French format:

  • "1 000,00"
  • 370,00

Now it is using the ISO format without thousands separator:

  • 1000.00
  • 370.00

but the other column "amount" remains unchanged -- i.e. still in French format. To fix this, we consider the column "accountbalance" as string when reading the CSV file and cast it ourselves.

Encoding

Previously the encoding of the file is ISO-8859-1, it had been changed to UTF-8 now. When using ISO-8859-1, we can see the problem from this error, where dateOp was considered as dateOp:

finance_toolkit.pipeline.PipelineDataError: Failed to read new Boursorama data. Details:
  path=/data/source/export-operations-11-06-2022_09-52-55.csv
  headers=dateOp;dateVal;label;category;categoryParent;amount;comment;accountNum;accountLabel;accountbalance
  pandas_kwargs={'decimal': ',', 'delimiter': ';', 'dtype': {'accountNum': 'str'}, 'encoding': 'ISO-8859-1', 'parse_dates': ['dateOp', 'dateVal'], 'skipinitialspace': True, 'thousands': ' '}
  pandas_error=Missing column provided to 'parse_dates': 'dateOp'

@codecov-commenter
Copy link

codecov-commenter commented Jun 11, 2022

Codecov Report

Merging #73 (4c360d8) into master (bc55bb8) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #73   +/-   ##
=======================================
  Coverage   94.23%   94.23%           
=======================================
  Files          11       11           
  Lines         607      607           
  Branches       97       97           
=======================================
  Hits          572      572           
  Misses         20       20           
  Partials       15       15           
Impacted Files Coverage Δ
finance_toolkit/boursorama.py 94.20% <100.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@mincong-h mincong-h changed the title Fix accountBalance format Fix accountBalance format and file encoding Jun 11, 2022
@mincong-h mincong-h merged commit eac79bf into master Jun 11, 2022
@mincong-h mincong-h deleted the boursorama branch June 11, 2022 17:20
@mincong-h
Copy link
Owner Author

It's related to #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants