Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data / Appending data to an existing dataset with custom schema fails #2957

Closed
ferblape opened this issue Apr 6, 2020 · 1 comment · Fixed by #2969
Closed

Data / Appending data to an existing dataset with custom schema fails #2957

ferblape opened this issue Apr 6, 2020 · 1 comment · Fixed by #2969

Comments

@ferblape
Copy link
Member

ferblape commented Apr 6, 2020

If you create a dataset with a custom schema any other further request to update the dataset (using append option) will overwrite the schema. I'd expect the schema to don't be overwritten by an action that is only updating the data.

Steps to reproduce:

  1. define $API_TOKEN_ADMIN env var

  2. Create the dataset using a custom schema:

ruby $DEV_DIR/gobierto-etl-utils/operations/gobierto_data/upload-dataset/run.rb --api-token $API_TOKEN_ADMIN --gobierto-url http:https://madrid.gobierto.test --name "Calidad aire" --slug calidad-aire --table-name calidad_aire --csv-separator=';' --file-path=data.csv --schema-path=$DEV_DIR/gobierto-etl-datos/datasets/calidad_del_aire_madrid/schema_create.json
  1. Update the dataset with today data (using append option):
ruby $DEV_DIR/gobierto-etl-utils/operations/gobierto_data/upload-dataset/run.rb --api-token $API_TOKEN_ADMIN --gobierto-url http:https://madrid.gobierto.test --name "Calidad aire" --slug calidad-aire --table-name calidad_aire --file-path=daily_data.csv --append 

In the create statements code the table is created from zero, instead of using the schema of the existing table.

Associated Rollbar: https://rollbar.com/Populate/gobierto/items/3649/

@ferblape ferblape added this to Backlog in Gobierto Data via automation Apr 6, 2020
@ferblape ferblape moved this from Backlog to To do in Gobierto Data Apr 6, 2020
@ferblape
Copy link
Member Author

ferblape commented Apr 6, 2020

Issue description updated, you can grab the data from the jenkins server or ask it to me.

@entantoencuanto entantoencuanto moved this from To do to In progress in Gobierto Data Apr 13, 2020
@entantoencuanto entantoencuanto moved this from In progress to Review in progress in Gobierto Data Apr 15, 2020
Gobierto Data automation moved this from Review in progress to Done Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Gobierto Data
  
Done
Development

Successfully merging a pull request may close this issue.

2 participants