Hacker News new | past | comments | ask | show | jobs | submit login

It'll just throw an error when COMMITing, most likely.



And roll back the whole load? Yikes


Yup.. I’ve been there and it sucks. Only realising after your load of ~1 billion records has almost completed after several hours that there were a couple of duplicated rows and the PK constraint failed.


I wish there were a way for the db to filter out or not surface rows that violated a constraint but that then loaded the data that did meet the constraints. I guess the fix here is before letting the db validate constraints one needs to run data quality checks to either delete or fix rows that would have failed such checks. But then depending on transaction level you might not be able to see uncommitted rows? I dunno


> I wish there were a way for the db to filter out or not surface rows that violated a constraint but that then loaded the data that did meet the constraints.

You can definitely have PG ignore rows that would violate constraints by specifying "ON CONFLICT DO NOTHING"[1] as part of your INSERT statement.

If you wanted to capture those rows on the fly, one way would be to perform the load via a PL/PgSQL program and include an exception handler specific to the particular constraint violation(s) you expect to hit. But I guess then you'd have to load them one row at a time...and that's not too different from just doing that in your high-level language of choice. Also the PL/PgSQL docs specifically recommend against this idea.[2]

[1]: https://www.postgresql.org/docs/11/sql-insert.html#SQL-ON-CO...

[2]: https://www.postgresql.org/docs/11/plpgsql-control-structure...

EDIT: formatting and clarity


Of course! I forgot about that option. Makes sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: