-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] BUG: Duplicates loaded into existing satellite #221
Comments
Hi, Thanks for this report! This is something we've noticed internally on our own projects and have recently developed a fix for and hope to release in due course. We've also decided to change |
Hi @DVAlexHiggs that's good to know, any idea when you say "due course", what sort of time frame do you think you're looking at? Are we talking a month, 6 months, 2 years? |
Apologies for the vagueness! The target is this month; we've got a few QoL, performance improvements and various other things coming down the line 😄 |
Amazing, thank you! |
Fixed in v0.10.2 😄 Thanks for your patience for release of this! Please let us know if you experience any issues by responding here or opening a new issue. |
Describe the bug
When loading into a Satellite table where Stage contains two or more duplicate records (the same hash_key, hash_diff and load_timestamp) and the records have not been previously loaded into the satellite, both records are loaded into the satellite rather than just one.
Environment
dbt version: 1.5.2
automate_dv version: 0.10.1
Database/Platform: Snowflake
To Reproduce
Expected behavior
The root cause
When debugging the Satellite script, we noticed that in this CTE the rank() statement is ranking all duplicate records as 1 which causes the records to be loaded. As a suggestion, changing this to a row_number() would solve the issue.
The text was updated successfully, but these errors were encountered: