Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update v61 migration to handle duplicate job names before unique constraint #2464

Merged
merged 2 commits into from
Mar 31, 2023

Conversation

collado-mike
Copy link
Collaborator

Problem

The v61 migration reapplies a unique constraint on job name and namespace without regard to parent job. Unfortunately, there are some cases when job FQNs are duplicated (mostly due to the issue fixed in #2097 , but also including cases when a job previously had no parent and now has a parent, but the same FQN). This update to the migration renames jobs that have been symlinked to point to newer versions of themselves so that the job FQN doesn't conflict and the unique constraint can be applied.

Any installations that have already applied this migration will not see any new operations on their data, but installations that have duplicates will need this fix for the migration to successfully complete.

I tested this on an installation with ~200,000 job names and was able to complete the migration successfully.

Checklist

  • You've signed-off your work
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've included a one-line summary of your change for the CHANGELOG.md (Depending on the change, this may not be necessary).
  • You've versioned your .sql database schema migration according to Flyway's naming convention (if relevant)
  • You've included a header in any source code files (if relevant)

@boring-cyborg boring-cyborg bot added the api API layer changes label Mar 31, 2023
@collado-mike collado-mike force-pushed the fix/61_db_migration_duplicate_jobs branch from a78e9e8 to 9912536 Compare March 31, 2023 00:06
@codecov
Copy link

codecov bot commented Mar 31, 2023

Codecov Report

Merging #2464 (9912536) into main (3b0cde3) will not change coverage.
The diff coverage is n/a.

❗ Current head 9912536 differs from pull request most recent head 0ef8e13. Consider uploading reports for the commit 0ef8e13 to get more accurate results

@@            Coverage Diff            @@
##               main    #2464   +/-   ##
=========================================
  Coverage     83.53%   83.53%           
  Complexity     1207     1207           
=========================================
  Files           231      231           
  Lines          5503     5503           
  Branches        267      267           
=========================================
  Hits           4597     4597           
  Misses          762      762           
  Partials        144      144           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

jobs j
WHERE j.uuid = f.uuid
)
UPDATE jobs SET name=(q.simple_name || '_' || q.row)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will set a job with a conflict to {job_name}_{row_num}?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but row_number() over (PARTITION BY j.namespace_name, j.name ORDER BY j.created_at) AS row means that all jobs that have the same name will have an incrementing counter suffixed so we avoid the conflict. E.g., if we had three jobs with the FQN a.b, the names would be

  1. a.b_1
  2. a.b_2
  3. a.b_3
    Thus, we can add the uniqueness constraint because they'll all end up with different names.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, thanks for clarifying and the examples 👍

Copy link
Member

@wslulciuc wslulciuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your writeup in the PR description was helpful in understanding all that's going on in the migration script. Thanks! And, also, 👍

@collado-mike collado-mike enabled auto-merge (squash) March 31, 2023 21:13
@collado-mike collado-mike merged commit da3863c into main Mar 31, 2023
@collado-mike collado-mike deleted the fix/61_db_migration_duplicate_jobs branch March 31, 2023 21:28
Xavier-Cliquennois pushed a commit to Xavier-Cliquennois/marquez that referenced this pull request Jul 26, 2023
…traint (MarquezProject#2464)

Signed-off-by: Michael Collado <[email protected]>
Signed-off-by: Xavier-Cliquennois <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api API layer changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants