Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: projects: long running job cleanup (draft) #2242

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

gioelecerati
Copy link
Member

What does this pull request do? Explain your changes. (required)

Specific updates (required)

How did you test each of these updates (required)

Does this pull request close any open issues?

Screenshots (optional)

Checklist

  • I have read the CONTRIBUTING document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.

Copy link

vercel bot commented Jul 8, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
livepeer-studio ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 9, 2024 4:16pm

@@ -177,4 +181,115 @@
res.end();
});

app.post(
"/job/projects-cleanup",
authorizer({ anyAdmin: true }),

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.
Copy link
Member

@victorges victorges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a skim pre-review 👀

Comment on lines +184 to +186
app.post(
"/job/projects-cleanup",
authorizer({ anyAdmin: true }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw if you're creating this only because the other jobs have sth like it, it's unnecessary! The only reason they're like that is tech debt: we already had such APIs that were called from GitHub actions.

The cleanest way for a new job IMO will be to implement everything on the job. Then if later we want the API to trigger it manually, we can make an API that calls the job function (not the other way around).

Your call though, especially if you're already too far into having this API that it would be a big refactor to get rid of it now.

});

for (const asset of assets) {
await req.taskScheduler.deleteAsset(asset.id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure this doesn't send webhooks I guess? Or do we want webhooks?

}

for (const stream of streams) {
await db.stream.update(stream.id, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW there's a markDeleted function on the tables which is generally used for the soft deletion

for (const signingKey of signingKeys) {
await db.signingKey.update(signingKey.id, {
deleted: true,
disabled: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to disable them as well? Does the code that check disabled not check deleted? If so, that's the thing that needs to be fixed instead imo.

My reasoning is that if we ever want to "undelete" a project then we would need to figure out which signing keys were already disabled and which ones we disabled automatically here.

Comment on lines +83 to +85
const triggerSpy = jest
.spyOn(projectsController, "triggerCleanUpProjectsJob")
.mockImplementation(() => [[], Promise.resolve()]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same regarding the "inverted" nature of the current jobs implementation. Let's make sure the "real" tests are implemented here, which check the effects of the deletion job. The API that should be the "side-kick" which calls the job, but the core logic of the job should be the job itself

Comment on lines +23 to +26
let [projects] = await jobsDb.stream.find([sql`data->>'deleted' = 'true'`], {
limit,
order: "data->>'lastSeen' DESC",
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need sth like a "deleting phase" on the projects too, otherwise this will keep listing deleted files that have already been cleaned-up ad infinitum. Maybe a simple cleanedUp boolean in the project that gets set by this job?


let [projects] = await jobsDb.stream.find([sql`data->>'deleted' = 'true'`], {
limit,
order: "data->>'lastSeen' DESC",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this exist? I thought it was an user/api-key/stream thing only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants