Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Add GH Action for /retest comments to re-run failed jobs #12864

Closed
agilgur5 opened this issue Mar 30, 2024 · 3 comments · Fixed by #13000 or #13007
Closed

CI: Add GH Action for /retest comments to re-run failed jobs #12864

agilgur5 opened this issue Mar 30, 2024 · 3 comments · Fixed by #13000 or #13007
Labels
area/build Build or GithubAction/CI issues solution/suggested A solution to the bug has been suggested. Someone needs to implement it. type/feature Feature request

Comments

@agilgur5
Copy link
Member

agilgur5 commented Mar 30, 2024

Summary

Create a GH Action Workflow that reads comments by Members on PRs and detects /retest. If detected, it should use the GH API to "re-run failed jobs".

Right now this permission is limited to Approver+ (those with "write" permissions), so the Action can perform this on behalf of Members and Reviewers. This will be particularly useful for test flakes.

This would be similar to upstream k8s's bot that reruns CI after detecting a /retest comment

Use Cases

In particular, this is useful when the repo has a bout of flakey tests, such as:

Contributors, including me, have asked how to retry in those cases in the past:

Pushing an empty commit (or closing/re-opening the PR) works, but re-runs all GH jobs, not just the failed one(s). /retest to only re-run failed jobs would be faster and more efficient.

While we should fix test flakes -- especially as they sometimes are due to unhandled race conditions in the source code (not just test races) -- in the interim, while they are being diagnosed, root caused, and fixed, such a /retest command is very useful.

Implementation Details

Similar to #12592 (comment) for /cherry-pick, we can run an action when a comment is made on a PR:

  1. We can check that the comment is from a Member of the org with if: github.event.comment.author_association == 'MEMBER'
  2. We can re-run failed jobs via the GH CLI: gh run rerun RUN_ID --failed
    • See the re-running jobs docs for the CLI
    • There's a slight complication here that you have to get the latest RUN_ID from the PR number. I think there's several ways of doing that and seemingly no shortcut command?

We could also extract 2 into its own separate OSS action for other repos to use. I couldn't find one from some searching so I don't think it exists already?


Message from the maintainers:

Love this feature request? Give it a 👍. We prioritise the proposals with the most 👍.

@agilgur5 agilgur5 added type/feature Feature request area/build Build or GithubAction/CI issues solution/suggested A solution to the bug has been suggested. Someone needs to implement it. labels Mar 30, 2024
@terrytangyuan
Copy link
Member

This is fine. We use this for other projects as well

@miltalex
Copy link
Member

@agilgur5 I could have a look into this if it is up for grabs

@agilgur5
Copy link
Member Author

Go for it

miltalex added a commit to miltalex/argo-workflows that referenced this issue May 1, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 1, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 1, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 1, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 2, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 2, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 2, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 2, 2024
agilgur5 added a commit that referenced this issue May 2, 2024
Co-authored-by: Anton Gilgur <[email protected]>
Signed-off-by: Miltiadis Alexis <[email protected]>
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 4, 2024
agilgur5 pushed a commit that referenced this issue May 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build or GithubAction/CI issues solution/suggested A solution to the bug has been suggested. Someone needs to implement it. type/feature Feature request
Projects
None yet
3 participants