Skip to content

Latest commit

 

History

History
66 lines (45 loc) · 2.64 KB

0012-keep-job-retrying-off.md

File metadata and controls

66 lines (45 loc) · 2.64 KB
  • Start Date: 2022-09-20
  • RFC Type: informational
  • RFC PR: #12

Summary

eng-pipes (our internal service for handling webhooks) attempts to auto-retry GitHub actions builds for getsentry (internal sentry) for:

  1. any job which fails the ensure docker image step
  2. any failed required job on the primary branch

the latter was recently disabled when it was discovered it was broken and was also blocking internal messaging.

the proposal is to remove this functionality entirely.

Motivation

  1. dev-infra believes it is more important to improve job reliability rather than investing in a big-hammer retry which is more likely to lead to ignoring the actual problems
  2. it would require significant investment to make it work properly
  3. removing this feature removes complexity in eng-pipes

Background

we've invested a lot recently into reducing flakiness of setup tasks:

we also already have 5x retries for python tests which we also believe is too high but is generally a better retry mechanism than rerunning the whole job. in the future we'd like to reduce this as it enables flaky tests as much as it improves CI experience however that is out of scope for this rfc.

Supporting Data

I cannot find any successful transactions of this feature in the ENG-PIPES sentry project -- there are however (resolved) failures.

Options Considered

the other option is to invest into fixing and supporting this functionality.

Drawbacks

the main drawback is if this functionality actually worked it would potentially improve CI experience

Unresolved questions

  • dev-infra agrees with this plan but wants to get input before moving forward