-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
governance: noting CI is a maintainer responsibility, emailing failures to on-call #6342
Conversation
…failures to on-call Signed-off-by: Alyssa Wilk <[email protected]>
Oh wow, the defaults for notification are 0 m: alert on calls for Envoy So.... I'm going to go ahead and fix that. For now, I'm going to remove forwarding to secondary and "all", and replace that with "also notify alyssa". If I can get high signal for build breakage I think it'd be cool to notify the maintainer team for full master breaks, but realistically I suspect I won't be able to do that without also including a bunch of flake-spam, I'll remove myself and call it a day. |
|
||
notify: | ||
webhooks: | ||
-url: https://api.opsgenie.com/v1/json/circleci?apiKey=f45d58a7-15a4-4800-8d2c-f460ec02cf71 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for us to document where this magic key comes from in some place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a security issue? Should this API key be injected via env variable? Do we need to revoke and generate a new key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the opsgenie doc linked in the description this is best practice for setting up build alerts.
Given the only thing we use the maintainer rotation for is for "who is triaging" and the only thing this does is hook circle up to opsgenie, I think it's Ok (happy to revoke and rerun if folks disagree). I don't know if opsgenie does client cert auth on this url (I could ask if we care) but I guess fundamentally I'm not terribly worried about malicious use of this. If people want to spam the Envoy maintainers there are easier routes and we can always disable it if we need to :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just paste that link in here as a comment? I suppose it's reasonable that future maintainers who need to adjust the key for whatever reason will follow source control to this PR and its description, but it might be easier if it was a direct comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, pasted both the instructions and the config link in my last update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Alyssa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. For the record, they do not do client cert auth, based on some curl tests.
I'd still be inclined to check this in and deal with spam if we get it, but open to other ideas.
I'll also reach out to opsgenie and ask they do better here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you potentially add a comment here about how to revoke this if we end up with a spam/security issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. I'm mildly appalled they do no checks on this but given we don't carry pagers I care less than I otherwise would :-P
Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Alyssa Wilk <[email protected]>
I believe per https://docs.opsgenie.com/docs/circleci-integration and https://app.opsgenie.com/integration#/edit/CircleCi/d9585dd7-b9c5-4723-abee-647a623639e4 (logged in users only) this will result in CircleCI failures getting emailed to maintainer-on-call
I think emails are better than relying on slack or polling to notice build failures
Risk Level: n/a for Envoy. Medium for annoying @jmarantz (or all of us!) if it ends up spammy :-P
Testing: Testing via submit / rollback. Whee!
Docs Changes: yep
Release Notes: nope
Fixes our build breakages going into the void