Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to disable SIGTERM reporting #4013

Closed
sindresorhus opened this issue May 23, 2024 · 10 comments · Fixed by #4025
Closed

Add the ability to disable SIGTERM reporting #4013

sindresorhus opened this issue May 23, 2024 · 10 comments · Fixed by #4025

Comments

@sindresorhus
Copy link

Problem Statement

Support for SIGTERM was added in #3895, but in practice, I don't find it very useful or actionable. I have just been spammed with a lot of notifications about SIGTERM, which turns out to be users simply rebooting their Mac (I have a crash report form, so some users filled it out with that reason), and the system sending SIGTERM to all apps.

Solution Brainstorm

Add a setting to disable the SIGTERM handling.

Are you willing to submit a PR?

No

@philipphofmann
Copy link
Member

philipphofmann commented May 24, 2024

You can drop these events in beforeSend for now. We're going to add a flag. Would you say it makes sense to disable this per default on macOS, @sindresorhus?

@sindresorhus
Copy link
Author

Would you say it makes sense to disable this per default on macOS, @sindresorhus?

I think it could still be useful to attach more information to watchdog events (as mentioned in #3895). For example, if the app is frozen and macOS tries to terminate it, it would be ideal to get a stack trace of what's currently running, so I could find the culprit. But reporting all SIGTERM events is not useful. So maybe catch SIGTERM events internally, but only use it to enrich other events? (I don't have deep knowledge how the Sentry SDK works though).

@philipphofmann
Copy link
Member

It’s the OS trying to tell you that it wants apps to exit gracefully. I’ve seen it happen in many different scenarios. Some of the interesting ones are when the device is shutdown, when an app will be terminated to update it and when something changes in traits that might require the OS to restart apps.

What’s interesting is that previously this would all go unnoticed to us the developers. If these happen when the user can see them, or perceive them, then we have an issue that need to be resolved. If they happen in the background, then we just hope that state restauration is well implemented and users don’t notice it.

@naftaly you wrote this ⬆️ here #3895 (comment). I'm starting to believe we shouldn't mark these events as fatal crashes but report them as normal events or do you think it would be better to attach this information to other events as suggested above? I think the story is different for macOS and iOS.

@naftaly
Copy link
Contributor

naftaly commented May 27, 2024

@naftaly you wrote this ⬆️ here #3895 (comment). I'm starting to believe we shouldn't mark these events as fatal crashes but report them as normal events or do you think it would be better to attach this information to other events as suggested above? I think the story is different for macOS and iOS.

Interesting. What is making you believe we shouldn’t report these as fatal? Agreed they aren’t really “crashes”, they are OS terminations, but they are still relevant and will cause a cold start which is something we might want to avoid.

I’m not sure suggested attaching the info to another event, there usually won’t be another event since the app is exited after this signal. I’m curious to hear more about your thoughts on this idea.

Update: I just read a bit more from the issue above. I see that in the case @sindresorhus this type of issue might not be relevant. These issues are the kind you want to track in your dashboard, understand why it’s happening and act on them if required, not all will require action. For @sindresorhus, I wouldn’t show these issues to users locally, they’re more interested in “programmer error” type issues. Over time, the vision of how to take on reliability issues will grow and these graceful exits will become much more relevant.

I think a local filter or flag to not report these will work just fine.

@sindresorhus
Copy link
Author

These issues are the kind you want to track in your dashboard, understand why it’s happening and act on them if required, not all will require action.

IMHO, if something is not actionable, it should not be reported as a fatal crash with a notification (and something that count towards the event limit).

@naftaly
Copy link
Contributor

naftaly commented May 28, 2024

These issues are the kind you want to track in your dashboard, understand why it’s happening and act on them if required, not all will require action.

IMHO, if something is not actionable, it should not be reported as a fatal crash with a notification (and something that count towards the event limit).

@sindresorhus You're right. Terminology wise, these aren't even "crashes" (what's a non-fatal crash btw?), they are OS terminations that can sometimes be the apps fault, sometimes not. I'd say Sentry here should provide a way for you to filter what gets reported and counts against your event limit.

@philipphofmann
Copy link
Member

I think adding a flag anyway makes sense, especially for macOS. As the SIGTERM signal is only a request to terminate your app, we must change the level to non-fatal. IMO, the level should be either warning or info. Still, for some users, these events might be useful.

@naftaly, can you elaborate on how you treat SIGTERMs? What would you do to avoid them? Is the stacktrace relevant to you?

@naftaly
Copy link
Contributor

naftaly commented May 28, 2024

I think adding a flag anyway makes sense, especially for macOS. As the SIGTERM signal is only a request to terminate your app, we must change the level to non-fatal. IMO, the level should be either warning or info. Still, for some users, these events might be useful.

FWIW, I think sigterm should be left as fatal since it « is ». In theory, we can catch the signal, and the app wouldn’t be terminated. Of course, that’s theory, in reality, the app then receives a SIGKILL. MacOS and iOS both have many reasons they receive SIGTERM, all being undocumented. What SIGTERM gives you is an extra piece of information, and time to gather data before being terminated. It won’t always be an event you want to make a big deal of.

@naftaly, can you elaborate on how you treat SIGTERMs? What would you do to avoid them? Is the stacktrace relevant to you?

The reliability systems I worked with reported everything. At the point of reporting from client to server, there was an option to filter what you wanted to send up. On the server, there was an option to filter what was important, and categorize them. For example, you had crashes, terminations which were split into smaller categories such as exceptions, OOMs, hangs and so on. There was always a large portion leftover that we had no clue what they were. This is where SIGTERM comes in as it allows us to get a lot more insight into those issues. With that in mind, we were able to reduce that « unexplained » portion by a very large amount. We would use the stack trace (not simply the main thread, but all threads) as well as any other « breadcrumbs » we would collect.

SIGTERM was a game changer at that point. I suggest all systems have it in their backpocket.

@lm
Copy link

lm commented May 29, 2024

iOS (and probably others) sends lots of SIGTERMs to widget extensions. So if you are using Sentry within widget you will end up with great deal of random "crashes" - which they are clearly not.

@kahest
Copy link
Member

kahest commented May 29, 2024

We will add an option for this and make it opt-in. Once we have more feedback, we will decide if and how to iterate (e.g. auto-enable it, adapt grouping, etc.).

@philipphofmann philipphofmann self-assigned this May 29, 2024
philipphofmann added a commit that referenced this issue May 31, 2024
We added support for SIGTERM reporting in the last release and enabled
it by default. For some users, SIGTERM events were verbose and not
actionable. Therefore, we disable it per default.

Fixes GH-4013
philipphofmann added a commit that referenced this issue May 31, 2024
We added support for SIGTERM reporting in the last release and enabled
it by default. For some users, SIGTERM events were verbose and not
actionable. Therefore, we disable it per default.

Fixes GH-4013
philipphofmann added a commit that referenced this issue May 31, 2024
We added support for SIGTERM reporting in the last release and enabled
it by default. For some users, SIGTERM events were verbose and not
actionable. Therefore, we disable it per default.

Fixes GH-4013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants