feat: Add SIGTERM support #3895

naftaly · 2024-04-24T16:52:23Z

📜 Description

Added support for catching the SIGTERM signal.

💡 Motivation and Context

All Apple OS's send SIGTERM (like any good unix based system) when they want to request a graceful termination of an application, when the app doesn't quit in those circumstances, it'll receive a SIGKILL. This often gives us the opportunity to get a stack trace where we know Apple wants the app to be terminated but don't know why (watchdog events). This is just one more piece of information that can be added to the puzzle of figuring out unexplained app terminations.

💚 How did you test it?

I ran the tests as well as Tested on a real app.

📝 Checklist

You have to check all boxes before merging:

I reviewed the submitted code.
I added tests to verify the changes.
No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
I updated the docs if needed.
Review from the native team if needed.
No breaking change or entry added to the changelog.
No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

armcknight

Thanks for the PR @naftaly! I think it's fine if we leave in the SIGTRAP logic, it looks like that was a bug that has since been fixed upstream. Would you mind making that small change?

Sources/SentryCrash/Recording/Tools/SentryCrashSignalInfo.c

codecov · 2024-04-25T18:02:38Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.604%. Comparing base (1267cb0) to head (4932e35).

Additional details and impacted files

@@              Coverage Diff              @@
##              main     #3895       +/-   ##
=============================================
- Coverage   90.815%   90.604%   -0.211%     
=============================================
  Files          590       589        -1     
  Lines        45946     45854       -92     
  Branches     16380     16273      -107     
=============================================
- Hits         41726     41546      -180     
- Misses        4040      4217      +177     
+ Partials       180        91       -89

Files	Coverage Δ
Sources/SentryCrash/Recording/SentryCrashDoctor.m	`55.597% <100.000%> (+2.012%)`	⬆️
...entryCrash/Recording/Tools/SentryCrashSignalInfo.c	`100.000% <ø> (ø)`
...ntryTests/SentryCrash/SentryCrashDoctorTests.swift	`100.000% <100.000%> (ø)`

... and 42 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1267cb0...4932e35. Read the comment docs.

armcknight

The code changes look good to me, we might just want to push this diff up in one of our own branches to get all the other tests running.

naftaly · 2024-05-06T21:18:33Z

@armcknight is there anything you need me to do to get this on in?

armcknight · 2024-05-06T21:21:22Z

I'm just running the other tests really quick in #3946, and then we can merge this! Thanks @naftaly for your patience.

armcknight

I'm happy with the results of the tests in #3946! 🙏🏻

philipphofmann

Just some missing tests. Apart from that LGTM. Thanks for doing this @naftaly 💯

Sources/SentryCrash/Recording/SentryCrashDoctor.m

philipphofmann

LGTM, thanks for the test @naftaly 👏

Add support for catching sigterm signals. Co-authored-by: Philipp Hofmann <[email protected]>

eric · 2024-05-21T22:30:43Z

What is this telling me when I get this on iOS and tvOS? Is this a good or a bad thing?

What would Sentry have been reporting otherwise? The Watchdog event?

naftaly · 2024-05-21T22:41:48Z

What is this telling me when I get this on iOS and tvOS? Is this a good or a bad thing?

It’s the OS trying to tell you that it wants apps to exit gracefully. I’ve seen it happen in many different scenarios. Some of the interesting ones are when the device is shutdown, when an app will be terminated to update it and when something changes in traits that might require the OS to restart apps.

What’s interesting is that previously this would all go unnoticed to us the developers. If these happen when the user can see them, or perceive them, then we have an issue that need to be resolved. If they happen in the background, then we just hope that state restauration is well implemented and users don’t notice it.

What would Sentry have been reporting otherwise? The Watchdog event?

Likely nothing, it would have gone unnoticed.

eric · 2024-05-21T22:45:19Z

We've always had a large number of WatchdogTermination events that didn't correspond to memory issues. Could it have been that these were actually receiving SIGTERM and it was not handled/reported and thus turned into the WatchdogTermination event?

naftaly · 2024-05-21T22:51:50Z

We've always had a large number of WatchdogTermination events that didn't correspond to memory issues. Could it have been that these were actually receiving SIGTERM and it was not handled/reported and thus turned into the WatchdogTermination event?

I guess it’s possible since the Watchdog code in Sentry (if I’m not mistaken) simply returns at the end of its heuristics and calls it an OOM/Watchdog event. So if the actual signal isn’t caught, then that might be the default.

FWIW, I’ve also proposed a PR to add real OOM recognition so that could help with other false positives. It might end up in KSCrash first though which is what Sentry uses for crash reporting.

naftaly · 2024-05-21T22:53:17Z

@eric if you’re getting SIGTERM’s now, it would be interesting to know what type of event is going down (if any). That would help understand exactly what is happening or will happen for your app.

eric · 2024-05-21T22:54:46Z

I just received the first SIGTERM on our Testflight beta today, and so wasn't sure what to do with it... it didn't really feel actionable.

eric · 2024-05-21T22:57:00Z

We do currently send memory usage with all of our breadcrumbs to try to give us a better idea of where we have actual memory issues. It always seemed silly to me that Sentry couldn't at least sample the memory usage once per second and store it somewhere to give a more informed message related to the Watchdog events.

naftaly · 2024-05-21T22:59:36Z

I just received the first SIGTERM on our Testflight beta today, and so wasn't sure what to do with it... it didn't really feel actionable.

If it’s background and it feels unactionable, then I would not prioritize it, but still keep in mind it’s there and happening. If it’s foreground, then I usually start adding breadcrumbs to make it actionable if the stack isn’t enough to give you an idea of why it is going on. These issues are definitely going to harder than your run of the mill exception since these happen based on what happened in the past vs. A mistake that might have happened at that point in time. If you can share the stack (of all threads) then I would be happy to take a look and see if I spot anything.

eric · 2024-05-21T23:04:59Z

It doesn't seem like much of anything is happening here: https://fancy-bits-llc.sentry.io/issues/5386894502/

What is the signal that is sent when the OS decides that it's time for your app to stop running in the background when it needs to get more memory? Is SIGTERM ever going to be sent during normal operation?

naftaly · 2024-05-21T23:12:59Z

It doesn't seem like much of anything is happening here: https://fancy-bits-llc.sentry.io/issues/5386894502/

I don’t have access. Maybe copy it to a gist or something.

What is the signal that is sent when the OS decides that it's time for your app to stop running in the background when it needs to get more memory?

That would be SIGKILL, we can’t catch it. Sometimes, if we’re really lucky, a SIGTERM comes before it, but I’ve not seen it often.

Is SIGTERM ever going to be sent during normal operation?

Depends what normal operation is, but yes, it can happen at any time. It’s a vestige of unix where it’s meant to tell apps to do their cleanup and exit otherwise be “exited”.

philipphofmann · 2024-05-22T13:21:05Z

We do currently send memory usage with all of our breadcrumbs to try to give us a better idea of where we have actual memory issues.

We have that planned here: #3518 (comment), but we haven't gotten to it yet, sorry.

It always seemed silly to me that Sentry couldn't at least sample the memory usage once per second and store it somewhere to give a more informed message related to the Watchdog events.

Yes, that makes sense. I created an issue. Thanks for the input 👏 , @eric. #4003

naftaly requested review from philipphofmann, brustolin and armcknight as code owners April 24, 2024 16:52

armcknight reviewed Apr 25, 2024

View reviewed changes

Sources/SentryCrash/Recording/Tools/SentryCrashSignalInfo.c Outdated Show resolved Hide resolved

armcknight mentioned this pull request Apr 25, 2024

Uptake of KSCrash bug fixes #3900

Closed

naftaly added 3 commits April 29, 2024 18:20

Add SIGTERM support

1007f17

Re-enable SIGTRAP

cd56fd5

Update CHANGELOG.md

43c74cc

naftaly force-pushed the SIGTERM branch from fcd5e68 to 43c74cc Compare April 29, 2024 22:20

armcknight reviewed May 6, 2024

View reviewed changes

armcknight mentioned this pull request May 6, 2024

Add SIGTERM support #3946

Closed

armcknight approved these changes May 6, 2024

View reviewed changes

philipphofmann reviewed May 7, 2024

View reviewed changes

Sources/SentryCrash/Recording/SentryCrashDoctor.m Show resolved Hide resolved

naftaly added 2 commits May 7, 2024 11:57

Merge branch 'getsentry:main' into SIGTERM

21750f3

Added a sigterm test for the crash doctor

4a8091f

naftaly requested a review from philipphofmann May 7, 2024 16:13

naftaly mentioned this pull request May 7, 2024

Added support for SIGTERM kstenerud/KSCrash#472

Merged

philipphofmann approved these changes May 8, 2024

View reviewed changes

philipphofmann added 4 commits May 8, 2024 09:45

Merge branch 'main' into SIGTERM

2cf812b

changelog

906e20f

format code

6607b20

Merge branch 'main' into SIGTERM

4932e35

philipphofmann changed the title ~~Add SIGTERM support~~ feat: Add SIGTERM support May 8, 2024

philipphofmann merged commit ad404de into getsentry:main May 8, 2024
63 of 68 checks passed

philipphofmann added a commit that referenced this pull request May 10, 2024

feat: Add SIGTERM support (#3895)

894ba28

Add support for catching sigterm signals. Co-authored-by: Philipp Hofmann <[email protected]>

threema-matteo pushed a commit to threema-ch/sentry-cocoa that referenced this pull request May 21, 2024

feat: Add SIGTERM support (getsentry#3895)

09c3566

Add support for catching sigterm signals. Co-authored-by: Philipp Hofmann <[email protected]>

sindresorhus mentioned this pull request May 23, 2024

Add the ability to disable SIGTERM reporting #4013

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add SIGTERM support #3895

feat: Add SIGTERM support #3895

naftaly commented Apr 24, 2024 •

edited

Loading

armcknight left a comment

codecov bot commented Apr 25, 2024 •

edited

Loading

armcknight left a comment

naftaly commented May 6, 2024

armcknight commented May 6, 2024

armcknight left a comment

philipphofmann left a comment •

edited

Loading

philipphofmann left a comment

eric commented May 21, 2024 •

edited

Loading

naftaly commented May 21, 2024

eric commented May 21, 2024

naftaly commented May 21, 2024

naftaly commented May 21, 2024

eric commented May 21, 2024

eric commented May 21, 2024

naftaly commented May 21, 2024

eric commented May 21, 2024

naftaly commented May 21, 2024

philipphofmann commented May 22, 2024

feat: Add SIGTERM support #3895

feat: Add SIGTERM support #3895

Conversation

naftaly commented Apr 24, 2024 • edited Loading

📜 Description

💡 Motivation and Context

💚 How did you test it?

📝 Checklist

🔮 Next steps

armcknight left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 25, 2024 • edited Loading

Codecov Report

armcknight left a comment

Choose a reason for hiding this comment

naftaly commented May 6, 2024

armcknight commented May 6, 2024

armcknight left a comment

Choose a reason for hiding this comment

philipphofmann left a comment • edited Loading

Choose a reason for hiding this comment

philipphofmann left a comment

Choose a reason for hiding this comment

eric commented May 21, 2024 • edited Loading

naftaly commented May 21, 2024

eric commented May 21, 2024

naftaly commented May 21, 2024

naftaly commented May 21, 2024

eric commented May 21, 2024

eric commented May 21, 2024

naftaly commented May 21, 2024

eric commented May 21, 2024

naftaly commented May 21, 2024

philipphofmann commented May 22, 2024

naftaly commented Apr 24, 2024 •

edited

Loading

codecov bot commented Apr 25, 2024 •

edited

Loading

philipphofmann left a comment •

edited

Loading

eric commented May 21, 2024 •

edited

Loading