Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce additional types of exceptions next to mechanism.handled exceptions #10

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Changes from 22 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 120 additions & 0 deletions text/0010-exception-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
* Start Date: 2022-09-17
* RFC Type: feature
* RFC PR: 0010
* RFC Status: draft

# Summary

This RFC suggests a feature which introduces additional types of exceptions next to `mechanism.handled`.

# Motivation

Currently, exception which cause the running software to exit (the process died/hard crash) are marked as `handled: false`. This isn't enough for SDKs where an exception can be unhandled but at the same time doesn't cause the software to exit.

Sentry shows exceptions which aren't caught by the developer (unhandled) but also do not cause the software to exit in the same way as exceptions which are manually caught by the developer. This seems rather unintuitive and makes exceptions seem less severe than they are.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seem inaccurate to me. Or maybe I'm not understanding.

Sentry doesn't show exceptions that are not captured by the user (unhandled) in the same way it shows exceptions exceptions manually captured (handled). One has the red banner:
image

Do you mean that Sentry shows all unhandled errors in the same way, independent if they caused the app to crash?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unhandled exceptions in Flutter don't have that red banner because they don't crash the application. I'm not sure how similar behavior is implemented in other SDKs.

That means there's no visual differentiation between handled and unhandled exceptions as long as they don't cause the software to exit.

Do you mean that Sentry shows all unhandled errors in the same way, independent if they caused the app to crash?

No, unhandled exceptions which cause the software to exit are shown visually different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, Handled and Unhandled exceptions are handled: null or handled: true so there's no red banner, no way to filter or alert them differently.
Only when it crashes on the Native side (C code or Native threads), the flag is handled: false.
So yes, it's bad and the RFC tries to fix this.


This issue impacts for example the [Dart/Flutter SDK](https://github.com/getsentry/sentry-dart/issues/456), the Unity SDK, the React Native SDK and possibly more.

Another issue is, that excpetions which don't cause the software to exit but are unhandled, are not considered in the [`session health` metric](https://develop.sentry.dev/sdk/sessions/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by 'they are not considered'?

My understanding is that we mark a session as crashed for all unhandled errors. For example, on Browser, we finish the session as crashed if there's an unhandled error, even though there was no crash.

Is this referring to Flutter where in the event of a Dart unhandled exception we don't close the session as crashed? But if there's a native unhandled error (which terminates the app) we do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we mark a session as crashed for all unhandled errors. For example, on Browser, we finish the session as crashed if there's an unhandled error, even though there was no crash.

That probably wouldn't be the case for Flutter, if it had sessions, based on the current behavior. Even for Flutter for Web.

Is this referring to Flutter where in the event of a Dart unhandled exception we don't close the session as crashed? But if there's a native unhandled error (which terminates the app) we do?

Yes, exactly.

Based on your questions, I'm assuming that crashed/unhandled mean different things across different SDKs. That's one more reason to align across SDKs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your questions, I'm assuming that crashed/unhandled mean different things across different SDKs. That's one more reason to align across SDKs.

Yes, this RFC tries to align this inconsistency as well @ueman .

Sessions are only marked as crashes when it cashes on the Native side.

Currently, the session would be marked as `errored` instead of `crashed`.

The attribute `thread.errored` was added in the past for similar reasons, but it got [reverted on Relay](https://github.com/getsentry/relay/pull/306) and [Android](https://github.com/getsentry/sentry-android/pull/187).

# Option 1 (recommended)

Based on the problem stated above, I propose to introduce the types of `handled`, `unhandled`, `process termination` (this is the same as the current `handled`, but rephrased to avoid confusion). I'm open for better phrasing of those types, but I'll stick to those names for the rest of the RFC. The meaning of those types is as follows:

- `handled`: The exception was recorded by a developer via `Sentry.capture*` method. May or may not be visually indicated by the Sentry user interface.
- `unhandled`: Indicates whether the exception was recorded automatically by Sentry through the use of a global exception handler or similar. This exception however didn't cause the software to exit, and the software will continue to be executed. This should be visualized in the Sentry user interface, they have a higher severity than the `handled` ones.
- `process termination`: The exception was recorded automatically by Sentry through the use of a exception handler or similar. The exception caused the software to terminate the execution. This should be visualized in the Sentry user interface. This is currently done by the `handled: false` flag in the [exception mechanism](https://develop.sentry.dev/sdk/event-payloads/exception/#exception-mechanism).

A user of Sentry should be able to

- filter events on the issues page or discover for the newly introduces exception types (3 categories).
- highlight (similar to the unhandled label) events of the type unhandled and process termination.
- get alerted for events of the type unhandled and process termination separately.

Currently, there's an unhandled label on the issue's page but it's only highlighted for process termination errors (if `handled: false`).

In order to propagate those exception types, the exception mechanism needs to be adapted:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this option more, because it is backwards compatible.
But I would not add another boolean, the new "process_terminated" variable needs to be added because "handled" is a boolean and thus not very future proof.
So I would add a number/enum kind of type called something more general like error_state or such that can then have values like process_terminated, timeout, deadlock, caught-fire, ...


```json
{
"exception": {
"values": [
{
"type": "Error",
"value": "An error occurred",
"mechanism": {
"type": "generic",
"handled": true,
"process_terminated": false // <--- newly introduced field
}
}
]
}
}
```

In order to achieve backwards compatibility, in the absence of the `process_terminated` flag, the current behavior stays as is.
As soon as the `process_terminated` flag is present the bavior is as follows:

- `handled = false` and `process_terminated = false`: Exception is not handled by the user but didn't cause the software to terminate. Same as `unhandled` in the list above
- `handled = false` and `process_terminated = true`: Software terminated after an unhandled exception. Same as `process termination` in the list above
- `handled = true` and `process_terminated = false`: Exception was reported via `Sentry.capture*()` method. Same as `handled` in the list above.
- `handled = true` and `process_terminated = true`: Software was gracefully shut down after an handled exception. This should never happen and is invalid.

In the absence of the `handled` or its value being null, it's assumed to be `handled = true`. This is also the current behavior.

The introduction of the `process_terminated` flag enables the consideration of such exception types in the `session health` metric.

The [session protocol](https://docs.sentry.io/product/releases/health/#session-status) would need to change as well though, because there are only `errored` and `crashed` states.

# Option 2

This one is very similar to option 1, however instead of an additional flag, this introduces an enum for the different types.
Once again, the mechanism needs to be adapted:

```json
{
"exception": {
"values": [
{
"type": "Error",
"value": "An error occurred",
"mechanism": {
"type": "generic",
"exception_type": "handled|unhandled|process_termination", // <--- newly introduced field
}
}
]
}
}
```

If the currently available `handled` flag is also present, the `exception_type` flag takes precedence. The `handled` flag however should become deprecated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag is how users current filter and set alerts. Deprecating this feel like a big change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why it's not the recommended one, if back compatibility is important.


The introduction of the `exception_type` flag enables the consideration of such exception types in the `session health` metric.
marandaneto marked this conversation as resolved.
Show resolved Hide resolved

# Option 3

Unhandled exceptions, which don't cause a process termination, are considered like exceptions which cause the process to terminate and are marked `handled: false`
That would however make it impossible to differentiate between those exception types in the `session health` metric, filters, alerts, etc.

# Approaches taken by other monitoring tools

- Crashlytics just differentiates between manually caught exceptions and unhandled exceptions, regardless of whether they cause the process to terminate.

# List of SDK to which this applies

This list might be incomplete

- [Flutter](https://github.com/getsentry/sentry-dart/issues/456)
- Browser SDKs
- Unity
- React Native
ueman marked this conversation as resolved.
Show resolved Hide resolved
- .NET (UnobservedTaskException)

# Unresolved Questions

- Are there any other exception types next to the ones metioned in this RFC?
marandaneto marked this conversation as resolved.
Show resolved Hide resolved
ueman marked this conversation as resolved.
Show resolved Hide resolved