Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc(feature): Sentry Semantic Conventions #116

Merged
merged 10 commits into from
Oct 24, 2023

Conversation

AbhiPrasad
Copy link
Member

@AbhiPrasad AbhiPrasad commented Oct 12, 2023

closes getsentry/team-webplatform-meta#107

The purpose of this RFC is to formalize the a set of semantic conventions in Sentry, aligning with OpenTelemetry's semantic conventions. These will be a standardized naming scheme for operations and data that will be shared across the SDKs, ingest, and the product.

This moves us closer to OpenTelemetry, helps reduce friction when creating new product features that rely on conventions around data, and helps us avoid the need to create new conventions for data that is already covered by OpenTelemetry.

At the current time, these semantic conventions are meant to apply to spans, breadcrumbs, and metrics, but not to errors/transactions/replays/crons. This may change in the future.

Rendered RFC

@AbhiPrasad AbhiPrasad force-pushed the rfc/sentry-semantic-conventions branch from c0f8108 to 3613abc Compare October 12, 2023 13:50
@AbhiPrasad AbhiPrasad marked this pull request as ready for review October 12, 2023 17:05

[User interface](https://develop.sentry.dev/sdk/event-payloads/user/):

- `user.id` -> `sentry.user.id`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will update, thank you for the reminder.


By having a versionig/packaging structure like this, it also makes it much easier for the SDKs to identify what parts of the product are using the data they are sending.

## Mapping Sentry Specific Fields to Attributes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So some things that are missing here, at least things I needed to manually handle in the POTEL experiments, are:

  • span origin (sentry.origin ?)
  • span op (sentry.op ?)
  • span source (sentry.source ?)
  • sample rate (sentry.sample_rate ?)

we also kind of have a parent sampled flag, but this may also just remain an internal thing I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure this will ever get to an exhaustive state, more just important examples, but I think we need a good analog for trace context. Will add that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, these were just the things I noticed we need to kind of maintain feature parity - esp. origin, op and source I believe? sample rate is maybe less important I guess.

- `sdk.packages` -> `sentry.sdk.packages`

For the contexts defined in as part of our contexts payload, we can flatten it so that the context name become the prefix to the attribute key. For example the `os` context would become `os.name`, `os.version`, `os.kernel_version`, etc. Some of these new keys will have to be renamed to better match OpenTelemetry's semantic conventions, but that can be approached in a case by case basis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing that we may have to handle, but maybe also want to do in a follow up, are breadcrumbs? The timed events in OTEL also use the same attributes schema. In the POTEL experiment I have:

  • sentry.breadcrumb.type
  • sentry.breadcrumb.level
  • sentry.breadcrumb.event_id
  • sentry.breadcrumb.category
  • sentry.breadcrumb.data --> this is a bit tricky as it is arbitrary JSON, currently I store this as serialized JSON string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally breadcrumbs data matches the attributes, we can structure them the same way otel does it's logs signal.

for span events that are attached that map to breadcrumbs, we probably want to "unroll" breadcrumb data into the top level keys for attributes on span events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify, I do store these as regular attributes on a timed event :) But we still need universally understood attributes on there, I guess? timed events have only a name other than the attributes. Not sure what we do with these fields on the breadcrumb right now though, tbh.

- `request.url` -> [`url.full`](https://github.com/open-telemetry/semantic-conventions/blob/cadfe53949266d33476b15ca52c92f682600a29c/model/trace/http.yaml#L47)
- `request.query_string` -> [`url.query_string`](https://github.com/open-telemetry/semantic-conventions/blob/cadfe53949266d33476b15ca52c92f682600a29c/model/trace/http.yaml#L47)
- `request.cookies` -> `http.request.cookies`
- `request.cookies` -> `http.request.headers`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `request.cookies` -> `http.request.headers`
- `request.headers` -> `http.request.headers`

Is the idea of cookies / headers to store a JSON string or to store them in nested dot notation, e.g.

span.setAttribute('request.headers.content-type', 'application/json');

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still undecided if we should do

we will allow dictionaries as attribute values for better backwards compatibility with existing Sentry fields

The idea here was to use a JSON string because it is the easiest to adapt from what we have, but maybe we should strictly enforce the otel attributes schema of no nesting. Still going back and forth on it.


## Backwards Compatibility

We will support the `data` field as an alias for `attributes` for backwards compatibility. This means that breadcrumbs and spans can adopt the `data` field immediately. There is no timeline for us to remove the `data` field, but new signals like crons or metrics should use `attributes` as the field name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean SDKs would send a hybrid format for the time being? If it wasn't for customer relays, I would suggest the SDK just starts sending the new format, and relay takes care of converting to whatever format we use internally.

Maybe during the transition phase, SDKs could send a simple "do you support the new schema?" request to relay before sending any data, and fall back to the old schema only if that request fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an SDK sends stuff in the data field we should be good to go for backwards compat in breadcrumbs/spans.

I think we're not going to make these changes to errors (too much friction to change) or transactions (getting phased out by spans anyway).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was about errors too. At least we mention errors in the intro of this RFC. Are errors in scope or not? Maybe we should clarify this?


## Attributes Schema

Attribute keys should be unique and well-known, and should not be used for multiple purposes. In OpenTelemetry the attribute values cannot be dictionaries, only primitives and arrays of primitives. We will try to follow this convention as well, but we will allow dictionaries as attribute values for better backwards compatibility with existing Sentry fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In OpenTelemetry the attribute values cannot be dictionaries, only primitives and arrays of primitives

Does this mean data would no longer be nested, but rather a flat list of attributes? That is

{"sentry.user.id": 1}

instead of

{"sentry": {"user": {"id": 1}}}

Asking because we have some heavily nested objects in our event schema (e.g. exceptions with multiple stack traces, which each have multiple frames). I guess those would never adhere to the attributes schema?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes in the otel schema they have to be flattened. I think we should be more liberal than that though to start off, we can eventually get to a flattened state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering how to represent stack traces in this format. It's an important use case we cannot dismiss.

It's not just stacktraces. debug_meta for example too.

Ideally we have some sort of plan of attack or migration path lined out in this doc for fields like these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, OTEL has the built in type EXCEPTION_STACKTRACE which is just a string.

Comment on lines 83 to 85
| Array<string>
| Array<number>
| Array<boolean>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will mean no mixed arrays? eg ['foo', 42, true]?

It's probably fine but just raising to be sure somebody thought about this.


## Attributes Schema

Attribute keys should be unique and well-known, and should not be used for multiple purposes. In OpenTelemetry the attribute values cannot be dictionaries, only primitives and arrays of primitives. We will try to follow this convention as well, but we will allow dictionaries as attribute values for better backwards compatibility with existing Sentry fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering how to represent stack traces in this format. It's an important use case we cannot dismiss.

It's not just stacktraces. debug_meta for example too.

Ideally we have some sort of plan of attack or migration path lined out in this doc for fields like these.


## Backwards Compatibility

We will support the `data` field as an alias for `attributes` for backwards compatibility. This means that breadcrumbs and spans can adopt the `data` field immediately. There is no timeline for us to remove the `data` field, but new signals like crons or metrics should use `attributes` as the field name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was about errors too. At least we mention errors in the intro of this RFC. Are errors in scope or not? Maybe we should clarify this?

@AbhiPrasad
Copy link
Member Author

AbhiPrasad commented Oct 24, 2023

I spent a bunch of time mulling over various decisions, but I've decided on a couple things

  1. The convention attributes can no longer be nested - they must be flat top level on the attributes object. We will provide mappings for all possible values in the event schema to address this
  2. The conventions are only applying to spans/breadcrumbs/metrics. Crons/errors/replays/transactions are not addressed in this case, but those signals can choose to adopt the conventions if they wish. This is to assist with ease of adoption and reduce hard to deal with changes
  3. I've more explicitly wrote down mappings for op, sample_rate and other sentry specific performance monitoring values. I've also put a note saying that the attribute transforms are not exhaustive.

@AbhiPrasad
Copy link
Member Author

I'm moving this RFC into implementation state. We're going to work on creating a repo to store our conventions, and publish libraries that both sentry and relay can use.

@AbhiPrasad AbhiPrasad merged commit 29586dc into main Oct 24, 2023
2 checks passed
@AbhiPrasad AbhiPrasad deleted the rfc/sentry-semantic-conventions branch October 24, 2023 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC about semantic conventions
4 participants