Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc(decision): Usage of Transaction Types #73

Merged
merged 24 commits into from
Mar 29, 2023

Conversation

philipphofmann
Copy link
Member

@philipphofmann philipphofmann commented Feb 10, 2023

This RFC aims to give Sentry developers insights into which types of transactions and spans our customers use.

Rendered RFC

@philipphofmann philipphofmann changed the title rfc(decision): Usage of transaction types rfc(decision): Event origin Feb 10, 2023
text/0073-event-origin.md Outdated Show resolved Hide resolved
text/0073-event-origin.md Outdated Show resolved Hide resolved

## Option 1: Event SDK Origin <a name="option-1"></a>

Add a new property to the [SDK interface of the event payload](https://develop.sentry.dev/sdk/event-payloads/sdk/) named `origin` to determine which part of the SDK created the event.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that errors can be mostly figured out based on

  1. name in SDK interface
  2. integration set
  3. mechanism
  4. contextual tags/message/event type

In addition - does it really matter what created an error? I fear this data will just cause anchoring bias.

I see how this is valuable for spans though - perhaps we add a new attribute to spans to track what generated them (transactions would inherit this). IMO I would like to see this proposal to be transactions/spans only, and focus the options on how we can identify them. This also can help in the UI, showing users what parts of their transaction were auto-instrumented/manually-instrumented. In addition, this difference can help the performance product build new features and performance issues..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Java errors it's not as easy. While it's possible in many cases there are also cases like OTEL + Spring Boot + Logging Integration where any of the three shows up as having initialized Sentry (sdk/version) but all three can be active at the same time. Maybe we need to take another look at how we use mechanism and add it in more places for errors.

Mechanism doesn't apply to messages, only exceptions afaict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, @AbhiPrasad. I will update the RFC accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adinauer, which option would work best for your requirements?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume we could use mechanism more thoroughly for errors, e.g. in logging integrations. I still believe a common solution for errors and transactions would be nice but I don't wanna drag this on just because of that. So we can try the mechanism approach first and then revisit if that doesn't work out. Created getsentry/sentry-java#2555


# Options Considered

For every option, Looker picks up the field, but we don't need to index it and make it searchable in Discover. Amplitude could look at this field as a property when users visit issue or transaction detail pages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not terribly familiar with this stuff. Does this mean we only get stats for events / transactions that were actually viewed in the UI by a user? Then this wouldn't fix the wrong attribution of events / transactions to different parts of the SDK. For example in Java our analytics show logging integrations to be amongst the top integrations when it comes to transactions but logging integrations aren't capable of creating transactions themselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that were actually viewed in the UI by a user

Yes exactly. This option works great in combination with any other option and is not mutually exclusive.

Most transactions/spans already contain enough information to identify the type. We can use Amplitude to grab that information, such as transaction and span names and operations, to classify them. This option works great in combination with any other option and is not mutually exclusive..


## Option 1: Event SDK Origin <a name="option-1"></a>

Add a new property to the [SDK interface of the event payload](https://develop.sentry.dev/sdk/event-payloads/sdk/) named `origin` to determine which part of the SDK created the event.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Java errors it's not as easy. While it's possible in many cases there are also cases like OTEL + Spring Boot + Logging Integration where any of the three shows up as having initialized Sentry (sdk/version) but all three can be active at the same time. Maybe we need to take another look at how we use mechanism and add it in more places for errors.

Mechanism doesn't apply to messages, only exceptions afaict.

@philipphofmann philipphofmann changed the title rfc(decision): Event origin rfc(decision): Usage of Transaction Types Feb 17, 2023
@markushi
Copy link
Member

markushi commented Mar 1, 2023

Nice one! I like the idea of extending the span interface with an origin field. Could also be quite useful to see which SDK features are burning a customers quota.
I'm still wondering what kind of insight an SDK developer would get out of this? If there are spans which are not part of any (performance) issues, they're likely less visible - but that's not necessarily bad right? I think we just need to be a bit careful when interpreting some aggregated numbers here.

@philipphofmann philipphofmann marked this pull request as ready for review March 15, 2023 08:56
@marandaneto
Copy link
Contributor

I believe a single option here won't do, if the goal is to track the adoption and usage of our integrations similarly to how we do with integrations and mechanism for errors, we need at least 2 of the adoptions.

  • We don't need to add a top-level property in the event protocol since there are already integrations and mechanism for that (errors specifically).
  • We need to focus on the usage of transactions and spans, they are equally important since some integrations create transactions and some others only spans.

I believe that we should add a new property to the trace context and in the span protocol.
Example (look at origin):

{
  "contexts": {
    "trace": {
      "op": "navigation",
      "description": "User clicked on <Link />",
      "trace_id": "743ad8bbfdd84e99bc38b4729e2864de",
      "span_id": "a0cfbde2bdff3adc",
      "status": "ok",
      "origin": "UserInteractionTracing",
    }
  }
}

And:

{
  "spans": [
    {
      "start_timestamp": 1588601261.481961,
      "description": "GET /sockjs-node/info",
      "tags": {
        "http.status_code": "200"
      },
      "timestamp": 1588601261.488901,
      "parent_span_id": "b0e6f15b45c36b12",
      "trace_id": "1e57b752bc6e4544bbaa246cd1d05dee",
      "op": "http",
      "data": {
        "url": "http:https://localhost:8080/sockjs-node/info?t=1588601703755",
        "status_code": 200,
        "type": "xhr",
        "method": "GET"
      },
      "span_id": "b01b9f6349558cd1"
      "origin": "HttpTracing",
    },
    {
      "start_timestamp": 1588601261.535386,
      "description": "Vue <App>",
      "timestamp": 1588601261.544196,
      "parent_span_id": "9312d0d18bf51736",
      "trace_id": "1e57b752bc6e4544bbaa246cd1d05dee",
      "op": "update",
      "span_id": "b980d4dec78d7344"
      "origin": "VueTracing",
    }
  ]
}

Now I know that the transaction was created by the UserInteractionTracing integration.
One of the spans was created by the HttpTracing integration.
And the last span was created by the VueTracing integration.

If this is consumed by Looker/Redash (this is a MUST IMO), we are able to track usage and adoption of our integrations and prioritize accordingly with its importance for bug fixing, marketing pushes, looking into more features, etc.

Overall this is a +1 for this RFC.
Thanks for doing this @philipphofmann
I really wanna see us taking decisions based on data, more, and more.

@philipphofmann
Copy link
Member Author

Thanks for your input @marandaneto. I added option 6 for the transaction context.

Copy link
Contributor

@marandaneto marandaneto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 5 is my favorite.

@philipphofmann philipphofmann requested a review from a team March 21, 2023 14:22
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 5 sounds good to me. Please don't forget to open a Relay PR that adds the field to both types.

@philipphofmann philipphofmann merged commit 05077db into main Mar 29, 2023
@philipphofmann philipphofmann deleted the rfc/usage-of-transaction-types branch March 29, 2023 06:41
philipphofmann added a commit to getsentry/relay that referenced this pull request Apr 7, 2023
Adds origin to both trace context and the span. The origin of the span
indicates what created the span.

Related RFC getsentry/rfcs#73 and develop docs
PR getsentry/develop#887
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants