Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client span keys: suppressing same instrumentation #3691

Merged
merged 7 commits into from
Aug 14, 2021

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Jul 27, 2021

This is a draft implementation of client span keys per instrumentations that suppresses spans with the same semantics from being created.
E.g.

DB (fine)
  HTTP (fine)
    HTTP (suppressed)
       user custom (fine)
  • Supports HTTP, RPC, DB and Messaging

Features:

  • when instrumentation type is set (affects client spans only), suppresses duplicate layers
  • nested server/consumer spans are suppressed - no change here
  • Generic instrumentation is never suppressed
  • if Instrumenter is built with no type (configurable with system property), preserves current behavior - suppresses any nested spans. Going forward we can remove current behavior and switch to suppression based on types
  • Instrumenter builder can figure out instrumentation type from attribute extractors (i.e. we can potentially make IntrumentationType package-private.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstrumentationType class is created to achieve extensibility and we may also use it to improve the registration of custom types. But I'm also happy to shrink it back to enum.

I sent a small PR to your PR, to simplify InstrumentationType a bit (short of making it an enum). I also like the idea of simplifying it all the way to an enum.

It would be good to have a configuration setting to suppress all nested CLIENT spans (preserving current behavior), since some backends don't handle nested CLIENT spans well yet.

Comment on lines 70 to 65
// TODO we should be able to figure it out from the schema_url?
// but then we'd have multiple versions with different urls
// and we want to suppress multiple versions of the same instrumentation
private final InstrumentationType instrumentationType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think schema_url is something like https://opentelemetry.io/schemas/1.5.0 and doesn't include the semantic convention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think http and db schemas may be versioned differently and in the same app you may have instrumentations with different versions (e.g. auto and native or manual instrumentations).
Also schema_url is an arbitrary url that you get Tracer with, so ideally it may be somehow associated with just this instrumentation type (e.g. https://opentelemetry.io/schemas/http/1.5.0). Anyway, that's not how it works now.

SpanNameExtractor<? super REQUEST> spanNameExtractor) {
return new InstrumenterBuilder<>(openTelemetry, instrumentationName, spanNameExtractor);
return new InstrumenterBuilder<>(openTelemetry, instrumentationName, instrumentationType, spanNameExtractor);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since instrumentationType is only for client spans(?), it would be nice to scope it to just client instrumenters. Let's see if @anuraaga and @mateuszrzeszutek have any thoughts how to do this.

Copy link
Contributor Author

@lmolkova lmolkova Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall we thought it would be useful to have a server span key as well, but as a way to enrich it rather than suppress other. Is there a need in java, e.g. in Spring to ever suppress server spans? I remember such cases in .NET world.
Anyway, agree that beyond client spans it's not that important

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that makes sense, we do suppress nested SERVER spans in Java instrumentation today. i c now, since ClientSpan creates it's own context keys, those wouldn't overlap with context keys created by ServerSpan 👍

@iNikem
Copy link
Contributor

iNikem commented Jul 28, 2021

It would be good to have a configuration setting to suppress all nested CLIENT spans (preserving current behavior), since some backends don't handle nested CLIENT spans well yet.

Also the opposite: not suppress anything.

@trask
Copy link
Member

trask commented Jul 28, 2021

It would be good to have a configuration setting to suppress all nested CLIENT spans (preserving current behavior), since some backends don't handle nested CLIENT spans well yet.

Also the opposite: not suppress anything.

In SIG last week I thought your use case was capturing HTTP spans under Database spans (and network spans under HTTP spans in the future). And that you didn't need to capture "duplicate" HTTP client spans under another HTTP client span. Though the logic seems nicely centralized with this approach and so adding such an option seems reasonable if someone really wants that capability.

Comment on lines 30 to 34
instrumentationKeys.put(InstrumentationType.HTTP, ContextKey.named(TYPED_CONTEXT_KEY_PREFIX + InstrumentationType.HTTP));
instrumentationKeys.put(InstrumentationType.DB, ContextKey.named(TYPED_CONTEXT_KEY_PREFIX + InstrumentationType.DB));
instrumentationKeys.put(InstrumentationType.RPC, ContextKey.named(TYPED_CONTEXT_KEY_PREFIX + InstrumentationType.RPC));
instrumentationKeys.put(
InstrumentationType.MESSAGING, ContextKey.named(TYPED_CONTEXT_KEY_PREFIX + InstrumentationType.MESSAGING));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, using a map (instead of just static final fields) might make our bridging really complicated (see AgentContextStorage). Perhaps it'd be a bit simpler (from the javaagent perspective) if each InstrumentationTypea had its own ContextKey` field?

@mateuszrzeszutek
Copy link
Member

Hey @lmolkova ,
I tried to implement something that solves the problems I've found: #3734
Please take a look at it, maybe you'll find some part that'll be useful in your PR.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 3, 2021

Thanks everyone for comments and suggestions!

@mateuszrzeszutek, I followed your idea : #3734 in this version, please take a look

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 7, 2021

@trask @mateuszrzeszutek this one is ready for review, can you please take a look.

It needs #3771 to distinguish server/consumer and pass tests (thanks @mateuszrzeszutek for creating it).

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sent a PR to your PR, with an idea to rename/restructure InstrumentationType/SuppressingSpanWrapper into SpanKey and SpanSuppressionStrategy.

I think the explicit ContextKey constants in the SpanKey class will make bridging easier (between user/library instrumentation and javaagent instrumentation).

import org.checkerframework.checker.nullness.qual.Nullable;

// TODO make class (and fields/methods below) package-private after tracers are gone and no need for
// bridging from ClientSpan/ServerSpan/ConsumerSpan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like ServerSpan#fromContextOrNull() will always be needed - there are many instrumentations that use it to update the server span name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense, we can remove the comment (it was mine)


// this is used instead of above, depending on the configuration value for
// otel.instrumentation.experimental.span-suppression-by-type
// named outgoing because it covers both client and producer spans
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we didn't suppress PRODUCER spans at all - should we maintain that behavior? Having a PRODUCER -> CLIENT relationship happens a few times in our messaging instrumentations (a producer calls a http/rpc client that actually sends the message to the broker)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we never suppressed them because there as never 2 producer spans so far and if there were, we'd want to suppress nested ones? With suppression-by-type we'd not suppress HTTP client under messaging producer because types are different. You're right that without it we'd start suppressing them.

BTW, it seems messaging producer can sometimes be client, not sure if we follow it though.

A producer of a message should set the span kind to PRODUCER unless it synchronously waits for a response: then it should use CLIENT.

Anyway, I've separated producer strategy from client and we'd suppress nested producers, but never suppress CLIENT under PRODUCER.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we never suppressed them because there as never 2 producer spans so far and if there were, we'd want to suppress nested ones?

I that case we probably would want to suppress them.

BTW, it seems messaging producer can sometimes be client, not sure if we follow it though.

🤯
I don't think that we follow it; we always use PRODUCER for all outgoing messaging spans. (Anyway, that's another reason why not using the span kind when suppressing might be a good idea)

Comment on lines 109 to 135
boolean shouldSuppress(SpanKind spanKind, Context parentContext) {
switch (spanKind) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to remove the association between span kind and span suppression at some point in the future - but I guess that's still impossible at this point. Maybe when we split our AttributesExtractor implementations into server/client? We've been thinking of doing that for HTTP, we could as well do the same thing for other conventions.

Copy link
Contributor Author

@lmolkova lmolkova Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems we'd have to specialize instrumenters (so specific instumenter instance work for one span kind) to remove the association between kind/suppression. Or perhaps, if we move CompositeStrategy into the instrumenter and it would have 4 of them - one per kind, it would not remove the association but maybe a bit cleaner.
Either way, a combination of type + kind is important for suppression (i.e. server HTTP span should not suppress client HTTP span)

Another thought is that all strategies can be the same regardless of kind - suppress same type+kind (or suppress the same kind if type disabled) - it removes the association. I assume it is rare to have nested SERVERs of different types and we can easily allow that.

@mateuszrzeszutek
Copy link
Member

Sorry for being a bit late, but between some additional surprise work and GH issues I didn't manage to look at it yesterday 🙇‍♂️

Can you rebase your branch with main? My spring-integration PR has been merged recently, so rebasing should make your PR a bit easier to review.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @lmolkova!!

I have some vague thoughts around naming changes, but will keep thinking and possibly propose some naming changes in a follow-up PR, e.g.

  • calling the options "single"/"layered"/"all" (I'm hoping no one really wants the "all", but we'll see...)
  • the SpanSuppressionStrategy class is more than just about span suppression, it's also about populating span keys for later enrichment, but I'm not sure how to convey both of those purposes. one possible answer to the naming problem is to split SpanSuppressionStratety into two classes, one that handles suppression and one that handles span key management.

import org.checkerframework.checker.nullness.qual.Nullable;

// TODO make class (and fields/methods below) package-private after tracers are gone and no need for
// bridging from ClientSpan/ServerSpan/ConsumerSpan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense, we can remove the comment (it was mine)

@mateuszrzeszutek
Copy link
Member

the SpanSuppressionStrategy class is more than just about span suppression, it's also about populating span keys for later enrichment, but I'm not sure how to convey both of those purposes. one possible answer to the naming problem is to split SpanSuppressionStratety into two classes, one that handles suppression and one that handles span key management.

We pretty much only use ServerSpan and it's just for updating the span name. I don't have any strong opinion on this, but I keep wondering whether we should introduce the LocalRootSpan after all - not for suppression, just for keeping it in context so that it's easy to retrieve it (and rename it).

Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small points, thanks!

instrumentation-api/build.gradle.kts Outdated Show resolved Hide resolved
private static final boolean ENABLE_SPAN_SUPPRESSION_BY_TYPE =
Config.get()
.getBooleanProperty(
"otel.instrumentation.experimental.outgoing-span-suppression-by-type", false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this can be overridden through direct call to enableInstrumentationTypeSuppression I think we should use default- in the property name. Or maybe can remove that method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing enableInstrumentationTypeSuppression method sounds good 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this method makes suppression untestable through instrumetner API. The method is package-private, though, so instrumentations won't be able to call it.

@trask trask merged commit 080c85d into open-telemetry:main Aug 14, 2021
@trask
Copy link
Member

trask commented Aug 14, 2021

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants