Add experimental metric attributes advice API #3546

mateuszrzeszutek · 2023-06-08T09:18:40Z

Changes

Introduces a new option to the metrics advice API, allowing to set the default list of attribute keys that should be preserved in a metrics measurement.
E.g. the following http.client.duration definition will keep the attributes defined in its semconv, and discard any extra attributes passed by the user (Java pseudocode):

duration = meter
    .histogramBuilder("http.client.duration")
    .setUnit("s")
    .setDescription("The duration of the outbound HTTP request")
    .setAdvice(advice -> 
        advice
            // default buckets as defined in semconv
            .setExplicitBucketBoundaries(List.of(0.0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0))
            // default attributes as defined in semconv
            .setAttributes(Set.of(
                "http.request.method",
                "http.response.status_code",
                "network.protocol.name",
                "network.protocol.version",
                "server.address",
                "server.port",
                "server.socket.address"))
    .build();

The user can override the advice using the View API and either further limit or extend the attributes list that are to be kept on a measurement.

MrAlias · 2023-06-08T14:50:38Z

When will this be useful?

The instrument definition is made in instrumentation. The instrumentation already has complete control over what attributes are recorded for the instrument during measurements it itself makes. Why does it need a way to filter attributes via advice instead of just not recording attributes it itself doesn't want to record?

trask · 2023-06-08T14:58:12Z

The use case we have in mind is that this will allow us to send all HTTP span attributes to corresponding HTTP metrics. Then users who want more or less of those attributes on their HTTP metrics (which is a request we've had a few times) can configure a metric view to get that behavior.

MrAlias · 2023-06-08T15:04:16Z

The use case we have in mind is that this will allow us to send all HTTP span attributes to corresponding HTTP metrics. Then users who want more or less of those attributes on their HTTP metrics (which is a request we've had a few times) can configure a metric view to get that behavior.

Ah, gotcha 👍 . Thanks

specification/metrics/api.md

lmolkova · 2023-06-08T15:26:03Z

I suggest coming up with a bit more specific method name - setAllowedAttributes or setPreservedAttributes.

Also, assuming we combine it with #3545, we might get into multiple methods configuring attributes and would need to resolve conflicts and whatnot. Can we define a more-or-less single API for attributes like:

ConfigureAttribute("http.request.method", 
  preserve: true, 
  knownValues: ["GET", "POST", "FOO"], 
  futureOptionalConfigurationOptions: ...)

mateuszrzeszutek · 2023-06-08T15:54:39Z

Hmm, WDYT about sth like

duration = meter
    .histogramBuilder("http.client.duration")
    .setUnit("s")
    .setDescription("The duration of the outbound HTTP request")
    .setAdvice(advice ->
        advice
            // default buckets as defined in semconv
            .setExplicitBucketBoundaries(List.of(0.0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0))
            // default attributes as defined in semconv
            .setAttributesAdvice(attributesAdvice ->
                attributesAdvice
                    // could also be a function of attribute name, as Tyler suggested
                    .preserveAttributes(
                        "http.request.method",
                        "http.response.status_code",
                        "network.protocol.name",
                        "network.protocol.version",
                        "server.address",
                        "server.port",
                        "server.socket.address")
                    .limitAttribute("http.request.method", List.of("GET", ...), "other")))
    .build();

Having one more level to the advice would allow us to add more complex transformations in the future

lmolkova · 2023-06-08T16:04:43Z

Hmm, WDYT about sth like

@mateuszrzeszutek looks great, thnx!

jack-berg · 2023-06-08T19:26:22Z

I think I disagree with adding preservedAttributes. If the set of allowed values are known at the time of instrument creation, than the instrumentation can clamp the values to the known set when recording measurements. Why add more complexity to the SDK to do this?

lmolkova · 2023-06-12T20:20:17Z

came up during discussions open-telemetry/semantic-conventions#35:

when defining hints (in general), we might need to apply them across signals. If this is the case, we'd rather build hints on attributes than on signals.

E.g.

When I drop (or sanitize) an attribute, I might want to do it across all signals at once.
When I provide a list of known values to limit cardinality, I might want to do the same for events/spans (for consistency, or aggregation purposes)

If this is the case, we'd rather do something like:

OpenTelemetry.configureAttribute(a -> a
  .metricsAdvice(...)
  .tracesAdvice(...)
  .logAdvice(...))

Also, with performance considerations, it's not reasonable to post-process attributes on traces and logs.
The instrumentation should be the one that gets advice from OpenTelemetry and applies it. probably not the case

E.g. I'm writing HTTP server instrumentation.

AttributeTraceAdvice  DEFAULT_TRACE_HTTP_METHOD_ADVICE = ...;

Tracer tracer = otel.tracerBuilder("my.http.lib").defaultAdvice(DEFAULT_TRACE_HTTP_METHOD_ADVICE).build();

// HTTP request instrumentation
span.addAttribute("http.request.method",  request.getMethod()); // SDK can apply the advice

If it's not performant enough to get advice for each attribute name inside addAttribute, we can also do

AttributeTraceAdvice advice = otel.getTraceAdvice("http.request.method");
if (advice == null)  advice = DEFAULT_HTTP_METHOD_ADVICE;
...
 
// HTTP request instrumentation
span.addAttribute("http.request.method",  advice.apply(request.getMethod()));

I'd design the API around allowing both options, so high-perf instrumentations can leverage the 2nd one

mateuszrzeszutek · 2023-06-14T13:27:31Z

I think I disagree with adding preservedAttributes. If the set of allowed values are known at the time of instrument creation, than the instrumentation can clamp the values to the known set when recording measurements. Why add more complexity to the SDK to do this?

After some thought, I agree with that - the advice API is intended to be used by the instrumentation authors, and its use case is pretty well defined. I believe the views API is supposed to be the general data manipulation tool aimed to the end users.

mateuszrzeszutek · 2023-06-14T13:28:42Z

when defining hints (in general), we might need to apply them across signals. If this is the case, we'd rather build hints on attributes than on signals.

E.g.

When I drop (or sanitize) an attribute, I might want to do it across all signals at once.

When I provide a list of known values to limit cardinality, I might want to do the same for events/spans (for consistency, or aggregation purposes)

Can you create a separate issue for that? I think it's not in scope for this issue/PR; and the advice API is still experimental, we can add improvements/experiment on it iteratively.

mateuszrzeszutek · 2023-06-16T09:07:51Z

One example where this API could be used is open-telemetry/semantic-conventions#109
The instrumentation would use the attributes advice API to limit the attributes and exclude server.*, and the end user would opt into collecting them by overriding the instrumentation authors' advice with a view. (as an extra benefit: less configuration flags)

github-actions · 2023-06-24T03:19:58Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

mateuszrzeszutek · 2023-06-26T07:02:07Z

/unstale

Ping @open-telemetry/technical-committee @open-telemetry/specs-approvers @open-telemetry/specs-metrics-approvers

Is there anything you'd like me to add/change in this PR?

reyang · 2023-06-27T14:42:55Z

@jsuereth FYI.

jack-berg · 2023-06-27T16:15:31Z

I did have one reservation about this. This advice encourages instrumentation to record with a broad set of attributes, just for those attributes to be trimmed down later to the default set as specified by advice. In implementations like java where attributes are immutable (and I suspect in other implementations as well) this requires that the attributes be copied and trimmed on every measurement recording, which incurs additional allocations. This will likely be fine in many cases, but will no be suitable in cases that require high performance and are sensitive to allocations.

In order to support this type of behavior while avoiding allocations, we could:

Add some sort of introspection API, where instrumentation code can query which attributes an instrument is going to keep and only record those required.
Expose some sort of attribute builder as a part of the instrument API, which instrumentation would use to construct attributes, and which would ignore attributes not required by resulting metric streams.

I'm still in favor of merging this as is, but wanted to call out that this is probably a tool we'd want to employ thoughtfully.

jmacd · 2023-06-27T16:18:04Z

@jack-berg wrote

Why add more complexity to the SDK to do this?

and in the not-very-different thread here (cc @MrAlias):

What do we gain from allowing you to filter on key-value pairs as opposed to just the keys?

It looks like we'd gain nearly the solution requested in this thread. However, I am also concerned about complexity. Suppose we agree that the attributes-advice API requested here can be translated into a MeasurementProcessor--the real question (to me) is whether an http instrumentation is going to then design itself to (1) generate the list or set of all attributes, and (2) pass them to the SDK to implement attribute reduction?

OR, would you imagine the http instrumentation interprets the advice itself, and skips generating the attributes that were not selected bypassing the second step? I'm not sure either of these approaches is necessarily good.

Thinking now, if I may, about a language-specific example. In OTel-Go there is an attribute.Set type and an []attribute.KeyValue type. Both of these are in fact quite expensive, and as far as I've seen attribute handling is almost always the dominant cost in metrics instrumentation. Neither of these http instrumentations is going to perform well because there are so many attributes, and I do not want (personally) to take either of these approaches without considering the end-to-end performance of the http instrumentation.

Speaking still about the situation in Go, sorting and de-duplicating attributes is expensive, but even just creating lists of attributes is expensive. If we aren't careful, by the time an http metric event reaches the SDK it will have:

processed a list of attributes multiple times for multiple metric instruments
applied one or more measurement processors to limit the set of attributes, allocating intermediate objects along the way.

And isn't the point of limiting attributes predominantly to reduce cost? As a project, I think the synchronous metrics API should support a batch operation (the way OpenCensus did) so that we can at least process the attributes once per request, not once per metrics.

Then, I think each language should consider all its options to get these API calls to be as fast as possible, including any idiomatic approaches that benefit performance. In Go the segmentio/statslibrary shows a nice way to optimize metrics instrumentation-- I would definitely support something like that, and once we have something like that then I think implementing the kind of advice requested here will be not very expensive and worth having.,

mateuszrzeszutek · 2023-06-28T11:02:07Z

I did have one reservation about this. This advice encourages instrumentation to record with a broad set of attributes, just for those attributes to be trimmed down later to the default set as specified by advice. In implementations like java where attributes are immutable (and I suspect in other implementations as well) this requires that the attributes be copied and trimmed on every measurement recording, which incurs additional allocations. This will likely be fine in many cases, but will no be suitable in cases that require high performance and are sensitive to allocations.

I agree, but we're already doing that in the Java instrumentations -- for example, when we record the HTTP metrics we have to take all the attributes that were set on the span and trim them to fit the metric semconv. The advice API would just move the execution of that logic to the SDK, and make it overrideable to the end user.

Expose some sort of attribute builder as a part of the instrument API, which instrumentation would use to construct attributes, and which would ignore attributes not required by resulting metric streams.

I really like that idea 👍

github-actions · 2023-07-06T03:21:12Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

jmacd · 2023-07-11T15:18:11Z

@mateuszrzeszutek What did you conclude about your own proposal above, #3546 (comment)?

mateuszrzeszutek · 2023-07-17T11:39:01Z

@mateuszrzeszutek What did you conclude about your own proposal above, #3546 (comment)?

Ah, good question. In the discussion above (#3546 (comment)) I re-thought the whole thing and decided to keep it as simple as possible. I was worried that the nested attributesAdvice might be a bit too much for the API; and that there isn't a low of use cases for that, especially now that the http.request.method case was solved in a completely different way.

reyang · 2023-07-24T20:51:25Z

@mateuszrzeszutek would you resolve the merge conflicts?

Co-authored-by: Reiley Yang <[email protected]>

mateuszrzeszutek requested review from a team June 8, 2023 09:18

github-actions bot assigned carlosalberto Jun 8, 2023

trask approved these changes Jun 8, 2023

View reviewed changes

MrAlias reviewed Jun 8, 2023

View reviewed changes

specification/metrics/api.md Outdated Show resolved Hide resolved

jack-berg mentioned this pull request Jun 8, 2023

Add attribute filtering function to View metric data stream configuration #3550

Closed

mateuszrzeszutek mentioned this pull request Jun 16, 2023

Make server.socket.* attributes on the HTTP server side opt-in open-telemetry/opentelemetry-java-instrumentation#8747

Merged

jack-berg approved these changes Jun 16, 2023

View reviewed changes

lmolkova mentioned this pull request Jun 16, 2023

Advice/hint APIs should be applied to all signals (when applicable) #3557

Open

mateuszrzeszutek mentioned this pull request Jun 20, 2023

Remove server.{address,port} from HTTP server metrics open-telemetry/opentelemetry-java-instrumentation#8771

Merged

github-actions bot added the Stale label Jun 24, 2023

github-actions bot removed the Stale label Jun 27, 2023

reyang approved these changes Jun 27, 2023

View reviewed changes

github-actions bot added the Stale label Jul 6, 2023

github-actions bot removed the Stale label Jul 12, 2023

Mateusz Rzeszutek and others added 3 commits July 26, 2023 14:11

Add experimental metric attributes advice API

98c631f

add changelog link

03b98cf

Update specification/metrics/api.md

71483d1

Co-authored-by: Reiley Yang <[email protected]>

mateuszrzeszutek force-pushed the attributes-advice-api branch from e25a4c5 to 71483d1 Compare July 26, 2023 12:14

reyang merged commit e8b6199 into open-telemetry:main Jul 26, 2023
6 checks passed

mateuszrzeszutek deleted the attributes-advice-api branch July 26, 2023 18:36

arminru added spec:metrics Related to the specification/metrics directory area:api Cross language API specification issue area:sdk Related to the SDK labels Aug 1, 2023

mateuszrzeszutek mentioned this pull request Aug 3, 2023

Add attributes advice API (just DoubleCounter for now) open-telemetry/opentelemetry-java#5677

Merged

trask mentioned this pull request Sep 25, 2024

How is the cardinality limit applied when attributes are being filtered? #3798

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add experimental metric attributes advice API #3546

Add experimental metric attributes advice API #3546

mateuszrzeszutek commented Jun 8, 2023

MrAlias commented Jun 8, 2023

trask commented Jun 8, 2023 •

edited

Loading

MrAlias commented Jun 8, 2023

lmolkova commented Jun 8, 2023 •

edited

Loading

mateuszrzeszutek commented Jun 8, 2023

lmolkova commented Jun 8, 2023

jack-berg commented Jun 8, 2023

lmolkova commented Jun 12, 2023 •

edited

Loading

mateuszrzeszutek commented Jun 14, 2023

mateuszrzeszutek commented Jun 14, 2023

mateuszrzeszutek commented Jun 16, 2023

github-actions bot commented Jun 24, 2023

mateuszrzeszutek commented Jun 26, 2023

reyang commented Jun 27, 2023

jack-berg commented Jun 27, 2023

jmacd commented Jun 27, 2023 •

edited

Loading

mateuszrzeszutek commented Jun 28, 2023

github-actions bot commented Jul 6, 2023

jmacd commented Jul 11, 2023

mateuszrzeszutek commented Jul 17, 2023

reyang commented Jul 24, 2023

Add experimental metric attributes advice API #3546

Add experimental metric attributes advice API #3546

Conversation

mateuszrzeszutek commented Jun 8, 2023

Changes

MrAlias commented Jun 8, 2023

trask commented Jun 8, 2023 • edited Loading

MrAlias commented Jun 8, 2023

lmolkova commented Jun 8, 2023 • edited Loading

mateuszrzeszutek commented Jun 8, 2023

lmolkova commented Jun 8, 2023

jack-berg commented Jun 8, 2023

lmolkova commented Jun 12, 2023 • edited Loading

mateuszrzeszutek commented Jun 14, 2023

mateuszrzeszutek commented Jun 14, 2023

mateuszrzeszutek commented Jun 16, 2023

github-actions bot commented Jun 24, 2023

mateuszrzeszutek commented Jun 26, 2023

reyang commented Jun 27, 2023

jack-berg commented Jun 27, 2023

jmacd commented Jun 27, 2023 • edited Loading

mateuszrzeszutek commented Jun 28, 2023

github-actions bot commented Jul 6, 2023

jmacd commented Jul 11, 2023

mateuszrzeszutek commented Jul 17, 2023

reyang commented Jul 24, 2023

trask commented Jun 8, 2023 •

edited

Loading

lmolkova commented Jun 8, 2023 •

edited

Loading

lmolkova commented Jun 12, 2023 •

edited

Loading

jmacd commented Jun 27, 2023 •

edited

Loading