Metric Exporters: Specify/Support Batching per-data point on top of per-metric? #3494

alxbl · 2023-05-10T13:03:03Z

I was initially going to open this with the opentelemetry-dotnet SDK (specifically for the OTLP exporter), but this could technically apply to all exporters or the OTLP exporter spec itself, so I'd like to see if there is interest on having this be part of the spec or if this is rather an implementation detail left up for interpretation to each exporter. In the latter case, I would/should probably open this issue specifically for the .NET SDK. Let me know.

What are you trying to achieve?

For large amount of data points for a single metric in a process, I would like the OTLP exporter to batch the data points across several OTLP requests to the collector to avoid prohibitively large messages that get rejected by the collector.

The problem right now is that batching happens at the metric-level, so if a process is reporting a single metric for many different services, this can lead to extremely large batches.

What did you expect to see?

The exporter could split the data over several messages and send them over a short time interval to reduce the spikiness of network bandwidth and support large amounts of data points.

Additional context.

For business reasons, we have a single process which ends up monitoring several thousands (even hundreds of thousands) of non-telemetry-aware devices and reports their up metric on their behalf (possibly also a target_info as well) I was performing some scale tests with 1-5 labels per data point and ran into some issues with default collector configurations at around 50,000-60,000 data points. The main issue is that the message ends up being too large to be accepted by the collector. I know I could modify the maximum message size in the collector, but by doing that, I am not able to address the burst of network bandwidth that will inevitably occur every export interval.

I experimented with adding batching per-data point, and I was able to scale up to 500,000 data points per process very easily (could probably go higher).

My main reason for creating this issue is that I would like to avoid having to maintain a fork of the SDK for these changes, so I want to know if this is something that is of interest to the OpenTelemetry maintainers or if it's an edge case that is best handled by a custom exporter/forking the existing OTLP exporter implementation.

I can provide a proof-of-concept patch (for .NET SDK) if that helps. The code is currently experimental and incomplete, because I was only attempting to put some numbers on the possible performance gains.

EDIT: If this is something there is interest for, I would be happy to contribute to the development or specification effort

The text was updated successfully, but these errors were encountered:

alxbl added the spec:metrics Related to the specification/metrics directory label May 10, 2023

github-actions bot assigned bogdandrutu May 10, 2023

alxbl mentioned this issue Apr 24, 2024

REQUEST: New membership for alxbl open-telemetry/community#2071

Closed

6 tasks

austinlparker added the triage:deciding:community-feedback label May 14, 2024

austinlparker unassigned bogdandrutu May 14, 2024

dyladan mentioned this issue May 21, 2024

Max Export Size for Metrics/Logs/Traces #2772

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metric Exporters: Specify/Support Batching per-data point on top of per-metric? #3494

Metric Exporters: Specify/Support Batching per-data point on top of per-metric? #3494

alxbl commented May 10, 2023 •

edited

Loading

Metric Exporters: Specify/Support Batching per-data point on top of per-metric? #3494

Metric Exporters: Specify/Support Batching per-data point on top of per-metric? #3494

Comments

alxbl commented May 10, 2023 • edited Loading

alxbl commented May 10, 2023 •

edited

Loading