[processor/spanmetricsprocessor] Allow set metrics namespace #18199

Cluas · 2023-01-31T15:18:58Z

Description:

The currently generated metric name differs from the description in the Tempo APM table document, such as calls_total vs traces_spanmetrics_calls_total. Why don't we simply open up the setting of the namespace on the processor side, so that I can easily integrate with the Tempo APM table?
so we can just do this:

processors:
  batch:
  spanmetrics:
    metrics_exporter: otlp/spanmetrics
    namespace: traces_spanmetrics
    latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
    dimensions:
      - name: http.method
        default: GET
      - name: http.status_code
    dimensions_cache_size: 1000
    aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"

Link to tracking Issue:

Testing:

Documentation:

linux-foundation-easycla · 2023-01-31T15:19:02Z

The committers listed above are authorized under a signed CLA.

✅ login: Cluas / name: Cluas (83a9736)

processor/spanmetricsprocessor/processor_test.go

albertteoh

Thanks for the contribution, it looks good, just a comment about the test assertion.

albertteoh

👍🏼

devrimdemiroz · 2023-02-09T11:49:35Z

I am looking forward on this change as well 🥇

Thanks @Cluas !

devrimdemiroz · 2023-02-10T00:45:20Z

@Cluas

I hope it resolves with no side effects. I noticed some ugly other stuff like it uses the same export config with spanmetrics. I documented here

Currently, metrictransform processor eased my headache at least.

Cluas · 2023-02-10T00:50:37Z

metricstransform:
transforms:
- include: calls_total
match_type: strict
action: update
new_name: traces_spanmetrics_calls_total
operations:
- action: update_label
label: operation
new_label: span.name
- action: update_label
label: service.name
new_label: service
- include: latency
match_type: strict
action: update
new_name: traces_spanmetrics_latency

Cluas · 2023-02-10T09:11:39Z

@dashpole PTAL

kovrus · 2023-02-10T15:50:47Z

Thanks for addressing this issue!

However, It seems that calls_total mixes the Prometheus naming convention in the metric name. It seems that there was some significant work done in this area, see #8950 and attached PRs, also see prometheus-metric-points-to-otlp spec. Imho, the spanmetrics processor should probably generate a valid OTel metrics name which can be further converted to the Prometheus metrics name format with means of Prom exporters or transformed with certain processors if needed.

I am wondering whether we shall provide the ability to set histogram and counter names explicitly via configuration instead of only providing a prefix? cc: @dashpole @gouthamve @albertteoh

gouthamve · 2023-02-10T16:08:30Z

I agree that the OTLP metric should be calls. When its exported to Prometheus, the exporter will automatically add _total.

I am wondering whether we shall provide the ability to set histogram and counter names explicitly via configuration instead of only providing a prefix?

I like the idea, but I am also a little conflicted. This means if we are generating 5 different metrics, we'll need to set 5 different names? A prefix makes it easier. However if we know that we will only export a couple of metrics from this processor, maybe explicit naming makes sense.

kovrus · 2023-02-10T16:25:07Z

I agree that the OTLP metric should be calls. When it's exported to Prometheus, the exporter will automatically add _total.

Yes, we should drop _total.

I am wondering whether we shall provide the ability to set histogram and counter names explicitly via configuration instead of only providing a prefix?

I like the idea, but I am also a little conflicted. This means if we are generating 5 different metrics, we'll need to set 5 different names? A prefix makes it easier. However if we know that we will only export a couple of metrics from this processor, maybe explicit naming makes sense.

For now, it is only 2. Right, maybe the prefix makes more sense. Also, shall we use . instead of _ to compose OTel metrics name?

gouthamve · 2023-02-10T16:40:10Z

I would use . yep.

Cluas · 2023-02-11T09:34:43Z

@gouthamve @kovrus Thank you both for your review suggestions!

kovrus · 2023-02-11T12:21:23Z

@Cluas thanks! Do you mind opening a separate PR with the breaking renaming change (calls_total to calls)?

Cluas · 2023-02-11T12:38:47Z

@Cluas thanks! Do you mind opening a separate PR with the breaking renaming change (calls_total to calls)?

ok, I'll remove this change first and submit a separate PR change

albertteoh · 2023-02-12T03:59:38Z

I like the idea of dropping _total (and good idea to move it to a separate PR) to keep to OTEL standards and be agnostic to the metrics storage backend.

Re: join on ., it's not clear to me what the consensus was from this discussion: open-telemetry/opentelemetry-specification#2740. Either way, I think it'll be okay since, IIUC, it looks like prometheus exporter will normalise the names and replace . with _.

When we do make these changes, I think we'd need to add clear instructions to include the --feature-gates=pkg.translator.prometheus.NormalizeName feature flag when running otel collector.

runforesight · 2023-02-12T05:44:30Z

Foresight Summary

Major Impacts

build-and-test duration(33 minutes 45 seconds) has decreased 31 minutes 53 seconds compared to main branch avg(1 hour 5 minutes 38 seconds).

View More Details

⭕ build-and-test-windows workflow has finished in 6 seconds (40 minutes 44 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

Job	Failed Steps	Tests
windows-unittest-matrix	- 🔗	N/A	See Details
windows-unittest	- 🔗	N/A	See Details

✅ check-links workflow has finished in 1 minute 21 seconds (1 minute 14 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

Job	Failed Steps	Tests
changed files	- 🔗	N/A	See Details
check-links	- 🔗	N/A	See Details

✅ telemetrygen workflow has finished in 1 minute 49 seconds (1 minute 18 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

Job	Failed Steps	Tests
publish-latest	- 🔗	N/A	See Details
build-dev	- 🔗	N/A	See Details
publish-stable	- 🔗	N/A	See Details

✅ changelog workflow has finished in 2 minutes 20 seconds (1 minute 23 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

Job	Failed Steps	Tests
changelog	- 🔗	N/A	See Details

✅ prometheus-compliance-tests workflow has finished in 12 minutes 21 seconds (⚠️ 3 minutes 32 seconds more than `main` branch avg.) and finished at 14th Feb, 2023.

Job	Failed Steps	Tests
prometheus-compliance-tests	- 🔗	✅ 21 ❌ 0 ⏭ 0 🔗	See Details

❌ build-and-test workflow has finished in 33 minutes 45 seconds (31 minutes 53 seconds less than `main` branch avg.) and finished at 14th Feb, 2023. 3 jobs failed. There are 6 test failures.

Job	Failed Steps	Tests
correctness-metrics	- 🔗	✅ 2 ❌ 0 ⏭ 0 🔗	See Details
correctness-traces	- 🔗	✅ 17 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, extension)	- 🔗	✅ 537 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, extension)	- 🔗	✅ 537 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, internal)	- 🔗	✅ 561 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, internal)	- 🔗	✅ 561 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, processor)	Run Unit Tests 🔗	✅ 1123 ❌ 6 ⏭ 0 🔗	See Details
unittest-matrix (1.18, receiver-0)	- 🔗	✅ 743 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, receiver-0)	- 🔗	✅ 365 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, receiver-1)	- 🔗	✅ 297 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, receiver-1)	- 🔗	✅ 333 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, other)	- 🔗	✅ 0 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, exporter)	- 🔗	✅ 1036 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, other)	- 🔗	✅ 0 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, processor)	- 🔗	✅ 800 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, exporter)	- 🔗	✅ 1547 ❌ 0 ⏭ 0 🔗	See Details
integration-tests	- 🔗	✅ 55 ❌ 0 ⏭ 0 🔗	See Details
setup-environment	- 🔗	N/A	See Details
check-codeowners	- 🔗	N/A	See Details
check-collector-module-version	- 🔗	N/A	See Details
checks	- 🔗	N/A	See Details
build-examples	- 🔗	N/A	See Details
lint-matrix (receiver-0)	- 🔗	N/A	See Details
lint-matrix (receiver-1)	- 🔗	N/A	See Details
lint-matrix (processor)	- 🔗	N/A	See Details
lint-matrix (exporter)	- 🔗	N/A	See Details
lint-matrix (extension)	- 🔗	N/A	See Details
lint-matrix (internal)	- 🔗	N/A	See Details
lint-matrix (other)	- 🔗	N/A	See Details
unittest (1.19)	Interpret result 🔗	N/A	See Details
unittest (1.18)	Interpret result 🔗	N/A	See Details
lint	- 🔗	N/A	See Details
cross-compile	- 🔗	N/A	See Details
windows-msi	- 🔗	N/A	See Details
build-package	- 🔗	N/A	See Details
publish-check	- 🔗	N/A	See Details
publish-stable	- 🔗	N/A	See Details
publish-dev	- 🔗	N/A	See Details

✅ e2e-tests workflow has finished in 15 minutes 10 seconds and finished at 14th Feb, 2023.

Job	Failed Steps	Tests
kubernetes-test	- 🔗	N/A	See Details

❌ load-tests workflow has finished in 25 minutes 50 seconds (⚠️ 8 minutes 48 seconds more than `main` branch avg.) and finished at 14th Feb, 2023. 1 job failed. There are 2 test failures.

Job	Failed Steps	Tests
loadtest (TestIdleMode)	- 🔗	✅ 1 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestTraceAttributesProcessor)	- 🔗	✅ 3 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestMetric10kDPS\|TestMetricsFromFile)	- 🔗	✅ 6 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestMetricResourceProcessor\|TestTrace10kSPS)	- 🔗	✅ 12 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestTraceNoBackend10kSPS\|TestTrace1kSPSWithAttrs)	- 🔗	✅ 8 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestTraceBallast1kSPSWithAttrs\|TestTraceBallast1kSPSAddAttrs)	- 🔗	✅ 10 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestBallastMemory\|TestLog10kDPS)	Loadtest 🔗	✅ 17 ❌ 2 ⏭ 0 🔗	See Details
setup-environment	- 🔗	N/A	See Details

🔎 See details on Foresight

^{*You can configure Foresight comments in your organization settings page.}

kovrus · 2023-02-14T17:13:29Z

@Cluas thanks! Do you mind opening a separate PR with the breaking renaming change (calls_total to calls)?

ok, I'll remove this change first and submit a separate PR change

Let's wait till #18535 is merged and open a pr with dropping _total against the new spanmetrics connector component to not break the spanmetrics processor.

Cluas · 2023-02-15T01:43:42Z

So let's do that on the new spanmetrics connector component and avoid breaking change

dropping _total and renaming operation -> span.name(complies with otel spec), tempo apm table will easier to integrate https://grafana.com/docs/tempo/latest/metrics-generator/app-performance-mgmt/#apm-table-contents.
support set namespace

[processor/spanmetricsprocessor] Allow set metrics namespace

83a9736

Cluas requested a review from a team as a code owner January 31, 2023 15:18

Cluas requested a review from bogdandrutu January 31, 2023 15:18

github-actions bot assigned dashpole Jan 31, 2023

github-actions bot added the processor/spanmetrics Span Metrics processor label Jan 31, 2023

[processor/spanmetricsprocessor] Add namespace comment

08346fa

albertteoh reviewed Feb 3, 2023

View reviewed changes

processor/spanmetricsprocessor/processor_test.go Outdated Show resolved Hide resolved

albertteoh reviewed Feb 3, 2023

View reviewed changes

feat: use assert.Equal

121272d

Cluas requested review from albertteoh and removed request for bogdandrutu February 3, 2023 16:45

albertteoh approved these changes Feb 3, 2023

View reviewed changes

feat: add changelog

a9d19ec

feat: metrics name ust otlp spec

16cbed2

feat: revert calls_total naming change

47d4a5b

fix: tests and chlog

83b7b71

This was referenced Feb 14, 2023

[connectors/spanmetrics] Rename the operation attribute to span.name #18529

Closed

Port spanmetrics processor to spanmetrics connector #18535

Merged

Cluas closed this Feb 15, 2023

kovrus mentioned this pull request Feb 15, 2023

[connectors/spanmetrics] Drop _total from generated metrics names. #18677

Closed

albertteoh mentioned this pull request Feb 16, 2023

Ability to prefiix metrics from spanmetrics processor #18679

Closed

Cluas mentioned this pull request Feb 16, 2023

[processor/spanmetricsprocessor] Support set metrics namespace. #18697

Merged

albertteoh mentioned this pull request Apr 7, 2023

[connector/spanmetricsconnector] fix default metrics initialization #20525

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[processor/spanmetricsprocessor] Allow set metrics namespace #18199

[processor/spanmetricsprocessor] Allow set metrics namespace #18199

Cluas commented Jan 31, 2023

linux-foundation-easycla bot commented Jan 31, 2023 •

edited

Loading

albertteoh left a comment

albertteoh left a comment

devrimdemiroz commented Feb 9, 2023

devrimdemiroz commented Feb 10, 2023 •

edited

Loading

Cluas commented Feb 10, 2023

Cluas commented Feb 10, 2023

kovrus commented Feb 10, 2023 •

edited

Loading

gouthamve commented Feb 10, 2023

kovrus commented Feb 10, 2023

gouthamve commented Feb 10, 2023

Cluas commented Feb 11, 2023

kovrus commented Feb 11, 2023

Cluas commented Feb 11, 2023

albertteoh commented Feb 12, 2023

runforesight bot commented Feb 12, 2023 •

edited

Loading

`build-and-test` duration(33 minutes 45 seconds) has decreased 31 minutes 53 seconds compared to main branch avg(1 hour 5 minutes 38 seconds).

⭕ build-and-test-windows workflow has finished in 6 seconds (40 minutes 44 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ check-links workflow has finished in 1 minute 21 seconds (1 minute 14 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ telemetrygen workflow has finished in 1 minute 49 seconds (1 minute 18 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ changelog workflow has finished in 2 minutes 20 seconds (1 minute 23 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ prometheus-compliance-tests workflow has finished in 12 minutes 21 seconds (⚠️ 3 minutes 32 seconds more than `main` branch avg.) and finished at 14th Feb, 2023.

❌ build-and-test workflow has finished in 33 minutes 45 seconds (31 minutes 53 seconds less than `main` branch avg.) and finished at 14th Feb, 2023. 3 jobs failed. There are 6 test failures.

✅ e2e-tests workflow has finished in 15 minutes 10 seconds and finished at 14th Feb, 2023.

❌ load-tests workflow has finished in 25 minutes 50 seconds (⚠️ 8 minutes 48 seconds more than `main` branch avg.) and finished at 14th Feb, 2023. 1 job failed. There are 2 test failures.

kovrus commented Feb 14, 2023

Cluas commented Feb 15, 2023

[processor/spanmetricsprocessor] Allow set metrics namespace #18199

[processor/spanmetricsprocessor] Allow set metrics namespace #18199

Conversation

Cluas commented Jan 31, 2023

linux-foundation-easycla bot commented Jan 31, 2023 • edited Loading

albertteoh left a comment

Choose a reason for hiding this comment

albertteoh left a comment

Choose a reason for hiding this comment

devrimdemiroz commented Feb 9, 2023

devrimdemiroz commented Feb 10, 2023 • edited Loading

Cluas commented Feb 10, 2023

Cluas commented Feb 10, 2023

kovrus commented Feb 10, 2023 • edited Loading

gouthamve commented Feb 10, 2023

kovrus commented Feb 10, 2023

gouthamve commented Feb 10, 2023

Cluas commented Feb 11, 2023

kovrus commented Feb 11, 2023

Cluas commented Feb 11, 2023

albertteoh commented Feb 12, 2023

runforesight bot commented Feb 12, 2023 • edited Loading

Foresight Summary

build-and-test duration(33 minutes 45 seconds) has decreased 31 minutes 53 seconds compared to main branch avg(1 hour 5 minutes 38 seconds).

⭕ build-and-test-windows workflow has finished in 6 seconds (40 minutes 44 seconds less than main branch avg.) and finished at 14th Feb, 2023.

✅ check-links workflow has finished in 1 minute 21 seconds (1 minute 14 seconds less than main branch avg.) and finished at 14th Feb, 2023.

✅ telemetrygen workflow has finished in 1 minute 49 seconds (1 minute 18 seconds less than main branch avg.) and finished at 14th Feb, 2023.

✅ changelog workflow has finished in 2 minutes 20 seconds (1 minute 23 seconds less than main branch avg.) and finished at 14th Feb, 2023.

✅ prometheus-compliance-tests workflow has finished in 12 minutes 21 seconds (⚠️ 3 minutes 32 seconds more than main branch avg.) and finished at 14th Feb, 2023.

❌ build-and-test workflow has finished in 33 minutes 45 seconds (31 minutes 53 seconds less than main branch avg.) and finished at 14th Feb, 2023. 3 jobs failed. There are 6 test failures.

✅ e2e-tests workflow has finished in 15 minutes 10 seconds and finished at 14th Feb, 2023.

❌ load-tests workflow has finished in 25 minutes 50 seconds (⚠️ 8 minutes 48 seconds more than main branch avg.) and finished at 14th Feb, 2023. 1 job failed. There are 2 test failures.

kovrus commented Feb 14, 2023

Cluas commented Feb 15, 2023

linux-foundation-easycla bot commented Jan 31, 2023 •

edited

Loading

devrimdemiroz commented Feb 10, 2023 •

edited

Loading

kovrus commented Feb 10, 2023 •

edited

Loading

runforesight bot commented Feb 12, 2023 •

edited

Loading

`build-and-test` duration(33 minutes 45 seconds) has decreased 31 minutes 53 seconds compared to main branch avg(1 hour 5 minutes 38 seconds).

⭕ build-and-test-windows workflow has finished in 6 seconds (40 minutes 44 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ check-links workflow has finished in 1 minute 21 seconds (1 minute 14 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ telemetrygen workflow has finished in 1 minute 49 seconds (1 minute 18 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ changelog workflow has finished in 2 minutes 20 seconds (1 minute 23 seconds less than `main` branch avg.) and finished at 14th Feb, 2023.

✅ prometheus-compliance-tests workflow has finished in 12 minutes 21 seconds (⚠️ 3 minutes 32 seconds more than `main` branch avg.) and finished at 14th Feb, 2023.

❌ build-and-test workflow has finished in 33 minutes 45 seconds (31 minutes 53 seconds less than `main` branch avg.) and finished at 14th Feb, 2023. 3 jobs failed. There are 6 test failures.

❌ load-tests workflow has finished in 25 minutes 50 seconds (⚠️ 8 minutes 48 seconds more than `main` branch avg.) and finished at 14th Feb, 2023. 1 job failed. There are 2 test failures.