Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exceptions connector #21090

Merged
merged 49 commits into from
Jul 19, 2023
Merged

Conversation

marctc
Copy link
Contributor

@marctc marctc commented Apr 20, 2023

Description: Added connector to create metrics from recorded exceptions. The implementation
is heavily inspired from spanmetricsconnector.

Link to tracking Issue: #17272

Documentation: Added examples in README.

@marctc marctc changed the title Add initial version of exceptionmetrics connector Add exceptions connector Apr 24, 2023
@atoulme atoulme marked this pull request as ready for review April 29, 2023 18:46
@atoulme atoulme requested a review from a team as a code owner April 29, 2023 18:46
@atoulme
Copy link
Contributor

atoulme commented Apr 29, 2023

Sorry, I took this PR out of draft by mistake. I meant to accept the builds to run.

const (
serviceNameKey = conventions.AttributeServiceName
spanKindKey = "span.kind" // OpenTelemetry non-standard constant.
statusCodeKey = "status.code" // OpenTelemetry non-standard constant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see how it's used. It might be a good idea to make this standard by formalizing the mapping from trace to metric on opentelemetry-specification. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO to fix later. Is that enough?

connector/exceptionsconnector/connector.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/connector.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/connector.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/connector.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/README.md Outdated Show resolved Hide resolved
connector/exceptionsconnector/internal/cache/cache.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/mocks/MetricsConsumer.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/connector.go Outdated Show resolved Hide resolved
connector/exceptionsconnector/README.md Outdated Show resolved Hide resolved
const (
serviceNameKey = conventions.AttributeServiceName
spanKindKey = "span.kind" // OpenTelemetry non-standard constant.
statusCodeKey = "status.code" // OpenTelemetry non-standard constant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see how it's used. It might be a good idea to make this standard by formalizing the mapping from trace to metric on opentelemetry-specification. WDYT?

@marctc
Copy link
Contributor Author

marctc commented May 5, 2023

Hey thanks for review comments! I iniallity opened the PR as draft because still needed to add the part of logs. The PR comments I think are still relevant, I will address them once I'm back from holidays 🙏 .

In the meantime, maybe you could help me with a couple of questions that I have with the connectors. In this branch, I added also support to generate logs in the same connector. Is great because it reduces configuration, but Is there a way to make that same connector shares the same code and process? I initially tried to make everything in the same component but if you don't specify, for example, logs in the service's pipelines the log's consumer will panic when trying to send logs. We don't have a connectorhelper so this requires some code duplication for both signals.

On the other hand, I was also wondering if it's possible that the connector shares the same process when sending metrics and logs. Current implementation relies on a cache where it stores the aggregated observed spans with exceptions to later generate metrics and logs. If someone would want to configure the collector to report exceptions, would have 2 running processes of component with 2 duplicated instances of the cache in memory. Is that something expected?

@jpkrohling
Copy link
Member

if you don't specify, for example, logs in the service's pipelines the log's consumer will panic when trying to send logs

It should not panic, I believe. Perhaps @djaglowski can share what's the expectation there.

would have 2 running processes of component with 2 duplicated instances of the cache in memory

This is up to the component. Some components use a singleton pattern, with metrics/traces/logs processors made from a single instance. I believe the same can be accomplished with connectors.

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review the actual code later on, but I wanted to mention that this PR seems to be missing a few things that are expected from new components, such as versions.yaml or entries to the internal distribution used for testing.

Here's some information on what's expected from new components: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#adding-new-components

@djaglowski
Copy link
Member

It should not panic, I believe. Perhaps @djaglowski can share what's the expectation there.

Correct.

@marctc, If I'm understanding correctly, you tried to implement ConsumeTraces in a way where one call would emit both metrics and logs. You actually need to implement two separate ConsumeTraces functions and trust the collector to call the appropriate one based on the user's configuration.

In other words, consider TracesToMetrics and TracesToLogs to be entirely independent capabilities. The user can use one or the other, or both. You can share code between them, if it makes sense, but don't assume any shared state or try to use one function to emit to both consumers.

There are a couple ways to do this, but here's a straightforward approach. If there's obvious code to share, great. If not, it's ok to just keep everything separate.

// connector.go
type connectorImp struct {
	sharedStuff
}

type tracesToMetricsConnector struct {
	connectorImp
	metricsStuff
	metricsConsumer consumer.Metrics
}

func (tm *tracesToMetricsConnector) ConsumeTraces(ctx context.Context, traces ptrace.Traces) error {
	metrics := doMetricsStuff(traces)
	tm.next.ConsumeMetrics(ctx, metrics)
}

type tracesToLogsConnector struct {
	connectorImp
	logsStuff
	logsConsumer consumer.Logs
}

func (tl *tracesToLogsConnector) ConsumeTraces(ctx context.Context, traces ptrace.Traces) error {
	logs := doLogsStuff(traces)
	tl.next.ConsumeLogs(ctx, logs)
}

// factory.go
func NewFactory() connector.Factory {
	return connector.NewFactory(
		typeStr,
		createDefaultConfig,
		connector.WithTracesToMetrics(createTracesToMetricsConnector, stability),
		connector.WithTracesToLogs(createTracesToLogsConnector, stability),
	)
}

func createTracesToMetricsConnector(ctx context.Context, params connector.CreateSettings, cfg component.Config, nextConsumer consumer.Metrics) (connector.Traces, error) {
	c, err := newConnector(params.Logger, cfg, ...)
	if err != nil {
		return nil, err
	}
	return &tracesToMetricsConnector{
		connectorImp: c,
		metricsStuff: initMetricsStuff(),
		metricsConsumer: nextConsumer,
	}, nil
}

func createTracesToLogsConnector(ctx context.Context, params connector.CreateSettings, cfg component.Config, nextConsumer consumer.Logs) (connector.Traces, error) {
	c, err := newConnector(params.Logger, cfg, ...)
	if err != nil {
		return nil, err
	}
	return &tracesToLogsConnector{
		connectorImp: c,		
		logsStuff: initLogsStuff(),
		logsConsumer: nextConsumer,
	}, nil
}

@github-actions
Copy link
Contributor

github-actions bot commented Jun 7, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 7, 2023
@djaglowski
Copy link
Member

I'd love to see this move forward. @marctc will you still be able to work on this?

@marctc
Copy link
Contributor Author

marctc commented Jun 7, 2023

I'd love to see this move forward. @marctc will you still be able to work on this?

Hey Daniel. Yes, I'm actually working on this right now after my vacations. Thanks for the encouragement!

@djaglowski djaglowski removed the Stale label Jun 7, 2023
@github-actions github-actions bot added the cmd/otelcontribcol otelcontribcol command label Jun 8, 2023
@marctc
Copy link
Contributor Author

marctc commented Jun 8, 2023

The PR is ready for review, thanks for waiting. I address some of the initial concerns, I hope this goes in the right direction.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@marctc marctc requested a review from jpkrohling July 14, 2023 16:12
- Span kind
- Status code
- Exception message
- Exception type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please create a tracking issue and mark this as resolved?


Each log will additionally have the following attributes:
- Exception stacktrace
- HTTP attributes (if available), such as `http.method` and `http.host` among others.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot already, are you getting all HTTP attributes, or are you using a fixed list of attributes? It'd be good to be clear here. If it's all HTTP attributes, it would be good to state exactly which semantic convention version(s) you support.

And as a feature request: allow users to specify which semantic convention(s) they want to follow.

func extractHTTP(attr pcommon.Map) map[string]string {
http := make(map[string]string)
attr.Range(func(k string, v pcommon.Value) bool {
if strings.HasPrefix(k, "http.") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This answers my previous question. The readme has to tell users to expect all attributes starting with "http." are going to be copied over.

@marctc
Copy link
Contributor Author

marctc commented Jul 18, 2023

@jpkrohling I addressed the README feedback. I thin in order to create follow-up issues we have the merge this one first (then I can choose this component in the Component(s) dropdown from Issue: Feature request form). Unless you suggest it doing before hand. Thanks!

@jpkrohling jpkrohling merged commit 9c470a0 into open-telemetry:main Jul 19, 2023
91 checks passed
@github-actions github-actions bot added this to the next release milestone Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants