Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Add ContextStatements config #15382

Merged
merged 29 commits into from
Nov 8, 2022

Conversation

TylerHelmuth
Copy link
Member

Description:
Adds the ContextStatement config and its supporting API. This config will allow users to specify a list of statements associated to a context. The processor will then be updated to use this API to processing all the statements associated to a context in the defined order.

As new Contexts are added the switch statement and ParserCollection will grow. Traces, Metrics, and Logs processors will use the corresponding New[Signal]ParserCollection to gain access to the parsers needed for their possible contexts.

Link to tracking Issue:
#14578

Related to #15381

@TylerHelmuth TylerHelmuth requested review from a team and bogdandrutu as code owners October 21, 2022 21:53
@TylerHelmuth TylerHelmuth added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Oct 21, 2022
Copy link
Member

@kentquirk kentquirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit, this is a bit wordier than I had originally thought we were talking about -- but from an efficiency point of view this is exactly the right way to do it -- if you don't specify a deep context you don't have to pay for it.

This is solid work. Thanks.

@TylerHelmuth
Copy link
Member Author

@bogdandrutu @evan-bradley please review.

Copy link
Member

@kovrus kovrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Contributor

@evan-bradley evan-bradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I have some minor nitpicks/questions, but nothing blocking.

processor/transformprocessor/internal/common/processor.go Outdated Show resolved Hide resolved
processor/transformprocessor/internal/common/processor.go Outdated Show resolved Hide resolved
@TylerHelmuth
Copy link
Member Author

@kovrus would the routing processor want to take advantage of this type of configuration/API?

@kovrus
Copy link
Member

kovrus commented Oct 27, 2022

@kovrus would the routing processor want to take advantage of this type of configuration/API?

As of now, routing statements use the resource context. I am now aware of any request to extend the context scope, but I guess, once we get such request we can make use of the ContextStatement configuration.

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Nov 5, 2022

@bogdandrutu @evan-bradley @kentquirk @kovrus I push some commits that I think gives the best of all worlds.

The API now has Trace/Metric/Log specific Contexts so that the Trace/Metric/Log specific processors can use their specific types. At the same time, ParserCollection is now a base that allows handling ResourceStatements and ScopeStatements are parsed in exactly 1 place and reused for the [Trace/Metric/Log]ParserCollection that encapsulates everything that a user would need to parse statements for the specific signal. We end up with an API that has no extraneous functions/interfaces.

When the logs processor uses this API it looks like

type Processor struct {
	contexts []common.LogsContext
}

func NewProcessor(statements []common.ContextStatements, settings component.TelemetrySettings) (*Processor, error) {
	ottlp := common.NewLogParserCollection(Functions(), settings)
	parsedStatements, err := ottlp.ParseContextStatements(statements)
	if err != nil {
		return nil, err
	}
	return &Processor{
		contexts: parsedStatements,
	}, nil
}

func (p *Processor) ProcessLogs(ctx context.Context, td plog.Logs) (plog.Logs, error) {
	for _, context := range p.contexts {
		err := context.ProcessLogs(ctx, td)
		if err != nil {
			return td, err
		}
	}
	return td, nil
}

Please re-review with fresh eyes.

processor/transformprocessor/internal/common/config.go Outdated Show resolved Hide resolved
Comment on lines +44 to +50
func ResourceFunctions() map[string]interface{} {
return Functions[ottlresource.TransformContext]()
}

func ScopeFunctions() map[string]interface{} {
return Functions[ottlscope.TransformContext]()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need these specialized calls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu I made specialized calls to represent the possibility of Resource-only functions or Scope-only functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need a public API for that, just add a comment to the "Functions" that can be initialized with multiple contexts.

Copy link
Member Author

@TylerHelmuth TylerHelmuth Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu This isn't a public API, it would never be moved to the OTTL. This is a way for the transform processor to enforce its definition of what functions are allowed to be used for Scope and Resource. Similar to how each signal in the transform processor has their own functions.go.

}
}

func (pc TraceParserCollection) ParseContextStatements(contextStatements []ContextStatements) ([]TracesContext, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to return a slice? I think we can return just one and call the statements sequentially.

Copy link
Member Author

@TylerHelmuth TylerHelmuth Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but it will mean implementing loops in config.Validate and each signal's NewProcessor function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think otherwise you will duplicate the loops in the user code (consumer of the package).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu If we remove this function's ability to handle a list of ContextStatements then the NewProcessor and Validate functions all look like:

type Processor struct {
	contexts []consumer.Logs
}

func NewProcessor(contextStatements []common.ContextStatements, settings component.TelemetrySettings) (*Processor, error) {
	pc, err := common.NewLogParserCollection(Functions(), settings)
	if err != nil {
		return nil, err
	}
	contexts := make([]consumer.Logs, len(contextStatements))
	for i, c := range contextStatements {
		context, err := pc.ParseContextStatements(c)
		if err != nil {
			return nil, err
		}
		contexts[i] = context
	}
	return &Processor{
		contexts: contexts,
	}, nil
}

I would definitely prefer to hide this shared logic in the ParseContextStatement functions. In the future, if this all moves into the OTTL, we can also expose a function for handling single ContextStatements if necessary

Comment on lines 31 to 33
type TracesContext interface {
ProcessTraces(ctx context.Context, td ptrace.Traces) error
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question I asked, why not replace this with consumer.Traces

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu I don't see the value of type traceStatements []*ottl.Statement[ottltraces.TransformContext] implementing a Capabilities() Capabilities function. Also, I don't like how the API becomes ConsumeTraces instead of ProcessTraces as the function does not consume traces in the sense that the consumer.Traces API means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu switched to consumer so you can see what it looks like.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think the direction is good.

processor/transformprocessor/internal/common/logs.go Outdated Show resolved Hide resolved
return nil
}

type LogParserCollection struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this type, or we can replace it with something like this:

func ParseLogStatements(functions map[string]interface{}, settings component.TelemetrySettings, contextStatements ContextStatements) (consumer.Logs, error) {
	logParser := ottllogs.NewParser(functions, settings)
	switch contextStatements.Context {
	case Log:
		lStatements, err := logParser.ParseStatements(contextStatements.Statements)
		if err != nil {
			return nil, err
		}
		return logStatements(lStatements), nil
	default:
		statements, err := pc.parseCommonContextStatements(contextStatements)
		if err != nil {
			return nil, err
		}
		return statements, nil
	}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu I started with this approach but added the extra type to simplify the ParseSignalStatements functions parameters. If there isn't enough value, initializing in the function isn't a big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if ParseSignalStatements only takes in a single ContextStatement, then we will be recreating each parser every time we call the function. Seems wasteful.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets iterate after we see the usage in the processor.

@bogdandrutu bogdandrutu merged commit 9f68e4e into open-telemetry:main Nov 8, 2022
@TylerHelmuth TylerHelmuth deleted the tp-enhanced-contexts-2 branch November 8, 2022 00:37
dineshg13 pushed a commit to DataDog/opentelemetry-collector-contrib that referenced this pull request Nov 21, 2022
)

* Add ContextStatements config

* Add high-level context

* respond to feedback

* Fix merge

* Adjust ParserCollection

* Add example usage temporarily

* apply feedback

* Split into individual files

* Add spanevent and metric contexts

* revert logs processor example

* Fix NewParserCollection

* Fix lint

* Fix checks

* Refactor ParserCollection to have a base

* Update interface to handle context.Context

* cleanup var names

* Fix case statement

* Implement UnmarshalText for ContextID

* Switch to consumer.Signal

* Fix lint

* Update processor/transformprocessor/internal/common/logs.go

Co-authored-by: Evan Bradley <[email protected]>

* Update processor/transformprocessor/internal/common/metrics.go

Co-authored-by: Evan Bradley <[email protected]>

* apply feedback

* Add back Options for each parser

Co-authored-by: Evan Bradley <[email protected]>
shalper2 pushed a commit to shalper2/opentelemetry-collector-contrib that referenced this pull request Dec 6, 2022
)

* Add ContextStatements config

* Add high-level context

* respond to feedback

* Fix merge

* Adjust ParserCollection

* Add example usage temporarily

* apply feedback

* Split into individual files

* Add spanevent and metric contexts

* revert logs processor example

* Fix NewParserCollection

* Fix lint

* Fix checks

* Refactor ParserCollection to have a base

* Update interface to handle context.Context

* cleanup var names

* Fix case statement

* Implement UnmarshalText for ContextID

* Switch to consumer.Signal

* Fix lint

* Update processor/transformprocessor/internal/common/logs.go

Co-authored-by: Evan Bradley <[email protected]>

* Update processor/transformprocessor/internal/common/metrics.go

Co-authored-by: Evan Bradley <[email protected]>

* apply feedback

* Add back Options for each parser

Co-authored-by: Evan Bradley <[email protected]>
@plantfansam plantfansam mentioned this pull request Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants