Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal/stanza] Batch in emitter instead of converter #6378

Merged
merged 21 commits into from
Dec 2, 2021

Conversation

BinaryFissionGames
Copy link
Contributor

@BinaryFissionGames BinaryFissionGames commented Nov 18, 2021

Description:

  • Moved batching in the internal/stanza package from the Converter to the LogEmitter to increase performance.

When the worker pattern was implemented, batching was put after the worker loop in the converter. The problem with this is that the workers have very little work to do per cycle. Ideally, workers have large amounts of work they do before they request more. Otherwise, workers begin to incur a lot of overhead from requesting work constantly (in this case selecting from a channel), which can (and does) cause performance issues under high load.

Instead of a unit of work being a single entry, this PR makes a unit of work a batch of entries, moving the batching logic into the LogEmitter.

Testing:
Added tests to LogEmitter to validate MaxBatchSize and FlushInterval are still respected.

Benchmarks of emitter-to-consumer performance:

Before:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/internal/stanza
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkEmitterToConsumer/worker_count=1-12                  10        5860113666 ns/op        3075350700 B/op 61522383 allocs/op
BenchmarkEmitterToConsumer/worker_count=2-12                  10        4330471031 ns/op        3075330906 B/op 61521719 allocs/op
BenchmarkEmitterToConsumer/worker_count=4-12                  10        3720432579 ns/op        3075325519 B/op 61521505 allocs/op
BenchmarkEmitterToConsumer/worker_count=6-12                  10        3540257550 ns/op        3075323456 B/op 61521449 allocs/op
BenchmarkEmitterToConsumer/worker_count=8-12                  10        3540537514 ns/op        3075327725 B/op 61521481 allocs/op
PASS

After:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/internal/stanza
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkEmitterToConsumer/worker_count=1-12                  10        3025573325 ns/op        3087150870 B/op 60540716 allocs/op
BenchmarkEmitterToConsumer/worker_count=2-12                  10        1677928788 ns/op        3087136036 B/op 60540400 allocs/op
BenchmarkEmitterToConsumer/worker_count=4-12                  10        1235078416 ns/op        3087130332 B/op 60540275 allocs/op
BenchmarkEmitterToConsumer/worker_count=6-12                  10        1223689228 ns/op        3087129233 B/op 60540246 allocs/op
BenchmarkEmitterToConsumer/worker_count=8-12                  10        1209764305 ns/op        3087128364 B/op 60540240 allocs/op

@djaglowski djaglowski self-assigned this Nov 18, 2021
@djaglowski djaglowski self-requested a review November 18, 2021 21:09
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 26, 2021
@BinaryFissionGames BinaryFissionGames marked this pull request as ready for review November 29, 2021 15:49
@BinaryFissionGames BinaryFissionGames requested a review from a team as a code owner November 29, 2021 15:49
@djaglowski djaglowski removed the Stale label Nov 29, 2021
internal/stanza/converter_test.go Outdated Show resolved Hide resolved
internal/stanza/converter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Show resolved Hide resolved
internal/stanza/converter.go Outdated Show resolved Hide resolved
internal/stanza/converter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
internal/stanza/converter_test.go Outdated Show resolved Hide resolved
internal/stanza/converter_test.go Outdated Show resolved Hide resolved
internal/stanza/emitter.go Outdated Show resolved Hide resolved
@djaglowski
Copy link
Member

@BinaryFissionGames This is looking really nice. Can you post benchmarks once more now that we've had some changes?

@BinaryFissionGames
Copy link
Contributor Author

@djaglowski
Here's the benchmarks with what is currently in this PR:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/internal/stanza
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkEmitterToConsumer/worker_count=1-12                  10        3001780091 ns/op        3087148339 B/op 60540706 allocs/op
BenchmarkEmitterToConsumer/worker_count=2-12                  10        1722811735 ns/op        3087141495 B/op 60540454 allocs/op
BenchmarkEmitterToConsumer/worker_count=4-12                  10        1263703437 ns/op        3087132067 B/op 60540290 allocs/op
BenchmarkEmitterToConsumer/worker_count=6-12                  10        1233186755 ns/op        3087133180 B/op 60540251 allocs/op
BenchmarkEmitterToConsumer/worker_count=8-12                  10        1247131258 ns/op        3087130500 B/op 60540259 allocs/op

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks @BinaryFissionGames

@djaglowski djaglowski added the ready to merge Code review completed; ready to merge by maintainers label Dec 1, 2021
@bogdandrutu bogdandrutu merged commit 2229b76 into open-telemetry:main Dec 2, 2021
jamesmoessis pushed a commit to atlassian-forks/opentelemetry-collector-contrib that referenced this pull request Dec 8, 2021
…ry#6378)

* Batch stanza entries up front

* Use same as previous default

* fix filelogreceiver tests

* refactor emitter.go to always defer mux unlocks

This makes it more clear/explicit what the critical sections are

* batchCount -> batchSize

* flatten resource aggregation code

* fix out of date comment for maxBatchSize

* Move checks for nil batch to avoid function calls

* Move emitter Stop directly after Start

* Use aggregation over batch for resource aggregation step

Also cleaned up comments and removed some unused bits here.

* Move converter resource aggregation map to reduced scope

Reducing the scope to just the aggregation loop prevents concurrent access.

* Remove reliance on internal/stanza defaults for filelogreceiver

* Remove paranoid nil check for cancel func

* default cancel func is noop for emitter

* Remove unnecessary option interface for emitter.

* Make sure to shutdown emitter in correct order.

* Combine some statements with following if in emitter

* Clarify batching logic in converter tests

* Fold some functions in emitter

Folded out makeNewBatch, makeNewBatchNoLock,
and appendEntry.

* add back in appendEntry, makeNewBatch

* Fix regressed if statement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants