Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[testbed] Detect and fix data race at testbed integration test #30549

Merged
merged 6 commits into from
Jan 16, 2024

Conversation

james-ryans
Copy link
Contributor

@james-ryans james-ryans commented Jan 14, 2024

Description:
Fixing a bug - The Testbed module has many data race warnings when tested with go test -race, especially for traces. I've listed the data race issues and fixes below:

  1. Add mutex to MockBackend startedAt variable
    Because tc := testbed.NewTestCase() starts a logStats() goroutine which calls MockBackend.GetStats() that uses the startedAt variable, a data race occurs when we later execute tc.StartBackend(). This is because tc.StartBackend() writes to the startedAt variable.
  2. Add mutex to LoadGenerator startedAt variable
    The reason is similar to MockBackend.
  3. Move MockBackend numSpansReceived addition after ConsumeTraces()
    Because tc.numSpansReceived.Add(uint64(td.SpanCount())) will make tc.LoadGenerator.DataItemsSent() equals to tc.MockBackend.DataItemsReceived() which the test case will assume MockBackend already received all the spans and lead to tc.ValidateData() while MockBackend is actually consuming the traces.
  4. Get td.SpanCount() before batchprocessor consumes the spans at opencensusreceiver and zipkinreceiver
    With the batchprocessor pipeline, there's a step that it moves the source td.ResourceSpans() to the batched one and that is why the td.SpanCount() line while batchprocessor consuming the traces will cause data race.

Testing:
Add -race args to testbed/runtests.sh at line 22 where go test takes place.
Run TESTS_DIR=correctnesstests/traces make e2e-test

Copy link

linux-foundation-easycla bot commented Jan 14, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@djaglowski djaglowski added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Jan 16, 2024
@djaglowski djaglowski merged commit 95e673e into open-telemetry:main Jan 16, 2024
89 of 90 checks passed
@github-actions github-actions bot added this to the next release milestone Jan 16, 2024
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this pull request Feb 1, 2024
…telemetry#30549)

**Description:**
Fixing a bug - The Testbed module has many data race warnings when
tested with `go test -race`, especially for traces. I've listed the data
race issues and fixes below:
1. Add mutex to MockBackend `startedAt` variable
Because `tc := testbed.NewTestCase()` starts a `logStats()` goroutine
which calls `MockBackend.GetStats()` that uses the `startedAt` variable,
a data race occurs when we later execute `tc.StartBackend()`. This is
because `tc.StartBackend()` writes to the `startedAt` variable.
2. Add mutex to LoadGenerator `startedAt` variable
The reason is similar to MockBackend.
3. Move MockBackend `numSpansReceived` addition after `ConsumeTraces()`
Because `tc.numSpansReceived.Add(uint64(td.SpanCount()))` will make
`tc.LoadGenerator.DataItemsSent()` equals to
`tc.MockBackend.DataItemsReceived()` which the test case will assume
MockBackend already received all the spans and lead to
`tc.ValidateData()` while MockBackend is actually consuming the traces.
4. Get `td.SpanCount()` before `batchprocessor` consumes the spans at
`opencensusreceiver` and `zipkinreceiver`
With the `batchprocessor` pipeline, there's a step that it moves the
source `td.ResourceSpans()` to the batched one and that is why the
`td.SpanCount()` line while `batchprocessor` consuming the traces will
cause data race.

**Testing:**
Add `-race` args to `testbed/runtests.sh` at line 22 where `go test`
takes place.
Run `TESTS_DIR=correctnesstests/traces make e2e-test`

---------

Signed-off-by: James Ryans <[email protected]>
anthoai97 pushed a commit to anthoai97/opentelemetry-collector-contrib that referenced this pull request Feb 12, 2024
…telemetry#30549)

**Description:**
Fixing a bug - The Testbed module has many data race warnings when
tested with `go test -race`, especially for traces. I've listed the data
race issues and fixes below:
1. Add mutex to MockBackend `startedAt` variable
Because `tc := testbed.NewTestCase()` starts a `logStats()` goroutine
which calls `MockBackend.GetStats()` that uses the `startedAt` variable,
a data race occurs when we later execute `tc.StartBackend()`. This is
because `tc.StartBackend()` writes to the `startedAt` variable.
2. Add mutex to LoadGenerator `startedAt` variable
The reason is similar to MockBackend.
3. Move MockBackend `numSpansReceived` addition after `ConsumeTraces()`
Because `tc.numSpansReceived.Add(uint64(td.SpanCount()))` will make
`tc.LoadGenerator.DataItemsSent()` equals to
`tc.MockBackend.DataItemsReceived()` which the test case will assume
MockBackend already received all the spans and lead to
`tc.ValidateData()` while MockBackend is actually consuming the traces.
4. Get `td.SpanCount()` before `batchprocessor` consumes the spans at
`opencensusreceiver` and `zipkinreceiver`
With the `batchprocessor` pipeline, there's a step that it moves the
source `td.ResourceSpans()` to the batched one and that is why the
`td.SpanCount()` line while `batchprocessor` consuming the traces will
cause data race.

**Testing:**
Add `-race` args to `testbed/runtests.sh` at line 22 where `go test`
takes place.
Run `TESTS_DIR=correctnesstests/traces make e2e-test`

---------

Signed-off-by: James Ryans <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
receiver/opencensus receiver/zipkin Zipkin receiver Skip Changelog PRs that do not require a CHANGELOG.md entry testbed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants