Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/pdatautil] Fixed panic in nested map hashing #18912

Merged

Conversation

cpheps
Copy link
Contributor

@cpheps cpheps commented Feb 24, 2023

Description:
Fixes #18910

Changed how the keysBuf works so that nested maps will append their keys to the buffer for that recursion then they will be removed before the recursion exist. This allows keeping a single buffer while also protecting ensure recursion doesn't pollute the buffer order.

Link to tracking Issue: #18910

Testing: Added a unit test that could replicate the panic in the old code.

@runforesight
Copy link

runforesight bot commented Feb 24, 2023

Foresight Summary

    
Major Impacts

TestStartAndShutdownRemote ❌ failed 2 times in 13 runs (15% fail rate).
build-and-test duration(20 minutes 1 second) has decreased 54 minutes 20 seconds compared to main branch avg(1 hour 14 minutes 21 seconds).
View More Details

⭕  build-and-test-windows workflow has finished in 7 seconds (43 minutes 2 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

✅  check-links workflow has finished in 54 seconds (1 minute 45 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 1 minute 1 second (2 minutes 17 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  changelog workflow has finished in 1 minute 37 seconds (1 minute 9 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

✅  prometheus-compliance-tests workflow has finished in 3 minutes 20 seconds (6 minutes 7 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
prometheus-compliance-tests -     🔗  ✅ 21  ❌ 0  ⏭ 0    🔗 See Details

✅  load-tests workflow has finished in 7 minutes 3 seconds (11 minutes 17 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
loadtest (TestBallastMemory|TestLog10kDPS) -     🔗  ✅ 18  ❌ 0  ⏭ 0    🔗 See Details

✅  e2e-tests workflow has finished in 13 minutes 17 seconds (3 minutes 11 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
kubernetes-test (v1.26.0) -     🔗  N/A See Details
kubernetes-test (v1.25.3) -     🔗  N/A See Details
kubernetes-test (v1.24.7) -     🔗  N/A See Details
kubernetes-test (v1.23.13) -     🔗  N/A See Details

 build-and-test workflow has finished in 12 minutes 37 seconds (1 hour 1 minute 44 seconds less than main branch avg.) and finished at 27th Feb, 2023.


Job Failed Steps Tests
unittest-matrix (1.19, receiver-0) N/A  ✅ 2580  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, receiver-0) N/A  ✅ 2580  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, exporter) N/A  ✅ 2456  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-1) N/A  ✅ 1937  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, receiver-1) N/A  ✅ 1937  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, exporter) N/A  ✅ 2456  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, other) N/A  ✅ 4772  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, other) N/A  ✅ 4772  ❌ 0  ⏭ 0    🔗 See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

@cpheps cpheps force-pushed the fix/pdata-util-nested-map-panic branch from 3a6e8b1 to 4f9ba25 Compare February 27, 2023 13:22
@cpheps
Copy link
Contributor Author

cpheps commented Feb 27, 2023

I don't think PR failures are related to my changes.

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than throw out the optimization altogether, can we find a way to handle recursion correctly?

At a minimum, I think we could pass a bool to writeMapHash that indicates whether it is a recursive call or not. MapHash can pass false, while writeValueHash passes true. Then at least we keep the optimization for non-recursive maps, which is likely the majority of cases.

Possibly an even better solution would have nested maps request their own hashWriter from the pool and use that, rather than polluting the existing one. This would require a bit more work, so maybe can be a separate issue.

@djaglowski djaglowski changed the title Fixed panic in pdatautil nested map hashing [pkg/pdatautil] Fixed panic in nested map hashing Feb 27, 2023
@cpheps
Copy link
Contributor Author

cpheps commented Feb 27, 2023

@djaglowski I reworked this to use a single buffer and some slice tricks to only focus on the current recursions key set. Let me know if this seems like a better solution.

pkg/pdatautil/hash_test.go Outdated Show resolved Hide resolved
pkg/pdatautil/hash.go Show resolved Hide resolved
@cpheps cpheps force-pushed the fix/pdata-util-nested-map-panic branch from 65537c8 to 33fe06f Compare February 27, 2023 17:44
Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@djaglowski djaglowski merged commit 848486f into open-telemetry:main Feb 27, 2023
newly12 pushed a commit to newly12/opentelemetry-collector-contrib that referenced this pull request Feb 28, 2023
* Fixed panic in pdatautil map hashing

Signed-off-by: Corbin Phelps <[email protected]>
@cpheps cpheps deleted the fix/pdata-util-nested-map-panic branch February 28, 2023 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pkg/pdatautil] Hashing Maps can panic with nested maps
4 participants