Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/opamp]: Add mechanism to detect whether the collector has been orphaned #32564

Merged

Conversation

BinaryFissionGames
Copy link
Contributor

@BinaryFissionGames BinaryFissionGames commented Apr 19, 2024

Description:

  • Allows the process to monitor a passed in ppid, which should be the parent process ID for the collector. When the parent process ID exits, the extension emits a fatal error event, which triggers a collector shutdown.

Link to tracking Issue: This is part of #32189 - It does not resolve this issue because the supervisor still needs changes to pass the its PID in.

Testing:
Added some unit tests.

I've manually tested it on my macbook with this PR: observIQ#4715
(running supervisor, kill -9 the supervisor, and take a look at the agent logs to see it shut down).

I've tried testing this out on Windows, but the supervisor doesn't get past bootstrapping (the Commander's Stop function does not work on windows), so I wasn't able to fully test it.

Documentation:
Added new parameter to README

@BinaryFissionGames BinaryFissionGames marked this pull request as ready for review April 19, 2024 19:03
@BinaryFissionGames BinaryFissionGames requested a review from a team as a code owner April 19, 2024 19:03
@evan-bradley evan-bradley added the Run Windows Enable running windows test on a PR label Apr 19, 2024
Copy link
Contributor

@evan-bradley evan-bradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to publish my review. Mostly looks good to me, just a few comments. I'll take another look through it shortly.

extension/opampextension/monitor_ppid.go Outdated Show resolved Hide resolved
extension/opampextension/monitor_ppid_test.go Outdated Show resolved Hide resolved
@BinaryFissionGames BinaryFissionGames force-pushed the feat/opamp-extension-monitor-ppid branch 2 times, most recently from 287fc82 to f09e1f4 Compare May 3, 2024 19:10
@github-actions github-actions bot added the cmd/otelcontribcol otelcontribcol command label May 3, 2024
@BinaryFissionGames BinaryFissionGames force-pushed the feat/opamp-extension-monitor-ppid branch from 20978e6 to 51ec112 Compare May 3, 2024 21:15
@evan-bradley evan-bradley merged commit e8d997d into open-telemetry:main May 3, 2024
178 checks passed
@github-actions github-actions bot added this to the next release milestone May 3, 2024
rimitchell pushed a commit to rimitchell/opentelemetry-collector-contrib that referenced this pull request May 8, 2024
…been orphaned (open-telemetry#32564)

**Description:** <Describe what has changed.>
* Allows the process to monitor a passed in ppid, which should be the
parent process ID for the collector. When the parent process ID exits,
the extension emits a fatal error event, which triggers a collector
shutdown.

**Link to tracking Issue:** This is part of open-telemetry#32189 - It does not resolve
this issue because the supervisor still needs changes to pass the its
PID in.

**Testing:**
Added some unit tests.

I've manually tested it on my macbook with this PR:
observIQ#4715
(running supervisor, kill -9 the supervisor, and take a look at the
agent logs to see it shut down).

I've tried testing this out on Windows, but the supervisor doesn't get
past bootstrapping (the Commander's Stop function does not work on
windows), so I wasn't able to fully test it.

**Documentation:** 
Added new parameter to README
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this pull request Jul 11, 2024
…been orphaned (open-telemetry#32564)

**Description:** <Describe what has changed.>
* Allows the process to monitor a passed in ppid, which should be the
parent process ID for the collector. When the parent process ID exits,
the extension emits a fatal error event, which triggers a collector
shutdown.

**Link to tracking Issue:** This is part of open-telemetry#32189 - It does not resolve
this issue because the supervisor still needs changes to pass the its
PID in.

**Testing:**
Added some unit tests.

I've manually tested it on my macbook with this PR:
observIQ#4715
(running supervisor, kill -9 the supervisor, and take a look at the
agent logs to see it shut down).

I've tried testing this out on Windows, but the supervisor doesn't get
past bootstrapping (the Commander's Stop function does not work on
windows), so I wasn't able to fully test it.

**Documentation:** 
Added new parameter to README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd/otelcontribcol otelcontribcol command extension/opamp Run Windows Enable running windows test on a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants