Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional system attributes #31627

Open
bhupenbisht opened this issue Mar 6, 2024 · 32 comments
Open

Additional system attributes #31627

bhupenbisht opened this issue Mar 6, 2024 · 32 comments
Assignees
Labels

Comments

@bhupenbisht
Copy link

bhupenbisht commented Mar 6, 2024

Component(s)

Describe the issue you're reporting

Looking for system uptime metric . This metrics would provide useful context on the machine that is generating telemetry and would be useful for infrastructure monitoring.

  1. Server Uptime(since last boot)
@bhupenbisht bhupenbisht added the needs triage New item requiring triage label Mar 6, 2024
@github-actions github-actions bot added the processor/resourcedetection Resource detection processor label Mar 6, 2024
Copy link
Contributor

github-actions bot commented Mar 6, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1 crobert-1 added the enhancement New feature or request label Mar 6, 2024
@bhupenbisht
Copy link
Author

Server uptime is a crucial metrics for infrastructure observability point of view. Could anyone look into this request, as all other tool providing the same details.

Copy link
Contributor

Pinging code owners for receiver/hostmetrics: @dmitryax @braydonk. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@andrzej-stencel andrzej-stencel removed the processor/resourcedetection Resource detection processor label Mar 21, 2024
@andrzej-stencel
Copy link
Member

andrzej-stencel commented Mar 21, 2024

This seems to fit into Host Metrics receiver more, so I have updated the label. It has been requested before (the issue was closed as inactive):

I continue to believe this is a reasonable addition. Would you be up for implementing it @bhupenbisht?

@bhupenbisht
Copy link
Author

@astencel-sumo sure i would like contribute.. pls let me know, how can i..?

@andrzej-stencel
Copy link
Member

@bhupenbisht You can prepare a pull request implementing the change. See the original issue for tips on how to implement this (a new system scraper in the Host Metrics receiver).

@kevinnoel-be
Copy link
Contributor

We do have an internal receiver/scraper to gather uptime, if still interested we could push it part of the hostmetrics receiver

@andrzej-stencel
Copy link
Member

@kevinnoel-be this sounds great. Is this code open source - can you point to it to take a look?

@kevinnoel-be
Copy link
Contributor

@andrzej-stencel It is in an internal/private OTel build, so you won't be able to see it. We created a separate receiver with scraper as we cannot extend the hostmetrics receiver and we didn't want to fork it only for this metric.

I could take same time to port this, but wondering what would be an appropriate naming for this metric as I don't see much movement on open-telemetry/semantic-conventions#648
Our internal definition for it is:

metrics:
  system.uptime:
    enabled: true
    description: The time since the system started
    unit: s
    sum:
      value_type: int
      monotonic: true
      aggregation_temporality: cumulative

@andrzej-stencel
Copy link
Member

Understood, thanks for your response @kevinnoel-be.

I believe system.uptime is a good name and I think we can implement this without waiting on a semantic convention for this.

Also see #14130 for previous discussion and considerations regarding implementing it.

@andrzej-stencel
Copy link
Member

@kevinnoel-be do you want to prepare a PR adding the system.uptime metric to the Host Metrics receiver? If yes, I'll assign this issue to you.

@kernelpanic77
Copy link
Contributor

@andrzej-stencel I'd like to contribute as well, If it's okay with @kevinnoel-be, Can I take a shot at this?

@kevinnoel-be
Copy link
Contributor

kevinnoel-be commented May 15, 2024

@kernelpanic77 Sure.
If you cannot find the time, ping me back and I'll pick it up

@andrzej-stencel
Copy link
Member

Sure, thanks for offering your help @kernelpanic77! Assigning this isuse to you.

@mx-psi
Copy link
Member

mx-psi commented May 16, 2024

I don't think we should add this metric to the hostmetrics receiver without adding it to semantic conventions. It's fine to add it under a feature flag as a PoC for the semantic conventions, but we must not risk deviating from semantic conventions here.

@kernelpanic77
Copy link
Contributor

Hi @kevinnoel-be,

I understand that the repository is internal. Could you guide me on how you implemented the uptime metric, or is there a way I can take a look at the implementation in your fork?

@andrzej-stencel @mx-psi we can create a draft PR for this until the semantic conventions is approved.

@kevinnoel-be
Copy link
Contributor

kevinnoel-be commented May 20, 2024

@kernelpanic77 Created a new uptime scraper in the hostmetrics receiver using shirou/gopsutil host.UptimeWithContext() method behind the scenes

@mx-psi
Copy link
Member

mx-psi commented May 20, 2024

we can create a draft PR for this until the semantic conventions is approved.

To be clear: to my knowledge, nobody is actively working on this on the semantic conventions side. I am happy to guide you through the process if you want to contribute it yourself to semantic conventions

@krantishetty
Copy link

@kevinnoel-be is there any update on uptime metrics? We are looking forward

@kernelpanic77
Copy link
Contributor

@krantishetty Give me some time, I'm working on a draft PR.

@kernelpanic77
Copy link
Contributor

we can create a draft PR for this until the semantic conventions is approved.

To be clear: to my knowledge, nobody is actively working on this on the semantic conventions side. I am happy to guide you through the process if you want to contribute it yourself to semantic conventions

Sure, let's create a PR for semantic-conventions as well. Could you please guide me ?
Sorry for my delayed response.

@kernelpanic77
Copy link
Contributor

@kernelpanic77 Created a new uptime scraper in the hostmetrics receiver using shirou/gopsutil host.UptimeWithContext() method behind the scenes

This is helpful @kevinnoel-be. thanks.

@mx-psi
Copy link
Member

mx-psi commented May 28, 2024

Sure, let's create a PR for semantic-conventions as well. Could you please guide me ?
Sorry for my delayed response.

@kernelpanic77 No worries! Take a look at this recent PR that adds another metric to system metrics: open-telemetry/semantic-conventions/pull/1078. You would have to file a PR with roughly the same structure, noting that the Markdown files are autogenerated (see here how this works and how to set up your development environment).

@andrzej-stencel
Copy link
Member

@krantishetty
Copy link

This is covering only process uptime, however we are looking for server uptime which can be captured from /proc/uptime. PR 2824 not covered with server uptime

@kernelpanic77
Copy link
Contributor

@kevinnoel-be I believe that the existing code for the processes scraper, is already calling gopsutils/host, so I think we can use the existing scraper to scrape host.uptimeWithContext().

@kernelpanic77
Copy link
Contributor

@krantishetty are you referring to application uptime ? Could you give an example of your usecase for more clarity.

@bhupenbisht
Copy link
Author

@kernelpanic77 i believe krantishetty is talking about otel process uptime. Here we are looking for server uptime.

@krantishetty
Copy link

@krantishetty are you referring to application uptime ? Could you give an example of your usecase for more clarity.

Im talking about the server uptime which is last reboot of the server. Suppose if i give command # uptime, it shows server uptime since last reboot.
This can read from /proc/uptime in linux machine

@bhupenbisht
Copy link
Author

@kernelpanic77 any luck on system uptime ...?

@krantishetty
Copy link

Hi
Any luck on system uptime?

@Ruppsn
Copy link

Ruppsn commented Oct 17, 2024

Hey,
this is a important metric. Any news?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants