-
-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exporter v1.4.0 up to v1.7.0 is constantly failing to retrieve data in time. #244
Comments
Ok, I've increased the scrape interval and the scrape timeout, it seems to solve (kinda, still waiting to see if we have holes in the metrics) the issue. Default values: New values: |
Hi guys, the infamous bug is back! We did increased the thresholds up to: However, even with those thresholds the metrics retrieval is not really working correctly as we now have period where we can't graph the latest 5mins info as the retrieval failed do to the following error: https://paste.opendev.org/show/byl7Eyuu5f4xfNwgc04q/ We're using the 1.7.0 release and this retrieval time issue is present since 1.4.0 up to latest. |
We are facing similar issues. Seems to me that the memory usage and the response time continuously increases… @gtherond Do you see high memory usage (>30GB) as well? |
Nope, it barely scratch around 6Gb of ram and even in debug mode it doesn't indeed show anything special. What's really weird is that the HTTP error is not catched by any stack in between. Here is my latest debug logs attached: If anything can help let me know! |
Are you using Heat intensively ? |
Hi! Not at all in fact ^^ Another issue with Openstack is it does answer slowly where prometheus advise to keep scrape and collect duration under 2 mins as much as possible and 5 mins max. |
Hi everyone,
I'm using the exporter on a kolla-ansible based platform, however I'm facing a weird issue with it since a while.
I can request this exporter using cURL but when I'm on grafana through prometheus server or even calling /metrics on prometheus server from a cURL request, then I don't get any metrics.
What's really weird is, as soon as this message appear the process restart (not the container).
I'm using the latest 1.6.0 but I have this behavior and issue since 1.4.0.
I've seen a lot of project having this issue and even a in here with this issue: #130
My prometheus server is having plenty ressources available with a lot of disk IOPS and CPUs/RAM.
Here is the prometheus configuration:
The prometheus server is at version 1.8.2
All our images are CentOS 8 Stream based.
All our host are CentOS 8 Stream based too.
Let me know if you need more information.
The text was updated successfully, but these errors were encountered: