Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scrapehelper] Make the initial scrape interval configurable #7635

Merged

Conversation

MovieStoreGuy
Copy link
Contributor

Description:
This change will allow scrapers to start gather values on scrape controller start instead of having to wait for scrape controller duration to perform the initial scrape.

This is particularly useful for components such as hostmetrics receiver that generate utilization metrics that require an initial state to compute that value. With this change, it means that this metrics will start being sent from Start + Duration instead of Start + 2*Duration which for scrape durations like 30s means that this value won't be sent til a minute has elapsed.

Link to tracking Issue:
N/A
Mostly because I forgot to raise an issue before I did this.

Testing:

Updated the test cases where appropriate.

Documentation:

I don't believe any user / dev docs need to be updated beyond the change log since the only behavioural change is when the scrapers are called which shouldn't impact any behaviours scrapers.

@MovieStoreGuy MovieStoreGuy requested a review from a team as a code owner May 7, 2023 02:54
@codecov
Copy link

codecov bot commented May 7, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.35 🎉

Comparison is base (deffd48) 90.74% compared to head (fec8609) 91.10%.

❗ Current head fec8609 differs from pull request most recent head 52ed230. Consider uploading reports for the commit 52ed230 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7635      +/-   ##
==========================================
+ Coverage   90.74%   91.10%   +0.35%     
==========================================
  Files         296      297       +1     
  Lines       14784    14662     -122     
==========================================
- Hits        13416    13358      -58     
+ Misses       1092     1043      -49     
+ Partials      276      261      -15     
Impacted Files Coverage Δ
receiver/scraperhelper/scrapercontroller.go 94.87% <100.00%> (+5.68%) ⬆️

... and 18 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@MovieStoreGuy
Copy link
Contributor Author

One thing worth pointing out for this change is that having scrapers starting on collector / component start will have an increase on resource usage (CPU, Mem, I/O, etc ...) since scraper start will happen shortly after create and the missed interval for hostmetrics will now be included in the metrics results.

@dmitryax
Copy link
Member

dmitryax commented May 7, 2023

With this change, the utilization metrics will still be misaligned with others, right? Maybe we can better fix those instead? We can record the initial utilization values on Start.

I like the current state of reporting metrics on Start + Duration because it's easy to differentiate start issues from scraping issues by looking at the logs. And, as you mentioned, it puts more memory/cpu pressure on the start procedure, which already can be pretty busy.

@MovieStoreGuy
Copy link
Contributor Author

With this change, the utilization metrics will still be misaligned with others, right?

Yes, they will continue to be offset by one collection (specifically for the hostmetric receiver, not sure if other receivers also experience this).

Maybe we can better fix those instead? We can record the initial utilization values on Start.

I like this idea, perhaps worth raising an issue in contrib for the host receiver (since I only noticed it there.)

I like the current state of reporting metrics on Start + Duration because it's easy to differentiate start issues from scraping issues by looking at the logs. And, as you mentioned, it puts more memory/cpu pressure on the start procedure, which already can be pretty busy.

My only hesitation on keeping as it is with Start + Duration is that it means that scrapers have this min time to start producing data which is problematic for scrapers that can't fully validate the configuration until they are started.
This could lead to setting collection interval to 1s then potentially flooding the database for example instead of doing one scrape 30s and failing fast on component start.

@bogdandrutu
Copy link
Member

I like this, just brainstorming an idea: maybe we have a config for "initial scrape delay" and have that set to 1sec or something like that, then we do the interval from there?

@MovieStoreGuy
Copy link
Contributor Author

I like this, just brainstorming an idea: maybe we have a config for "initial scrape delay" and have that set to 1sec or something like that, then we do the interval from there?

Yeah that makes sense to me, we can keep the current behaviour (ie: initial_delay = collection_duration) but expose it as a field.

@MovieStoreGuy MovieStoreGuy force-pushed the msg/enhacement-scrape-on-start branch from 0003918 to 5b958b2 Compare May 10, 2023 12:28
receiver/scraperhelper/scrapercontroller.go Outdated Show resolved Hide resolved
receiver/scraperhelper/scrapercontroller.go Outdated Show resolved Hide resolved
receiver/scraperhelper/scrapercontroller.go Outdated Show resolved Hide resolved
@MovieStoreGuy MovieStoreGuy force-pushed the msg/enhacement-scrape-on-start branch from 5b958b2 to a92f0ac Compare May 17, 2023 11:13
@MovieStoreGuy
Copy link
Contributor Author

I've updated the PR to be more of an opt out approach so users are required to explicitly set initial_delay:0 within the configuration instead of going for an opt in approach which convoluted the design a bit.

Copy link
Member

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this enhancement.

I wonder if/what changes we need to make in contrib to make sure this change does not surprise users. I noticed that the ScraperControllerSettings struct is often initialized like this:

ScraperControllerSettings: scraperhelper.ScraperControllerSettings{
    CollectionInterval: defaultCollectionInterval,
},

In those cases, the InitialDelay will be initialized to 0, right? Should we make sure is it initializes with the scraper controller's default value of 1 second?

@MovieStoreGuy
Copy link
Contributor Author

I love this enhancement.

I wonder if/what changes we need to make in contrib to make sure this change does not surprise users. I noticed that the ScraperControllerSettings struct is often initialized like this:

ScraperControllerSettings: scraperhelper.ScraperControllerSettings{
    CollectionInterval: defaultCollectionInterval,
},

In those cases, the InitialDelay will be initialized to 0, right? Should we make sure is it initializes with the scraper controller's default value of 1 second?

Mmm, this is a good point.

Let me go find those receivers referencing the struct directly and I will update them to use the function so it shouldn't come as a complete surprise.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 2, 2023
@MovieStoreGuy
Copy link
Contributor Author

@bogdandrutu , @mx-psi , @astencel-sumo

This change should be safe to merge in now without any issues.

@dmitryax dmitryax removed the Stale label Jun 2, 2023
Copy link
Member

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great enhancement!

We're going to need to update all the affected contrib components' README files with the new initial_delay option, I think it's going to be done in this PR:

@MovieStoreGuy
Copy link
Contributor Author

This is a great enhancement!

We're going to need to update all the affected contrib components' README files with the new initial_delay option, I think it's going to be done in this PR:

Should be done :)

@dmitryax dmitryax merged commit ce65350 into open-telemetry:main Jun 2, 2023
31 of 32 checks passed
@github-actions github-actions bot added this to the next release milestone Jun 2, 2023
@dmitryax dmitryax changed the title [scrapehelper] Start scraping from component start [scrapehelper] Make the initial scrape interval configurable Jun 2, 2023
@MovieStoreGuy MovieStoreGuy deleted the msg/enhacement-scrape-on-start branch June 3, 2023 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants