Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for 'rrdtool plugin: illegal attempt to update ... (minimum one second step)' #4060

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sorcky
Copy link

@sorcky sorcky commented Nov 6, 2022

ChangeLog: collectd: 'rrdtool plugin: illegal attempt to update ... (minimum one second step)'

When rrdtool is used on collectd server for data storage it is crucial for clients to send metrics with strictly increasing timestamps. Equal adjacent timestamps cause error similar to this:
rrdtool plugin: rrd_update_r (/opt/collectd/rrd/testhost1/GenericJMX-jetty/gauge-idleThreads.rrd) failed: /opt/collectd/rrd/testhost1/GenericJMX-jetty/gauge-idleThreads.rrd: illegal attempt to update using time 1667670455 when last update time is 1667670455 (minimum one second step)

@sorcky sorcky requested a review from a team as a code owner November 6, 2022 14:31
* past too much. */
rf->rf_next_read = now;
/* Check, if `rf_next_read' is on the same second as 'now' or in the past */
if (CDTIME_T_TO_TIME_T(rf->rf_next_read) <= CDTIME_T_TO_TIME_T(now)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the addition of CDTIME_T_TO_TIME_T()?

Copy link
Author

@sorcky sorcky Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDTIME_T_TO_TIME_T() to round timestamp to nearest integer second.
E.g.:
now = 1667983304.2
rf->rf_next_read = 1667983304.4
It means, that current metric is collected at 1667983304.2 and next one will be collected at 1667983304.4
Technically, condition rf->rf_next_read < now is false but if we don't increment rf->rf_next_read, server gets metrics with these timestamps in sequence: 1667983304.2, 1667983304.4. rrdtool storage "resolution" is 1 second and the both timestamps are rounded to 1667983304. So rrdtool rejects whole batch of data, causing gap on graph and the error.
To avoid this we need to be sure that all sent timestamps round to different seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collectd supports intervals smaller than one second. With this change, we'd cut that off.

/* `rf_next_read' is on the same second as `now' or earlier. it can cause
* 'rrdtool plugin: illegal attempt to update...(minimum one second step)'
* on server so move it to the next second. */
rf->rf_next_read = now + rf->rf_effective_interval;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effective interval could be temporarily up to max interval, so use of rf_interval could be better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand rf->rf_effective_interval is calculated dynamically on each iteration of read loop according to callback status. It reflects current state of plugin more accurately. Thus I decided to use it.

@sorcky
Copy link
Author

sorcky commented Nov 29, 2022

Should I take any actions or make changes for approval?

@mrunge
Copy link
Member

mrunge commented Nov 30, 2022

Thank you for your PR!

I would prefer to round the timestamps down in the rrd write plugin[1]; other write targets do support timestamps in sub-seconds resolution.

[1] https://github.com/collectd/collectd/blob/main/src/rrdtool.c

@sorcky
Copy link
Author

sorcky commented Dec 2, 2022

How are you going to handle situation if rrd write plugin gets two timestamps within one second?
This pull request avoids this situation on client side.

@mrunge
Copy link
Member

mrunge commented Dec 12, 2022

I'd then use either the first or the last metrics in rrd_write and document that behavior. The current behavior sort of uses the behavior of using the first data and to ignore the following within the same second (or: complain about receiving more data than supported).

@sorcky
Copy link
Author

sorcky commented Dec 14, 2022

Current behavior of rrd_write is to ignore whole bunch of data (not only data with timestamps within the same second). In our clients we flush data to server once per 120 secs. So it leads to gaps of up to 120 secs of data.
It would be nice indeed to drop only data with timestamps that rounds to the same second.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants