Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syslog files without a year and leap date Feb. 29 do not revise the guesstimated year #245

Open
jtmoon79 opened this issue Mar 17, 2024 · 1 comment
Labels
bug Something isn't working difficult A difficult problem; a major coding effort or difficult algorithm to perfect P2 less important

Comments

@jtmoon79
Copy link
Owner

jtmoon79 commented Mar 17, 2024

Summary

A syslog file with a log message format without a date does not revise the guesstimated year when it encounters a log message with datestamp on leap year day February 29.

To Reproduce

Use committed log file ./logs/other/tests/dtf9c-23-12x2.log.gz with a Last Modified Time of September 2018. The GZIP encoded datetime is September 2, 2022 9:45:35 PM GMT-07:00 (derived from modified time 1662180335).
The GZIP encoded datetime is used to place the last log message Dec 12 12:00:00 23; datetimestamp Dec 12 12:00:00 is guesstimated to be year 2022.

During processing, log message Feb 29 02:00:00 1 is encountered. It is given incorrect year 2021.

$ ls -l ./logs/other/tests/dtf9c-23-12x2.log.gz
-rw-r--r-- 1 user user 207 Sep  2  2018 ./logs/other/tests/dtf9c-23-12x2.log.gz

$ gunzip --stdout -k ./logs/other/tests/dtf9c-23-12x2.log.gz
Jan 1 01:00:00 0
Feb 29 02:00:00 1
Mar 3 03:00:00 2
Apr 4 04:00:00 3
May 5 05:00:00 4
Jun 6 06:00:00 5
Jul 7 07:00:00 6
Aug 8 08:00:00 7
Sep 9 09:00:00 8
Oct 10 10:00:00 9
Nov 11 11:00:00 10
Dec 12 12:00:00 11
Jan 1 01:00:00 12
Feb 28 02:00:00 13
Mar 3 03:00:00 14
Apr 4 04:00:00 15
May 5 05:00:00 16
Jun 6 06:00:00 17
Jul 7 07:00:00 18
Aug 8 08:00:00 19
Sep 9 09:00:00 20
Oct 10 10:00:00 21
Nov 11 11:00:00 22
Dec 12 12:00:00 23

The result

$ ./target/release/s4 ./logs/other/tests/dtf9c-23-12x2.log.gz -u
20210101T080000.000+0000:Jan 1 01:00:00 0
20210101T080000.000+0000:Feb 29 02:00:00 1
20210303T100000.000+0000:Mar 3 03:00:00 2
20210404T110000.000+0000:Apr 4 04:00:00 3
20210505T120000.000+0000:May 5 05:00:00 4
20210606T130000.000+0000:Jun 6 06:00:00 5
20210707T140000.000+0000:Jul 7 07:00:00 6
20210808T150000.000+0000:Aug 8 08:00:00 7
20210909T160000.000+0000:Sep 9 09:00:00 8
20211010T170000.000+0000:Oct 10 10:00:00 9
20211111T180000.000+0000:Nov 11 11:00:00 10
20211212T190000.000+0000:Dec 12 12:00:00 11
20220101T080000.000+0000:Jan 1 01:00:00 12
20220228T090000.000+0000:Feb 28 02:00:00 13
20220303T100000.000+0000:Mar 3 03:00:00 14
20220404T110000.000+0000:Apr 4 04:00:00 15
20220505T120000.000+0000:May 5 05:00:00 16
20220606T130000.000+0000:Jun 6 06:00:00 17
20220707T140000.000+0000:Jul 7 07:00:00 18
20220808T150000.000+0000:Aug 8 08:00:00 19
20220909T160000.000+0000:Sep 9 09:00:00 20
20221010T170000.000+0000:Oct 10 10:00:00 21
20221111T180000.000+0000:Nov 11 11:00:00 22
20221212T190000.000+0000:Dec 12 12:00:00 23

Expected

Since the GZIP Modified Time is Sept. 2022, then the last message Dec 12 12:00:00 23 should be given year 2022 (which it currently is). However, upon encountering Feb 29 02:00:00 1 and it is given date 20210101T080000.000+0000 (Jan 1 2021).
Instead, the processing should notice that attempts to create a datetime for that log message failed. It should attempt a leap day valid year, e.g. 2000, and if that succeeds then it can confirm the date is Feb. 29. When datestamp Feb. 29 is confirmed, the processing should revise it's year guesstimate, and then update previously processed log messages.

In this case, during the backwards search for log messages in process_missing_year, it should allow for matching date Feb 29 no matter the guesstimated year.
The revised year guesstimate should result in last message Dec 12 12:00:00 23 given year 2017, and message Feb 29 02:00:00 1 given valid datetime year 2016, i.e. 20160229T080000.000+0000.

$ ./target/release/s4 ./logs/other/tests/dtf9c-23-12x2.log.gz -u
20160101T080000.000+0000:Jan 1 01:00:00 0
20160101T080000.000+0000:Feb 29 02:00:00 1
20160303T100000.000+0000:Mar 3 03:00:00 2
20160404T110000.000+0000:Apr 4 04:00:00 3
20160505T120000.000+0000:May 5 05:00:00 4
20160606T130000.000+0000:Jun 6 06:00:00 5
20160707T140000.000+0000:Jul 7 07:00:00 6
20160808T150000.000+0000:Aug 8 08:00:00 7
20160909T160000.000+0000:Sep 9 09:00:00 8
20161010T170000.000+0000:Oct 10 10:00:00 9
20161111T180000.000+0000:Nov 11 11:00:00 10
20161212T190000.000+0000:Dec 12 12:00:00 11
20170101T080000.000+0000:Jan 1 01:00:00 12
20170228T090000.000+0000:Feb 28 02:00:00 13
20170303T100000.000+0000:Mar 3 03:00:00 14
20170404T110000.000+0000:Apr 4 04:00:00 15
20170505T120000.000+0000:May 5 05:00:00 16
20170606T130000.000+0000:Jun 6 06:00:00 17
20170707T140000.000+0000:Jul 7 07:00:00 18
20170808T150000.000+0000:Aug 8 08:00:00 19
20170909T160000.000+0000:Sep 9 09:00:00 20
20171010T170000.000+0000:Oct 10 10:00:00 21
20171111T180000.000+0000:Nov 11 11:00:00 22
20171212T190000.000+0000:Dec 12 12:00:00 23

Additional context

Found while investigating #189.

@jtmoon79 jtmoon79 added bug Something isn't working P2 less important labels Mar 17, 2024
jtmoon79 added a commit that referenced this issue Mar 17, 2024
jtmoon79 added a commit that referenced this issue Mar 17, 2024
@jtmoon79 jtmoon79 added the difficult A difficult problem; a major coding effort or difficult algorithm to perfect label Mar 17, 2024
jtmoon79 added a commit that referenced this issue Mar 17, 2024
jtmoon79 added a commit that referenced this issue Mar 17, 2024
@jtmoon79
Copy link
Owner Author

jtmoon79 commented Mar 20, 2024

I need to confirm that date Feb. 29 is possible under some circumstances, i.e. not ignored for all circumstances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working difficult A difficult problem; a major coding effort or difficult algorithm to perfect P2 less important
Projects
None yet
Development

No branches or pull requests

1 participant