Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza] log when files are rotated/moved/truncated #33237

Merged
merged 4 commits into from
May 31, 2024

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented May 27, 2024

Description:

This PR adds the logging part from #31256.
With this addition every time that is identified that file is rotated
either by move/create or copy/truncate, proper logging takes place.

Link to tracking Issue: #31256

Testing:
Updated existing unit tests

How to test this manually

Using the following config:

receivers:
  filelog:
    start_at: beginning
    poll_interval: 5s
    include:
    - /var/log/busybox/monitoring/stable*.log

exporters:
  debug:
    verbosity: detailed

service:
  telemetry:
    logs:
      level: info
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [debug]
      processors: []

Testing truncate (out of pattern)

echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log &&  
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log && 
sleep 6 && 
cp /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc.log.1 && 
: > /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log

Testing truncate (in pattern)

echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log &&
sleep 6 &&
cp /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc_1.log &&
: > /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line new1" >> /var/log/busybox/monitoring/stable_trunc.log

Testing move/create (out of pattern)

echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log &&
sleep 6 &&
mv /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc.log.1 &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log

Testing move/create (in pattern)

echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log &&
sleep 6 &&
mv /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc_1.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log

Documentation:
Add some extra notes in the design.md

@github-actions github-actions bot requested a review from djaglowski May 27, 2024 13:13
@ChrsMark ChrsMark force-pushed the add_filelog_logging branch 2 times, most recently from 08f16a6 to b7343dd Compare May 29, 2024 14:42
@ChrsMark ChrsMark changed the title [wip] [pkg/stanza] log when files are moved/truncated [wip] [pkg/stanza] log when files are rotated/moved/truncated May 29, 2024
@ChrsMark ChrsMark changed the title [wip] [pkg/stanza] log when files are rotated/moved/truncated [pkg/stanza] log when files are rotated/moved/truncated May 29, 2024
Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add actual test cases to make sure we don't regress? We could use zaptest/observer to validated the expected logs.

// the Validate method to ensure that the file has not been truncated.
if !oldReader.Validate() {
m.set.Logger.Debug("File has been rotated(truncated)", zap.String("path", newReader.GetFileName()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it doesn't technically matter but do you think it would be slightly more readable to use oldReader here, simply because we are communicating to the reader about the old file that is no longer here?

Comment on lines 213 to 220
A) When a file it rotated out of pattern via move/create, we detect that
our old handle is still valid and we attempt to read from it.
B) When a file it rotated out of pattern via copy/truncate, we detect that
our old handle is invalid and we do not attempt to read from it.
C) When a file is copied within the pattern with unread logs on the end, then the original is truncated,
we get the unread logs on the copy as well as any new logs written to the truncated file
D) When a file is moved within the pattern with unread logs on the end, then the original is created again,
we get the unread logs on the moved as well as any new logs written to the newly created file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good explainations. Can I suggest we reverse the order so that we start with the simplest case (move/create within pattern) and progress to the most complex (copy/truncate out of pattern)?

Comment on lines 234 to 238
In case of the file has been rotated with copy/truncate, the old handle will be pointing
to the original file which is truncated. So we don't have a handle so as to consume any remaining
logs from the moved file. This can cause data loss.
In case of the file has been rotated with move/create, the old handle will be pointing
to the moved file so we can still consume from it even if it's out of the pattern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, I think move/create is simpler to understand since copy/truncate comes with a possible data loss scenario.

@ChrsMark ChrsMark force-pushed the add_filelog_logging branch 2 times, most recently from a698bda to 5bd7b4d Compare May 31, 2024 13:40
@ChrsMark ChrsMark marked this pull request as ready for review May 31, 2024 14:16
@ChrsMark ChrsMark requested a review from a team as a code owner May 31, 2024 14:16
@djaglowski djaglowski merged commit 0978757 into open-telemetry:main May 31, 2024
164 checks passed
@github-actions github-actions bot added this to the next release milestone May 31, 2024
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this pull request Jul 11, 2024
…ry#33237)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
This PR adds the logging part from
open-telemetry#31256.
With this addition every time that is identified that file is rotated
either by move/create or copy/truncate, proper logging takes place. 

**Link to tracking Issue:** <Issue number if applicable>
open-telemetry#31256

**Testing:** <Describe what testing was performed and which tests were
added.>
Updated existing unit tests


### How to test this manually

Using the following config:

```yaml
receivers:
  filelog:
    start_at: beginning
    poll_interval: 5s
    include:
    - /var/log/busybox/monitoring/stable*.log

exporters:
  debug:
    verbosity: detailed

service:
  telemetry:
    logs:
      level: info
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [debug]
      processors: []
```

#### Testing truncate (out of pattern)

```console
echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log &&  
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log && 
sleep 6 && 
cp /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc.log.1 && 
: > /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log
```

#### Testing truncate (in pattern)

```console
echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log &&
sleep 6 &&
cp /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc_1.log &&
: > /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line new1" >> /var/log/busybox/monitoring/stable_trunc.log
```

#### Testing move/create  (out of pattern)

```console
echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log &&
sleep 6 &&
mv /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc.log.1 &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log
```

#### Testing move/create (in pattern)

```console
echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && 
echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log &&
sleep 6 &&
mv /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc_1.log &&
echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log
```

**Documentation:** <Describe the documentation added.>
Add some extra notes in the `design.md`

---------

Signed-off-by: ChrsMark <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants