Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][DISCUSSION] file.manage_file might cause race condition due to permission config #65651

Open
m-czernek opened this issue Nov 30, 2023 · 2 comments · May be fixed by #65657
Open

[BUG][DISCUSSION] file.manage_file might cause race condition due to permission config #65651

m-czernek opened this issue Nov 30, 2023 · 2 comments · May be fixed by #65657
Assignees
Labels
Bug broken, incorrect, or confusing behavior Chlorine v3007.0 Linux P3 Priority 3

Comments

@m-czernek
Copy link
Contributor

Description
The file.manage_file module can cause unexpected problems due to the way it handles permissions. I am happy to provide a PR, but wanted first discuss which approach would make the most sense to the maintainers.

The file.manage_file module currently works as such:

  1. Create a tmp file
  2. Copy tmp file to dst
  3. Set permissions on the dst

The tmp file always has the 600 permissions, together with the current user + group.

Race condition case

  • Salt is modifying a file in use, for example /etc/hosts, /etc/nsswitch.conf, or /etc/nscd.conf.
  • Salt finishes executing step 2. - new file is copied over the old file. However, the new file uses the 600 permissions.
  • The system attempts to use the file before step 3 finishes, and fails in its attempt.

Downstream issue:

Discussion

This is an understandable issue--we are modifying a file that is currently in use, and we cannot guarantee correct file permissions while the module is still executing. However, in some cases (like managing DNS in a running environment), it is impractical to stop the service that uses the file.

A few solutions that come to my mind:

Solution 1

Possibly the least intrusive way to solve the issue is to ensure that the temporary file uses the final permissions. The advantage is that when the tmp file is copied to destination, it's immediately ready to be used.

I can imagine corner cases, like changing the permissions such that salt itself cannot write into the temp file. We could mitigate this by having an additional property, e.g. atomic_permissions, that'd turn off the behavior by default.

Solution 2

We could alternatively modify the salt.utils.files.copyfile function to include permissions, and do a best-effort user/group/mode config within the function.

This would mitigate the issue by greatly reducing the window between file copying and file permission change. The issue would not be 100% solved though.

The disadvantage, in my mind, is that this would have to be a best-effort user/group/mode configuration because we probably wouldn't want to duplicate or pull in https://github.com/saltstack/salt/blob/master/salt/modules/file.py#L5069 into the utils module.

Do you have any preference for the approach to fix/mitigate the issue? Alternatively, is there another solution that I haven't considered?

@m-czernek m-czernek added Bug broken, incorrect, or confusing behavior needs-triage labels Nov 30, 2023
Copy link

welcome bot commented Nov 30, 2023

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at [email protected]. We’re glad you’ve joined our community and look forward to doing awesome things with you!

@Akm0d Akm0d self-assigned this May 8, 2024
@Akm0d
Copy link
Contributor

Akm0d commented Jul 12, 2024

I've taken a close look at the issue with the file.manage_file module and the potential race condition due to how permissions are handled. Here are my thoughts on the solutions you presented and a detailed suggestion based on our discussion:

Thoughts on Proposed Solutions

  1. Ensuring Temporary File Uses Final Permissions:
    This is likely the least intrusive way to handle the issue. Setting the correct permissions on the temporary file before copying it to the destination minimizes the race condition window. However, there could be corner cases where changing permissions might prevent Salt from writing to the temporary file. This can be mitigated by ensuring permissions are set correctly beforehand.

  2. Modifying salt.utils.files.copyfile:
    I strongly believe that copying a file and keeping the destination's permissions is an absurd thing to do. We should keep the default copy behavior as it is, without adding any new properties to handle this case.

Proposed Solution
Given these considerations, I suggest we directly implement changes within the file.manage_file function. This approach ensures that the temporary file is created with the correct permissions and minimizes the race condition without modifying core utility functions.

Suggested Changes
Here’s a specific code snippet to illustrate the changes:

def manage_file(...):
    ...
    if contents is not None:
        # Write the static contents to a temporary file
        tmp = salt.utils.files.mkstemp(
            prefix=salt.utils.files.TEMPFILE_PREFIX, text=True
        )
        # TODO apply previous file permissions to the tempfile right here
    ...

This approach avoids modifying salt.utils.files.copyfile and adding new properties, which I believe is essential to maintaining the expected behavior. By creating the temporary file with the correct permissions before moving it, we reduce the window for any race condition and ensure the file is immediately usable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Chlorine v3007.0 Linux P3 Priority 3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants