Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3s etcd snapshot reconcile consumes excessive memory when a large number of snapshots are present #10450

Closed
purha opened this issue Jul 4, 2024 · 9 comments
Assignees

Comments

@purha
Copy link

purha commented Jul 4, 2024

Environmental Info:
K3s Version: v1.28.10+k3s1 (a4c5612)
go version go1.21.9

but also affects some earlier versions.

Node(s) CPU architecture, OS, and Version: Linux kubernetes-worker-f-1 6.1.0-22-arm64 #1 SMP Debian 6.1.94-1 (2024-06-21) aarch64 GNU/Linux

Issue exists also atleast on ubuntu 22.04 arm64.

Cluster Configuration: 3 servers with all roles in all of them

Describe the bug:

In case when there's lots of snapshots in s3, at some point k3s will consume all memory available, and then oom killer will kick in.
Normally snapshots get cleaned, but because of bug #10292 they dont.
This will only affect single node in cluster.

Steps To Reproduce:

  • Installed K3s: enabled snapshots to S3 (AWS) as snapshots gets piled up the memory will run out faster and faster.

Expected behavior:

Memory should not run out.

Actual behavior:

Memory runs out on node, causing oom killer to start killing processes, restarting k3s service will fix the issue for some time until memory runs out again.

Additional context / logs:

Memory consumption example graph attached.
Screenshot 2024-07-04 at 20 08 10

@brandond
Copy link
Contributor

brandond commented Jul 5, 2024

In case when there's lots of snapshots in s3, at some point k3s will consume all memory available, and then oom killer will kick in.

Is there actually a cumulative memory leak, or is the memory required to manage the snapshots directly proportional to the number of snapshots found on disk and in S3?

If there is a cumulative memory leak, this should show up as increasing memory usage over time despite a static number of etcd snapshots.

@brandond brandond self-assigned this Jul 5, 2024
@brandond brandond added this to the v1.30.3+k3s1 milestone Jul 5, 2024
@purha
Copy link
Author

purha commented Jul 6, 2024

In case when there's lots of snapshots in s3, at some point k3s will consume all memory available, and then oom killer will kick in.

Is there actually a cumulative memory leak, or is the memory required to manage the snapshots directly proportional to the number of snapshots found on disk and in S3?

If there is a cumulative memory leak, this should show up as increasing memory usage over time despite a static number of etcd snapshots.

Seems like cumulative, amount of snapshots contribute to the time that it takes the memory to run out. By disabling S3 snapshots the issue is gone and the memory usage is stable.

@brandond
Copy link
Contributor

brandond commented Jul 8, 2024

What are the units on your graph? Can you show the actual memory utilization of the k3s process in bytes? How many snapshots did you have in the cluster when you saw the memory utilization growing?

I'm trying to reproduce this by profiling k3s with s3 enabled, retention set to 120, and snapshots taken 1 per minute, but I'm not quite sure that I'm seeing the exact same thing as you.

@brandond
Copy link
Contributor

brandond commented Jul 8, 2024

I am also curious if you've tried adding a memory limit to the k3s systemd unit. By default the k3s systemd unit does not have a memory limit on it, and without any external memory pressure, golang will not free memory back to the operating system. So you could just be seeing secondary effects of k3s requiring more memory to reconcile a large number of snapshots, and golang not freeing memory until it absolutely needs to.

@brandond
Copy link
Contributor

brandond commented Jul 8, 2024

Just to share what I'm seeing: I do see k3s allocating a lot of memory while reconciling snapshots, but this memory is freed at the end of each snapshot save cycle. Note that the memory is allocated but no longer in use, which means that it is available to be freed or reused. This is NOT a leak, but I can try to see if there is some potential for enhancement here to avoid the momentary spike in memory during reconcile.

alloc_space
image

inuse_space
image

@brandond
Copy link
Contributor

brandond commented Jul 8, 2024

Just from glancing at this, I suspect that just adding some pagination to the various list operations would take the memory utilization down a lot. The current code pulls a full list into memory on every pass, which will be expensive with hundreds of snapshots.

The profiling also makes it clear that this is NOT a leak, and is not related to minio. So I am going to edit the issue title to better reflect the root of the problem.

image

@brandond brandond changed the title Memory leak in s3 snapshots / minio client K3s etcd snapshot reconcile consumes excessive memory when a large number of snapshots are present Jul 8, 2024
@purha
Copy link
Author

purha commented Jul 9, 2024

I dont have the data anymore, but I think there was around 300 snapshots or more, from 60 days or so, and few on-demand snapshots, I deleted all but the last 14 days and you can see the from the graph that it slightly helped. And you can also see when I disabled the snapshots all together. Attached also the heap profile that I took, however at that point the k3s was already consuming gigabytes of memory. I didn't try to set memory limits for the service.

profile008

Screenshot 2024-07-09 at 10 05 50

@purha
Copy link
Author

purha commented Jul 9, 2024

The usage is in percents and that's a node with 8Gb of memory

@aganesh-suse
Copy link

Closing based on release-1.30 results here: #10559

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done Issue
Development

No branches or pull requests

4 participants