Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support quota set on overlay snapshot #5859

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

yylt
Copy link
Contributor

@yylt yylt commented Aug 11, 2021

support set quota on overlay snapshot

as discussed in issue #3329

Fixes #3329

enable quota set by follows:

  1. Make sure the root directory of containerd is mounted with 'pquota'
  2. Set 'overlay' as the default snapshot and enable quota in 'overlay' config
  3. Set the default quota size in CRI

config.toml like this

  [plugins.'io.containerd.cri.v1.runtime']
    enable_selinux = false
    default_snapshot_quota_size = '2M'   #+
...
  [plugins.'io.containerd.snapshotter.v1.overlayfs']
    root_path = ''
    upperdir_label = false
    sync_remove = false
    slow_chown = false
    mount_options = []
    enable_quota = true  #+

check containerd root mount info

# /etc/fstab
...
/dev/mapper/containerd_root  /var/lib/containerd  xfs     defaults,pquota        0 0

# mount ...
/dev/mapper/containerd_root on /var/lib/containerd type xfs (rw,relatime,prjquota)

check container had been set quota

xfs_quota -x -c "report -h" /var/lib/containerd/

Signed-off-by: Yang Yang [email protected]

@k8s-ci-robot
Copy link

Hi @yylt. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Aug 11, 2021

Build succeeded.

See the License for the specific language governing permissions and
limitations under the License.
*/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seems copied from Moby. Please clarify the origin of the code and the license in code comments.

#define Q_XGETPQUOTA QCMD(Q_XGETQUOTA, PRJQUOTA)
#endif
*/
import "C"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid cgo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid cgo

Done!

@AkihiroSuda AkihiroSuda added the status/needs-discussion Needs discussion and decision from maintainers label Aug 11, 2021
@AkihiroSuda AkihiroSuda added this to the 1.6 milestone Aug 11, 2021
@dmcgowan dmcgowan added this to New in Code Review via automation Aug 11, 2021
@dmcgowan dmcgowan moved this from New to Needs Discussion in Code Review Aug 11, 2021
@dmcgowan
Copy link
Member

Quota handling is a topic we should discuss at a high level first and possibly a good topic for a future community meeting. However, getting library support for setting project quotas across different backing FS makes sense to do now.

As an end feature, we need to discuss how it will be integrated with Kubernetes and how we can use it from our own tooling and go libraries. It seems backwards to have the cri layer here enable the quota then the snapshot set a static quota. I would think the quota would be set by the client adding a quota and the snapshotter enabling it via configuration and erroring out if the backing FS couldn't support it when enabled. The difficult part is figuring out from the client perspective whether quota is enabled so it knows whether it can avoid expensive ephemeral storage accounting (such as in the Kubelet).

@@ -266,6 +266,9 @@ type PluginConfig struct {
// of being placed under the hardcoded directory /var/run/netns. Changing this setting requires
// that all containers are deleted.
NetNSMountsUnderStateDir bool `toml:"netns_mounts_under_state_dir" json:"netnsMountsUnderStateDir"`
// SupportSetQuota indicate to set quota on container read-write snapshot, when snapshot do not
// support quota set, it will do nothing.
SupportSetQuota bool `toml:"support_set_quota" json:"supportSetQuota"`
Copy link

@lining2020x lining2020x Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe EnableRootfsQuota or EnableSetQuotasounds better.

@dmcgowan dmcgowan modified the milestones: 1.6, 1.7 Dec 9, 2021
@pacoxu
Copy link
Contributor

pacoxu commented Jun 9, 2022

The difficult part is figuring out from the client perspective whether quota is enabled so it knows whether it can avoid expensive ephemeral storage accounting (such as in the Kubelet).

In Kubernetes, there is a feature gate and a check method SupportsQuotas, here is the KEP "Quotas for Ephemeral Storage" kubernetes/enhancements#1029 from Kubernetes.
https://github.com/kubernetes/kubernetes/blob/226323178e23b4c476001266beab1e2f116b3879/pkg/volume/util/fsquota/common/quota_linux_common_impl.go#L180-L190

// SupportsQuotas determines whether the filesystem supports quotas.
func SupportsQuotas(mountpoint string, qType QuotaType) (bool, error) {
	data, err := runXFSQuotaCommand(mountpoint, "state -p")
	if err != nil {
		return false, err
	}
	if qType == FSQuotaEnforcing {
		return strings.Contains(data, "Enforcement: ON"), nil
	}
	return strings.Contains(data, "Accounting: ON"), nil
}

@yylt
Copy link
Contributor Author

yylt commented Jun 16, 2022

sorry for so long time later.

As an end feature, we need to discuss how it will be integrated with Kubernetes and how we can use it from our own tooling and go libraries. It seems backwards to have the cri layer here enable the quota then the snapshot set a static quota. I would think the quota would be set by the client adding a quota and the snapshotter enabling it via configuration and erroring out if the backing FS couldn't support it when enabled. The difficult part is figuring out from the client perspective whether quota is enabled so it knows whether it can avoid expensive ephemeral storage accounting (such as in the Kubelet).

it is good suggestion. now quota setter focus on overlay, and with pure go.

and update snapshots litter, add new option WithQuotaSize. mostly if the option not define, will use default size in overlayfs configurtion

such as:

  [plugins."io.containerd.snapshotter.v1.overlayfs"]
    root_path = ""
    quota_size = "128MB"

Copy link
Contributor

@pacoxu pacoxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to build a containerd to do some test. Too many quota is created in my env.

[root@daocloud ~]# xfs_quota -xc "report -a -p -h"  | head -n 100
Project quota on /var/lib/kubelet (/dev/sdc)
                        Blocks
Project ID   Used   Soft   Hard Warn/Grace
---------- ---------------------------------
#0           820K      0      0  00 [------]
volume1048590      0     8E     8E  00 [------]
volume1048605      0     8E     8E  00 [------]
volume1048607      0     8E     8E  00 [------]
volume1048610      0     8E     8E  00 [------]

Project quota on /var/lib/containerd (/dev/sdb)
                        Blocks
Project ID   Used   Soft   Hard Warn/Grace
---------- ---------------------------------
#0          10.2G      0      0  00 [------]
#2              0      0      0  00 [------]
#3              0    15G    15G  00 [------]
#4            12K    15G    15G  00 [------]
#5              0    15G    15G  00 [------]
#6              0    15G    15G  00 [------]
#7              0    15G    15G  00 [------]
#8              0    15G    15G  00 [------]
#9              0    15G    15G  00 [------]
#10             0    15G    15G  00 [------]
#11             0    15G    15G  00 [------]
#12             0    15G    15G  00 [------]
#13             0    15G    15G  00 [------]
#14             0    15G    15G  00 [------]
#15             0    15G    15G  00 [------]
#16             0    15G    15G  00 [------]
#17             0    15G    15G  00 [------]
#18             0    15G    15G  00 [------]
#19             0    15G    15G  00 [------]
#20             0    15G    15G  00 [------]
#21             0    15G    15G  00 [------]
#22             0    15G    15G  00 [------]
#23             0    15G    15G  00 [------]
#24            4K    15G    15G  00 [------]
#25             0    15G    15G  00 [------]
#26             0    15G    15G  00 [------]
#27             0    15G    15G  00 [------]
#28             0    15G    15G  00 [------]
#29            4K    15G    15G  00 [------]
#30             0    15G    15G  00 [------]
#31             0    15G    15G  00 [------]
#32             0    15G    15G  00 [------]
#33             0    15G    15G  00 [------]
#34             0    15G    15G  00 [------]
#35            4K    15G    15G  00 [------]
#36            4K    15G    15G  00 [------]
#37             0    15G    15G  00 [------]
#38          540K    15G    15G  00 [------]
#39             0    15G    15G  00 [------]
#40             0    15G    15G  00 [------]
#41             0    15G    15G  00 [------]
#42             0    15G    15G  00 [------]
#43             0    10G    10G  00 [------]
#44             0    10G    10G  00 [------]
#45             0    10G    10G  00 [------]
#46             0    10G    10G  00 [------]
#47             0    10G    10G  00 [------]
#48             0    10G    10G  00 [------]
#49             0    10G    10G  00 [------]
#50             0    10G    10G  00 [------]
#51             0    10G    10G  00 [------]
#52             0    10G    10G  00 [------]
#53             0    10G    10G  00 [------]
#54             0    10G    10G  00 [------]
#55             0    10G    10G  00 [------]
#56             0    10G    10G  00 [------]
#57             0    10G    10G  00 [------]
#58             0    10G    10G  00 [------]
#59             0    10G    10G  00 [------]
#60             0    10G    10G  00 [------]
#61             0    10G    10G  00 [------]
#62             0    10G    10G  00 [------]
#63             0    10G    10G  00 [------]

I set the size to 10G at first and 15G later.

[root@daocloud ~]# xfs_quota -xc "report -a -p -h"  | wc -l
1956507

@@ -57,6 +62,15 @@ func init() {
oOpts = append(oOpts, overlay.WithUpperdirLabel)
}

ic.Meta.Exports["root"] = root
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dup with line 74.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 121 to 125
fsMagic, err := overlayutils.GetFSMagic(root)
if err != nil {
return nil, err
}
if config.quotaSize > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fsMagic, err := overlayutils.GetFSMagic(root)
if err != nil {
return nil, err
}
if config.quotaSize > 0 {
if config.quotaSize > 0 {
fsMagic, err := overlayutils.GetFSMagic(root)
if err != nil {
return nil, err
}

} else {
size = config.quotaSize
}
return quotaCtl.SetAllQuota(size, targets...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting quota may need some logs. Probably, some debug log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@yylt
Copy link
Contributor Author

yylt commented Jun 27, 2022

I tried to build a containerd to do some test. Too many quota is created in my env.

[root@daocloud ~]# xfs_quota -xc "report -a -p -h"  | head -n 100
Project quota on /var/lib/kubelet (/dev/sdc)
                        Blocks
Project ID   Used   Soft   Hard Warn/Grace
---------- ---------------------------------
#0           820K      0      0  00 [------]
volume1048590      0     8E     8E  00 [------]
volume1048605      0     8E     8E  00 [------]
volume1048607      0     8E     8E  00 [------]
volume1048610      0     8E     8E  00 [------]

Project quota on /var/lib/containerd (/dev/sdb)
                        Blocks
Project ID   Used   Soft   Hard Warn/Grace
---------- ---------------------------------
#0          10.2G      0      0  00 [------]
#2              0      0      0  00 [------]
#3              0    15G    15G  00 [------]
#4            12K    15G    15G  00 [------]
#5              0    15G    15G  00 [------]
#6              0    15G    15G  00 [------]
#7              0    15G    15G  00 [------]
#8              0    15G    15G  00 [------]
#9              0    15G    15G  00 [------]
#10             0    15G    15G  00 [------]
#11             0    15G    15G  00 [------]
#12             0    15G    15G  00 [------]
#13             0    15G    15G  00 [------]
#14             0    15G    15G  00 [------]
#15             0    15G    15G  00 [------]
#16             0    15G    15G  00 [------]
#17             0    15G    15G  00 [------]
#18             0    15G    15G  00 [------]
#19             0    15G    15G  00 [------]
#20             0    15G    15G  00 [------]
#21             0    15G    15G  00 [------]
#22             0    15G    15G  00 [------]
#23             0    15G    15G  00 [------]
#24            4K    15G    15G  00 [------]
#25             0    15G    15G  00 [------]
#26             0    15G    15G  00 [------]
#27             0    15G    15G  00 [------]
#28             0    15G    15G  00 [------]
#29            4K    15G    15G  00 [------]
#30             0    15G    15G  00 [------]
#31             0    15G    15G  00 [------]
#32             0    15G    15G  00 [------]
#33             0    15G    15G  00 [------]
#34             0    15G    15G  00 [------]
#35            4K    15G    15G  00 [------]
#36            4K    15G    15G  00 [------]
#37             0    15G    15G  00 [------]
#38          540K    15G    15G  00 [------]
#39             0    15G    15G  00 [------]
#40             0    15G    15G  00 [------]
#41             0    15G    15G  00 [------]
#42             0    15G    15G  00 [------]
#43             0    10G    10G  00 [------]
#44             0    10G    10G  00 [------]
#45             0    10G    10G  00 [------]
#46             0    10G    10G  00 [------]
#47             0    10G    10G  00 [------]
#48             0    10G    10G  00 [------]
#49             0    10G    10G  00 [------]
#50             0    10G    10G  00 [------]
#51             0    10G    10G  00 [------]
#52             0    10G    10G  00 [------]
#53             0    10G    10G  00 [------]
#54             0    10G    10G  00 [------]
#55             0    10G    10G  00 [------]
#56             0    10G    10G  00 [------]
#57             0    10G    10G  00 [------]
#58             0    10G    10G  00 [------]
#59             0    10G    10G  00 [------]
#60             0    10G    10G  00 [------]
#61             0    10G    10G  00 [------]
#62             0    10G    10G  00 [------]
#63             0    10G    10G  00 [------]

I set the size to 10G at first and 15G later.

[root@daocloud ~]# xfs_quota -xc "report -a -p -h"  | wc -l
1956507

yes, it becauce set quota committed layer too. but it is not expected

@pacoxu
Copy link
Contributor

pacoxu commented Jul 6, 2022

I got some new errors with latest code.

7月 06 14:57:03 paco containerd[10220]: time="2022-07-06T14:57:03.299383030+08:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
7月 06 14:57:03 paco containerd[10220]: time="2022-07-06T14:57:03.299968488+08:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.overlayfs" error="Failed to set quota limit for projid 1 on /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/backingFsBlockDev: function not implemented"
7月 06 14:57:03 paco containerd[10220]: time="2022-07-06T14:57:03.300169800+08:00" level=warning msg="could not use snapshotter overlayfs in metadata plugin" error="Failed to set quota limit for projid 1 on /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/backingFsBlockDev: function not implemented"

And the quota is not set as expected.

@yylt yylt force-pushed the quota branch 2 times, most recently from f72d83c to ed16c4b Compare July 29, 2022 07:42
@yylt
Copy link
Contributor Author

yylt commented Jul 29, 2022

I try command GOGC=off containerd -c myconfig.toml, and it always successed, and use code below, but it does not work.

	_, _, errno := syscall.Syscall6(syscall.SYS_QUOTACTL, uintptr(qcmd(Q_XSETQLIM, XFS_PROJ_QUOTA)),
		(*[2]uintptr)(unsafe.Pointer(&backingFsBlockDev))[0], uintptr(projectID),
		uintptr(unsafe.Pointer(&fd)), 0, 0)

and use bpftrace to debug, and the syscall _sys_quotactl retval could be 0, -2 , -20.

bpftrace -e 'kretprobe:__x64_sys_quotactl {printf(%s, retval %d\n",kstack,retval)}'

finally, use below to translate string to uintptr could be work fine.

	devbyte := append([]byte(backingFsBlockDev), 0)

	_, _, errno := syscall.Syscall6(syscall.SYS_QUOTACTL, uintptr(qcmd(Q_XSETQLIM, XFS_PROJ_QUOTA)),
		uintptr(unsafe.Pointer(&devbyte[0])), uintptr(fd.id),
		uintptr(unsafe.Pointer(&fd)), 0, 0)

@dmcgowan dmcgowan marked this pull request as draft November 30, 2022 01:53
@yylt yylt force-pushed the quota branch 4 times, most recently from fcf0f99 to bdf63b4 Compare April 3, 2024 10:00
@yylt
Copy link
Contributor Author

yylt commented Apr 3, 2024

/retest

@k8s-ci-robot
Copy link

@yylt: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yylt
Copy link
Contributor Author

yylt commented Apr 8, 2024

Is it necessary to use the new version of ackagep in this way to use fs magic? This looks quite ugly.

require github.com/containerd/continuity v0.4.4-0.20240406122502-56a67e3ba427

@pacoxu
Copy link
Contributor

pacoxu commented Apr 11, 2024

Is it necessary to use the new version of ackagep in this way to use fs magic? This looks quite ugly.

require github.com/containerd/continuity v0.4.4-0.20240406122502-56a67e3ba427

Can this be done in a seperate PR?

@yylt
Copy link
Contributor Author

yylt commented Apr 11, 2024

Can this be done in a seperate PR?

It doesn't seem appropriate, because the overlay judgement of whether the file system supports setting quotas requires using magic fs.

Signed-off-by: Yang Yang <[email protected]>
@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase size/XXL status/needs-discussion Needs discussion and decision from maintainers
Projects
Code Review
Needs Discussion
Status: Work In Progress
Development

Successfully merging this pull request may close these issues.

Feature request: overlayfs quota limitation
6 participants