Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: request of new minor release with an updated syscall table to support newer kernels #406

Closed
shaba opened this issue May 5, 2023 · 22 comments
Labels
Milestone

Comments

@shaba
Copy link

shaba commented May 5, 2023

Can you please release new minor version with just the support of actual kernels?

@pcmoore pcmoore changed the title Request of new minor release with actual kernels support Q: request of new minor release with an updated syscall table to support newer kernels May 5, 2023
@pcmoore pcmoore added this to the v2.5.5 milestone May 5, 2023
@pcmoore
Copy link
Member

pcmoore commented May 5, 2023

Thoughts @drakenclimber?

It's been roughly a year since the last v2.5.x release and while I don't see any significant changes in the release-2.5 branch, and new release with an updated syscall table might be a good idea.

@drakenclimber
Copy link
Member

Thoughts @drakenclimber?

It's been roughly a year since the last v2.5.x release and while I don't see any significant changes in the release-2.5 branch, and new release with an updated syscall table might be a good idea.

Yeah, I definitely support doing a new 2.5.x release. I have some obligations that will likely consume the next few weeks, but I should have time after that. Does June or July sound reasonable?

@pcmoore
Copy link
Member

pcmoore commented May 5, 2023

Next week I'm going to be spending some quality time stuck on planes/airports, I might be able to put a release together next week, but I don't want to step on your toes :)

I am doing two PRs to update the syscall tables on main and release-2.5, and it looks like we don't need to update main (no syscall changes between v6.2 and v6.3).

@drakenclimber
Copy link
Member

Next week I'm going to be spending some quality time stuck on planes/airports, I might be able to put a release together next week, but I don't want to step on your toes :)

I am doing two PRs to update the syscall tables on main and release-2.5, and it looks like we don't need to update main (no syscall changes between v6.2 and v6.3).

If you have the time and the desire to release v2.5.5, I'm totally cool with that. I know you've been pretty busy lately, so I didn't want to burden you with more work.

@drakenclimber
Copy link
Member

I should have time to help review/test the v2.5.5 release if you want a second set of eyes.

@pcmoore
Copy link
Member

pcmoore commented May 5, 2023

Actually, wait a minute ... looking at the syscall table changes between Linux v5.17 (what we shipped in the libseccomp v2.5.4 release) and Linux v6.3 I only see one change: memfd_secret() is defined for riscv64. Given the limitations of memfd_secret() I'm beginning to wonder if a new release is really worth it ... ?

@shaba what problems are you seeing with libseccomp v2.5.4 that you need a new release with updated kernel support?

@pcmoore
Copy link
Member

pcmoore commented May 5, 2023

Related PR to update the release-2.5 branch with the Linux v6.3 syscall information.:

@shaba
Copy link
Author

shaba commented May 5, 2023

I didn't knew there was no significant syscall changes either. It's good to know release is not needed. We just wanted to have updated libseccomp for actual kernels.

For longarch support i can wait next major release.

@drakenclimber
Copy link
Member

This summer I'll work on paring down the open issues for 2.6.0. We have a lot of cool features there that I would love to get released.

@pcmoore
Copy link
Member

pcmoore commented May 5, 2023

This summer I'll work on paring down the open issues for 2.6.0. We have a lot of cool features there that I would love to get released.

Yeah, I've been trying to carve out one day a week to work on libseccomp lately ... although most weeks I've been failing miserably at that :/

@pcmoore
Copy link
Member

pcmoore commented May 5, 2023

@shaba I'm going to go ahead and close this out as I think we've resolved your concern, but if I'm mistaken please feel free to re-open.

@pcmoore pcmoore closed this as completed May 5, 2023
@drakenclimber
Copy link
Member

Yeah, I've been trying to carve out one day a week to work on libseccomp lately ... although most weeks I've been failing miserably at that :/

Sounds all too familiar :)

@vt-alt
Copy link

vt-alt commented Sep 16, 2023

There are three new syscalls since than already on v6.6-rc1 (cachestat (since 6.5), fchmodat2, and map_shadow_stack) can you add them with a minor release?

@pcmoore
Copy link
Member

pcmoore commented Sep 18, 2023

We are working on a minor release, although there is not set date yet so please don't ask ;)

@vt-alt
Copy link

vt-alt commented Oct 17, 2023

Because it's impossible to add exact syscall that is present in kernel but not present in libseccomp (such as fchmodat2) it's impossible to create workarounds for adding syscalls (for example just ENOSYS them altogether) that are not yet supported by libseccomp.

Please make libseccomp synchronized with the current kernel version? Or allow adding unsupported syscalls?

@pcmoore
Copy link
Member

pcmoore commented Oct 17, 2023

Please make libseccomp synchronized with the current kernel version?

#406 (comment)

Or allow adding unsupported syscalls?

See seccomp_add_rule_exact(), it should allow arbitrary syscall numbers; we use it all the time in the bundled regression tests.

@vt-alt
Copy link

vt-alt commented Oct 17, 2023

See seccomp_add_rule_exact(), it should allow arbitrary syscall numbers; we use it all the time in the bundled regression tests.

Ah thanks. We misinterpreted -EFAULT when adding a rule (with seccomp_add_rule_exact) for non-native arch (for SCMP_ARCH_X86 on SCMP_ARCH_X86_64). For native arch it works good.

@keszybz
Copy link

keszybz commented Nov 30, 2023

glibc starting using fchmodat2 to implement fchmod with flags [1], so the lack of support for fchmodat2 in libseccomp is causing problems with programs sandboxed by systemd. In particular, tar now fails with the default SystemCallFilter="@system-service" sandbox [2]. We'd appreciate a quick release to support fchmodat2.

[1] bminor/glibc@65341f7
[2] systemd/systemd#30250

@pcmoore
Copy link
Member

pcmoore commented Nov 30, 2023

@drakenclimber see above. I'll send you an email this morning.

keszybz added a commit to keszybz/systemd that referenced this issue Dec 1, 2023
glibc starting using fchmodat2 to implement fchmod with flags [1], but
current version of libseccomp does not support fchmodat2 [2]. This is
causing problems with programs sandboxed by systemd. libseccomp needs to know
a syscall to be able to set any kind of filter for it, so for syscalls unknown
by libseccomp we would always do the default action, i.e. either return the
errno set by SystemCallErrorNumber or send a fatal signal. For glibc to ignore
the unknown syscall and gracefully fall back to the older implementation,
we need to return ENOSYS. In particular, tar now fails with the default
SystemCallFilter="@System-service" sandbox [3].

This is of course a wider problem: any time the kernel gains new syscalls,
before libseccomp and systemd have caught up, we'd behave incorrectly. Let's
do the same as we already were doing in nspawn since
3573e03, and do the "default action" only
for syscalls which are known by us and libseccomp, and return ENOSYS for
anything else. This means that users can start using a sandbox with the new
syscalls only after libseccomp and systemd have been updated, but before that
happens they behaviour that is backwards-compatible.

[1] bminor/glibc@65341f7
[2] seccomp/libseccomp#406
[2] systemd#30250

Fixes systemd#30250.
@keszybz
Copy link

keszybz commented Dec 1, 2023

systemd/systemd#30291 makes systemd handle unknown (to itself or libseccomp) syscalls gracefully by returning ENOSYS. So the ask here is less urgent: things should work as before, but we need an updated libseccomp to allow users to specify fchmodat2 in filters and/or to use it from sandboxed services.

@drakenclimber
Copy link
Member

Thanks, @keszybz. That looks like a good addition to systemd.

I'm going to start working on the 2.5.5 release right now. It's been long overdue.

keszybz added a commit to keszybz/systemd that referenced this issue Dec 1, 2023
glibc starting using fchmodat2 to implement fchmod with flags [1], but
current version of libseccomp does not support fchmodat2 [2]. This is
causing problems with programs sandboxed by systemd. libseccomp needs to know
a syscall to be able to set any kind of filter for it, so for syscalls unknown
by libseccomp we would always do the default action, i.e. either return the
errno set by SystemCallErrorNumber or send a fatal signal. For glibc to ignore
the unknown syscall and gracefully fall back to the older implementation,
we need to return ENOSYS. In particular, tar now fails with the default
SystemCallFilter="@System-service" sandbox [3].

This is of course a wider problem: any time the kernel gains new syscalls,
before libseccomp and systemd have caught up, we'd behave incorrectly. Let's
do the same as we already were doing in nspawn since
3573e03, and do the "default action" only
for syscalls which are known by us and libseccomp, and return ENOSYS for
anything else. This means that users can start using a sandbox with the new
syscalls only after libseccomp and systemd have been updated, but before that
happens they behaviour that is backwards-compatible.

[1] bminor/glibc@65341f7
[2] seccomp/libseccomp#406
[2] systemd#30250

Fixes systemd#30250.
@drakenclimber
Copy link
Member

For those watching this issue - I have just released libseccomp v2.5.5. Thanks for all the help 👍

keszybz added a commit to keszybz/systemd that referenced this issue Dec 2, 2023
glibc starting using fchmodat2 to implement fchmod with flags [1], but
current version of libseccomp does not support fchmodat2 [2]. This is
causing problems with programs sandboxed by systemd. libseccomp needs to know
a syscall to be able to set any kind of filter for it, so for syscalls unknown
by libseccomp we would always do the default action, i.e. either return the
errno set by SystemCallErrorNumber or send a fatal signal. For glibc to ignore
the unknown syscall and gracefully fall back to the older implementation,
we need to return ENOSYS. In particular, tar now fails with the default
SystemCallFilter="@System-service" sandbox [3].

This is of course a wider problem: any time the kernel gains new syscalls,
before libseccomp and systemd have caught up, we'd behave incorrectly. Let's
do the same as we already were doing in nspawn since
3573e03, and do the "default action" only
for syscalls which are known by us and libseccomp, and return ENOSYS for
anything else. This means that users can start using a sandbox with the new
syscalls only after libseccomp and systemd have been updated, but before that
happens they behaviour that is backwards-compatible.

[1] bminor/glibc@65341f7
[2] seccomp/libseccomp#406
[2] systemd#30250

Fixes systemd#30250.

In seccomp_restrict_sxid() there's a chunk conditionalized with
'#if defined(__SNR_fchmodat2)'. We need to kep that because seccomp_restrict_sxid()
seccomp_restrict_suid_sgid() uses SCMP_ACT_ALLOW as the default action.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants