Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

seccomp: Sync fields with runtime-spec fields #42604

Merged
merged 1 commit into from
Jul 15, 2021

Conversation

rata
Copy link
Contributor

@rata rata commented Jul 7, 2021

The runtime spec we are using has support for these 3 fields[1], but
moby doesn't have them in its seccomp struct. This patch just adds and
copies them when they are in the profile.

DefaultErrnoRet is implemented in the runc version moby is using (it is
implemented since runc-rc95[2]) but if we create a container without
this moby patch, we don't see an error nor the expected behavior. This
is not clear for the user (the profile they specify is valid, the syntax
is ok, but the wrong behavior is seen).

This is because the DefaultErrnoRet field is not copied to the config
passed ultimately to runc (i.e. is like the field was not specified).
With this patch, we see the expected behavior.

The other two fileds are in the runtime-spec but not yet in runc (a PR
is open and targets 1.1.0 milestone[3]). However, I took the liberty to
copy them now too for two reasons:

  1. If we don't add them now and end up using a runc version that
    supports them, then the error that the user will see is not clear at
    all:

    docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: listenerPath is not set: unknown.

And it is not obvious to debug for the user, as the field is set in
the profile they specify (just not copied by moby to the profile moby
specifies ultimately to runc).

  1. When using a runc without seccomp notify support (like today), the
    error we see is the same with and without this moby patch (when using a
    seccomp profile with the new fields):

    docker: Error response from daemon: OCI runtime create failed: string SCMP_ACT_NOTIFY is not a valid action for seccomp: unknown.

Then, it seems like a clear win to add them now: we don't have to do it
later (that implies not clear errors to the user if we forget, like we
did with DefaultErrnoRet) and the user sees the exact same error when
using a runc version that doesn't support these fields.

[1]: Note we are vendoring version 1c3f411f041711bbeecf35ff7e93461ea6789220 and this version has these 3 fields https://github.com/opencontainers/runtime-spec/blob/1c3f411f041711bbeecf35ff7e93461ea6789220/config-linux.md#seccomp
[2]: opencontainers/runc#2954
[3]: opencontainers/runc#2682

Signed-off-by: Rodrigo Campos [email protected]

- What I did
Sync moby struct seccomp fields with runtime-spec seccomp fields. This makes DefaultErrnoRet field in the seccomp spec work and prepares for other runc changes in the future.

- How I did it
Added the fields to the moby struct and copied them.

- How to verify it

I tested this with runc 1.0 compiled locally, but should work with runc-rc95 too.

Download the attached seccomp-defaulterrno.txt, let's say you save it as /tmp/seccomp-default-errno.txt.

Then, you should expect the following when running dockerd with this patch:

docker run --rm -it --security-opt seccomp=/tmp/seccomp-defaulterrno.txt debian mkdir a`
mkdir: cannot create directory 'a': No such device or address

This is because the seccomp profile is using errno 6:

$ errno 6
ENXIO 6 No such device or address

If you run dockerd without this patch, you should see EPERM (the DefaultErrnoRet option is ignored without this patch, and therefore it defaults to EPERM):

$ docker run --rm -it --security-opt seccomp=/tmp/seccomp-defaulterrno.txt debian mkdir a
mkdir: cannot create directory 'a': Operation not permitted

- Description for the changelog

Honor DefaultErrnoRet in seccomp profile

- A picture of a cute animal (not mandatory but encouraged)
Of course I chose a rat 馃槃

@rata rata force-pushed the rata/seccomp-new-fields branch from 364f79e to 4ed2302 Compare July 8, 2021 11:05
@rata
Copy link
Contributor Author

rata commented Jul 8, 2021

The CI failing seems unrelated to this PR. Pushed with no changes to re-run them, but I noticed it is the same failure for some commits merged in master, so I guess CI is failing in master, is unrelated to this and will continue to fail :)

@thaJeztah
Copy link
Member

DefaultErrnoRet is implemented in the runc version moby is using (it is implemented since runc-rc95[2]) but if we create a container without this moby patch, we don't see an error nor the expected behavior. This is not clear for the user (the profile they specify is valid, the syntax is ok, but the wrong behavior is seen).

Thanks; I had the DefaultErrnoRet on my list #42005 (comment).

I don't exactly understand your description of the "error" condition though; as far as I'm aware, the field (and value) is optional (see opencontainers/runtime-spec#1087) so if it's unset, the previous default will apply.

We should also

The other two fileds are in the runtime-spec but not yet in runc (a PR
is open and targets 1.1.0 milestone[3]) However, I took the liberty to
copy them now too for two reason
If we don't add them now and end up using a runc version that
supports them, then the error that the user will see is not clear at
all

That looks problematic to me; if that's the case, it means runc updates will break backward compatibility with all current runtimes. Both of those fields are marked OPTIONAL in the spec.

I'm also wondering: if I understand that feature correctly, the listenerPath needs to specify a socket that provides information about a specific container; adding it here means that path (and listenerMetadata) can be specified in a seccomp (JSON) file.

  • who provides the socket?
  • if the socket (or listenerPath) is unique per container; does it make sense to set it through a JSON file? Such a JSON file could only be used for a single container (i.e., not as a generic profile for all container), plus it would require the author of the file to know the ID (or name) of the container in advance.

Perhaps I don't understand the feature, and I'm completely wrong there, but if that's correct, I feel like those values should be either set dynamically (automatically) by the docker daemon (or by containerd or runc), and not defined in the profile?

@thaJeztah thaJeztah added area/security/seccomp kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/2-code-review impact/changelog labels Jul 8, 2021
@rata
Copy link
Contributor Author

rata commented Jul 8, 2021

DefaultErrnoRet is implemented in the runc version moby is using (it is implemented since runc-rc95[2]) but if we create a container without this moby patch, we don't see an error nor the expected behavior. This is not clear for the user (the profile they specify is valid, the syntax is ok, but the wrong behavior is seen).

Thanks; I had the DefaultErrnoRet on my list #42005 (comment).

Cool :)

I don't exactly understand your description of the "error" condition though; as far as I'm aware, the field (and value) is optional (see opencontainers/runtime-spec#1087) so if it's unset, the previous default will apply.

Yes, it is optional and if unset, the previous behaviour applies. That is indeed what happens without this patch: the previous behavior is applied.

My point was that is not clear when it is used or not. Let's say you add the DefaultErrnoRet field to your seccomp profile (like in the config attached in the PR description), but you run the container and you don't see the DefaultErrnoRet being used. You have a valid profile, your runc version might even support it and the moby version you are using is meant to run with that runc version. It's not obvious why you don't see the behavior specifiec by DefaultErrnoRet.

Maybe my example was not the best one? :) But yes, though, what is seen is the old behavior :)

We should also

* add a test that loads a profile (from a json file) that uses this field; https://github.com/moby/moby/blob/master/profiles/seccomp/seccomp_test.go

Will add a test, thanks!

* (possibly) explicitly set the default in our default profile (although that's probably better done separately)

Cool, I can do a follow-up PR for that too. What should the default be? ENOSYS as you mentioned in the comment? I guess that will help userspace to fallback correctly if some syscall is not allowed.

The other two fileds are in the runtime-spec but not yet in runc (a PR
is open and targets 1.1.0 milestone[3]) However, I took the liberty to
copy them now too for two reason
If we don't add them now and end up using a runc version that
supports them, then the error that the user will see is not clear at
all

That looks problematic to me; if that's the case, it means runc updates will break backward compatibility with all current runtimes.
Both of those fields are marked OPTIONAL in the spec.

Yes, both fields are optional but I don't think backwards compatibility is broken. That error only happens if you use the new action SCMP_ACT_NOTIFY in your seccomp profile. That is, if you use the new action that specifies to notify a userspace process on some syscalls, then you need to specify where to send info so a process actually handles that syscall.

So, you only see the error if the new action is specified. IOW, that error only happens if you use a "new" seccomp profile (a profile with SCMP_ACT_NOTIFY and listenerPath), with a runc that support seccomp notify and moby without this patch.

Sorry if I wasn't clear before.

I'm also wondering: if I understand that feature correctly, the listenerPath needs to specify a socket that provides information about a specific container; adding it here means that path (and listenerMetadata) can be specified in a seccomp (JSON) file.

Yes, it can only be specified in the json file.

* who provides the socket?

A seccomp agent should be listening on that listenerPath socket. The seccomp agent can be a container or a process in the host, but something that is deployed before we create a container that uses seccomp notify in its profile.

* if the socket (or `listenerPath`) is _unique_ per container; does it make sense to set it through a JSON file? Such a JSON file could only be used for a single container (i.e., not as a generic profile for all container), plus it would require the author of the file to know the ID (or name) of the container in advance.

No, is not unique per container. One seccomp agent can handle several containers (and typically it does). Also, the seccomp agent is tightly coupled with a seccomp profile: if the profile specifies that syscall X should notify to userpsace, the seccomp agent (listening in the listenerPath specified in the json) should handle syscall X.

You may want to run different agents (like one different agent for containers that are allowed to run more high-privileged operations, etc.) and you can, but anyways you usually handle more than one container per seccomp agent.

Does this clarifies the idea? Let me know if I'm not explaining myself correctly :)

The runtime spec we are using has support for these 3 fields[1], but
moby doesn't have them in its seccomp struct. This patch just adds and
copies them when they are in the profile.

DefaultErrnoRet is implemented in the runc version moby is using (it is
implemented since runc-rc95[2]) but if we create a container without
this moby patch, we don't see an error nor the expected behavior. This
is not clear for the user (the profile they specify is valid, the syntax
is ok, but the wrong behavior is seen).

This is because the DefaultErrnoRet field is not copied to the config
passed ultimately to runc (i.e. is like the field was not specified).
With this patch, we see the expected behavior.

The other two fileds are in the runtime-spec but not yet in runc (a PR
is open and targets 1.1.0 milestone). However, I took the liberty to
copy them now too for two reasons:

1. If we don't add them now and end up using a runc version that
supports them, then the error that the user will see is not clear at
all:

	docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: listenerPath is not set: unknown.

And it is not obvious to debug for the user, as the field _is_ set in
the profile they specify (just not copied by moby to the profile moby
specifies ultimately to runc).

2. When using a runc without seccomp notify support (like today), the
error we see is the same with and without this moby patch (when using a
seccomp profile with the new fields):

	docker: Error response from daemon: OCI runtime create failed: string SCMP_ACT_NOTIFY is not a valid action for seccomp: unknown.

Then, it seems like a clear win to add them now: we don't have to do it
later (that implies not clear errors to the user if we forget, like we
did with DefaultErrnoRet) and the user sees the exact same error when
using a runc version that doesn't support these fields.

[1]: Note we are vendoring version 1c3f411f041711bbeecf35ff7e93461ea6789220 and this version has these 3 fields https://github.com/opencontainers/runtime-spec/blob/1c3f411f041711bbeecf35ff7e93461ea6789220/config-linux.md#seccomp
[2]: opencontainers/runc#2954
[3]: opencontainers/runc#2682

Signed-off-by: Rodrigo Campos <[email protected]>
@rata rata force-pushed the rata/seccomp-new-fields branch from 4ed2302 to 5d24467 Compare July 8, 2021 15:13
@rata
Copy link
Contributor Author

rata commented Jul 8, 2021

Added tests now, PTAL :)

@thaJeztah
Copy link
Member

My point was that is not clear when it is used or not. Let's say you add the DefaultErrnoRet field to your seccomp profile (like in the config attached in the PR description), but you run the container and you don't see the DefaultErrnoRet being used.

Ah, yes, gotcha. We're back on the same page again 馃槄

What should the default be? ENOSYS as you mentioned in the comment? I guess that will help userspace to fallback correctly if some syscall is not allowed.

I suggest to do it in two steps; first a PR to "add the current (EPERM) default explicitly" to the profile, then a follow up PR to "change default to ENOSYS"

I'm open to have the discussion on a changing the default to be ENOSYS, but I recall the heated debates on the runc repository on this subject. Reality is that we used EPERM for years; whether or not that was the correct default, I don't know, but it's there, and if we change, we must be sure it's (at least) discussed, and that we're comfortable with the change (and don't expect a regression in behaviour).

At least, with this PR, users will be able to use a custom profile with ENOSYS as a default, which takes us "one step closer" to changing (we could even consider providing both profiles for the user to choose from).

So, you only see the error if the new action is specified. IOW, that error only happens if you use a "new" seccomp profile (a profile with SCMP_ACT_NOTIFY and listenerPath), with a runc that support seccomp notify and moby without this patch.

Yup, that should be perfectly fine! My primary concern was that (e.g.) a user with an older version of docker/containerd installed to upgrade runc, and no longer being able to run containers.

A seccomp agent should be listening on that listenerPath socket. The seccomp agent can be a container or a process in the host, but something that is deployed before we create a container that uses seccomp notify in its profile.
No, is not unique per container. One seccomp agent can handle several containers (and typically it does).

Gotcha, clear! I thought this was a socket that the / a runtime had to provide, but it's the other way round. I clearly should've taken slightly more time to read up on the feature.

The seccomp agent can be a container

Just so I'm prepared for this; will there be a clear error message if the seccomp agent socket is not ready / doesn't exist?

Running services in a container that other containers depend on (especially when required to start) lead to race-conditions ("agent" container not yet running). Of course this is a more advanced feature, but, well, users "try things" and then report "it doesn't work!!", so hoping the error is clear enough to allow them to identify the problem.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah
Copy link
Member

(test failures are unrelated; it's being worked on, and a bit whack-a-mole currently: I'll kick CI to roll the dice)

@rata
Copy link
Contributor Author

rata commented Jul 8, 2021

What should the default be? ENOSYS as you mentioned in the comment? I guess that will help userspace to fallback correctly if some syscall is not allowed.

I suggest to do it in two steps; first a PR to "add the current (EPERM) default explicitly" to the profile, then a follow up PR to "change default to ENOSYS"

I'm open to have the discussion on a changing the default to be ENOSYS, but I recall the heated debates on the runc repository on this subject. Reality is that we used EPERM for years; whether or not that was the correct default, I don't know, but it's there, and if we change, we must be sure it's (at least) discussed, and that we're comfortable with the change (and don't expect a regression in behaviour).

Cool, after this PR I'll open a new one setting the DefaltErrnoRet to EPERM at least.

To be honest, I don't have an informed opinion on what the default should be, and it is definitely a big change that can break several users. From memory, switching to ENOSYS can help, like it happened with libc using a new syscall, not allowed by seccomp (didn't know about it) and the fallback was not used because the code expected ENOSYS when the syscall was not implemented.

I'll start with being explicit in the default (setting DefaultErrnoRet to EPERM) and then we see :)

At least, with this PR, users will be able to use a custom profile with ENOSYS as a default, which takes us "one step closer" to changing (we could even consider providing both profiles for the user to choose from).

Yeap, should be super helpful for some cases :)

So, you only see the error if the new action is specified. IOW, that error only happens if you use a "new" seccomp profile (a profile with SCMP_ACT_NOTIFY and listenerPath), with a runc that support seccomp notify and moby without this patch.

Yup, that should be perfectly fine! My primary concern was that (e.g.) a user with an older version of docker/containerd installed to upgrade runc, and no longer being able to run containers.

Ohh, I see. Sorry that wasn't clear before! :)

Gotcha, clear! I thought this was a socket that the / a runtime had to provide, but it's the other way round. I clearly should've taken slightly more time to read up on the feature.

Ohh, yes, it is the other way around :)

The seccomp agent can be a container

Just so I'm prepared for this; will there be a clear error message if the seccomp agent socket is not ready / doesn't exist?

Yes, runc should output:

ERRO[0017] container_linux.go:380: starting container process caused: dial unix /tmp/seccompagent.sock: connect: no such file or directory 

Running services in a container that other containers depend on (especially when required to start) lead to race-conditions ("agent" container not yet running). Of course this is a more advanced feature, but, well, users "try things" and then report "it doesn't work!!", so hoping the error is clear enough to allow them to identify the problem.

Yes, good call. I think the error is clear, but there is only one way to know... ;)

@thaJeztah
Copy link
Member

Thanks for the extra info!

w.r.t. the error (didn't look at the runc PR yet), would be great if there was some "clue" what socket it is / where it comes from (just the name of the socket may not always give a clue; your /tmp/seccompagent.sock example of course helps, but it could have any name).

@rata
Copy link
Contributor Author

rata commented Jul 8, 2021

@thaJeztah oh, good point. Will update the runc PR next week making it clearer in the error msg. Thanks!

@thaJeztah thaJeztah added this to the 21.xx milestone Jul 9, 2021
@thaJeztah
Copy link
Member

@AkihiroSuda @tianon ptal

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit a2da507 into moby:master Jul 15, 2021
@rata rata deleted the rata/seccomp-new-fields branch July 16, 2021 10:45
@rata
Copy link
Contributor Author

rata commented Jul 16, 2021

Just for the record, I improved the runc error msg: opencontainers/runc#2682 (comment)

Thanks again! :)

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

profiles/seccomp/seccomp.go

``

profiles/seccomp/seccomp.go Show resolved Hide resolved
@thaJeztah
Copy link
Member

^^ reported the account for abuse / spam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security/seccomp impact/changelog kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/2-code-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants