Currently moby drops ep sets before the entrypoint is executed.
This does mean that with combination of no-new-privileges the
file capabilities stops working with non-root containers.
This is undesired as the usability of such containers is harmed
comparing to running root containers.
This commit therefore sets the effective/permitted set in order
to allow use of file capabilities or libcap(3)/prctl(2) respectively
with combination of no-new-privileges and without respectively.
For no-new-privileges the container will be able to obtain capabilities
that are requested.
Signed-off-by: Luboslav Pivarc <lpivarc@redhat.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit 3aef732e61)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Audit the OCI spec options used for Linux containers to ensure they are
less order-dependent. Ensure they don't assume that any pointer fields
are non-nil and that they don't unintentionally clobber mutations to the
spec applied by other options.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 8a094fe609)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Older versions of Go don't format comments, so committing this as
a separate commit, so that we can already make these changes before
we upgrade to Go 1.19.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Linux kernel never sets the Inheritable capability flag to anything
other than empty. Moby should have the same behavior, and leave it to
userspace code within the container to set a non-empty value if desired.
Reported-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Samuel Karp <skarp@amazon.com>
This fixes a panic when an invalid "device cgroup rule" is passed, resulting
in an "index out of range".
This bug was introduced in the original implementation in 1756af6faf,
but was not reproducible when using the CLI, because the same commit also added
client-side validation on the flag before making an API request. The following
example, uses an invalid rule (`c *:* rwm` - two spaces before the permissions);
```console
$ docker run --rm --network=host --device-cgroup-rule='c *:* rwm' busybox
invalid argument "c *:* rwm" for "--device-cgroup-rule" flag: invalid device cgroup format 'c *:* rwm'
```
Doing the same, but using the API results in a daemon panic when starting the container;
Create a container with an invalid device cgroup rule:
```console
curl -v \
--unix-socket /var/run/docker.sock \
"http://localhost/v1.41/containers/create?name=foobar" \
-H "Content-Type: application/json" \
-d '{"Image":"busybox:latest", "HostConfig":{"DeviceCgroupRules": ["c *:* rwm"]}}'
```
Start the container:
```console
curl -v \
--unix-socket /var/run/docker.sock \
-X POST \
"http://localhost/v1.41/containers/foobar/start"
```
Observe the daemon logs:
```
2021-01-22 12:53:03.313806 I | http: panic serving @: runtime error: index out of range [0] with length 0
goroutine 571 [running]:
net/http.(*conn).serve.func1(0xc000cb2d20)
/usr/local/go/src/net/http/server.go:1795 +0x13b
panic(0x2f32380, 0xc000aebfc0)
/usr/local/go/src/runtime/panic.go:679 +0x1b6
github.com/docker/docker/oci.AppendDevicePermissionsFromCgroupRules(0xc000175c00, 0x8, 0x8, 0xc0000bd380, 0x1, 0x4, 0x0, 0x0, 0xc0000e69c0, 0x0, ...)
/go/src/github.com/docker/docker/oci/oci.go:34 +0x64f
```
This patch:
- fixes the panic, allowing the daemon to return an error on container start
- adds a unit-test to validate various permutations
- adds a "todo" to verify the regular expression (and handling) of the "a" (all) value
We should also consider performing this validation when _creating_ the container,
so that an error is produced early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Add support for exact list of capabilities, support only OCI model
- Support OCI model on CapAdd and CapDrop but remain backward compatibility
- Create variable locally instead of declaring it at the top
- Use const for magic "ALL" value
- Rename `cap` variable as it overlaps with `cap()` built-in
- Normalize and validate capabilities before use
- Move validation for conflicting options to validateHostConfig()
- TweakCapabilities: simplify logic to calculate capabilities
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>