0ct0pu5/moby

Author	SHA1	Message	Date
Cory Snider	d222bf097c	daemon: reload runtimes w/o breaking containers The existing runtimes reload logic went to great lengths to replace the directory containing runtime wrapper scripts as atomically as possible within the limitations of the Linux filesystem ABI. Trouble is, atomically swapping the wrapper scripts directory solves the wrong problem! The runtime configuration is "locked in" when a container is started, including the path to the runC binary. If a container is started with a runtime which requires a daemon-managed wrapper script and then the daemon is reloaded with a config which no longer requires the wrapper script (i.e. some args -> no args, or the runtime is dropped from the config), that container would become unmanageable. Any attempts to stop, exec or otherwise perform lifecycle management operations on the container are likely to fail due to the wrapper script no longer existing at its original path. Atomically swapping the wrapper scripts is also incompatible with the read-copy-update paradigm for reloading configuration. A handler in the daemon could retain a reference to the pre-reload configuration for an indeterminate amount of time after the daemon configuration has been reloaded and updated. It is possible for the daemon to attempt to start a container using a deleted wrapper script if a request to run a container races a reload. Solve the problem of deleting referenced wrapper scripts by ensuring that all wrapper scripts are immutable for the lifetime of the daemon process. Any given runtime wrapper script must always exist with the same contents, no matter how many times the daemon config is reloaded, or what changes are made to the config. This is accomplished by using everyone's favourite design pattern: content-addressable storage. Each wrapper script file name is suffixed with the SHA-256 digest of its contents to (probabilistically) guarantee immutability without needing any concurrency control. Stale runtime wrapper scripts are only cleaned up on the next daemon restart. Split the derived runtimes configuration from the user-supplied configuration to have a place to store derived state without mutating the user-supplied configuration or exposing daemon internals in API struct types. Hold the derived state and the user-supplied configuration in a single struct value so that they can be updated as an atomic unit. Signed-off-by: Cory Snider <csnider@mirantis.com>	2023-06-01 14:45:25 -04:00
Cory Snider	0b592467d9	daemon: read-copy-update the daemon config Ensure data-race-free access to the daemon configuration without locking by mutating a deep copy of the config and atomically storing a pointer to the copy into the daemon-wide configStore value. Any operations which need to read from the daemon config must capture the configStore value only once and pass it around to guarantee a consistent view of the config. Signed-off-by: Cory Snider <csnider@mirantis.com>	2023-06-01 14:45:24 -04:00
Sebastiaan van Stijn	ab35df454d	remove pre-go1.17 build-tags Removed pre-go1.17 build-tags with go fix; go mod init go fix -mod=readonly ./... rm go.mod Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2023-05-19 20:38:51 +02:00
Sebastiaan van Stijn	3eebf4d162	container: split security options to a SecurityOptions struct - Split these options to a separate struct, so that we can handle them in isolation. - Change some tests to use subtests, and improve coverage Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2023-04-29 00:03:37 +02:00
Cory Snider	4bafaa00aa	Refactor libcontainerd to minimize c8d RPCs The containerd client is very chatty at the best of times. Because the libcontained API is stateless and references containers and processes by string ID for every method call, the implementation is essentially forced to use the containerd client in a way which amplifies the number of redundant RPCs invoked to perform any operation. The libcontainerd remote implementation has to reload the containerd container, task and/or process metadata for nearly every operation. This in turn amplifies the number of context switches between dockerd and containerd to perform any container operation or handle a containerd event, increasing the load on the system which could otherwise be allocated to workloads. Overhaul the libcontainerd interface to reduce the impedance mismatch with the containerd client so that the containerd client can be used more efficiently. Split the API out into container, task and process interfaces which the consumer is expected to retain so that libcontainerd can retain state---especially the analogous containerd client objects---without having to manage any state-store inside the libcontainerd client. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-08-24 14:59:08 -04:00
Sebastiaan van Stijn	686be57d0a	Update to Go 1.17.0, and gofmt with Go 1.17 Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2021-08-24 23:33:27 +02:00
Sebastiaan van Stijn	6d1eceb509	Fix panic in TestExecSetPlatformOpt, TestExecSetPlatformOptPrivileged These tests would panic; - in WithRLimits(), because HostConfig was not set; `470ae8422f/daemon/oci_linux.go (L46-L47)` - in daemon.mergeUlimits(), because daemon.configStore was not set; `470ae8422f/daemon/oci_linux.go (L1069)` This panic was not discovered because the current version of runc/libcontainer that we vendor would not always return false for `apparmor.IsEnabled()` when running docker-in-docker or if `apparmor_parser` is not found. Starting with v1.0.0-rc93 of libcontainer, this is no longer the case (changed in `bfb4ea1b1b`) This patch; - changes the tests to initialize Daemon.configStore and Container.HostConfig - Combines TestExecSetPlatformOpt and TestExecSetPlatformOptPrivileged into a new test (TestExecSetPlatformOptAppArmor) - Runs the test both if AppArmor is enabled and if not (in which case it tests that the container's AppArmor profile is left empty). - Adds a FIXME comment for a possible bug in execSetPlatformOpts, which currently prefers custom profiles over "privileged". Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2021-04-23 00:39:39 +02:00
Sebastiaan van Stijn	2834f842ee	Use containerd's apparmor package to detect if apparmor can be used The runc/libcontainer apparmor package on master no longer checks if apparmor_parser is enabled, or if we are running docker-in-docker. While those checks are not relevant to runc (as it doesn't load the profile), these checks _are_ relevant to us (and containerd). So switching to use the containerd apparmor package, which does include the needed checks. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2021-04-08 20:22:08 +02:00
Sebastiaan van Stijn	9f0b3f5609	bump gotest.tools v3.0.1 for compatibility with Go 1.14 full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.1 Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2020-02-11 00:06:42 +01:00
Sebastiaan van Stijn	a33cf495f2	daemon: use constants for AppArmor profiles Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-10-13 19:16:12 +02:00
Sebastiaan van Stijn	07ff4f1de8	goimports: fix imports Format the source according to latest goimports. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-09-18 12:56:54 +02:00
Vincent Demeester	3845728524	Update tests to use gotest.tools 👼 Signed-off-by: Vincent Demeester <vincent@sbr.pm>	2018-06-13 09:04:30 +02:00
Sebastiaan van Stijn	8f3308ae10	Fix AppArmor not being applied to Exec processes Exec processes do not automatically inherit AppArmor profiles from the container. This patch sets the AppArmor profile for the exec process. Before this change: apparmor_parser -q -r <<EOF #include <tunables/global> profile deny-write flags=(attach_disconnected) { #include <abstractions/base> file, network, deny /tmp/** w, capability, } EOF docker run -dit --security-opt "apparmor=deny-write" --name aa busybox docker exec aa sh -c 'mkdir /tmp/test' (no error) With this change applied: docker exec aa sh -c 'mkdir /tmp/test' mkdir: can't create directory '/tmp/test': Permission denied Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-03-02 14:05:36 +01:00

13 commits