moby/daemon/config/config_linux.go

269 lines
10 KiB
Go
Raw Permalink Normal View History

package config // import "github.com/docker/docker/daemon/config"
import (
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
"context"
"fmt"
"net"
"os/exec"
"path/filepath"
"github.com/containerd/cgroups/v3"
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
"github.com/containerd/log"
"github.com/docker/docker/api/types/container"
"github.com/docker/docker/api/types/system"
"github.com/docker/docker/libnetwork/drivers/bridge"
"github.com/docker/docker/opts"
"github.com/docker/docker/pkg/homedir"
"github.com/docker/docker/pkg/rootless"
units "github.com/docker/go-units"
"github.com/pkg/errors"
)
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
const (
// DefaultIpcMode is default for container's IpcMode, if not set otherwise
DefaultIpcMode = container.IPCModePrivate
// DefaultCgroupNamespaceMode is the default mode for containers cgroup namespace when using cgroups v2.
DefaultCgroupNamespaceMode = container.CgroupnsModePrivate
// DefaultCgroupV1NamespaceMode is the default mode for containers cgroup namespace when using cgroups v1.
DefaultCgroupV1NamespaceMode = container.CgroupnsModeHost
// StockRuntimeName is the reserved name/alias used to represent the
// OCI runtime being shipped with the docker daemon package.
StockRuntimeName = "runc"
daemon: raise default minimum API version to v1.24 The daemon currently provides support for API versions all the way back to v1.12, which is the version of the API that shipped with docker 1.0. On Windows, the minimum supported version is v1.24. Such old versions of the client are rare, and supporting older API versions has accumulated significant amounts of code to remain backward-compatible (which is largely untested, and a "best-effort" at most). This patch updates the minimum API version to v1.24, which is the fallback API version used when API-version negotiation fails. The intent is to start deprecating older API versions, but no code is removed yet as part of this patch, and a DOCKER_MIN_API_VERSION environment variable is added, which allows overriding the minimum version (to allow restoring the behavior from before this patch). With this patch the daemon defaults to API v1.24 as minimum: docker version Client: Version: 24.0.2 API version: 1.43 Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:50:49 2023 OS/Arch: linux/arm64 Context: default Server: Engine: Version: dev API version: 1.44 (minimum version 1.24) Go version: go1.21.3 Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4 Built: Mon Dec 4 15:22:17 2023 OS/Arch: linux/arm64 Experimental: false containerd: Version: v1.7.9 GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d runc: Version: 1.1.10 GitCommit: v1.1.10-0-g18a0cb0 docker-init: Version: 0.19.0 GitCommit: de40ad0 Trying to use an older version of the API produces an error: DOCKER_API_VERSION=1.23 docker version Client: Version: 24.0.2 API version: 1.23 (downgraded from 1.43) Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:50:49 2023 OS/Arch: linux/arm64 Context: default Error response from daemon: client version 1.23 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version To restore the previous minimum, users can start the daemon with the DOCKER_MIN_API_VERSION environment variable set: DOCKER_MIN_API_VERSION=1.12 dockerd API 1.12 is the oldest supported API version on Linux; docker version Client: Version: 24.0.2 API version: 1.43 Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:50:49 2023 OS/Arch: linux/arm64 Context: default Server: Engine: Version: dev API version: 1.44 (minimum version 1.12) Go version: go1.21.3 Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4 Built: Mon Dec 4 15:22:17 2023 OS/Arch: linux/arm64 Experimental: false containerd: Version: v1.7.9 GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d runc: Version: 1.1.10 GitCommit: v1.1.10-0-g18a0cb0 docker-init: Version: 0.19.0 GitCommit: de40ad0 When using the `DOCKER_MIN_API_VERSION` with a version of the API that is not supported, an error is produced when starting the daemon; DOCKER_MIN_API_VERSION=1.11 dockerd --validate invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: 1.11 DOCKER_MIN_API_VERSION=1.45 dockerd --validate invalid DOCKER_MIN_API_VERSION: maximum supported API version is 1.44: 1.45 Specifying a malformed API version also produces the same error; DOCKER_MIN_API_VERSION=hello dockerd --validate invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: hello Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-04 12:44:21 +00:00
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
// userlandProxyBinary is the name of the userland-proxy binary.
// In rootless-mode, [rootless.RootlessKitDockerProxyBinary] is used instead.
userlandProxyBinary = "docker-proxy"
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
)
// BridgeConfig stores all the parameters for both the bridge driver and the default bridge network.
type BridgeConfig struct {
DefaultBridgeConfig
EnableIPTables bool `json:"iptables,omitempty"`
EnableIP6Tables bool `json:"ip6tables,omitempty"`
EnableIPForward bool `json:"ip-forward,omitempty"`
EnableIPMasq bool `json:"ip-masq,omitempty"`
EnableUserlandProxy bool `json:"userland-proxy,omitempty"`
UserlandProxyPath string `json:"userland-proxy-path,omitempty"`
}
// DefaultBridgeConfig stores all the parameters for the default bridge network.
type DefaultBridgeConfig struct {
commonBridgeConfig
// Fields below here are platform specific.
EnableIPv6 bool `json:"ipv6,omitempty"`
FixedCIDRv6 string `json:"fixed-cidr-v6,omitempty"`
MTU int `json:"mtu,omitempty"`
DefaultIP net.IP `json:"ip,omitempty"`
IP string `json:"bip,omitempty"`
DefaultGatewayIPv4 net.IP `json:"default-gateway,omitempty"`
DefaultGatewayIPv6 net.IP `json:"default-gateway-v6,omitempty"`
InterContainerCommunication bool `json:"icc,omitempty"`
}
// Config defines the configuration of a docker daemon.
// It includes json tags to deserialize configuration from a file
// using the same names that the flags in the command line uses.
type Config struct {
CommonConfig
// Fields below here are platform specific.
Runtimes map[string]system.Runtime `json:"runtimes,omitempty"`
DefaultInitBinary string `json:"default-init,omitempty"`
CgroupParent string `json:"cgroup-parent,omitempty"`
EnableSelinuxSupport bool `json:"selinux-enabled,omitempty"`
RemappedRoot string `json:"userns-remap,omitempty"`
Ulimits map[string]*units.Ulimit `json:"default-ulimits,omitempty"`
CPURealtimePeriod int64 `json:"cpu-rt-period,omitempty"`
CPURealtimeRuntime int64 `json:"cpu-rt-runtime,omitempty"`
Init bool `json:"init,omitempty"`
InitPath string `json:"init-path,omitempty"`
SeccompProfile string `json:"seccomp-profile,omitempty"`
ShmSize opts.MemBytes `json:"default-shm-size,omitempty"`
NoNewPrivileges bool `json:"no-new-privileges,omitempty"`
IpcMode string `json:"default-ipc-mode,omitempty"`
CgroupNamespaceMode string `json:"default-cgroupns-mode,omitempty"`
// ResolvConf is the path to the configuration of the host resolver
ResolvConf string `json:"resolv-conf,omitempty"`
Rootless bool `json:"rootless,omitempty"`
}
// GetExecRoot returns the user configured Exec-root
func (conf *Config) GetExecRoot() string {
return conf.ExecRoot
}
// GetInitPath returns the configured docker-init path
func (conf *Config) GetInitPath() string {
if conf.InitPath != "" {
return conf.InitPath
}
if conf.DefaultInitBinary != "" {
return conf.DefaultInitBinary
}
return DefaultInitBinary
}
Prefer loading `docker-init` from an appropriate "libexec" directory The `docker-init` binary is not intended to be a user-facing command, and as such it is more appropriate for it to be found in `/usr/libexec` (or similar) than in `PATH` (see the FHS, especially https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html and https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA). This adjusts the logic for using that configuration option to take this into account and appropriately search for `docker-init` (or the user's configured alternative) in these directories before falling back to the existing `PATH` lookup behavior. This behavior _used_ to exist for the old `dockerinit` binary (of a similar name and used in a similar way but for an alternative purpose), but that behavior was removed in 4357ed4a7363a1032edf93cf03232953c805184f when that older `dockerinit` was also removed. Most of this reasoning _also_ applies to `docker-proxy` (and various `containerd-xxx` binaries such as the shims), but this change does not affect those. It would be relatively straightforward to adapt `LookupInitPath` to be a more generic function such as `libexecLookupPath` or similar if we wanted to explore that. See https://github.com/docker/cli/blob/14482589df194a86b2ee07df643ba3277b40df7d/cli-plugins/manager/manager_unix.go for the related path list in the CLI which loads CLI plugins from a similar set of paths (with a similar rationale - plugin binaries are not typically intended to be run directly by users but rather invoked _via_ the CLI binary). Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
2023-03-22 20:26:43 +00:00
// LookupInitPath returns an absolute path to the "docker-init" binary by searching relevant "libexec" directories (per FHS 3.0 & 2.3) followed by PATH
func (conf *Config) LookupInitPath() (string, error) {
binary := conf.GetInitPath()
if filepath.IsAbs(binary) {
return binary, nil
}
for _, dir := range []string{
// FHS 3.0: "/usr/libexec includes internal binaries that are not intended to be executed directly by users or shell scripts. Applications may use a single subdirectory under /usr/libexec."
// https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html
"/usr/local/libexec/docker",
"/usr/libexec/docker",
// FHS 2.3: "/usr/lib includes object files, libraries, and internal binaries that are not intended to be executed directly by users or shell scripts."
// https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA
"/usr/local/lib/docker",
"/usr/lib/docker",
} {
// exec.LookPath has a fast-path short-circuit for paths that contain "/" (skipping the PATH lookup) that then verifies whether the given path is likely to be an actual executable binary (so we invoke that instead of reimplementing the same checks)
if file, err := exec.LookPath(filepath.Join(dir, binary)); err == nil {
return file, nil
}
}
// if we checked all the "libexec" directories and found no matches, fall back to PATH
return exec.LookPath(binary)
}
// GetResolvConf returns the appropriate resolv.conf
// Check setupResolvConf on how this is selected
func (conf *Config) GetResolvConf() string {
return conf.ResolvConf
}
// IsSwarmCompatible defines if swarm mode can be enabled in this config
func (conf *Config) IsSwarmCompatible() error {
if conf.LiveRestoreEnabled {
return fmt.Errorf("--live-restore daemon configuration is incompatible with swarm mode")
}
return nil
}
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
func verifyDefaultIpcMode(mode string) error {
const hint = `use "shareable" or "private"`
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
dm := container.IpcMode(mode)
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
if !dm.Valid() {
return fmt.Errorf("default IPC mode setting (%v) is invalid; "+hint, dm)
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
}
if dm != "" && !dm.IsPrivate() && !dm.IsShareable() {
return fmt.Errorf(`IPC mode "%v" is not supported as default value; `+hint, dm)
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
}
return nil
}
func verifyDefaultCgroupNsMode(mode string) error {
cm := container.CgroupnsMode(mode)
if !cm.Valid() {
return fmt.Errorf(`default cgroup namespace mode (%v) is invalid; use "host" or "private"`, cm)
}
return nil
}
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
// ValidatePlatformConfig checks if any platform-specific configuration settings are invalid.
func (conf *Config) ValidatePlatformConfig() error {
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
if conf.EnableUserlandProxy {
if conf.UserlandProxyPath == "" {
return errors.New("invalid userland-proxy-path: userland-proxy is enabled, but userland-proxy-path is not set")
}
if !filepath.IsAbs(conf.UserlandProxyPath) {
return errors.New("invalid userland-proxy-path: must be an absolute path: " + conf.UserlandProxyPath)
}
// Using exec.LookPath here, because it also produces an error if the
// given path is not a valid executable or a directory.
if _, err := exec.LookPath(conf.UserlandProxyPath); err != nil {
return errors.Wrap(err, "invalid userland-proxy-path")
}
}
if err := verifyDefaultIpcMode(conf.IpcMode); err != nil {
return err
}
if err := bridge.ValidateFixedCIDRV6(conf.FixedCIDRv6); err != nil {
return errors.Wrap(err, "invalid fixed-cidr-v6")
}
if _, ok := conf.Features["windows-dns-proxy"]; ok {
return errors.New("feature option 'windows-dns-proxy' is only available on Windows")
}
return verifyDefaultCgroupNsMode(conf.CgroupNamespaceMode)
Implement none, private, and shareable ipc modes Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and /dev/mqueue between containers") container's /dev/shm is mounted on the host first, then bind-mounted inside the container. This is done that way in order to be able to share this container's IPC namespace (and the /dev/shm mount point) with another container. Unfortunately, this functionality breaks container checkpoint/restore (even if IPC is not shared). Since /dev/shm is an external mount, its contents is not saved by `criu checkpoint`, and so upon restore any application that tries to access data under /dev/shm is severily disappointed (which usually results in a fatal crash). This commit solves the issue by introducing new IPC modes for containers (in addition to 'host' and 'container:ID'). The new modes are: - 'shareable': enables sharing this container's IPC with others (this used to be the implicit default); - 'private': disables sharing this container's IPC. In 'private' mode, container's /dev/shm is truly mounted inside the container, without any bind-mounting from the host, which solves the issue. While at it, let's also implement 'none' mode. The motivation, as eloquently put by Justin Cormack, is: > I wondered a while back about having a none shm mode, as currently it is > not possible to have a totally unwriteable container as there is always > a /dev/shm writeable mount. It is a bit of a niche case (and clearly > should never be allowed to be daemon default) but it would be trivial to > add now so maybe we should... ...so here's yet yet another mode: - 'none': no /dev/shm mount inside the container (though it still has its own private IPC namespace). Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd need to make 'private' the default mode, but unfortunately it breaks the backward compatibility. So, let's make the default container IPC mode per-daemon configurable (with the built-in default set to 'shareable' for now). The default can be changed either via a daemon CLI option (--default-shm-mode) or a daemon.json configuration file parameter of the same name. Note one can only set either 'shareable' or 'private' IPC modes as a daemon default (i.e. in this context 'host', 'container', or 'none' do not make much sense). Some other changes this patch introduces are: 1. A mount for /dev/shm is added to default OCI Linux spec. 2. IpcMode.Valid() is simplified to remove duplicated code that parsed 'container:ID' form. Note the old version used to check that ID does not contain a semicolon -- this is no longer the case (tests are modified accordingly). The motivation is we should either do a proper check for container ID validity, or don't check it at all (since it is checked in other places anyway). I chose the latter. 3. IpcMode.Container() is modified to not return container ID if the mode value does not start with "container:", unifying the check to be the same as in IpcMode.IsContainer(). 3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified to add checks for newly added values. [v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997] [v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833] [v4: addressed the case of upgrading from older daemon, in this case container.HostConfig.IpcMode is unset and this is valid] [v5: document old and new IpcMode values in api/swagger.yaml] [v6: add the 'none' mode, changelog entry to docs/api/version-history.md] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
}
// IsRootless returns conf.Rootless on Linux but false on Windows
func (conf *Config) IsRootless() bool {
return conf.Rootless
}
func setPlatformDefaults(cfg *Config) error {
cfg.Ulimits = make(map[string]*units.Ulimit)
cfg.ShmSize = opts.MemBytes(DefaultShmSize)
cfg.SeccompProfile = SeccompProfileDefault
cfg.IpcMode = string(DefaultIpcMode)
cfg.Runtimes = make(map[string]system.Runtime)
if cgroups.Mode() != cgroups.Unified {
cfg.CgroupNamespaceMode = string(DefaultCgroupV1NamespaceMode)
} else {
cfg.CgroupNamespaceMode = string(DefaultCgroupNamespaceMode)
}
if rootless.RunningWithRootlessKit() {
cfg.Rootless = true
var err error
// use rootlesskit-docker-proxy for exposing the ports in RootlessKit netns to the initial namespace.
cfg.BridgeConfig.UserlandProxyPath, err = exec.LookPath(rootless.RootlessKitDockerProxyBinary)
if err != nil {
return errors.Wrapf(err, "running with RootlessKit, but %s not installed", rootless.RootlessKitDockerProxyBinary)
}
dataHome, err := homedir.GetDataHome()
if err != nil {
return err
}
runtimeDir, err := homedir.GetRuntimeDir()
if err != nil {
return err
}
cfg.Root = filepath.Join(dataHome, "docker")
cfg.ExecRoot = filepath.Join(runtimeDir, "docker")
cfg.Pidfile = filepath.Join(runtimeDir, "docker.pid")
} else {
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
var err error
cfg.BridgeConfig.UserlandProxyPath, err = exec.LookPath(userlandProxyBinary)
if err != nil {
// Log, but don't error here. This allows running a daemon with
// userland-proxy disabled (which does not require the binary
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
// to be present).
//
// An error is still produced by [Config.ValidatePlatformConfig] if
// userland-proxy is enabled in the configuration.
//
// We log this at "debug" level, as this code is also executed
// when running "--version", and we don't want to print logs in
// that case..
log.G(context.TODO()).WithError(err).Debug("failed to lookup default userland-proxy binary")
portmapper: move userland-proxy lookup to daemon config When mapping a port with the userland-proxy enabled, the daemon would perform an "exec.LookPath" for every mapped port (which, in case of a range of ports, would be for every port in the range). This was both inefficient (looking up the binary for each port), inconsistent (when running in rootless-mode, the binary was looked-up once), as well as inconvenient, because a missing binary, or a mis-configureed userland-proxy-path would not be detected daeemon startup, and not produce an error until starting the container; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. However, the container would still be created (but invalid); docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles This patch changes how the userland-proxy is configured; - The path of the userland-proxy is now looked up / configured at daemon startup; this is similar to how the proxy is configured in rootless-mode. - A warning is logged when failing to lookup the binary. - If the daemon is configured with "userland-proxy" enabled, an error is produced, and the daemon will refuse to start. - The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper) is now required to be set. It no longer looks up the executable, and produces an error if no path was provided. While this change was not required, it makes the daemon config the canonical source of truth, instead of logic spread accross multiplee locations. Some of this logic is a change of behavior, but these changes were made with the assumption that we don't want to support; - installing the userland proxy _after_ the daemon was started - moving the userland proxy (or installing a proxy with a higher preference in PATH) With this patch: Validating the config produces an error if the binary is not found: dockerd --validate WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" userland-proxy is enabled, but userland-proxy-path is not set Disabling userland-proxy prints a warning, but validates as "OK": dockerd --userland-proxy=false --validate WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH" configuration OK Speficying a non-absolute path produces an error: dockerd --userland-proxy-path=docker-proxy --validate invalid userland-proxy-path: must be an absolute path: docker-proxy Befor this patch, we would not validate this path, which would allow the daemon to start, but fail to map a port; docker run -d -P nginx:alpine 4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194 docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory. Specifying an invalid userland-proxy-path produces an error as well: dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory mkdir -p /usr/local/bin/not-a-file dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory touch /usr/local/bin/not-an-executable dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied Same when using the daemon.json config-file; echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json dockerd --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary dockerd --userland-proxy-path=hello --validate unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
}
cfg.Root = "/var/lib/docker"
cfg.ExecRoot = "/var/run/docker"
cfg.Pidfile = "/var/run/docker.pid"
}
return nil
}