2018-02-05 21:05:59 +00:00
package config // import "github.com/docker/docker/daemon/config"
2017-01-23 11:23:07 +00:00
import (
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
"context"
2017-01-23 11:23:07 +00:00
"fmt"
2021-07-10 12:44:02 +00:00
"net"
2022-06-06 19:11:08 +00:00
"os/exec"
"path/filepath"
2017-01-23 11:23:07 +00:00
2023-01-30 14:43:31 +00:00
"github.com/containerd/cgroups/v3"
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
"github.com/containerd/log"
2022-04-02 15:40:59 +00:00
"github.com/docker/docker/api/types/container"
2023-07-03 11:14:14 +00:00
"github.com/docker/docker/api/types/system"
Allow overlapping change in bridge's IPv6 network.
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.
(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)
IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.
Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.
Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.
Signed-off-by: Rob Murray <rob.murray@docker.com>
2023-11-25 14:38:03 +00:00
"github.com/docker/docker/libnetwork/drivers/bridge"
2017-01-23 11:23:07 +00:00
"github.com/docker/docker/opts"
2022-06-06 19:11:08 +00:00
"github.com/docker/docker/pkg/homedir"
2023-01-03 12:08:40 +00:00
"github.com/docker/docker/pkg/rootless"
2019-08-05 14:37:47 +00:00
units "github.com/docker/go-units"
2022-06-06 19:11:08 +00:00
"github.com/pkg/errors"
2017-01-23 11:23:07 +00:00
)
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
const (
// DefaultIpcMode is default for container's IpcMode, if not set otherwise
2022-04-02 15:40:59 +00:00
DefaultIpcMode = container . IPCModePrivate
2019-10-13 12:18:57 +00:00
// DefaultCgroupNamespaceMode is the default mode for containers cgroup namespace when using cgroups v2.
2022-04-02 15:40:59 +00:00
DefaultCgroupNamespaceMode = container . CgroupnsModePrivate
2019-10-13 12:18:57 +00:00
// DefaultCgroupV1NamespaceMode is the default mode for containers cgroup namespace when using cgroups v1.
2022-04-02 15:40:59 +00:00
DefaultCgroupV1NamespaceMode = container . CgroupnsModeHost
2021-02-26 23:23:55 +00:00
// StockRuntimeName is the reserved name/alias used to represent the
// OCI runtime being shipped with the docker daemon package.
StockRuntimeName = "runc"
daemon: raise default minimum API version to v1.24
The daemon currently provides support for API versions all the way back
to v1.12, which is the version of the API that shipped with docker 1.0. On
Windows, the minimum supported version is v1.24.
Such old versions of the client are rare, and supporting older API versions
has accumulated significant amounts of code to remain backward-compatible
(which is largely untested, and a "best-effort" at most).
This patch updates the minimum API version to v1.24, which is the fallback
API version used when API-version negotiation fails. The intent is to start
deprecating older API versions, but no code is removed yet as part of this
patch, and a DOCKER_MIN_API_VERSION environment variable is added, which
allows overriding the minimum version (to allow restoring the behavior from
before this patch).
With this patch the daemon defaults to API v1.24 as minimum:
docker version
Client:
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: dev
API version: 1.44 (minimum version 1.24)
Go version: go1.21.3
Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
Built: Mon Dec 4 15:22:17 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v1.7.9
GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Trying to use an older version of the API produces an error:
DOCKER_API_VERSION=1.23 docker version
Client:
Version: 24.0.2
API version: 1.23 (downgraded from 1.43)
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Error response from daemon: client version 1.23 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version
To restore the previous minimum, users can start the daemon with the
DOCKER_MIN_API_VERSION environment variable set:
DOCKER_MIN_API_VERSION=1.12 dockerd
API 1.12 is the oldest supported API version on Linux;
docker version
Client:
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: dev
API version: 1.44 (minimum version 1.12)
Go version: go1.21.3
Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
Built: Mon Dec 4 15:22:17 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v1.7.9
GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
When using the `DOCKER_MIN_API_VERSION` with a version of the API that
is not supported, an error is produced when starting the daemon;
DOCKER_MIN_API_VERSION=1.11 dockerd --validate
invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: 1.11
DOCKER_MIN_API_VERSION=1.45 dockerd --validate
invalid DOCKER_MIN_API_VERSION: maximum supported API version is 1.44: 1.45
Specifying a malformed API version also produces the same error;
DOCKER_MIN_API_VERSION=hello dockerd --validate
invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: hello
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-04 12:44:21 +00:00
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
// userlandProxyBinary is the name of the userland-proxy binary.
// In rootless-mode, [rootless.RootlessKitDockerProxyBinary] is used instead.
userlandProxyBinary = "docker-proxy"
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
)
2023-11-03 10:08:22 +00:00
// BridgeConfig stores all the parameters for both the bridge driver and the default bridge network.
2021-07-10 12:40:25 +00:00
type BridgeConfig struct {
2023-11-03 10:08:22 +00:00
DefaultBridgeConfig
EnableIPTables bool ` json:"iptables,omitempty" `
EnableIP6Tables bool ` json:"ip6tables,omitempty" `
EnableIPForward bool ` json:"ip-forward,omitempty" `
EnableIPMasq bool ` json:"ip-masq,omitempty" `
EnableUserlandProxy bool ` json:"userland-proxy,omitempty" `
UserlandProxyPath string ` json:"userland-proxy-path,omitempty" `
}
// DefaultBridgeConfig stores all the parameters for the default bridge network.
type DefaultBridgeConfig struct {
2021-07-10 12:40:25 +00:00
commonBridgeConfig
// Fields below here are platform specific.
2023-11-03 10:08:22 +00:00
EnableIPv6 bool ` json:"ipv6,omitempty" `
FixedCIDRv6 string ` json:"fixed-cidr-v6,omitempty" `
2023-07-05 12:21:04 +00:00
MTU int ` json:"mtu,omitempty" `
2021-07-10 12:44:02 +00:00
DefaultIP net . IP ` json:"ip,omitempty" `
IP string ` json:"bip,omitempty" `
DefaultGatewayIPv4 net . IP ` json:"default-gateway,omitempty" `
DefaultGatewayIPv6 net . IP ` json:"default-gateway-v6,omitempty" `
InterContainerCommunication bool ` json:"icc,omitempty" `
2021-07-10 12:40:25 +00:00
}
2017-01-23 11:23:07 +00:00
// Config defines the configuration of a docker daemon.
// It includes json tags to deserialize configuration from a file
// using the same names that the flags in the command line uses.
type Config struct {
CommonConfig
// Fields below here are platform specific.
2023-07-03 11:14:14 +00:00
Runtimes map [ string ] system . Runtime ` json:"runtimes,omitempty" `
DefaultInitBinary string ` json:"default-init,omitempty" `
CgroupParent string ` json:"cgroup-parent,omitempty" `
EnableSelinuxSupport bool ` json:"selinux-enabled,omitempty" `
RemappedRoot string ` json:"userns-remap,omitempty" `
Ulimits map [ string ] * units . Ulimit ` json:"default-ulimits,omitempty" `
CPURealtimePeriod int64 ` json:"cpu-rt-period,omitempty" `
CPURealtimeRuntime int64 ` json:"cpu-rt-runtime,omitempty" `
Init bool ` json:"init,omitempty" `
InitPath string ` json:"init-path,omitempty" `
SeccompProfile string ` json:"seccomp-profile,omitempty" `
ShmSize opts . MemBytes ` json:"default-shm-size,omitempty" `
NoNewPrivileges bool ` json:"no-new-privileges,omitempty" `
IpcMode string ` json:"default-ipc-mode,omitempty" `
CgroupNamespaceMode string ` json:"default-cgroupns-mode,omitempty" `
2018-07-17 19:11:38 +00:00
// ResolvConf is the path to the configuration of the host resolver
ResolvConf string ` json:"resolv-conf,omitempty" `
2018-10-15 07:52:53 +00:00
Rootless bool ` json:"rootless,omitempty" `
2017-01-23 11:23:07 +00:00
}
2021-07-10 12:50:32 +00:00
// GetExecRoot returns the user configured Exec-root
func ( conf * Config ) GetExecRoot ( ) string {
return conf . ExecRoot
}
// GetInitPath returns the configured docker-init path
func ( conf * Config ) GetInitPath ( ) string {
if conf . InitPath != "" {
return conf . InitPath
}
if conf . DefaultInitBinary != "" {
return conf . DefaultInitBinary
}
return DefaultInitBinary
}
2023-03-22 20:26:43 +00:00
// LookupInitPath returns an absolute path to the "docker-init" binary by searching relevant "libexec" directories (per FHS 3.0 & 2.3) followed by PATH
func ( conf * Config ) LookupInitPath ( ) ( string , error ) {
binary := conf . GetInitPath ( )
if filepath . IsAbs ( binary ) {
return binary , nil
}
for _ , dir := range [ ] string {
// FHS 3.0: "/usr/libexec includes internal binaries that are not intended to be executed directly by users or shell scripts. Applications may use a single subdirectory under /usr/libexec."
// https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html
"/usr/local/libexec/docker" ,
"/usr/libexec/docker" ,
// FHS 2.3: "/usr/lib includes object files, libraries, and internal binaries that are not intended to be executed directly by users or shell scripts."
// https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA
"/usr/local/lib/docker" ,
"/usr/lib/docker" ,
} {
// exec.LookPath has a fast-path short-circuit for paths that contain "/" (skipping the PATH lookup) that then verifies whether the given path is likely to be an actual executable binary (so we invoke that instead of reimplementing the same checks)
if file , err := exec . LookPath ( filepath . Join ( dir , binary ) ) ; err == nil {
return file , nil
}
}
// if we checked all the "libexec" directories and found no matches, fall back to PATH
return exec . LookPath ( binary )
}
2021-07-10 12:50:32 +00:00
// GetResolvConf returns the appropriate resolv.conf
// Check setupResolvConf on how this is selected
func ( conf * Config ) GetResolvConf ( ) string {
return conf . ResolvConf
}
2017-01-23 11:23:07 +00:00
// IsSwarmCompatible defines if swarm mode can be enabled in this config
func ( conf * Config ) IsSwarmCompatible ( ) error {
if conf . LiveRestoreEnabled {
return fmt . Errorf ( "--live-restore daemon configuration is incompatible with swarm mode" )
}
return nil
}
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
func verifyDefaultIpcMode ( mode string ) error {
2021-05-31 09:48:30 +00:00
const hint = ` use "shareable" or "private" `
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
2022-04-02 15:40:59 +00:00
dm := container . IpcMode ( mode )
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
if ! dm . Valid ( ) {
2021-05-31 09:48:30 +00:00
return fmt . Errorf ( "default IPC mode setting (%v) is invalid; " + hint , dm )
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
}
if dm != "" && ! dm . IsPrivate ( ) && ! dm . IsShareable ( ) {
2021-05-31 09:48:30 +00:00
return fmt . Errorf ( ` IPC mode "%v" is not supported as default value; ` + hint , dm )
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
}
return nil
}
2019-03-15 03:44:18 +00:00
func verifyDefaultCgroupNsMode ( mode string ) error {
2022-04-02 15:40:59 +00:00
cm := container . CgroupnsMode ( mode )
2019-03-15 03:44:18 +00:00
if ! cm . Valid ( ) {
2021-05-31 09:48:30 +00:00
return fmt . Errorf ( ` default cgroup namespace mode (%v) is invalid; use "host" or "private" ` , cm )
2019-03-15 03:44:18 +00:00
}
return nil
}
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
// ValidatePlatformConfig checks if any platform-specific configuration settings are invalid.
func ( conf * Config ) ValidatePlatformConfig ( ) error {
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
if conf . EnableUserlandProxy {
if conf . UserlandProxyPath == "" {
return errors . New ( "invalid userland-proxy-path: userland-proxy is enabled, but userland-proxy-path is not set" )
}
if ! filepath . IsAbs ( conf . UserlandProxyPath ) {
return errors . New ( "invalid userland-proxy-path: must be an absolute path: " + conf . UserlandProxyPath )
}
// Using exec.LookPath here, because it also produces an error if the
// given path is not a valid executable or a directory.
if _ , err := exec . LookPath ( conf . UserlandProxyPath ) ; err != nil {
return errors . Wrap ( err , "invalid userland-proxy-path" )
}
}
2019-03-15 03:44:18 +00:00
if err := verifyDefaultIpcMode ( conf . IpcMode ) ; err != nil {
return err
}
Allow overlapping change in bridge's IPv6 network.
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.
(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)
IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.
Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.
Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.
Signed-off-by: Rob Murray <rob.murray@docker.com>
2023-11-25 14:38:03 +00:00
if err := bridge . ValidateFixedCIDRV6 ( conf . FixedCIDRv6 ) ; err != nil {
return errors . Wrap ( err , "invalid fixed-cidr-v6" )
}
2024-03-15 17:04:41 +00:00
if _ , ok := conf . Features [ "windows-dns-proxy" ] ; ok {
return errors . New ( "feature option 'windows-dns-proxy' is only available on Windows" )
}
2019-03-15 03:44:18 +00:00
return verifyDefaultCgroupNsMode ( conf . CgroupNamespaceMode )
Implement none, private, and shareable ipc modes
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-06-27 21:58:50 +00:00
}
2018-10-15 07:52:53 +00:00
2021-07-10 12:40:25 +00:00
// IsRootless returns conf.Rootless on Linux but false on Windows
2018-10-15 07:52:53 +00:00
func ( conf * Config ) IsRootless ( ) bool {
return conf . Rootless
}
2022-06-06 19:11:08 +00:00
func setPlatformDefaults ( cfg * Config ) error {
cfg . Ulimits = make ( map [ string ] * units . Ulimit )
cfg . ShmSize = opts . MemBytes ( DefaultShmSize )
cfg . SeccompProfile = SeccompProfileDefault
cfg . IpcMode = string ( DefaultIpcMode )
2023-07-03 11:14:14 +00:00
cfg . Runtimes = make ( map [ string ] system . Runtime )
2022-06-06 19:11:08 +00:00
if cgroups . Mode ( ) != cgroups . Unified {
cfg . CgroupNamespaceMode = string ( DefaultCgroupV1NamespaceMode )
} else {
cfg . CgroupNamespaceMode = string ( DefaultCgroupNamespaceMode )
}
if rootless . RunningWithRootlessKit ( ) {
cfg . Rootless = true
var err error
// use rootlesskit-docker-proxy for exposing the ports in RootlessKit netns to the initial namespace.
cfg . BridgeConfig . UserlandProxyPath , err = exec . LookPath ( rootless . RootlessKitDockerProxyBinary )
if err != nil {
return errors . Wrapf ( err , "running with RootlessKit, but %s not installed" , rootless . RootlessKitDockerProxyBinary )
}
dataHome , err := homedir . GetDataHome ( )
if err != nil {
return err
}
runtimeDir , err := homedir . GetRuntimeDir ( )
if err != nil {
return err
}
cfg . Root = filepath . Join ( dataHome , "docker" )
cfg . ExecRoot = filepath . Join ( runtimeDir , "docker" )
cfg . Pidfile = filepath . Join ( runtimeDir , "docker.pid" )
} else {
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
var err error
cfg . BridgeConfig . UserlandProxyPath , err = exec . LookPath ( userlandProxyBinary )
if err != nil {
2024-01-09 09:31:06 +00:00
// Log, but don't error here. This allows running a daemon with
// userland-proxy disabled (which does not require the binary
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
// to be present).
//
// An error is still produced by [Config.ValidatePlatformConfig] if
// userland-proxy is enabled in the configuration.
2024-01-09 09:31:06 +00:00
//
// We log this at "debug" level, as this code is also executed
// when running "--version", and we don't want to print logs in
// that case..
log . G ( context . TODO ( ) ) . WithError ( err ) . Debug ( "failed to lookup default userland-proxy binary" )
portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 12:52:45 +00:00
}
2022-06-06 19:11:08 +00:00
cfg . Root = "/var/lib/docker"
cfg . ExecRoot = "/var/run/docker"
cfg . Pidfile = "/var/run/docker.pid"
}
return nil
}