Commit graph

36 commits

Author SHA1 Message Date
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
0f6eeecac0 daemon: consolidate runtimes config validation
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.

Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Akihiro Suda
5045a2de24
Support recursively read-only (RRO) mounts
`docker run -v /foo:/foo:ro` is now recursively read-only on kernel >= 5.12.

Automatically falls back to the legacy non-recursively read-only mount mode on kernel < 5.12.

Use `ro-non-recursive` to disable RRO.
Use `ro-force-recursive` or `rro` to explicitly enable RRO. (Fails on kernel < 5.12)

Fix issue 44978
Fix docker/for-linux issue 788

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-05-26 01:58:24 +09:00
Sebastiaan van Stijn
f72548956f
remove deprecated legacy "overlay" storage-driver
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-04-19 17:06:45 +02:00
Sebastiaan van Stijn
3903f16cd6
daemon: remove deprecated AuFS storage driver
There's still some locations refering to AuFS;

- pkg/archive: I suspect most of that code is because the whiteout-files
  are modelled after aufs (but possibly some code is only relevant to
  images created with AuFS as storage driver; to be looked into).
- contrib/apparmor/template: likely some rules can be removed
- contrib/dockerize-disk.sh: very old contribution, and unlikely used
  by anyone, but perhaps could be updated if we want to (or just removed).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-04-15 01:27:16 +02:00
Cory Snider
cc19eba579 daemon: let libnetwork assign default bridge IPAM
The netutils.ElectInterfaceAddresses function is only used in one place
outside of tests: in the daemon, to configure the default bridge
network. The function is also messy to reason about as it references the
shared mutable state of ipamutils.PredefinedLocalScopeDefaultNetworks.
It uses the list of predefined default networks to always return an IPv4
address even if the named interface does not exist or does not have any
IPv4 addresses. This list happens to be the same as the one used to
initialize the address pool of the 'builtin' IPAM driver, though that is
far from obvious. (Start with "./libnetwork".initIPAMDrivers and trace
the dataflow of the addressPool value. Surprise! Global state is being
mutated using the value of other global mutable state.)

The daemon does not need the fallback behaviour of
ElectInterfaceAddresses. In fact, the daemon does not have to configure
an address pool for the network at all! libnetwork will acquire one of
the available address ranges from the network's IPAM driver when the
preferred-pool configuration is unset. It will do so using the same list
of address ranges and the exact same logic
(netutils.FindAvailableNetworks) as ElectInterfaceAddresses. So unless
the daemon needs to force the network to use a specific address range
because the bridge interface already exists, it can leave the details
up to libnetwork.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-01-26 14:54:57 -05:00
Sebastiaan van Stijn
19c5d21e6f
daemon: getPluginExecRoot(): pass config
This makes it more transparent that it's unused for Linux,
and we don't pass "root", which has no relation with the
path on Linux.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-17 15:22:10 +02:00
Sebastiaan van Stijn
cb7b329911
daemon: fix daemon.Shutdown, daemon.Cleanup not cleaning up overlay2 mounts
While working on deprecation of the `aufs` and `overlay` storage-drivers, the
`TestCleanupMounts` had to be updated, as it was currently using `aufs` for
testing. When rewriting the test to use `overlay2` instead (using an updated
`mountsFixture`), I found out that the test was failing, and it appears that
only `overlay`, but not `overlay2` was taken into account.

These cleanup functions were added in 05cc737f54,
but at the time the `overlay2` storage driver was not yet implemented;
05cc737f54/daemon/graphdriver

This omission was likely missed in 23e5c94cfb,
because the original implementation re-used the `overlay` storage driver, but
later on it was decided to make `overlay2` a separate storage driver.

As a result of the above, `daemon.cleanupMountsByID()` would ignore any `overlay2`
mounts during `daemon.Shutdown()` and `daemon.Cleanup()`.

This patch:

- Adds a new `mountsFixtureOverlay2` with example mounts for `overlay2`
- Rewrites the tests to use `gotest.tools` for more informative output on failures.
- Adds the missing regex patterns to `daemon/getCleanPatterns()`. The patterns
  are added at the start of the list to allow for the fasted match (`overlay2`
  is the default for most setups, and the code is iterating over possible
  options).

As a follow-up, we could consider adding additional fixtures for different
storage drivers.

Before the fix is applied:

    go test -v -run TestCleanupMounts ./daemon/
    === RUN   TestCleanupMounts
    === RUN   TestCleanupMounts/aufs
    === RUN   TestCleanupMounts/overlay2
    daemon_linux_test.go:135: assertion failed: 0 (unmounted int) != 1 (int): Expected to unmount the shm (and the shm only)
    --- FAIL: TestCleanupMounts (0.01s)
    --- PASS: TestCleanupMounts/aufs (0.00s)
    --- FAIL: TestCleanupMounts/overlay2 (0.01s)
    === RUN   TestCleanupMountsByID
    === RUN   TestCleanupMountsByID/aufs
    === RUN   TestCleanupMountsByID/overlay2
    daemon_linux_test.go:171: assertion failed: 0 (unmounted int) != 1 (int): Expected to unmount the root (and that only)
    --- FAIL: TestCleanupMountsByID (0.00s)
    --- PASS: TestCleanupMountsByID/aufs (0.00s)
    --- FAIL: TestCleanupMountsByID/overlay2 (0.00s)
    FAIL
    FAIL	github.com/docker/docker/daemon	0.054s
    FAIL

With the fix applied:

    go test -v -run TestCleanupMounts ./daemon/
    === RUN   TestCleanupMounts
    === RUN   TestCleanupMounts/aufs
    === RUN   TestCleanupMounts/overlay2
    --- PASS: TestCleanupMounts (0.00s)
    --- PASS: TestCleanupMounts/aufs (0.00s)
    --- PASS: TestCleanupMounts/overlay2 (0.00s)
    === RUN   TestCleanupMountsByID
    === RUN   TestCleanupMountsByID/aufs
    === RUN   TestCleanupMountsByID/overlay2
    --- PASS: TestCleanupMountsByID (0.00s)
    --- PASS: TestCleanupMountsByID/aufs (0.00s)
    --- PASS: TestCleanupMountsByID/overlay2 (0.00s)
    PASS
    ok  	github.com/docker/docker/daemon	0.042s

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-29 16:28:13 +02:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
정재영
19eda6b9a2 Update daemon_linux.go for preventing off-by-one
Array length should be bigger than 5, when accessing index 4

Signed-off-by: J-jaeyoung <jjy600901@gmail.com>
2020-11-17 14:08:48 +09:00
Sebastiaan van Stijn
eb14d936bf
daemon: rename variables that collide with imported package names
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:23 +02:00
Sebastiaan van Stijn
5d040cbd16
daemon: fix capitalization of some functions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:19 +02:00
Kir Kolyshkin
39048cf656 Really switch to moby/sys/mount*
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.

This commit was generated by the following bash script:

```
set -e -u -o pipefail

for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
	sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
		-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
		$file
	goimports -w $file
done
```

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-03-20 09:46:25 -07:00
Sebastiaan van Stijn
bad0b4e604
Remove skip evaluation of symlinks to data root on IoT Core
This fix was added in 8e71b1e210 to work around
a go issue (https://github.com/golang/go/issues/20506).

That issue was fixed in
66c03d39f3,
which is part of Go 1.10 and up. This reverts the changes that were made in
8e71b1e210, and are no longer needed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-07-13 23:44:51 +02:00
Tibor Vass
8ff4ec98cf build: buildkit now also uses systemd's resolv.conf
Signed-off-by: Tibor Vass <tibor@docker.com>
2019-06-04 16:04:10 +00:00
Flavio Crisciani
e353e7e3f0
Fixes for resolv.conf
Handle the case of systemd-resolved, and if in place
use a different resolv.conf source.
Set appropriately the option on libnetwork.
Move unix specific code to container_operation_unix

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-07-26 11:17:56 -07:00
Vincent Demeester
53982e3fc1
Merge pull request #36091 from kolyshkin/mount
pkg/mount improvements
2018-04-21 11:03:54 +02:00
Kir Kolyshkin
d3ebcde82a daemon.cleanupMounts(): use mount.SingleEntryFilter
Use mount.SingleEntryFilter as we're only interested in a single entry.

Test case data of TestShouldUnmountRoot is modified accordingly, as
from now on:

1. `info` can't be nil;

2. the mountpoint check is not performed (as SingleEntryFilter
   guarantees it to be equal to daemon.root).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-04-19 14:48:25 -07:00
Kir Kolyshkin
bb934c6aca pkg/mount: implement/use filter for mountinfo parsing
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.

One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.

This commit adds `filter` argument to `GetMounts()` to implement
two things:
 1. filter out entries a caller is not interested in
 2. stop processing if a caller is found what it wanted

`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.

A few filters are implemented:
 - `PrefixFilter`: filters out all entries not under `prefix`
 - `SingleEntryFilter`: looks for a specific entry

Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.

Unit tests are added to check filters are working.

[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-04-19 14:48:09 -07:00
Brian Goff
c403f0036b Extra check before unmounting on shutdown
This makes sure that if the daemon root was already a self-binded mount
(thus meaning the daemonc only performed a remount) that the daemon does
not try to unmount.

Example:

```
$ sudo mount --bind /var/lib/docker /var/lib/docker
$ sudo dockerd &
```

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-04-18 20:43:42 -04:00
Brian Goff
487c6c7e73 Ensure daemon root is unmounted on shutdown
This is only for the case when dockerd has had to re-mount the daemon
root as shared.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-02-15 15:58:20 -05:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Derek McGowan
1009e6a40b
Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-07-31 13:16:46 -07:00
Darren Stahl
8e71b1e210 Skip evaluation of symlinks to data root on IoT Core
Signed-off-by: Darren Stahl <darst@microsoft.com>
2017-06-13 15:02:35 -07:00
Anusha Ragunathan
26517a0161 Add Windows specific exec root for plugins.
Fixes #30572

Signed-off-by: Anusha Ragunathan <anusha.ragunathan@docker.com>
2017-02-02 14:00:12 -08:00
Tonis Tiigi
05cc737f54 Fix container mount cleanup issues
- Refactor generic and path based cleanup functions into a single function.
- Include aufs and zfs mounts in the mounts cleanup.
- Containers that receive exit event on restore don't require manual cleanup.
- Make missing sandbox id message a warning because currently sandboxes are always cleared on startup. libnetwork#975
- Don't unmount volumes for containers that don't have base path. Shouldn't be needed after #21372

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-03-30 17:25:49 -07:00
Tonis Tiigi
9c4570a958 Replace execdrivers with containerd implementation
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
2016-03-18 13:38:32 -07:00
Lei Jitang
af614a19dc Clean up container rootf mounts on daemon start fixes #19679
When the daemon shutdown ungracefully, it will left the running
containers' rootfs still be mounted. This will cause some error
when trying to remove the containers.

Signed-off-by: Lei Jitang <leijitang@huawei.com>
2016-02-03 20:52:32 -05:00
Brian Goff
78bd17e805 Force IPC mount to unmount on daemon shutdown/init
Instead of using `MNT_DETACH` to unmount the container's mqueue/shm
mounts, force it... but only on daemon init and shutdown.

This makes sure that these IPC mounts are cleaned up even when the
daemon is killed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2015-10-30 15:41:48 -04:00
Chun Chen
213a0f9d86 Do not try to cleanupMounts if daemon.repository is empty
Signed-off-by: Chun Chen <ramichen@tencent.com>
2015-09-29 11:30:18 +08:00
David Calavera
b1d2f52bb2 Improvements to the original sharing implementation.
- Print the mount table as in /proc/self/mountinfo
- Do not exit prematurely when one of the ipc mounts doesn't exist.
- Do not exit prematurely when one of the ipc mounts cannot be unmounted.
- Add a unit test to see if the cleanup really works.
- Use syscall.MNT_DETACH to cleanup mounts after a crash.
- Unmount IPC mounts when the daemon unregisters an old running container.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-09-23 12:07:24 -04:00
Mrunal Patel
c8291f7107 Add support for sharing /dev/shm/ and /dev/mqueue between containers
This changeset creates /dev/shm and /dev/mqueue mounts for each container under
/var/lib/containers/<id>/ and bind mounts them into the container. When --ipc:container<id/name>
is used, then the /dev/shm and /dev/mqueue of the ipc container are used instead of creating
new ones for the container.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)

(cherry picked from commit d88fe447df)
2015-09-11 14:02:11 -04:00
David Calavera
688dd8477e Revert "Add support for sharing /dev/shm/ and /dev/mqueue between containers"
This reverts commit d88fe447df.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-08-26 05:23:00 -04:00
Mrunal Patel
d88fe447df Add support for sharing /dev/shm/ and /dev/mqueue between containers
This changeset creates /dev/shm and /dev/mqueue mounts for each container under
/var/lib/containers/<id>/ and bind mounts them into the container. When --ipc:container<id/name>
is used, then the /dev/shm and /dev/mqueue of the ipc container are used instead of creating
new ones for the container.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
2015-08-19 12:36:52 -04:00