Commit graph

7106 commits

Author SHA1 Message Date
Sebastiaan van Stijn
821b4d4108
daemon/config: DefaultShmSize: minor tweak and improve docs
I had to check what the actual size was, so added it to the const's documentation.

While at it, also made use of it in a test, so that we're testing against the expected
value, and changed one alias to be consistent with other places where we alias this
import.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-18 23:29:36 +01:00
Sebastiaan van Stijn
705f9b68cc
some cleaning up of isolation checks, and platform information
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-18 22:58:37 +01:00
Sebastiaan van Stijn
1b3fef5333
Windows: require Windows Server RS5 / ltsc2019 (build 17763) as minimum
Windows Server 2016 (RS1) reached end of support, and Docker Desktop requires
Windows 10 V19H2 (version 1909, build 18363) as a minimum.

This patch makes Windows Server RS5 /  ltsc2019 (build 17763) the minimum version
to run the daemon, and removes some hacks for older versions of Windows.

There is one check remaining that checks for Windows RS3 for a workaround
on older versions, but recent changes in Windows seemed to have regressed
on the same issue, so I kept that code for now to check if we may need that
workaround (again);

085c6a98d5/daemon/graphdriver/windows/windows.go (L319-L341)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-18 22:58:28 +01:00
Akihiro Suda
54d35c071d
Merge pull request #43130 from thaJeztah/daemon_cache_sysinfo
daemon: load and cache sysInfo on initialization
2022-02-18 13:46:15 +09:00
Sebastiaan van Stijn
79cad59d97
daemon/images: ImageService.Cleanup(): return error instead of logging
This makes the function a bit more idiomatic, and leaves it to the caller to
decide wether or not the error can be ignored.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-17 22:04:03 +01:00
Sebastiaan van Stijn
32e5fe5099
Merge pull request #43182 from thaJeztah/layer_remove_unused_error
layer: remove unused error return from .Size() and .DiffSize()
2022-02-17 20:51:45 +01:00
Brian Goff
047d58f007
Merge pull request #43187 from thaJeztah/remove_lcow_checks
Remove various leftover LCOW checks
2022-02-17 11:22:19 -08:00
Brian Goff
81db56f0a9
Merge pull request #43253 from thaJeztah/remove_kernel_version_check
daemon: remove kernel version check and DOCKER_NOWARN_KERNEL_VERSION
2022-02-17 11:17:15 -08:00
Sebastiaan van Stijn
dd4cf4b641
daemon: remove some unused stubs on Windows
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-17 17:57:51 +01:00
Sebastiaan van Stijn
1240f8b41d
daemon: remove kernel version check and DOCKER_NOWARN_KERNEL_VERSION
All regular, non-EOL Linux distros now come with more recent kernels
out of the box. There may still be users trying to run on kernel 3.10
or older (some embedded systems, e.g.), but those should be a rare
exception, which we don't have to take into account.

This patch removes the kernel version check on Linux, and the corresponding
DOCKER_NOWARN_KERNEL_VERSION environment that was there to skip this
check.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-17 17:47:22 +01:00
Sebastiaan van Stijn
699174347c
daemon: use RWMutex for stateCounter
Use an RWMutex to allow concurrent reads of these counters

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-15 18:04:18 +01:00
Tianon Gravi
469749947b
Merge pull request #43181 from thaJeztah/image_service_debuglogs
daemon/images.NewImageService() don't print debug logs
2022-02-14 09:45:40 -08:00
Samuel Karp
0d9a37d0c2
oci: inheritable capability set should be empty
The Linux kernel never sets the Inheritable capability flag to anything
other than empty.  Moby should have the same behavior, and leave it to
userspace code within the container to set a non-empty value if desired.

Reported-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Samuel Karp <skarp@amazon.com>
2022-02-08 14:33:44 -08:00
Sebastiaan van Stijn
b88f4e2604
daemon/logger/awslogs: suppress false positive on hardcoded creds (gosec)
daemon/logger/awslogs/cloudwatchlogs.go:42:2: G101: Potential hardcoded credentials (gosec)
        credentialsEndpointKey = "awslogs-credentials-endpoint"
        ^
    daemon/logger/awslogs/cloudwatchlogs.go:67:2: G101: Potential hardcoded credentials (gosec)
        credentialsEndpoint = "http://169.254.170.2"
        ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-08 09:43:22 +01:00
Sebastiaan van Stijn
f9fb5d4f25
daemon/graphdriver/fuse-overlayfs: Init(): fix directory permissions (staticcheck)
daemon/graphdriver/fuse-overlayfs/fuseoverlayfs.go:101:63: SA9002: file mode '700' evaluates to 01274; did you mean '0700'? (staticcheck)
        if err := idtools.MkdirAllAndChown(path.Join(home, linkDir), 700, currentID); err != nil {
                                                                     ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-28 12:50:30 +01:00
Akihiro Suda
65b8bcc321
Merge pull request #43174 from thaJeztah/move_platformcheck
distribution: remove RootFSDownloadManager interface, and remove "os" argument from Download()
2022-01-26 14:08:44 +09:00
Sebastiaan van Stijn
b36d896fce
layer: remove OS from layerstore
This was added in commits fc21bf280b and
0380fbff37 in support of LCOW, but was
now always set to runtime.GOOS.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 15:23:23 +01:00
Sebastiaan van Stijn
da277f891a
daemon.cleanupContainer() remove named return variable
It only made the code more difficult to read, adding cognitive overload.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 15:23:22 +01:00
Sebastiaan van Stijn
cae1dbee01
ImageService.ReleaseLayer(): remove unused containerOS argument
This looks to be a leftover from LCOW.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 15:23:20 +01:00
Sebastiaan van Stijn
e30a4a438b
daemon: remove leftover LCOW platform checks
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.

- `Daemon.create()`: fix the error message, which assumed it could only occur on Windows.
- `Daemon.cleanupContainer()`: no need to validate container platform to delete it.
- `Daemon.containerExport`: if a container was created, we should be able to
  export it; no need to validate.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 15:23:18 +01:00
Sebastiaan van Stijn
b2ef2e8c83
daemon/images: remove leftover LCOW platform checks
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.

- `ImageService.ImageDelete()`: no need to validate image platform to delete it.
- `ImageService.ImageHistory()`: no need to validate image platform to show its
  history; if it made it into the local image cache, it should be valid.
- `ImageService.ImportImage()`: `dockerfile.BuildFromConfig()` is used for
  `docker (container) commmit` and `docker (image) import`. For `docker import`,
   it's more transparent to perform validation early.
- `ImageService.LookupImage()`: no need to validate image platform to inspect it;
  if it made it into the local image cache, it should be valid.
- `ImageService.SquashImage()`: same. This code was actually broken, because it
  wrapped an `err` that was always `nil`, so would never return an error.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 12:15:50 +01:00
Sebastiaan van Stijn
f5db4b01c0
daemon/images: ImageService.LookupImage(): minor cleanup
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-24 18:45:49 +01:00
Sebastiaan van Stijn
e1ea911aba
layer: remove unused error return from .Size() and .DiffSize()
None of the implementations used return an error, so removing the error
return can simplify using these.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-24 18:45:47 +01:00
Sebastiaan van Stijn
01ae9525dd
Add support for platform (os and architecture) on image import
Commit 0380fbff37 added the ability to pass a
--platform flag on `docker import` when importing an archive. The intent
of that commit was to allow importing a Linux rootfs on a Windows daemon
(as part of the experimental LCOW feature).

A later commit (337ba71fc1) changed some
of this code to take both OS and Architecture into account (for `docker build`
and `docker pull`), but did not yet update the `docker image import`.

This patch updates the import endpoitn to allow passing both OS and
Architecture. Note that currently only matching OSes are accepted,
and an error will be produced when (e.g.) specifying `linux` on Windows
and vice-versa.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-24 18:24:51 +01:00
Sebastiaan van Stijn
5c870b421a
daemon/images.NewImageService() don't print debug logs
These logs were meant to be logged when starting the daemon. Moving the logs
to the daemon startup code (which also prints similar messages) instead of
having the images service log them.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-24 15:55:45 +01:00
Sebastiaan van Stijn
0b0a995d9d
distribution: remove RootFSDownloadManager interface
This interface only had a single implementation (xfer.LayerDownloadManager),
and all places where it was used already imported the xfer package.
Removing the interface, also makes it a closer match to the "upload" part,
as `xfer.LayerUploadManager()` did not use an interface.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-21 13:53:36 +01:00
Sebastiaan van Stijn
bf051447b9
Merge pull request #43139 from samuelkarp/awslogs-tests
awslogs: replace channel-based mocks
2022-01-13 15:31:15 +01:00
Sebastiaan van Stijn
483aa6294b
daemon: load and cache sysInfo on initialization
The `daemon.RawSysInfo()` function can be a heavy operation, as it collects
information about all cgroups on the host, networking, AppArmor, Seccomp, etc.

While looking at our code, I noticed that various parts in the code call this
function, potentially even _multiple times_ per container, for example, it is
called from:

- `verifyPlatformContainerSettings()`
- `oci.WithCgroups()` if the daemon has `cpu-rt-period` or `cpu-rt-runtime` configured
- in `ContainerDecoder.DecodeConfig()`, which is called on boith `container create` and `container commit`

Given that this information is not expected to change during the daemon's
lifecycle, and various information coming from this (such as seccomp and
apparmor status) was already cached, we may as well load it once, and cache
the results in the daemon instance.

This patch updates `daemon.RawSysInfo()` to use a `sync.Once()` so that
it's only executed once for the daemon's lifecycle.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-12 18:28:15 +01:00
Brian Goff
f045d0de94
Merge pull request #43105 from kzys/follow-struct
daemon/logger: refactor followLogs and replace flaky TestFollowLogsHandleDecodeErr
2022-01-11 16:57:02 -08:00
Samuel Karp
71119a5649
awslogs: use gotest.tools/v3/assert more
Signed-off-by: Samuel Karp <skarp@amazon.com>
2022-01-10 21:18:11 -08:00
Samuel Karp
f0e450992c
awslogs: replace channel-based mocks
Signed-off-by: Samuel Karp <skarp@amazon.com>
2022-01-10 18:42:11 -08:00
Sebastiaan van Stijn
c741ab0efa
daemon: remove daemon/discovery as it's now unused
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:24 +01:00
Sebastiaan van Stijn
9492354782
daemon: remove daemon.discoveryWatcher
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:22 +01:00
Sebastiaan van Stijn
f28fc8bc8d
daemon: remove discovery inits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:21 +01:00
Sebastiaan van Stijn
ff2a5301b8
daemon: remove discovery-related config handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:17 +01:00
Sebastiaan van Stijn
702cb7fe14
daemon: remove discovery related tests
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:10 +01:00
Sebastiaan van Stijn
6faceabbb3
Merge pull request #43053 from thaJeztah/use_containerd_oci_devices
daemon.WithDevices(): use containerd's HostDevices()
2022-01-05 20:57:45 +01:00
Brian Goff
520dfc36f9
Merge pull request #43100 from conorevans/conorevans/update-fluent
vendor: github.com/fluent/fluent-logger-golang v1.9.0
2022-01-05 11:46:11 -08:00
Tobias Klauser
cfd26afabe
Use syscall.Timespec.Unix
Use the syscall method instead of repeating the type conversions for
the syscall.Stat_t Atim/Mtim members. This also allows to drop the
//nolint: unconvert comments.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2022-01-03 16:51:02 +01:00
Kazuyoshi Kato
c91e09bee2 daemon/logger: replace flaky TestFollowLogsHandleDecodeErr
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-12-27 09:11:46 -08:00
Kazuyoshi Kato
7a10f5a558 daemon/logger: refactor followLogs to write more unit tests
followLogs() is getting really long (170+ lines) and complex.
The function has multiple inner functions that mutate its variables.

To refactor the function, this change introduces follow{} struct.
The inner functions are now defined as ordinal methods, which are
accessible from tests.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-12-27 09:11:12 -08:00
Conor Evans
5cbc08ce57
The flag ForceStopAsyncSend was added to fluent logger lib in v1.9.0
* When async is enabled, this option defines the interval (ms) at which the connection
to the fluentd-address is re-established. This option is useful if the address
may resolve to one or more IP addresses, e.g. a Consul service address.

While the change in #42979 resolves the issue where a Docker container can be stuck
if the fluentd-address is unavailable, this functionality adds an additional benefit
in that a new and healthy fluentd-address can be resolved, allowing logs to flow once again.

This adds a `fluentd-async-reconnect-interval` log-opt for the fluentd logging driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Conor Evans <coevans@tcd.ie>

Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Conor Evans <coevans@tcd.ie>
2021-12-24 22:04:08 +01:00
Kazuyoshi Kato
8b4c445f54 test: use os.CreateTemp instead of ioutil.TempFile
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-12-23 09:09:47 -08:00
Kazuyoshi Kato
f2e458ebc5 daemon/logger: test followLogs' handleDecodeErr case
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-12-15 15:13:02 -08:00
Kazuyoshi Kato
48d387a757 daemon/logger: read the length header correctly
Before this change, if Decode() couldn't read a log record fully,
the subsequent invocation of Decode() would read the record's non-header part
as a header and cause a huge heap allocation.

This change prevents such a case by having the intermediate buffer in
the decoder struct.

Fixes #42125.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-12-15 15:13:02 -08:00
Sebastiaan van Stijn
f6848ae321
Merge pull request #42979 from akerouanton/bump-fluent-logger
vendor: github.com/fluent/fluent-logger-golang v1.8.0
2021-12-02 20:51:04 +01:00
Sebastiaan van Stijn
787b8fe14f
Merge pull request #42838 from sanjams2/42731-development
Add an option to specify log format for awslogs driver
2021-12-02 20:48:06 +01:00
Albin Kerouanton
bd61629b6b
fluentd: Turn ForceStopAsyncSend true when async connect is used
The flag ForceStopAsyncSend was added to fluent logger lib in v1.5.0 (at
this time named AsyncStop) to tell fluentd to abort sending logs
asynchronously as soon as possible, when its Close() method is called.
However this flag was broken because of the way the lib was handling it
(basically, the lib could be stucked in retry-connect loop without
checking this flag).

Since fluent logger lib v1.7.0, calling Close() (when ForceStopAsyncSend
is true) will really stop all ongoing send/connect procedure,
wherever it's stucked.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2021-12-02 01:15:28 +01:00
Sebastiaan van Stijn
9d9b8e0cf3
daemon.WithDevices(): use containerd's HostDevices()
Trying to reduce the use of libcontainer/devices, as it's considered
to be an "internal" package by runc.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-12-01 15:42:18 +01:00
Akihiro Suda
40ccedd61b
Merge pull request #42785 from sanchayanghosh/42753-fix-host.internal
Fixed docker.internal.gateway not displaying properly on live restore
2021-11-16 13:26:20 +09:00
Akihiro Suda
d116e12c6d
Merge pull request #42726 from thaJeztah/daemon_simplify_nwconfig
daemon: simplify networking config
2021-11-12 01:19:07 +09:00
Tianon Gravi
65cc84abc5
Merge pull request #42152 from AkihiroSuda/fix-rootless-info-42151
info: unset cgroup-related fields when CgroupDriver == none
2021-11-08 14:45:11 -08:00
sanchayanghosh
894230b82d
Fixed docker.internal.gateway not displaying properly on live restore
Also includes review suggestions in daemon.initNetworkController():

- update godoc for setHostGatewayIP()
- change setHostGatewayIP() to get config, instead of daemon
- remove redundant nil check for controller

Signed-off-by: sanchayanghosh <sanchayanghosh@outlook.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-27 12:44:56 +02:00
Sebastiaan van Stijn
76016b846d
daemon: make sure proxy settings are sanitized when printing
The daemon can print the proxy configuration as part of error-messages,
and when reloading the daemon configuration (SIGHUP). Make sure that
the configuration is sanitized before printing.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-27 12:39:02 +02:00
Anca Iordache
427c7cc5f8
Add http(s) proxy properties to daemon configuration
This allows configuring the daemon's proxy server through the daemon.json con-
figuration file or command-line flags configuration file, in addition to the
existing option (through environment variables).

Configuring environment variables on Windows to configure a service is more
complicated than on Linux, and adding alternatives for this to the daemon con-
figuration makes the configuration more transparent and easier to use.

The configuration as set through command-line flags or through the daemon.json
configuration file takes precedence over env-vars in the daemon's environment,
which allows the daemon to use a different proxy. If both command-line flags
and a daemon.json configuration option is set, an error is produced when starting
the daemon.

Note that this configuration is not "live reloadable" due to Golang's use of
`sync.Once()` for proxy configuration, which means that changing the proxy
configuration requires a restart of the daemon (reload / SIGHUP will not update
the configuration.

With this patch:

    cat /etc/docker/daemon.json
    {
        "http-proxy": "http://proxytest.example.com:80",
        "https-proxy": "https://proxytest.example.com:443"
    }

    docker pull busybox
    Using default tag: latest
    Error response from daemon: Get "https://registry-1.docker.io/v2/": proxyconnect tcp: dial tcp: lookup proxytest.example.com on 127.0.0.11:53: no such host

    docker build .
    Sending build context to Docker daemon  89.28MB
    Step 1/3 : FROM golang:1.16-alpine AS base
    Get "https://registry-1.docker.io/v2/": proxyconnect tcp: dial tcp: lookup proxytest.example.com on 127.0.0.11:53: no such host

Integration tests were added to test the behavior:

- verify that the configuration through all means are used (env-var,
  command-line flags, damon.json), and used in the expected order of
  preference.
- verify that conflicting options produce an error.

Signed-off-by: Anca Iordache <anca.iordache@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-27 12:38:59 +02:00
Sebastiaan van Stijn
a6ce7eff65
daemon: move maskCredentials to config package
This allows the utility to be used in other places.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-27 12:38:56 +02:00
Sebastiaan van Stijn
0c887404a8
daemon: fix TestVerifyPlatformContainerResources not capturing variable
This test runs with t.Parallel() _and_ uses subtests, but didn't capture
the `tc` variable, which potentialy (likely) makes it test the same testcase
multiple times.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-20 09:57:54 +02:00
Sebastiaan van Stijn
a80c450fb3
Merge pull request #41935 from alexisries/Issue-41871-Restore-healthcheck-at-dockerd-restart
Resume healthcheck when daemon restarts
2021-10-15 12:46:43 +02:00
Sebastiaan van Stijn
ba16293330
Merge pull request #42907 from thaJeztah/master_forward_port_security_fixes
[master] forward-port security fixes from 20.10.9
2021-10-14 20:43:01 +02:00
James Sanders
68e3034322 Add an option to specify log format for awslogs driver
Added an option 'awslogs-format' to allow specifying
a log format for the logs sent CloudWatch from the aws log driver.
For now, only the 'json/emf' format is supported.
If no option is provided, the log format header in the
request to CloudWatch will be omitted as before.

Signed-off-by: James Sanders <james3sanders@gmail.com>
2021-10-13 07:38:54 -07:00
Alexis Ries
9f39889dee Fixes #41871: Update daemon/daemon.go: resume healthcheck on restore
Call updateHealthMonitor for alive non-paused containers

Signed-off-by: Alexis Ries <alexis.ries.ext@orange.com>
2021-10-07 21:23:27 +02:00
Tianon Gravi
1430d849a4
Merge pull request #42878 from thaJeztah/daemon_check_cd_once
daemon.UsingSystemd(): don't call getCD() multiple times
2021-10-06 17:28:35 -07:00
Brian Goff
03f1c3d78f
Lock down docker root dir perms.
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.

0701 on lower directories also happens to allow any user to modify
     things in, for instance, the overlay upper dir which neccessarily
     has 0755 permissions.

This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.

This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 93ac040bf0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-05 09:57:00 +02:00
Akihiro Suda
30413e5efb
Merge pull request #42736 from thaJeztah/cap_net_raw_usens_detection
daemon.WithCommonOptions() fix detection of user-namespaces
2021-09-24 22:40:59 +09:00
Sebastiaan van Stijn
3ce1dcc25d
daemon.UsingSystemd(): don't call getCD() multiple times
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-09-24 13:51:39 +02:00
Brian Goff
7ccf750daa Allow switching Windows runtimes.
This adds support for 2 runtimes on Windows, one that uses the built-in
HCSv1 integration and another which uses containerd with the runhcs
shim.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-09-23 17:44:04 +00:00
Sebastiaan van Stijn
a826ca3aef
daemon.WithCommonOptions() fix detection of user-namespaces
Commit dae652e2e5 added support for non-privileged
containers to use ICMP_PROTO (used for `ping`). This option cannot be set for
containers that have user-namespaces enabled.

However, the detection looks to be incorrect; HostConfig.UsernsMode was added
in 6993e891d1 / ee2183881b,
and the property only has meaning if the daemon is running with user namespaces
enabled. In other situations, the property has no meaning.
As a result of the above, the sysctl would only be set for containers running
with UsernsMode=host on a daemon running with user-namespaces enabled.

This patch adds a check if the daemon has user-namespaces enabled (RemappedRoot
having a non-empty value), or if the daemon is running inside a user namespace
(e.g. rootless mode) to fix the detection.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-30 19:48:29 +02:00
Akihiro Suda
0cd1bd42b4
Merge pull request #42770 from thaJeztah/eventtype_enums
api/types/events: add "Type" type for event-type enum
2021-08-28 00:23:56 +09:00
Akihiro Suda
9e7bbdb9ba
Merge pull request #40084 from thaJeztah/hostconfig_const_cleanup
api/types: hostconfig: add some constants/enums and minor code cleanup
2021-08-28 00:21:31 +09:00
Eng Zer Jun
c55a4ac779
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-08-27 14:56:57 +08:00
Sebastiaan van Stijn
686be57d0a
Update to Go 1.17.0, and gofmt with Go 1.17
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-24 23:33:27 +02:00
Sebastiaan van Stijn
8207c05cfc
Merge pull request #41479 from olljanat/ci-win-containerd-support
Windows CI: Add support for testing with containerd
2021-08-24 22:29:14 +02:00
Sebastiaan van Stijn
247f4796d2
api/types/events: add "Type" type for event-type enum
Currently just an alias for string, but we can change it to be  an
actual type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-23 21:14:55 +02:00
Sebastiaan van Stijn
b585c64e2b
info: remove "expected" check for tini version
These checks were added when we required a specific version of containerd
and runc (different versions were known to be incompatible). I don't think
we had a similar requirement for tini, so this check was redundant. Let's
remove the check altogether.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-23 13:25:14 +02:00
Olli Janatuinen
1285c6d125 Windows CI: Add support for testing with containerd
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2021-08-17 07:09:40 -07:00
Brian Goff
7681a3eb40
Merge pull request #42481 from thaJeztah/seccomp_unconfined_daemon 2021-08-11 10:02:05 -07:00
Sebastiaan van Stijn
343665850e
Merge pull request #42201 from AkihiroSuda/annotate-btrfs-error
rootless: btrfs: annotate error with human-readable hint string
2021-08-11 16:12:59 +02:00
Roman Volosatovs
135cec5d4d
daemon,volume: share disk usage computations
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-08-09 19:59:39 +02:00
Roman Volosatovs
5adc29ffe2
daemon: sort imports according to gofmt
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-08-09 19:59:37 +02:00
Sebastiaan van Stijn
e8e278c44f
daemon: simplify networking config
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-09 11:15:49 +02:00
Sebastiaan van Stijn
27aaadb710
daemon: normalize seccomp profile as part of setupSeccompProfile()
This makes sure that the value set in the daemon can be used as-is,
without having to replicate the normalization logic elsewhere.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-07 15:41:46 +02:00
Sebastiaan van Stijn
04f932ac86
daemon: move custom seccomp profile warning from CLI to daemon side
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-07 15:41:44 +02:00
Sebastiaan van Stijn
f8795ed364
daemon: allow "builtin" as valid value for seccomp profiles
This allows containers to use the embedded default profile if a different
default is set (e.g. "unconfined") in the daemon configuration. Without this
option, users would have to copy the default profile to a file in order to
use the default.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-07 15:40:47 +02:00
Sebastiaan van Stijn
68e96f88ee
Fix daemon.json and daemon --seccomp-profile not accepting "unconfined"
Commit b237189e6c implemented an option to
set the default seccomp profile in the daemon configuration. When that PR
was reviewed, it was discussed to have the option accept the path to a custom
profile JSON file; https://github.com/moby/moby/pull/26276#issuecomment-253546966

However, in the implementation, the special "unconfined" value was not taken into
account. The "unconfined" value is meant to disable seccomp (more factually:
run with an empty profile).

While it's likely possible to achieve this by creating a file with an an empty
(`{}`) profile, and passing the path to that file, it's inconsistent with the
`--security-opt seccomp=unconfined` option on `docker run` and `docker create`,
which is both confusing, and makes it harder to use (especially on Docker Desktop,
where there's no direct access to the VM's filesystem).

This patch adds the missing check for the special "unconfined" value.

Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-07 15:40:45 +02:00
Sebastiaan van Stijn
ac449d6b5a
daemon/config: rename the default seccomp profile to "builtin"
Using "default" as a name is a bit ambiguous, because the _daemon_ default
can be changed using the '--seccomp-profile' daemon flag.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-07 15:37:03 +02:00
Sebastiaan van Stijn
ee02257553
Add const for "unconfined" and default seccomp profiles
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-07 15:36:06 +02:00
Sebastiaan van Stijn
6948ab4fa1
api/types: hostconfig: fix LogMode enum
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-06 19:05:58 +02:00
Sebastiaan van Stijn
09cf117b31
api/types: hostconfig: create enum for CgroupnsMode
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-06 19:05:54 +02:00
Sebastiaan van Stijn
98f0f0dd87
api/types: hostconfig: define consts for IpcMode
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-06 19:05:51 +02:00
Sebastiaan van Stijn
5e498e20f7
Merge pull request #42710 from rvolosatovs/parallelize_system_df
daemon: paralellize disk usage computations
2021-08-06 09:55:51 +02:00
Roman Volosatovs
a18cf3e4ef
daemon: paralellize disk usage computations
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-08-05 14:42:31 +02:00
Sebastiaan van Stijn
0c88b0dc82
Merge pull request #42618 from thaJeztah/remove_common_unix_config
daemon/config: remove commonUnixBridgeConfig and CommonUnixConfig
2021-08-03 16:52:10 +02:00
Sebastiaan van Stijn
656a5e2bdf
Merge pull request #42559 from rvolosatovs/system_df_types
Add `type` parameter to `/system/df`
2021-08-02 21:03:05 +02:00
Brian Goff
51b06c6795
Merge pull request #42683 from thaJeztah/remove_lcow_step6
Remove LCOW (step 6)
2021-07-29 11:34:29 -07:00
Brian Goff
ad268e79c4
Merge pull request #42193 from lzhfromustc/3_23
discovery & test: Fix goroutine leaks by adding 1 buffer to channel
2021-07-28 15:25:37 -07:00
Sebastiaan van Stijn
0c84c322ae
daemon, oci: remove LCOW bits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-27 13:35:59 +02:00
Roman Volosatovs
47ad2f3dd6
API,daemon: support type URL parameter to /system/df
Let clients choose object types to compute disk usage of.

Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-27 12:17:45 +02:00
Akihiro Suda
3deac5dc85
btrfs: annotate error with human-readable hint string
Add hints for "Failed to destroy btrfs snapshot <DIR> for <ID>: operation not permitted" on rootless

Related to issue 41762

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-27 15:45:02 +09:00
Brian Goff
12f1b3ce43
Merge pull request #42616 from thaJeztah/migrate_pkg_signal
replace pkg/signal with moby/sys/signal v0.5.0
2021-07-26 10:47:28 -07:00
Brian Goff
9674540ccf
Merge pull request #42520 from thaJeztah/remove_lcow_step5_alternative
Remove LCOW (step 5): volumes/mounts: remove LCOW code (alternative)
2021-07-26 10:24:52 -07:00
yufeifly
17f39dcb4d fix a typo
Signed-off-by: yufeifly <yufei.xiong@qq.com>
2021-07-25 00:33:59 +08:00
Sebastiaan van Stijn
28409ca6c7
replace pkg/signal with moby/sys/signal v0.5.0
This code was moved to the moby/sys repository

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-23 09:32:54 +02:00
Sebastiaan van Stijn
d5dbbb5369
storage-driver: promote overlay2, make Btrfs and ZFS opt-in
The daemon uses a priority list to automatically select the best-matching storage
driver for the backing filesystem that is used.

Historically, overlay2 was not supported on Btrfs and ZFS, and the daemon would
automatically pick the `btrfs` or `zfs` storage driver if that was the Backing
File System.

Commits 649e4c8889 and e226aea280
improved our detection to check if overlay2 was supported on the backing file-
system, allowing overlay2 to be used on top of Btrfs or ZFS,  but did not change
the priority list.

While both Btrfs and ZFS have advantages for certain use-cases, and provide
advanced features that are not available to overlay2, they also are known
to require more "handholding", and are generally considered to be mostly
useful for "advanced" users.

This patch changes the storage-driver priority list, to prefer overlay2 (if
supported by the backing filesystem), and effectively makes btrfs and zfs
opt-in storage drivers.

This change does not affect existing installations; the daemon will detect
the storage driver that was previously in use (based on the presence of
storage directories in `/var/lib/docker`).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-21 14:53:56 +02:00
Brian Goff
9a6ff685a8
Merge pull request #42641 from thaJeztah/make_signal_selfcontained 2021-07-19 14:46:15 -07:00
Sebastiaan van Stijn
627bbd3fa4
Merge pull request #42132 from xia-wu/add-create-log-stream
Add an option to skip create log stream for awslogs driver
2021-07-19 16:42:36 +02:00
Justin Cormack
b337c70bdc
Merge pull request #42639 from thaJeztah/system_info_clean
pkg/sysinfo: assorted cleanup/refactoring for handling warnings and logging
2021-07-19 15:17:07 +01:00
Justin Cormack
ab974f6b57
Merge pull request #42620 from thaJeztah/daemon_stats_literal
daemon: use object literal for stats
2021-07-19 15:14:41 +01:00
Sebastiaan van Stijn
ea5c94cdb9
pkg/signal: move signal.DumpStacks() to a separate package
It is not directly related to signal-handling, so can well live
in its own package.

Also added a variant that doesn't take a directory to write files
to, for easier consumption / better match to how it's used.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-15 18:09:43 +02:00
Sebastiaan van Stijn
9b795c3e50
pkg/sysinfo.New(), daemon.RawSysInfo(): remove "quiet" argument
The "quiet" argument was only used in a single place (at daemon startup), and
every other use had to pass "false" to prevent this function from logging
warnings.

Now that SysInfo contains the warnings that occurred when collecting the
system information, we can make leave it up to the caller to use those
warnings (and log them if wanted).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-14 23:10:07 +02:00
Roman Volosatovs
bf9c76f0a8
API, daemon/images: add ImageListOptions and pass context
This makes it easier to add more options to the backend without having to change
the signature.

While we're changing the signature, also adding a context.Context, which is not
currently used, but probably should be at some point.

Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-07-13 13:45:24 +02:00
Sebastiaan van Stijn
115b37b8f7
daemon: use object literal for stats
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 14:16:13 +02:00
Sebastiaan van Stijn
0ff80c844d
daemon/config.New(): rewrite to be slightly more idiomatic
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 11:06:56 +02:00
Sebastiaan van Stijn
5588a78ab3
daemon/config: restrict "unix" code is linux
This code is not generically useful on "unix", and contains linux-
specific code, so make it only compile on linux.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 11:06:55 +02:00
Sebastiaan van Stijn
96f843ef30
daemon/config: move "common" tests
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 11:06:53 +02:00
Sebastiaan van Stijn
9d9679975f
daemon/config: remove CommonUnixConfig type
This type was added to support Solaris (which didn't support these
options). Solaris support was removed, so we can integrate this type
back into the "unix" type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 11:06:51 +02:00
Sebastiaan van Stijn
defeab7387
daemon/config: remove commonUnixBridgeConfig
This type was added to support Solaris (which didn't support these
options). Solaris support was removed, so we can integrate this type
back into the "unix" type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 11:06:49 +02:00
Sebastiaan van Stijn
a65f83317c
daemon/config: reorganize code between unix and windows files
Put variables and functions in the same owrder between both,
to allow for easier comparing between platforms.

Also synchronised some comments/godoc between both.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-11 11:06:42 +02:00
Sebastiaan van Stijn
ababae665d
Merge pull request #42550 from rvolosatovs/fix_image_shared_size
Fix SharedSize computation in `ImageService.Image` for filtered requests
2021-07-02 18:16:55 +02:00
Sebastiaan van Stijn
300c11c7c9
volume/mounts: remove "containerOS" argument from NewParser (LCOW code)
This changes mounts.NewParser() to create a parser for the current operatingsystem,
instead of one specific to a (possibly non-matching, in case of LCOW) OS.

With the OS-specific handling being removed, the "OS" parameter is also removed
from `daemon.verifyContainerSettings()`, and various other container-related
functions.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-02 13:51:55 +02:00
Roman Volosatovs
af3e5568fc
daemon/images: fix shared size computation for filtered requests
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-07-02 11:46:25 +02:00
Adam Williams
a8d92be6e8 Use crypto/rand
Signed-off-by: Adam Williams <awilliams@mirantis.com>
2021-07-01 14:15:39 -07:00
Adam Williams
9f0e268b00 Fix use of unsafe ptr #42444
Signed-off-by: Adam Williams <awilliams@mirantis.com>
2021-07-01 12:24:33 -07:00
Roman Volosatovs
b308097ec3
daemon/images: refactor image listing
- Rename image summary constructor
    - Rename `newImage` into `newImageSummary`, since the returned type is
      `*types.ImageSummary`
- Rename variables for clarity
    - Rename `newImage` into `summary`, since the variable type is
      `*types.ImageSummary`
    - Rename `imagesMap` into `summaryMap`, since the value type
      contained is `*types.ImageSummary`
- Only compute `DiffSize` when more than 1 reference to the layer
  exists, since it is not used otherwise
- Move variable declarations closer to where they are used

Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-30 11:32:32 +02:00
Sebastiaan van Stijn
314759dc2f
Merge pull request #42393 from aiordache/daemon_config
Daemon config validation
2021-06-23 19:32:07 +02:00
Rich Horwood
8f80e55111 Add configuration validation option and tests.
Fixes #36911

If config file is invalid we'll exit anyhow, so this just prevents
the daemon from starting if the configuration is fine.

Mainly useful for making config changes and restarting the daemon
iff the config is valid.

Signed-off-by: Rich Horwood <rjhorwood@apple.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Anca Iordache <anca.iordache@docker.com>
2021-06-23 09:54:55 +00:00
Brian Goff
bf11970fd5
Merge pull request #42536 from thaJeztah/replace_deprecated_userns
replace uses of deprecated containerd/sys.RunningInUserNS()
2021-06-18 13:20:19 -07:00
Akihiro Suda
7729ebfa1b
Merge pull request #42432 from dperny/fix-ip-overlap 2021-06-19 01:07:27 +09:00
Sebastiaan van Stijn
472f21b923
replace uses of deprecated containerd/sys.RunningInUserNS()
This utility was moved to a separate package, which has no
dependencies.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-18 11:01:24 +02:00
Akihiro Suda
9e8cf1016e
Merge pull request #42473 from thaJeztah/unfork_buildkit
revendor BuildKit (master branch)
2021-06-17 10:56:25 +09:00
Sebastiaan van Stijn
2773f81aa5
Merge pull request #42445 from thaJeztah/bump_golang_ci
[testing] ~update~ fix linting issues found by golangci-lint v1.40.1
2021-06-16 22:15:01 +02:00
Sebastiaan van Stijn
3eb1257698
revendor BuildKit (master branch)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-16 01:17:48 +02:00
Tianon Gravi
a060328874
Merge pull request #42472 from thaJeztah/improve_rootless_option
daemon: improve handling of ROOTLESSKIT_PARENT_EUID
2021-06-11 13:03:31 -07:00
Justin Cormack
1ba54a5fd0
Merge pull request #42511 from thaJeztah/remove_lcow_step4
Remove LCOW (step 4): remove layerstore indexing by OS (used for LCOW)
2021-06-11 18:31:59 +01:00
Samuel Karp
17bf6211af
Merge pull request #42325 from thaJeztah/warn_on_non_matching_platform
docker pull: warn when pulled single-arch image does not match --platform
2021-06-10 16:53:50 -07:00
Sebastiaan van Stijn
dc7cbb9b33
remove layerstore indexing by OS (used for LCOW)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 17:49:11 +02:00
Sebastiaan van Stijn
e6dabfa977
graphdriver: temporarily ignore unsafeptr: possible misuse of reflect.SliceHeader
Probably needs a similar change as c208f03fbd,
but this code makes my head spin, so for now suppressing, and created a
tracking issue:

    daemon/graphdriver/graphtest/graphtest_unix.go:305:12: unsafeptr: possible misuse of reflect.SliceHeader (govet)
        header := *(*reflect.SliceHeader)(unsafe.Pointer(&buf))
                  ^
    daemon/graphdriver/graphtest/graphtest_unix.go:308:36: unsafeptr: possible misuse of reflect.SliceHeader (govet)
        data := *(*[]byte)(unsafe.Pointer(&header))
                                          ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:47 +02:00
Sebastiaan van Stijn
d61b7c1211
daemon: var-declaration: should omit type bool (revive)
daemon/list.go:556:18: var-declaration: should omit type bool from declaration of var shouldSkip; it will be inferred from the right-hand side (revive)
                shouldSkip    bool = true
                              ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:45 +02:00
Sebastiaan van Stijn
16ced7622b
daemon/config: error strings should not be capitalized
daemon/config/config_unix.go:92:21: error-strings: error strings should not be capitalized or end with punctuation or a newline (revive)
            return fmt.Errorf("Default cgroup namespace mode (%v) is invalid. Use \"host\" or \"private\".", cm) // nolint: golint
                              ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:43 +02:00
Sebastiaan van Stijn
bb17074119
reformat "nolint" comments
Unlike regular comments, nolint comments should not have a leading space.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:42 +02:00
Sebastiaan van Stijn
4004a39d53
daemon/splunk: ignore G402: TLS MinVersion too low for now
daemon/logger/splunk/splunk.go:173:16: G402: TLS MinVersion too low. (gosec)
    	tlsConfig := &tls.Config{}
    	              ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:38 +02:00
Sebastiaan van Stijn
09191c0936
daemon/stats: fix notRunningErr / notFoundErr detected as unused (false positive)
Also looks like a false positive, but given that these were basically
testing for the `errdefs.Conflict` and `errdefs.NotFound` interfaces, I
replaced these with those;

    daemon/stats/collector.go:154:6: type `notRunningErr` is unused (unused)
    type notRunningErr interface {
         ^
    daemon/stats/collector.go:159:6: type `notFoundErr` is unused (unused)
    type notFoundErr interface {
         ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:34 +02:00
Sebastiaan van Stijn
b4c0c7c076
G601: Implicit memory aliasing in for loop
daemon/cluster/executor/container/adapter.go:446:42: G601: Implicit memory aliasing in for loop. (gosec)
            req := c.container.volumeCreateRequest(&mount)
                                                   ^
    daemon/network.go:577:10: G601: Implicit memory aliasing in for loop. (gosec)
                np := &n
                      ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:31 +02:00
Sebastiaan van Stijn
d13997b4ba
gosec: G601: Implicit memory aliasing in for loop
plugin/v2/plugin.go:141:50: G601: Implicit memory aliasing in for loop. (gosec)
                    updateSettingsEnv(&p.PluginObj.Settings.Env, &s)
                                                                 ^
    libcontainerd/remote/client.go:572:13: G601: Implicit memory aliasing in for loop. (gosec)
                cpDesc = &m
                         ^
    distribution/push_v2.go:400:34: G601: Implicit memory aliasing in for loop. (gosec)
                (metadata.CheckV2MetadataHMAC(&mountCandidate, pd.hmacKey) ||
                                              ^
    builder/dockerfile/builder.go:261:84: G601: Implicit memory aliasing in for loop. (gosec)
            currentCommandIndex = printCommand(b.Stdout, currentCommandIndex, totalCommands, &meta)
                                                                                             ^
    builder/dockerfile/builder.go:278:46: G601: Implicit memory aliasing in for loop. (gosec)
            if err := initializeStage(dispatchRequest, &stage); err != nil {
                                                       ^
    daemon/container.go:283:40: G601: Implicit memory aliasing in for loop. (gosec)
            if err := parser.ValidateMountConfig(&cfg); err != nil {
                                                 ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:29 +02:00
Sebastiaan van Stijn
f7433d6190
staticcheck: SA4001: &*x will be simplified to x. It will not copy x
daemon/volumes_unix_test.go:228:13: SA4001: &*x will be simplified to x. It will not copy x. (staticcheck)
                mp:      &(*c.MountPoints["/jambolan"]), // copy the mountpoint, expect no changes
                         ^
    daemon/logger/local/local_test.go:214:22: SA4001: &*x will be simplified to x. It will not copy x. (staticcheck)
            dst.PLogMetaData = &(*src.PLogMetaData)
                               ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:25 +02:00
Sebastiaan van Stijn
d43bcc8974
daemon/logger/journald: fix linting errors
daemon/logger/journald/read.go:128:3 comment on exported function `CErr` should be of the form `CErr ...`

    daemon/logger/journald/read.go:131:36: unnecessary conversion (unconvert)
            return C.GoString(C.strerror(C.int(-ret)))
	                                  ^
    daemon/logger/journald/read.go:380:2: S1023: redundant `return` statement (gosimple)
        return
        ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 13:03:21 +02:00
Sebastiaan van Stijn
076d9c6037
daemon: remove graphdriver indexing by OS (used for LCOW)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 10:17:26 +02:00
Sebastiaan van Stijn
08ddbfbdac
libcontainerd: remove LCOW bits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-09 22:05:10 +02:00
Drew Erny
0d9b0ed678 Fix possible overlapping IPs
A node is no longer using its load balancer IP address when it no longer
has tasks that use the network that requires that load balancer. When
this occurs, the swarmkit manager will free that IP in IPAM, and may
reaassign it.

When a task shuts down cleanly, it attempts removal of the networks it
uses, and if it is the last task using those networks, this removal
succeeds, and the load balancer IP is freed.

However, this behavior is absent if the container fails. Removal of the
networks is never attempted.

To address this issue, I amend the executor. Whenever a node load
balancer IP is removed or changed, that information is passedd to the
executor by way of the Configure method. By keeping track of the set of
node NetworkAttachments from the previous call to Configure, we can
determine which, if any, have been removed or changed.

At first, this seems to create a race, by which a task can be attempting
to start and the network is removed right out from under it. However,
this is already addressed in the controller. The controller will attempt
to recreate missing networks before starting a task.

Signed-off-by: Drew Erny <derny@mirantis.com>
2021-06-09 11:21:41 -06:00
Tianon Gravi
adb26d3fbe
Merge pull request #42488 from sparrc/log-fix-wait-timeout
Fix log statement 'failed to exit' timeout accuracy
2021-06-08 15:14:07 -07:00
Brian Goff
a7ea29a5a6
Merge pull request #42451 from thaJeztah/remove_lcow_step1
Remove LCOW code (step 1)
2021-06-08 13:41:45 -07:00
Cam
d15ce134ef
Fix log statement 'failed to exit' timeout accuracy
log statement should reflect how long it actually waited, not how long
it theoretically could wait based on the 'seconds' integer passed in.

Signed-off-by: Cam <gh@sparr.email>
2021-06-08 13:37:58 -07:00
Akihiro Suda
0ad2293d0e
Merge pull request #41656 from thaJeztah/unexport_things 2021-06-08 12:07:40 +09:00
Sebastiaan van Stijn
424c0eb3c0
docker pull: warn when pulled single-arch image does not match --platform
This takes the same approach as was implemented on `docker build`, where a warning
is printed if `FROM --platform=...` is used (added in 399695305c)

Before:

    docker rmi armhf/busybox
    docker pull --platform=linux/s390x armhf/busybox

    Using default tag: latest
    latest: Pulling from armhf/busybox
    d34a655120f5: Pull complete
    Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
    Status: Downloaded newer image for armhf/busybox:latest
    docker.io/armhf/busybox:latest

With this change:

    docker rmi armhf/busybox
    docker pull --platform=linux/s390x armhf/busybox

    Using default tag: latest
    latest: Pulling from armhf/busybox
    d34a655120f5: Pull complete
    Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
    Status: Downloaded newer image for armhf/busybox:latest
    WARNING: image with reference armhf/busybox was found but does not match the specified platform: wanted linux/s390x, actual: linux/arm64
    docker.io/armhf/busybox:latest

And daemon logs print:

   WARN[2021-04-26T11:19:37.153572667Z] ignoring platform mismatch on single-arch image  error="image with reference armhf/busybox was found but does not match the specified platform: wanted linux/s390x, actual: linux/arm64" image=armhf/busybox

When pulling without specifying `--platform, no warning is currently printed (but we can add a warning in future);

    docker rmi armhf/busybox
    docker pull armhf/busybox

    Using default tag: latest
    latest: Pulling from armhf/busybox
    d34a655120f5: Pull complete
    Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
    Status: Downloaded newer image for armhf/busybox:latest
    docker.io/armhf/busybox:latest

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-07 23:08:45 +02:00
Sebastiaan van Stijn
aa4dce742f
daemon: improve handling of ROOTLESSKIT_PARENT_EUID
- daemon.WithRootless():  make sure ROOTLESSKIT_PARENT_EUID is valid int
- daemon.RawSysInfo(): minor simplification, and rename variable that
  clashed with imported package.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-05 21:12:32 +02:00
Sebastiaan van Stijn
e047d984dc
Remove LCOW code (step 1)
The LCOW implementation in dockerd has been deprecated in favor of re-implementation
in containerd (in progress). Microsoft started removing the LCOW V1 code from the
build dependencies we use in Microsoft/opengcs (soon to be part of Microsoft/hcshhim),
which means that we need to start removing this code.

This first step removes the lcow graphdriver, the LCOW initialization code, and
some LCOW-related utilities.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-03 21:16:21 +02:00
Brian Goff
a77317882d
Merge pull request #42262 from cpuguy83/move_libnetwork
Move libnetwork
2021-06-03 12:06:31 -07:00
Brian Goff
20eb137e0a
Merge pull request #42334 from AkihiroSuda/rootless-overlay2-k511-selinux
rootless: disable overlay2 if running with SELinux
2021-06-03 10:33:27 -07:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
Sebastiaan van Stijn
bf07c06c63
daemon: move DefaultShimBinary, DefaultRuntimeBinary to config package
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-31 19:06:16 +02:00
Sebastiaan van Stijn
34b854f965
daemon: un-export ModifyRootKeyLimit()
it's only used internally, so no need to export

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-31 19:06:14 +02:00
Sebastiaan van Stijn
95d69658be
daemon: un-export VerifyCgroupDriver()
it's only used internally, so no need to export

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-31 19:06:12 +02:00
Sebastiaan van Stijn
a506630e57
daemon: use sync.Once for systemd detection
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-31 19:06:10 +02:00
Sebastiaan van Stijn
e7ba5cacc6
daemon: un-export IsRunningSystemd()
This utility was added after 19.03, and is only used in the daemon code
itself, so we can un-export it, until there's an external use for it.

Also updated the description, because the runc code already copied it
from coreos/go-systemd, so better to describe the actual source.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-31 19:06:07 +02:00
Brian Goff
a8a769f04f
Merge pull request #42291 from angelcar/awslogs-dont-log-messge-discarded-errors
Limit the rate at which logger errors are logged into daemon logs
2021-05-27 19:33:44 -07:00
Sebastiaan van Stijn
08096e3ee6
Merge pull request #42320 from anujva/fix_moby_ring_logger
#42316 Wait for `run` goroutine to exit before `Close`
2021-05-27 15:00:03 +02:00
Sebastiaan van Stijn
97303df921
Merge pull request #41588 from sparrc/kill-refactor
docker kill: fix bug where failed kills didnt fallback to unix kill
2021-05-25 14:47:58 +02:00
Angel Velazquez
fb5a9ec741
Limit the rate at which logger errors are logged into daemon logs
Logging to daemon logs every time there's an error with a log driver can be
problematic since daemon logs can grow rapidly, potentially exhausting disk
space.

Instead, it's preferable to limit the rate at which log driver errors are allowed
to be written. By default, this limit is 333 entries per second max.

Signed-off-by: Angel Velazquez <angelcar@amazon.com>
2021-05-24 16:41:38 -07:00
Brian Goff
e02bc91dcb
Merge pull request #42339 from awmirantis/allow-vhdx-as-data-root-windows
Add security privilege needed to write layers to VHDX
2021-05-20 14:19:52 -07:00
Muhammad Zohaib Aslam
56c88c94dd Added missing test cleanup for temporary directory
A temporary directory was created but not removed at the end of the test.
The missing remove directory call is added now.

Signed-off-by: Muhammad Zohaib Aslam <zohaibse011@gmail.com>
2021-05-01 15:39:50 +03:00
Anuj Varma
cf259eb8a0 Wait for run goroutine to exit before Close
The underlying Loggers Close() function can be called with the the
run() goroutine still writing to the driver. This is causing the
fluentd-golang-logger to panic cause it doesn't defensively check
for the closing of the channel before writing to it.
It relies on the docker daemon to keep the contract of not calling Log()
if Close() has already been called.

Contributions by: James Johnston <james.johnston@thumbtack.com>
                  Nathan Wong <nathanw@thumbtack.com>

Signed-off-by: Anuj Varma <anujvarma@thumbtack.com>
2021-04-30 17:23:32 -07:00
Sebastiaan van Stijn
dd3275c5f9
Merge pull request #42182 from thaJeztah/fix_exec_start_err_handling
Fix error-handling in `daemon.ContainerExecStart()` and `daemon.getExecConfig()`
2021-04-29 21:33:19 +02:00
Adam Williams
489f57b877 Add security privilege needed to write layers when windows VHDX used as docker data root
Signed-off-by: Adam Williams <awilliams@mirantis.com>
2021-04-29 10:41:19 -07:00
Akihiro Suda
4300a52606
rootless: disable overlay2 if running with SELinux
Kernel 5.11 introduced support for rootless overlayfs, but incompatible with SELinux.

On the other hand, fuse-overlayfs is compatible.

Close issue 42333

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-28 18:22:06 +09:00
Sebastiaan van Stijn
6d1eceb509
Fix panic in TestExecSetPlatformOpt, TestExecSetPlatformOptPrivileged
These tests would panic;

- in WithRLimits(), because HostConfig was not set;
  470ae8422f/daemon/oci_linux.go (L46-L47)
- in daemon.mergeUlimits(), because daemon.configStore was not set;
  470ae8422f/daemon/oci_linux.go (L1069)

This panic was not discovered because the current version of runc/libcontainer that we vendor
would not always return false for `apparmor.IsEnabled()` when running docker-in-docker or if
`apparmor_parser` is not found. Starting with v1.0.0-rc93 of libcontainer, this is no longer
the case (changed in bfb4ea1b1b)

This patch;

- changes the tests to initialize Daemon.configStore and Container.HostConfig
- Combines TestExecSetPlatformOpt and TestExecSetPlatformOptPrivileged into a new test
  (TestExecSetPlatformOptAppArmor)
- Runs the test both if AppArmor is enabled and if not (in which case it tests
  that the container's AppArmor profile is left empty).
- Adds a FIXME comment for a possible bug in execSetPlatformOpts, which currently
  prefers custom profiles over "privileged".

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-23 00:39:39 +02:00
Tianon Gravi
72fef53cec
Merge pull request #42270 from cpuguy83/bump_hcsshim
Bump hcsshim to get some fixes.
2021-04-20 14:42:29 -07:00
Brian Goff
225e046d9d Error string match: do not match command path
Whether or not the command path is in the error message is a an
implementation detail.
For example, on Windows the only reason this ever matched was because it
dumped the entire container config into the error message, but this had
nothing to do with the actual error.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-04-14 23:03:18 +00:00
Cam
e57a365ab1
docker kill: fix bug where failed kills didnt fallback to unix kill
1. fixes #41587
2. removes potential infinite Wait and goroutine leak at end of kill
function

fixes #41587

Signed-off-by: Cam <gh@sparr.email>
2021-04-14 15:43:44 -07:00
Brian Goff
6110ba3d7c
Merge pull request #41586 from sparrc/stop-refactor
Fix hung docker stop if stop signal and daemon.Kill both fail
2021-04-14 13:33:14 -07:00
Cam
8e362b75cb
docker daemon container stop refactor
this refactors the Stop command to fix a few issues and behaviors that
dont seem completely correct:

1. first it fixes a situation where stop could hang forever (#41579)
2. fixes a behavior where if sending the
stop signal failed, then the code directly sends a -9 signal. If that
fails, it returns without waiting for the process to exit or going
through the full docker kill codepath.
3. fixes a behavior where if sending the stop signal failed, then the
code sends a -9 signal. If that succeeds, then we still go through the
same stop waiting process, and may even go through the docker kill path
again, even though we've already sent a -9.
4. fixes a behavior where the code would wait the full 30 seconds after
sending a stop signal, even if we already know the stop signal failed.

fixes #41579

Signed-off-by: Cam <gh@sparr.email>
2021-04-13 09:53:00 -07:00
Michal Rostecki
1ec689c4c2 btrfs: Do not disable quota on cleanup
Before this change, cleanup of the btrfs driver (occuring on each daemon
shutdown) resulted in disabling quotas. It was done with an assumption
that quotas can be enabled or disabled on a subvolume level, which is
not true - enabling or disabling quota is always done on a filesystem
level.

That was leading to disabling quota on btrfs filesystems on each daemon
shutdown.

This change fixes that behavior and removes misleading `subvol` prefix
from functions and methods which set up quota (on a filesystem level).

Fixes: #34593
Fixes: 401c8d1767 ("Add disk quota support for btrfs")
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
2021-04-13 16:23:39 +01:00
Tibor Vass
68bec0fcf7
Merge pull request #42276 from thaJeztah/apparmor_detect_fix
Use containerd's apparmor package to detect if apparmor can be used
2021-04-09 16:09:54 -07:00
Sebastiaan van Stijn
be95eae6d2
Merge pull request #41999 from diakovliev/fix_update_sync
Fix for lack of synchronization in daemon/update.go
2021-04-09 00:27:43 +02:00
Sebastiaan van Stijn
2834f842ee
Use containerd's apparmor package to detect if apparmor can be used
The runc/libcontainer apparmor package on master no longer checks if apparmor_parser
is enabled, or if we are running docker-in-docker.

While those checks are not relevant to runc (as it doesn't load the profile), these
checks _are_ relevant to us (and containerd). So switching to use the containerd
apparmor package, which does include the needed checks.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-08 20:22:08 +02:00
Tianon Gravi
f76958f612
Merge pull request #42245 from thaJeztah/use_proper_domains
Use designated test domains (RFC2606) in tests
2021-04-05 09:44:18 -07:00
Sebastiaan van Stijn
1df3d5c1de
Merge pull request #42203 from AkihiroSuda/btrfs-allow-unprivileged
btrfs: Allow unprivileged user to delete subvolumes (kernel >= 4.18)
2021-04-05 16:35:12 +02:00
Sebastiaan van Stijn
97a5b797b6
Use designated test domains (RFC2606) in tests
Some tests were using domain names that were intended to be "fake", but are
actually registered domain names (such as domain.com, registry.com, mytest.com).

Even though we were not actually making connections to these domains, it's
better to use domains that are designated for testing/examples in RFC2606:
https://tools.ietf.org/html/rfc2606

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-02 14:06:27 +02:00
Tibor Vass
5b11047c25
Merge pull request #42188 from AkihiroSuda/fix-overlay2-naivediff
rootless: overlay2: fix "createDirWithOverlayOpaque(...) ... input/output error"
2021-04-01 05:03:24 -07:00
Akihiro Suda
248f98ef5e
rootless: bind mount: fix "operation not permitted"
The following was failing previously, because `getUnprivilegedMountFlags()` was not called:
```console
$ sudo mount -t tmpfs -o noexec none /tmp/foo
$ $ docker --context=rootless run -it --rm -v /tmp/foo:/mnt:ro alpine
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:520: container init caused: rootfs_linux.go:60: mounting "/tmp/foo" to rootfs at "/home/suda/.local/share/docker/overlay2/b8e7ea02f6ef51247f7f10c7fb26edbfb308d2af8a2c77915260408ed3b0a8ec/merged/mnt" caused: operation not permitted: unknown.
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-01 14:58:11 +09:00
Akihiro Suda
6322dfc217
archive: do not use overlayWhiteoutConverter for UserNS
overlay2 no longer sets `archive.OverlayWhiteoutFormat` when
running in UserNS, so we can remove the complicated logic in the
archive package.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-29 14:47:12 +09:00
Akihiro Suda
67aa418df2
overlay2: doesSupportNativeDiff: add fast path for userns
When running in userns, returns error (i.e. "use naive, not native")
immediately.

No substantial change to the logic.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-29 14:47:09 +09:00
Akihiro Suda
dd97134232
overlay2: call d.naiveDiff.ApplyDiff when useNaiveDiff==true
Previously, `d.naiveDiff.ApplyDiff` was not used even when
`useNaiveDiff()==true`

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-26 14:34:56 +09:00
Akihiro Suda
62b5194f62
btrfs: Allow unprivileged user to delete subvolumes (kernel >= 4.18)
Fix issue 41762

Cherry-pick "drivers: btrfs: Allow unprivileged user to delete subvolumes" from containers/storage
831e32b6bd

> In btrfs, subvolume can be deleted by IOC_SNAP_DESTROY ioctl but there
> is one catch: unprivileged IOC_SNAP_DESTROY call is restricted by default.
>
> This is because IOC_SNAP_DESTROY only performs permission checks on
> the top directory(subvolume) and unprivileged user might delete dirs/files
> which cannot be deleted otherwise. This restriction can be relaxed if
> user_subvol_rm_allowed mount option is used.
>
> Although the above ioctl had been the only way to delete a subvolume,
> btrfs now allows deletion of subvolume just like regular directory
> (i.e. rmdir sycall) since kernel 4.18.
>
> So if we fail to cleanup subvolume in subvolDelete(), just fallback to
> system.EnsureRmoveall() to try to cleanup subvolumes again.
> (Note: quota needs privilege, so if quota is enabled we do not fallback)
>
> This fix will allow non-privileged container works with btrfs backend.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-26 14:30:40 +09:00
lzhfromustc
5ffcd162b5 discovery & test: Fix goroutine leaks by adding 1 buffer to channel
Signed-off-by: Ziheng Liu <lzhfromustc@gmail.com>
2021-03-24 10:32:39 -04:00
Sebastiaan van Stijn
a432eb4b3a
ContainerExecStart(): don't wrap getExecConfig() errors, and prevent panic
daemon.getExecConfig() already returns typed errors; by wrapping those errors
we may loose the actual reason for failures. Changing the error-type was
originally added in 2d43d93410, but I think
it was not intentional to ignore already-typed errors. It was later refactored
in a793564b25, which added helper functions
to create these errors, but kept the same behavior.

Also adds error-handling to prevent a panic in situations where (although
unlikely) `daemon.containers.Get()` would not return a container.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-22 13:37:05 +01:00
Sebastiaan van Stijn
6eb5720233
Fix daemon.getExecConfig(): not using typed errNotRunning() error
This makes daemon.getExecConfig return a errdefs.Conflict() error if the
container is not running.

This was originally the case, but a refactor of this code changed the typed
error (`derr.ErrorCodeContainerNotRunning`) to a non-typed error;
a793564b25

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-22 13:37:03 +01:00
Brian Goff
788f2883d2
Merge pull request #42104 from cpuguy83/41820_fix_json_unexpected_eof 2021-03-18 14:18:11 -07:00
Brian Goff
a84d824c5f
Merge pull request #42068 from AkihiroSuda/ovl-k511 2021-03-18 11:54:01 -07:00
Brian Goff
5a664dc87d jsonfile: more defensive reader implementation
Tonis mentioned that we can run into issues if there is more error
handling added here. This adds a custom reader implementation which is
like io.MultiReader except it does not cache EOF's.
What got us into trouble in the first place is `io.MultiReader` will
always return EOF once it has received an EOF, however the error
handling that we are going for is to recover from an EOF because the
underlying file is a file which can have more data added to it after
EOF.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-18 18:44:46 +00:00
Brian Goff
ece4cd4c4d
Merge pull request #41757 from thaJeztah/carry_39371_remove_more_v1_code 2021-03-18 11:38:07 -07:00
Akihiro Suda
039e9670cb
info: unset cgroup-related fields when CgroupDriver == none
Fix issue 42151

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-16 16:17:22 +09:00
Brian Goff
4be98a38e7 Fix handling for json-file io.UnexpectedEOF
When the multireader hits EOF, we will always get EOF from it, so we
cannot store the multrireader fro later error handling, only for the
decoder.

Thanks @tobiasstadler for pointing this error out.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-11 20:01:03 +00:00
Akihiro Suda
a8008f7313
overlayutils/userxattr.go: add "fast path" for kernel >= 5.11.0
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-11 15:18:59 +09:00
Akihiro Suda
11ef8d3ba9
overlay2: support "userxattr" option (kernel 5.11)
The "userxattr" option is needed for mounting overlayfs inside a user namespace with kernel >= 5.11.

The "userxattr" option is NOT needed for the initial user namespace (aka "the host").

Also, Ubuntu (since circa 2015) and Debian (since 10) with kernel < 5.11 can mount the overlayfs in a user namespace without the "userxattr" option.

The corresponding kernel commit: 2d2f2d7322ff43e0fe92bf8cccdc0b09449bf2e1
> **ovl: user xattr**
>
> Optionally allow using "user.overlay." namespace instead of "trusted.overlay."
> ...
> Disable redirect_dir and metacopy options, because these would allow privilege escalation through direct manipulation of the
> "user.overlay.redirect" or "user.overlay.metacopy" xattrs.

Fix issue 42055

Related to containerd/containerd PR 5076

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-11 15:12:41 +09:00
Xia Wu
d10046f228 Add an option to skip create log stream for awslogs driver
Added an option `awslogs-create-stream` to allow skipping log stream
creation for awslogs log driver. The default value is still true to
keep the behavior be consistent with before.

Signed-off-by: Xia Wu <xwumzn@amazon.com>
2021-03-09 15:49:43 -08:00
Sebastiaan van Stijn
328de0b8d9
Update documentation links
- Using "/go/" redirects for some topics, which allows us to
  redirect to new locations if topics are moved around in the
  documentation.
- Updated some old URLs to their new location.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-25 12:11:50 +01:00
Tibor Vass
7a50fe8a52
Remove more of registry v1 code.
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-23 09:49:46 +01:00
Sebastiaan van Stijn
0f3b94a5c7
daemon: remove migration code from docker 1.11 to 1.12
This code was added in 391441c28b, to fix
upgrades from docker 1.11 to 1.12 with existing containers.

Given that any container after 1.12 should have the correct configuration
already, it should be safe to assume this upgrade logic is no longer needed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-22 11:36:43 +01:00
Sebastiaan van Stijn
8b6d9eaa55
Merge pull request #42044 from nathanlcarlson/labels_regex_length_check
Check the length of the correct variable #42039
2021-02-18 22:22:44 +01:00
Sebastiaan van Stijn
56ffa614d6
Merge pull request #41955 from cpuguy83/fallback_manifest_on_bad_plat
Fallback to  manifest list when no platform match
2021-02-18 20:59:51 +01:00
Brian Goff
50f39e7247 Move cpu variant checks into platform matcher
Wrap platforms.Only and fallback to our ignore mismatches due to  empty
CPU variants. This just cleans things up and makes the logic re-usable
in other places.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-02-18 16:58:48 +00:00
Nathan Carlson
8d73c1ad68 Check the length of the correct variable #42039
Signed-off-by: Nathan Carlson <carl4403@umn.edu>
2021-02-18 10:27:35 -06:00
Brian Goff
4be5453215 Fallback to manifest list when no platform match
In some cases, in fact many in the wild, an image may have the incorrect
platform on the image config.
This can lead to failures to run an image, particularly when a user
specifies a `--platform`.
Typically what we see in the wild is a manifest list with an an entry
for, as an example, linux/arm64 pointing to an image config that has
linux/amd64 on it.

This change falls back to looking up the manifest list for an image to
see if the manifest list shows the image as the correct one for that
platform.

In order to accomplish this we need to traverse the leases associated
with an image. Each image, if pulled with Docker 20.10, will have the
manifest list stored in the containerd content store with the resource
assigned to a lease keyed on the image ID.
So we look up the lease for the image, then look up the assocated
resources to find the manifest list, then check the manifest list for a
platform match, then ensure that manifest referes to our image config.

This is only used as a fallback when a user specified they want a
particular platform and the image config that we have does not match
that platform.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-02-17 19:10:48 +00:00
Akihiro Suda
1d2a660093
Move cgroup v2 out of experimental
We have upgraded runc to rc93 and added CI for cgroup 2.
So we can move cgroup v2 out of experimental.

Fix issue 41916

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-16 17:54:28 +09:00
Tibor Vass
7359a3b1e9
Merge pull request #41567 from J-jaeyoung/fix_off_by_one
Update array length check logic for preventing off-by-one error
2021-02-11 11:18:23 -08:00
Sebastiaan van Stijn
264353425a
Merge pull request #41698 from cpuguy83/fix_shutdown_handling
Move container exit state to after cleanup.
2021-02-11 20:18:00 +01:00
Sebastiaan van Stijn
45bb0860b6
Merge pull request #41320 from pjbgf/add-seccomp-tests
Add test coverage to seccomp.
2021-02-10 17:14:15 +01:00
Sebastiaan van Stijn
1c39b1c44c
Merge pull request #41842 from jchorl/master
Reject null manifests during tar import
2021-02-09 12:06:27 +01:00
dmytro.iakovliev
58825ffc32 Fix for lack of syncromization in daemon/update.go
Signed-off-by: dmytro.iakovliev <dmytro.iakovliev@zodiacsystems.com>
2021-02-09 09:34:20 +02:00
Paulo Gomes
137f86067c
Add test coverage for seccomp implementation
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
2021-02-04 19:47:07 +00:00
Josh Chorlton
654f854fae reject null manifests
Signed-off-by: Josh Chorlton <jchorlton@gmail.com>
2021-02-02 09:24:53 -08:00
Tibor Vass
2bd6213363
Merge pull request #41965 from thaJeztah/buildkit_apparmor_master
[master] Ensure AppArmor and SELinux profiles are applied when building with BuildKit
2021-02-02 08:52:11 -08:00
Brian Goff
94c07441c2
buildkit: Apply apparmor profile
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 611eb6ffb3)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:32:24 +01:00
Brian Goff
7f5e39bd4f
Use real root with 0701 perms
Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.

Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.

This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:01:25 +01:00
Brian Goff
4b5aa28f24
Do not set DOCKER_TMP to be owned by remapped root
The remapped root does not need access to this dir.
Having this owned by the remapped root opens the host up to an
uprivileged user on the host being able to escalate privileges.

While it would not be normal for the remapped UID to be used outside of
the container context, it could happen.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit bfedd27259)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:01:22 +01:00
Brian Goff
e192ce4009 Move container exit state to after cleanup.
Before this change, there is no way to know if container (runtime)
resources have been cleaned up unless you actually remove the container.

This change allows callers of the wait API or the events API to know
that all runtime resources for the container are released (e.g. IP
addresses).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-01-28 11:28:41 -08:00
Sebastiaan van Stijn
f266f13965
Merge pull request #41636 from TBBle/37352-test-and-fix
Set 127GB default sandbox size for WCOW, and ensure storage-opts is honoured on all paths under WCOW and LCOW
2021-01-25 14:34:34 +01:00
Brian Goff
b865beba22
Merge pull request #41894 from AkihiroSuda/silence-dockerinfo 2021-01-21 09:32:37 -08:00
Sebastiaan van Stijn
d5612a0ef8
Merge pull request #41854 from cpuguy83/for-linux-1169-plugins-custom-runtime-panic
Add shim config for custom runtimes for plugins
2021-01-21 16:26:36 +01:00
Sebastiaan van Stijn
44aacff3fc
Merge pull request #41873 from cpuguy83/fix_builder_inconsisent_platform
Fix builder inconsistent error on buggy platform
2021-01-21 16:23:28 +01:00
Kazuyoshi Kato
bb11365e96 Handle long log messages correctly on SizedLogger
Loggers that implement BufSize() (e.g. awslogs) uses the method to
tell Copier about the maximum log line length. However loggerWithCache
and RingBuffer hide the method by wrapping loggers.

As a result, Copier uses its default 16KB limit which breaks log
lines > 16kB even the destinations can handle that.

This change implements BufSize() on loggerWithCache and RingBuffer to
make sure these logger wrappes don't hide the method on the underlying
loggers.

Fixes #41794.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-01-20 16:44:06 -08:00
Akihiro Suda
00225e220f
docker info: adjust warning strings for cgroup v2
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-01-20 13:42:32 +09:00
Akihiro Suda
8086443a44
docker info: silence unhandleable warnings
The following warnings in `docker info` are now discarded,
because there is no action user can actually take.

On cgroup v1:
- "WARNING: No blkio weight support"
- "WARNING: No blkio weight_device support"

On cgroup v2:
- "WARNING: No kernel memory TCP limit support"
- "WARNING: No oom kill disable support"

`docker run` still prints warnings when the missing feature is being attempted to use.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-01-19 15:10:21 +09:00
Brian Goff
399695305c Fix builder inconsistent error on buggy platform
When pulling an image by platform, it is possible for the image's
configured platform to not match what was in the manifest list.
The image itself is buggy because either the manifest list is incorrect
or the image config is incorrect. In any case, this is preventing people
from upgrading because many times users do not have control over these
buggy images.

This was not a problem in 19.03 because we did not compare on platform
before. It just assumed if we had the image it was the one we wanted
regardless of platform, which has its own problems.

Example Dockerfile that has this problem:

```Dockerfile
FROM --platform=linux/arm64 k8s.gcr.io/build-image/debian-iptables:buster-v1.3.0
RUN echo hello
```

This fails the first time you try to build after it finishes pulling but
before performing the `RUN` command.
On the second attempt it works because the image is already there and
does not hit the code that errors out on platform mismatch (Actually it
ignores errors if an image is returned at all).

Must be run with the classic builder (DOCKER_BUILDKIT=0).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-01-14 21:45:45 +00:00
Brian Goff
2903863a1d Add shim config for custom runtimes for plugins
This fixes a panic when an admin specifies a custom default runtime,
when a plugin is started the shim config is nil.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-01-14 19:28:28 +00:00
gunadhya
64465f3b5f Fix Error in daemon_unix.go and docker_cli_run_unit_test.go
Signed-off-by: gunadhya <6939749+gunadhya@users.noreply.github.com>
2021-01-05 16:56:29 +05:30
Akihiro Suda
8891c58a43
Merge pull request #41786 from thaJeztah/test_selinux_tip
vendor: opencontainers/selinux v1.8.0, and remove selinux build-tag and stubs
2020-12-26 00:07:49 +09:00
Tibor Vass
ffc4dc9aec
Merge pull request #41817 from simonferquel/desktop-startup-hang
Fix a potential hang when starting after a non-clean shutdown
2020-12-23 23:22:00 -08:00
Sebastiaan van Stijn
1c0af18c6c
vendor: opencontainers/selinux v1.8.0, and remove selinux build-tag and stubs
full diff: https://github.com/opencontainers/selinux/compare/v1.7.0...v1.8.0

Remove "selinux" build tag

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-12-24 00:47:16 +01:00
Brian Goff
4a175fd050 Cleanup container shutdown check and add test
Adds a test case for the case where dockerd gets stuck on startup due to
hanging `daemon.shutdownContainer`

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-12-23 16:59:03 +00:00
Oscar Bonilla
c923f6ac3b Fix off-by-one bug
This is a fix for https://github.com/docker/for-linux/issues/1012.

The code was not considering that C strings are NULL-terminated so
we need to leave one extra byte.

Without this fix, the testcase in https://github.com/docker/for-linux/issues/1012
fails with

```
Step 61/1001 : RUN echo 60 > 60
 ---> Running in dde85ac3b1e3
Removing intermediate container dde85ac3b1e3
 ---> 80a12a18a241
Step 62/1001 : RUN echo 61 > 61
error creating overlay mount to /23456789112345678921234/overlay2/d368abcc97d6c6ebcf23fa71225e2011d095295d5d8c9b31d6810bea748bdf07-init/merged: no such file or directory
```

with the output of `dmesg -T` as:

```
[Sat Dec 19 02:35:40 2020] overlayfs: failed to resolve '/23456789112345678921234/overlay2/89e435a1b24583c463abb73e8abfad8bf8a88312ef8253455390c5fa0a765517-init/wor': -2
```

with this fix, you get the expected:

```
Step 126/1001 : RUN echo 125 > 125
 ---> Running in 2f2e56da89e0
max depth exceeded
```

Signed-off-by: Oscar Bonilla <6f6231@gmail.com>
2020-12-20 16:23:25 -08:00
Simon Ferquel
af0665861b Fix a potential hang when starting after a non-clean shutdown
Previous startup sequence used to call "containerStop" on containers that were persisted with a running state but are not alive when restarting (can happen on non-clean shutdown).
This call was made before fixing-up the RunningState of the container, and tricked the daemon to trying to kill a non-existing process and ultimately hang.

The fix is very simple - just add a condition on calling containerStop.

Signed-off-by: Simon Ferquel <simon.ferquel@docker.com>
2020-12-18 10:20:56 +01:00
Brian Goff
b5f863c67e
Merge pull request #41811 from AkihiroSuda/fuseoverlayfs-wrong-comment
fuse-overlayfs: fix godoc
2020-12-17 10:15:02 -08:00
Akihiro Suda
727d597452
Merge pull request #41806 from dperny/fix-jobs-filter-spelling
Fix service job mode filter
2020-12-16 20:05:41 +09:00
Akihiro Suda
188a691db7
fuse-overlayfs: fix godoc
"fuse-overlayfs" storage driver had wrong godoc comments
that were copied from "overlay2".

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-16 19:21:03 +09:00
Akihiro Suda
109be6b2bd
Merge pull request #41800 from thaJeztah/daemon_improve_logging
daemon: improve log messages during startup / shutdown
2020-12-16 09:58:32 +09:00
Drew Erny
295fb1c35e Fix jobs mode filter spelling
Oops.

Signed-off-by: Drew Erny <derny@mirantis.com>
2020-12-15 14:45:05 -06:00
Sebastiaan van Stijn
7e600eaae0
daemon: improve log messages during startup / shutdown
Consistently set "container ID" as a field for log messages, so that
logs can be associated with a container.

With this logs look like;

    INFO[2020-12-15T12:30:46.239329903Z] Loading containers: start.
    DEBU[2020-12-15T12:30:46.239919357Z] processing event stream      module=libcontainerd namespace=moby
    DEBU[2020-12-15T12:30:46.242061458Z] loaded container             container=622dec5f737d532da347bc627655ebc351fa5887476e8b8c33e5fbc5d0e48b5c paused=false running=false
    DEBU[2020-12-15T12:30:46.242185251Z] loaded container             container=47f348160645f46a17c758d120dec600967eed4adf08dd28b809725971d062cc paused=false running=false
    DEBU[2020-12-15T12:30:46.242912375Z] loaded container             container=e29c34c14b84810bc1e6cb6978a81e863601bfbe9ffe076c07dd5f6a439289d6 paused=false running=false
    DEBU[2020-12-15T12:30:46.243165260Z] loaded container             container=31d40ee3e591a50ebee790b08c2bec751610d2eca51ca1a371ea1ff66ea46c1d paused=false running=false
    DEBU[2020-12-15T12:30:46.243585164Z] loaded container             container=03dd5b1dc251a12d2e74eb54cb3ace66c437db228238a8d4831a264c9313c192 paused=false running=false
    DEBU[2020-12-15T12:30:46.244870764Z] loaded container             container=b774141975cc511cc61fc5f374793503bb2e8fa774d6580ac47111a089de1b9b paused=false running=false
    DEBU[2020-12-15T12:30:46.245140276Z] loaded container             container=b8a7229824fb84ff6f5af537a8ba987d106bf9a24a9aad3b628605d26b3facc4 paused=false running=false
    DEBU[2020-12-15T12:30:46.245457025Z] loaded container             container=b3256ff87fc6f243d9e044fb3d7988ef61c86bfb957d90c0227e8a9697ffa49c paused=false running=false
    DEBU[2020-12-15T12:30:46.292515417Z] restoring container          container=b3256ff87fc6f243d9e044fb3d7988ef61c86bfb957d90c0227e8a9697ffa49c paused=false running=false
    DEBU[2020-12-15T12:30:46.292612379Z] restoring container          container=31d40ee3e591a50ebee790b08c2bec751610d2eca51ca1a371ea1ff66ea46c1d paused=false running=false
    DEBU[2020-12-15T12:30:46.292573767Z] restoring container          container=b8a7229824fb84ff6f5af537a8ba987d106bf9a24a9aad3b628605d26b3facc4 paused=false running=false
    DEBU[2020-12-15T12:30:46.292602437Z] restoring container          container=b774141975cc511cc61fc5f374793503bb2e8fa774d6580ac47111a089de1b9b paused=false running=false
    DEBU[2020-12-15T12:30:46.305032730Z] restoring container          container=47f348160645f46a17c758d120dec600967eed4adf08dd28b809725971d062cc paused=false running=false
    DEBU[2020-12-15T12:30:46.305421360Z] restoring container          container=622dec5f737d532da347bc627655ebc351fa5887476e8b8c33e5fbc5d0e48b5c paused=false running=false
    DEBU[2020-12-15T12:30:46.305558773Z] restoring container          container=03dd5b1dc251a12d2e74eb54cb3ace66c437db228238a8d4831a264c9313c192 paused=false running=false
    DEBU[2020-12-15T12:30:46.307662990Z] restoring container          container=e29c34c14b84810bc1e6cb6978a81e863601bfbe9ffe076c07dd5f6a439289d6 paused=false running=false
    ...
    INFO[2020-12-15T12:30:46.536506204Z] Loading containers: done.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-12-15 15:57:39 +01:00
Arnaud Rebillout
6349b32e1b daemon/oci_linux_test: Skip privileged tests when non-root
These tests fail when run by a non-root user

  === RUN   TestTmpfsDevShmNoDupMount
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestTmpfsDevShmNoDupMount (0.00s)
  === RUN   TestIpcPrivateVsReadonly
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestIpcPrivateVsReadonly (0.00s)
  === RUN   TestSysctlOverride
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestSysctlOverride (0.00s)
  === RUN   TestSysctlOverrideHost
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestSysctlOverrideHost (0.00s)

Signed-off-by: Arnaud Rebillout <elboulangero@gmail.com>
2020-12-15 09:47:44 +07:00
Sebastiaan van Stijn
cf31b9622a
Merge pull request #41622 from bboehmke/ipv6_nat
IPv6 iptables config option
2020-12-07 11:59:42 +01:00