Commit graph

66 commits

Author SHA1 Message Date
Brian Goff
677d41aa3b Plumb context through info endpoint
I was trying to find out why `docker info` was sometimes slow so
plumbing a context through to propagate trace data through.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-11-10 20:09:25 +00:00
Sebastiaan van Stijn
cff4f20c44
migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
c90229ed9a
api/types: move system info types to api/types/system
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-07 13:01:36 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Sebastiaan van Stijn
59b5c6075f
pkg/rootless: remove GetRootlessKitClient, and move to daemon
This utility was only used in a single location (as part of `docker info`),
but the `pkg/rootless` package is imported in various locations, causing
rootlesskit to be a dependency for consumers of that package.

Move GetRootlessKitClient to the daemon code, which is the only location
it was used.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-12 13:44:30 +02:00
Cory Snider
0f6eeecac0 daemon: consolidate runtimes config validation
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.

Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Sebastiaan van Stijn
7796891381
Merge pull request #45475 from thaJeztah/remove_old_buildtags 2023-05-20 02:10:19 +02:00
Sebastiaan van Stijn
411a9e1b86
daemon: remove devicemapper driver-warnings
commit dc11d2a2d8 removed the devicemapper
storage-driver, so these warnings are no longer relevant.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-19 20:45:31 +02:00
Sebastiaan van Stijn
424a1c5d21
daemon: remove warning for overlay/overlay2 without d_type
commit 0abb8dec3f removed support for
running overlay/overlay2 on top of a backing filesystem without d_type
support, and  turned it into a fatal error when starting the daemon,
so there's no need to generate warnings for this situation.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-19 20:43:51 +02:00
Sebastiaan van Stijn
ab35df454d
remove pre-go1.17 build-tags
Removed pre-go1.17 build-tags with go fix;

    go mod init
    go fix -mod=readonly ./...
    rm go.mod

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-19 20:38:51 +02:00
Cory Snider
6690d2969c pkg/archive: bail if setting xattrs is unsupported
Extended attributes are set on files in container images for a reason.
Fail to unpack if extended attributes are present in a layer and setting
the attributes on the unpacked files fails for any reason.

Add an option to the vfs graph driver to opt into the old behaviour
where ENOTSUPP and EPERM errors encountered when setting extended
attributes are ignored. Make it abundantly clear to users and anyone
triaging their bug reports that they are shooting themselves in the
foot by enabling this option.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-05-18 17:21:12 -04:00
Sebastiaan van Stijn
fb96b94ed0
daemon: remove handling for deprecated "oom-score-adjust", and produce error
This option was deprecated in 5a922dc162, which
is part of the v24.0.0 release, so we can remove it from master.

This patch;

- adds a check to ValidatePlatformConfig, and produces a fatal error
  if oom-score-adjust is set
- removes the deprecated libcontainerd/supervisor.WithOOMScore
- removes the warning from docker info

With this patch:

    dockerd --oom-score-adjust=-500 --validate
    Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.

And when using `daemon.json`:

    dockerd --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-06 16:36:17 +02:00
Sebastiaan van Stijn
20a1d23b39
Merge pull request #45320 from akerouanton/info-no-new-privileges
Add no-new-privileges to SecurityOptions returned by /info
2023-04-18 14:37:15 +02:00
Albin Kerouanton
eb7738221c
Add no-new-privileges to SecurityOptions returned by /info
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-04-18 09:34:08 +02:00
Sebastiaan van Stijn
5a922dc162
daemon: deprecate --oom-score-adjust for the daemon
The `oom-score-adjust` option was added in a894aec8d8,
to prevent the daemon from being OOM-killed before other processes. This
option was mostly added as a "convenience", as running the daemon as a
systemd unit was not yet common.

Having the daemon set its own limits is not best-practice, and something
better handled by the process-manager starting the daemon.

Commit cf7a5be0f2 fixed this option to allow
disabling it, and 2b8e68ef06 removed the default
score adjust.

This patch deprecates the option altogether, recommending users to set these
limits through the process manager used, such as the "OOMScoreAdjust" option
in systemd units.

With this patch:

    dockerd --oom-score-adjust=-500 --validate
    Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
    configuration OK

    echo '{"oom-score-adjust":-500}' > /etc/docker/daemon.json
    dockerd
    INFO[2023-04-12T21:34:51.133389627Z] Starting up
    INFO[2023-04-12T21:34:51.135607544Z] containerd not running, starting managed containerd
    WARN[2023-04-12T21:34:51.135629086Z] DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release.

    docker info
    Client:
      Context:    default
      Debug Mode: false
    ...
    DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-04-13 00:02:39 +02:00
Tianon Gravi
6caaa8cadc Prefer loading docker-init from an appropriate "libexec" directory
The `docker-init` binary is not intended to be a user-facing command, and as such it is more appropriate for it to be found in `/usr/libexec` (or similar) than in `PATH` (see the FHS, especially https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html and https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA).

This adjusts the logic for using that configuration option to take this into account and appropriately search for `docker-init` (or the user's configured alternative) in these directories before falling back to the existing `PATH` lookup behavior.

This behavior _used_ to exist for the old `dockerinit` binary (of a similar name and used in a similar way but for an alternative purpose), but that behavior was removed in 4357ed4a73 when that older `dockerinit` was also removed.

Most of this reasoning _also_ applies to `docker-proxy` (and various `containerd-xxx` binaries such as the shims), but this change does not affect those.  It would be relatively straightforward to adapt `LookupInitPath` to be a more generic function such as `libexecLookupPath` or similar if we wanted to explore that.

See 14482589df/cli-plugins/manager/manager_unix.go for the related path list in the CLI which loads CLI plugins from a similar set of paths (with a similar rationale - plugin binaries are not typically intended to be run directly by users but rather invoked _via_ the CLI binary).

Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
2023-03-24 14:25:12 -07:00
Jan Garcia
6ab12ec8f4 rootless: move ./rootless to ./pkg/rootless
Signed-off-by: Jan Garcia <github-public@n-garcia.com>
2023-01-09 16:26:06 +01:00
Brian Goff
e6ee27a541 Allow containerd shim refs in default-runtime
Since runtimes can now just be containerd shims, we need to check if the
reference is possibly a containerd shim.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2022-08-18 18:41:03 +00:00
Sebastiaan van Stijn
52c1a2fae8
gofmt GoDoc comments with go1.19
Older versions of Go don't format comments, so committing this as
a separate commit, so that we can already make these changes before
we upgrade to Go 1.19.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-07-08 19:56:23 +02:00
Akihiro Suda
de6732a403
version: add RootlessKit, slirp4netns, and VPNKit version
```console
$ docker --context=rootless version
...
Server:
...
 rootlesskit:
  Version:          0.14.2
  ApiVersion:       1.1.1
  NetworkDriver:    slirp4netns
  PortDriver:       builtin
  StateDir:         /tmp/rootlesskit245426514
 slirp4netns:
  Version:          1.1.9
  GitCommit:        4e37ea557562e0d7a64dc636eff156f64927335e
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-03-15 15:44:42 +09:00
Tianon Gravi
65cc84abc5
Merge pull request #42152 from AkihiroSuda/fix-rootless-info-42151
info: unset cgroup-related fields when CgroupDriver == none
2021-11-08 14:45:11 -08:00
Sebastiaan van Stijn
686be57d0a
Update to Go 1.17.0, and gofmt with Go 1.17
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-24 23:33:27 +02:00
Sebastiaan van Stijn
b585c64e2b
info: remove "expected" check for tini version
These checks were added when we required a specific version of containerd
and runc (different versions were known to be incompatible). I don't think
we had a similar requirement for tini, so this check was redundant. Let's
remove the check altogether.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-23 13:25:14 +02:00
Akihiro Suda
039e9670cb
info: unset cgroup-related fields when CgroupDriver == none
Fix issue 42151

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-16 16:17:22 +09:00
Akihiro Suda
1d2a660093
Move cgroup v2 out of experimental
We have upgraded runc to rc93 and added CI for cgroup 2.
So we can move cgroup v2 out of experimental.

Fix issue 41916

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-16 17:54:28 +09:00
Akihiro Suda
00225e220f
docker info: adjust warning strings for cgroup v2
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-01-20 13:42:32 +09:00
Akihiro Suda
8086443a44
docker info: silence unhandleable warnings
The following warnings in `docker info` are now discarded,
because there is no action user can actually take.

On cgroup v1:
- "WARNING: No blkio weight support"
- "WARNING: No blkio weight_device support"

On cgroup v2:
- "WARNING: No kernel memory TCP limit support"
- "WARNING: No oom kill disable support"

`docker run` still prints warnings when the missing feature is being attempted to use.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-01-19 15:10:21 +09:00
Akihiro Suda
b8ca7de823
Deprecate KernelMemory
Kernel memory limit is not supported on cgroup v2.
Even on cgroup v1, kernel memory limit (`kmem.limit_in_bytes`) has been deprecated since kernel 5.4.
0158115f70

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-24 20:44:29 +09:00
Brian Goff
260c26b7be
Merge pull request #41016 from kolyshkin/cgroup-init 2020-07-16 11:26:52 -07:00
Brian Goff
1022c6608e
Merge pull request #41083 from thaJeztah/more_warnings
info: add warnings about missing blkio cgroup support
2020-07-09 11:51:09 -07:00
Akihiro Suda
97708281eb
info: improve "WARNING: Running in rootless-mode without cgroup"
The cgroup v2 mode uses systemd driver by default.
Suggesting to set exec-opt "native.cgroupdriver=systemd" isn't meaningful.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-06-29 20:59:47 +09:00
Kir Kolyshkin
afbeaf6f29 pkg/sysinfo: rm duplicates
The CPU CFS cgroup-aware scheduler is one single kernel feature, not
two, so it does not make sense to have two separate booleans
(CPUCfsQuota and CPUCfsPeriod). Merge these into CPUCfs.

Same for CPU realtime.

For compatibility reasons, /info stays the same for now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-06-26 16:19:52 -07:00
Sebastiaan van Stijn
d378625554
info: add warnings about missing blkio cgroup support
These warnings were only logged, and could therefore be overlooked
by users. This patch makes these more visible by returning them as
warnings in the API response.

We should probably consider adding "boolean" (?) fields for these
as well, so that they can be consumed in other ways. In addition,
some of these warnings could potentially be grouped to reduce the
number of warnings that are printed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-06-08 17:16:44 +02:00
Akihiro Suda
f350b53241 cgroup2: implement docker info
ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-17 07:20:01 +09:00
Sebastiaan van Stijn
339fb74cbc
prevent panic if TINI_COMMIT isn't set during build
If TINI_COMMIT isn't set, .go-autogen sets an empty value
as the "expected" commit. Attempting to truncate the value
caused a panic in that situation.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-01-10 15:27:20 +01:00
Akihiro Suda
d2d8e96f51
Merge pull request #39940 from carlosedp/runtime-version
Change version parsing to support alternate runtimes
2019-11-26 09:55:53 +09:00
Sebastiaan van Stijn
2030daf2ee
TestParseInitVersion: add some additional tests
Also slightly harden parseInitVersion

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-16 03:54:14 +02:00
Carlos de Paula
1a96cf95ca Parse runtime name
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2019-09-15 12:33:52 -04:00
Carlos de Paula
4ab1e808d1 Change version parsing to support alternate runtimes
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2019-09-17 07:06:19 -03:00
Rob Gulewich
072400fc4b Make cgroup namespaces configurable
This adds both a daemon-wide flag and a container creation property:
- Set the `CgroupnsMode: "host|private"` HostConfig property at
  container creation time to control what cgroup namespace the container
  is created in
- Set the `--default-cgroupns-mode=host|private` daemon flag to control
  what cgroup namespace containers are created in by default
- Set the default if the daemon flag is unset to "host", for backward
  compatibility
- Default to CgroupnsMode: "host" for client versions < 1.40

Signed-off-by: Rob Gulewich <rgulewich@netflix.com>
2019-05-07 10:22:16 -07:00
Tonis Tiigi
f9b9d5f584 builder-next: fixes for rootless mode
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2019-02-28 10:44:21 -08:00
Sebastiaan van Stijn
dd94555787
Merge pull request #32519 from darkowlzz/32443-docker-update-pids-limit
Add pids-limit support in docker update
2019-02-23 15:20:59 +01:00
Sunny Gogoi
74eb258ffb Add pids-limit support in docker update
- Adds updating PidsLimit in UpdateContainer().
- Adds setting PidsLimit in toContainerResources().

Signed-off-by: Sunny Gogoi <indiasuny000@gmail.com>
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-02-21 14:17:38 -08:00
Akihiro Suda
ec87479b7e allow running dockerd in an unprivileged user namespace (rootless mode)
Please refer to `docs/rootless.md`.

TLDR:
 * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
 * `dockerd-rootless.sh --experimental`
 * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2019-02-04 00:24:27 +09:00
Sebastiaan van Stijn
2137b8ccf2
Add containerd, runc, and docker-init versions to /version
This patch adds version information about the containerd,
runc, and docker-init components to the /version endpoint.

With this patch applied, running:

```
curl --unix-socket /var/run/docker.sock http://localhost/version | jq .
```

Will produce this response:

```json
{
  "Platform": {
    "Name": ""
  },
  "Components": [
    {
      "Name": "Engine",
      "Version": "dev",
      "Details": {
        "ApiVersion": "1.40",
        "Arch": "amd64",
        "BuildTime": "2018-11-08T10:23:42.000000000+00:00",
        "Experimental": "false",
        "GitCommit": "7d02782d2f",
        "GoVersion": "go1.11.2",
        "KernelVersion": "4.9.93-linuxkit-aufs",
        "MinAPIVersion": "1.12",
        "Os": "linux"
      }
    },
    {
      "Name": "containerd",
      "Version": "v1.1.4",
      "Details": {
        "GitCommit": "9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b"
      }
    },
    {
      "Name": "runc",
      "Version": "1.0.0-rc5+dev",
      "Details": {
        "GitCommit": "a00bf0190895aa465a5fbed0268888e2c8ddfe85"
      }
    },
    {
      "Name": "docker-init",
      "Version": "0.18.0",
      "Details": {
        "GitCommit": "fec3683"
      }
    }
  ],
  "Version": "dev",
  "ApiVersion": "1.40",
  "MinAPIVersion": "1.12",
  "GitCommit": "7d02782d2f",
  "GoVersion": "go1.11.2",
  "Os": "linux",
  "Arch": "amd64",
  "KernelVersion": "4.9.93-linuxkit-aufs",
  "BuildTime": "2018-11-08T10:23:42.000000000+00:00"
}
```

When using a recent version of the CLI, that information is included in the
output of `docker version`:

```
Client: Docker Engine - Community
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:46:51 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          dev
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.11.2
  Git commit:       7d02782d2f
  Built:            Thu Nov  8 10:23:42 2018
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.1.4
  GitCommit:        9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b
 runc:
  Version:          1.0.0-rc5+dev
  GitCommit:        a00bf0190895aa465a5fbed0268888e2c8ddfe85
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-14 23:27:05 +01:00
Sebastiaan van Stijn
6f70946a27
Add warning to /info if KernelMemoryTCP is not supported
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-11-27 22:47:39 +01:00
Yong Tang
f023816608 Add memory.kernelTCP support for linux
This fix tries to address the issue raised in 37038 where
there were no memory.kernelTCP support for linux.

This fix add MemoryKernelTCP to HostConfig, and pass
the config to runtime-spec.

Additional test case has been added.

This fix fixes 37038.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-11-26 21:03:08 +00:00
Sebastiaan van Stijn
de1094bc95
Remove redundant nil checks
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-11 23:19:01 +02:00