Commit graph

1015 commits

Author SHA1 Message Date
Albin Kerouanton
4eed3dcdfe api: normalize the default NetworkMode
The NetworkMode "default" is now normalized into the value it
aliases ("bridge" on Linux and "nat" on Windows) by the
ContainerCreate endpoint, the legacy image builder, Swarm's
cluster executor and by the container restore codepath.

builder-next is left untouched as it already uses the normalized
value (ie. bridge).

Going forward, this will make maintenance easier as there's one
less NetworkMode to care about.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-03-28 12:34:23 +01:00
Sebastiaan van Stijn
c516804d6f
vendor: OTEL v0.46.1 / v1.21.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-23 10:11:07 +01:00
Sebastiaan van Stijn
e8346c53d9
Merge pull request #46786 from rumpl/c8d-userns-namespace
c8d: Use a specific containerd namespace when userns are remapped
2024-01-24 20:36:40 +01:00
Djordje Lukic
3a617e5463
c8d: Use a specific containerd namespace when userns are remapped
We need to isolate the images that we are remapping to a userns, we
can't mix them with "normal" images. In the graph driver case this means
we create a new root directory where we store the images and everything
else, in the containerd case we can use a new namespace.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2024-01-24 15:46:16 +01:00
Sebastiaan van Stijn
00c9785e2e
fix "host-gateway-ip" label not set for builder workers
Commit 21e50b89c9 added a label on the buildkit
worker to advertise the host-gateway-ip. This option can be either set by the
user in the daemon config, or otherwise defaults to the gateway-ip.

If no value is set by the user, discovery of the gateway-ip happens when
initializing the network-controller (`NewDaemon`, `daemon.restore()`).

However d222bf097c changed how we handle the
daemon config. As a result, the `cli.Config` used when initializing the
builder only holds configuration information form the daemon config
(user-specified or defaults), but is not updated with information set
by `NewDaemon`.

This patch adds an accessor on the daemon to get the current daemon config.
An alternative could be to return the config by `NewDaemon` (which should
likely be a _copy_ of the config).

Before this patch:

    docker buildx inspect default
    Name:   default
    Driver: docker

    Nodes:
    Name:      default
    Endpoint:  default
    Status:    running
    Buildkit:  v0.12.4+3b6880d2a00f
    Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
    Labels:
     org.mobyproject.buildkit.worker.moby.host-gateway-ip: <nil>

After this patch:

    docker buildx inspect default
    Name:   default
    Driver: docker

    Nodes:
    Name:      default
    Endpoint:  default
    Status:    running
    Buildkit:  v0.12.4+3b6880d2a00f
    Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
    Labels:
     org.mobyproject.buildkit.worker.moby.host-gateway-ip: 172.18.0.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-23 14:58:01 +01:00
Paweł Gronowski
5bbcc41c20
volumes/subpath: Plumb context
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-19 17:32:21 +01:00
Paul "TBBle" Hampson
efadb70ef8
The Windows snapshotter and graphdriver have different names
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-17 16:29:24 +01:00
Cory Snider
0e62dbadcd daemon: reinit health monitor on live-restore
The container may have been running without health probes for an
indeterminate amount of time. The container may have become unhealthy in
the interim. We should probe it sooner than in steady-state, while also
giving it some leeway to recover from e.g. timed-out connections. This
is easy to achieve by probing the container like a freshly-started one.
The original author of health-checks came to the same conclusion; the
health monitor was reinitialized on live-restored containers before
v17.11.0, when health monitoring of live-restored containers was
accidentally broken. Revert to the original behavior.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-01-09 11:03:06 -05:00
Sebastiaan van Stijn
07d2ad30e5
daemon/cluster: Cluster.imageWithDigestString: include mirrors to resolve digest
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, the endpoint did not fall back to querying the upstream if the mirror
did not contain the given reference.

For pull-through registry-mirrors, this was not a problem, as in that case the
registry would forward the request, but for other mirrors, no fallback would
happen. This was inconsistent with how "pulling" images handled this situation;
when pulling images, both the mirror and upstream would be tried.

This patch brings the daemon-side lookup of image-manifests on-par with the
client-side lookup (the GET /distribution endpoint) as used in API 1.30 and
higher.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-04 16:53:45 +01:00
Sebastiaan van Stijn
8aacbb3ba9
api: fix "GET /distribution" endpoint ignoring mirrors
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, the endpoint did not fall back to querying the upstream if the mirror
did not contain the given reference.

If the daemon is configured to use a mirror for the default (Docker Hub)
registry, did not fall back to querying the upstream if the mirror did not
contain the given reference.

For pull-through registry-mirrors, this was not a problem, as in that case the
registry would forward the request, but for other mirrors, no fallback would
happen. This was inconsistent with how "pulling" images handled this situation;
when pulling images, both the mirror and upstream would be tried.

This problem was caused by the logic used in GetRepository, which had an
optimization to only return the first registry it was successfully able to
configure (and connect to), with the assumption that the mirror either contained
all images used, or to be configured as a pull-through mirror.

This patch:

- Introduces a GetRepositories method, which returns all candidates (both
  mirror(s) and upstream).
- Updates the endpoint to try all

Before this patch:

    # the daemon is configured to use a mirror for Docker Hub
    cat /etc/docker/daemon.json
    { "registry-mirrors": ["http://localhost:5000"]}

    # start the mirror (empty registry, not configured as pull-through mirror)
    docker run -d --name registry -p 127.0.0.1:5000:5000 registry:2

    # querying the endpoint fails, because the image-manifest is not found in the mirror:
    curl -s --unix-socket /var/run/docker.sock http://localhost/v1.43/distribution/docker.io/library/hello-world:latest/json
    {
      "message": "manifest unknown: manifest unknown"
    }

With this patch applied:

    # the daemon is configured to use a mirror for Docker Hub
    cat /etc/docker/daemon.json
    { "registry-mirrors": ["http://localhost:5000"]}

    # start the mirror (empty registry, not configured as pull-through mirror)
    docker run -d --name registry -p 127.0.0.1:5000:5000 registry:2

    # querying the endpoint succeeds (manifest is fetched from the upstream Docker Hub registry):
    curl -s --unix-socket /var/run/docker.sock http://localhost/v1.43/distribution/docker.io/library/hello-world:latest/json | jq .
    {
      "Descriptor": {
        "mediaType": "application/vnd.oci.image.index.v1+json",
        "digest": "sha256:1b9844d846ce3a6a6af7013e999a373112c3c0450aca49e155ae444526a2c45e",
        "size": 3849
      },
      "Platforms": [
        {
          "architecture": "amd64",
          "os": "linux"
        }
      ]
    }

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-04 15:46:32 +01:00
Brian Goff
12c7411b6b
Merge pull request #46750 from thaJeztah/daemon_start_log 2024-01-03 17:19:59 -08:00
Sebastiaan van Stijn
2cf230951f
add //go:build directives to prevent downgrading to go1.16 language
This repository is not yet a module (i.e., does not have a `go.mod`). This
is not problematic when building the code in GOPATH or "vendor" mode, but
when using the code as a module-dependency (in module-mode), different semantics
are applied since Go1.21, which switches Go _language versions_ on a per-module,
per-package, or even per-file base.

A condensed summary of that logic [is as follows][1]:

- For modules that have a go.mod containing a go version directive; that
  version is considered a minimum _required_ version (starting with the
  go1.19.13 and go1.20.8 patch releases: before those, it was only a
  recommendation).
- For dependencies that don't have a go.mod (not a module), go language
  version go1.16 is assumed.
- Likewise, for modules that have a go.mod, but the file does not have a
  go version directive, go language version go1.16 is assumed.
- If a go.work file is present, but does not have a go version directive,
  language version go1.17 is assumed.

When switching language versions, Go _downgrades_ the language version,
which means that language features (such as generics, and `any`) are not
available, and compilation fails. For example:

    # github.com/docker/cli/cli/context/store
    /go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/storeconfig.go:6:24: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
    /go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/store.go:74:12: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)

Note that these fallbacks are per-module, per-package, and can even be
per-file, so _(indirect) dependencies_ can still use modern language
features, as long as their respective go.mod has a version specified.

Unfortunately, these failures do not occur when building locally (using
vendor / GOPATH mode), but will affect consumers of the module.

Obviously, this situation is not ideal, and the ultimate solution is to
move to go modules (add a go.mod), but this comes with a non-insignificant
risk in other areas (due to our complex dependency tree).

We can revert to using go1.16 language features only, but this may be
limiting, and may still be problematic when (e.g.) matching signatures
of dependencies.

There is an escape hatch: adding a `//go:build` directive to files that
make use of go language features. From the [go toolchain docs][2]:

> The go line for each module sets the language version the compiler enforces
> when compiling packages in that module. The language version can be changed
> on a per-file basis by using a build constraint.
>
> For example, a module containing code that uses the Go 1.21 language version
> should have a `go.mod` file with a go line such as `go 1.21` or `go 1.21.3`.
> If a specific source file should be compiled only when using a newer Go
> toolchain, adding `//go:build go1.22` to that source file both ensures that
> only Go 1.22 and newer toolchains will compile the file and also changes
> the language version in that file to Go 1.22.

This patch adds `//go:build` directives to those files using recent additions
to the language. It's currently using go1.19 as version to match the version
in our "vendor.mod", but we can consider being more permissive ("any" requires
go1.18 or up), or more "optimistic" (force go1.21, which is the version we
currently use to build).

For completeness sake, note that any file _without_ a `//go:build` directive
will continue to use go1.16 language version when used as a module.

[1]: 58c28ba286/src/cmd/go/internal/gover/version.go (L9-L56)
[2]: https://go.dev/doc/toolchain

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-15 15:24:15 +01:00
Sebastiaan van Stijn
3b1d9f1a26
add validation and migration for deprecated logentries driver
A validation step was added to prevent the daemon from considering "logentries"
as a dynamically loaded plugin, causing it to continue trying to load the plugin;

    WARN[2023-12-12T21:53:16.866857127Z] Unable to locate plugin: logentries, retrying in 1s
    WARN[2023-12-12T21:53:17.868296836Z] Unable to locate plugin: logentries, retrying in 2s
    WARN[2023-12-12T21:53:19.874259254Z] Unable to locate plugin: logentries, retrying in 4s
    WARN[2023-12-12T21:53:23.879869881Z] Unable to locate plugin: logentries, retrying in 8s

But would ultimately be returned as an error to the user:

    docker container create --name foo --log-driver=logentries nginx:alpine
    Error response from daemon: error looking up logging plugin logentries: plugin "logentries" not found

With the additional validation step, an error is returned immediately:

    docker container create --log-driver=logentries busybox
    Error response from daemon: the logentries logging driver has been deprecated and removed

A migration step was added on container restore. Containers using the
"logentries" logging driver are migrated to use the "local" logging driver:

    WARN[2023-12-12T22:38:53.108349297Z] migrated deprecated logentries logging driver  container=4c9309fedce75d807340ea1820cc78dc5c774d7bfcae09f3744a91b84ce6e4f7 error="<nil>"

As an alternative to the validation step, I also considered using a "stub"
deprecation driver, however this would not result in an error when creating
the container, and only produce an error when starting:

    docker container create --name foo --log-driver=logentries nginx:alpine
    4c9309fedce75d807340ea1820cc78dc5c774d7bfcae09f3744a91b84ce6e4f7

    docker start foo
    Error response from daemon: failed to create task for container: failed to initialize logging driver: the logentries logging driver has been deprecated and removed
    Error: failed to start containers: foo

For containers, this validation is added in the backend (daemon). For services,
this was not sufficient, as SwarmKit would try to schedule the task, which
caused a close loop;

    docker service create --log-driver=logentries --name foo nginx:alpine
    zo0lputagpzaua7cwga4lfmhp
    overall progress: 0 out of 1 tasks
    1/1: no suitable node (missing plugin on 1 node)
    Operation continuing in background.

    DEBU[2023-12-12T22:50:28.132732757Z] Calling GET /v1.43/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22zo0lputagpzaua7cwga4lfmhp%22%3Atrue%7D%7D
    DEBU[2023-12-12T22:50:28.137961549Z] Calling GET /v1.43/nodes
    DEBU[2023-12-12T22:50:28.340665007Z] Calling GET /v1.43/services/zo0lputagpzaua7cwga4lfmhp?insertDefaults=false
    DEBU[2023-12-12T22:50:28.343437632Z] Calling GET /v1.43/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22zo0lputagpzaua7cwga4lfmhp%22%3Atrue%7D%7D
    DEBU[2023-12-12T22:50:28.345201257Z] Calling GET /v1.43/nodes

So a validation was added in the service create and update endpoints;

    docker service create --log-driver=logentries --name foo nginx:alpine
    Error response from daemon: the logentries logging driver has been deprecated and removed

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-13 01:10:05 +01:00
Sebastiaan van Stijn
484e6b784c
api/types: move ContainerCreateConfig, ContainerRmConfig to api/types/backend
The `ContainerCreateConfig` and `ContainerRmConfig` structs are used for
options to be passed to the backend, and are not used in client code.

Thess struct currently is intended for internal use only (for example, the
`AdjustCPUShares` is an internal implementation details to adjust the container's
config when older API versions are used).

Somewhat ironically, the signature of the Backend has a nicer UX than that
of the client's `ContainerCreate` signature (which expects all options to
be passed as separate arguments), so we may want to update that signature
to be closer to what the backend is using, but that can be left as a future
exercise.

This patch moves the `ContainerCreateConfig` and `ContainerRmConfig` structs
to the backend package to prevent it being imported in the client, and to make
it more clear that this is part of internal APIs, and not public-facing.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-05 16:41:36 +01:00
Paweł Gronowski
c5ea3d595c
liverestore: Don't remove --rm containers on restart
When live-restore is enabled, containers with autoremove enabled
shouldn't be forcibly killed when engine restarts.
They still should be removed if they exited while the engine was down
though.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2023-11-28 12:59:38 +01:00
Brian Goff
677d41aa3b Plumb context through info endpoint
I was trying to find out why `docker info` was sometimes slow so
plumbing a context through to propagate trace data through.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-11-10 20:09:25 +00:00
Sebastiaan van Stijn
1c50526092
daemon: improve daemon start informational log message
When starting a daemon in debug mode (such as used in CI), many log-messages
are printed during startup. As a result, the log message indicating whether
graph-drivers or snapshotters are used may appear far separate from the
informational log about the daemon (and selected storage-driver).

The existing log-driver also unconditionally uses the legacy "graph-driver"
terminology, instead of the more generic "storage-driver".

This patch changes the log message shown during startup to use the generic
"graph-driver" as field, and adds a new field that indicates wheter we're
using snapshotters or graph-drivers.

Given that snapshotters will be the default at some point, an alternative
could be to include the _type_ of driver used, for example;
`io.containerd.snapshotter.v1`, which may continue to be relevant after
snapshotters become the default, and at which point (potentially) the
type of snapshotter becomes more relevant.

Before this change:

    TEST_INTEGRATION_USE_SNAPSHOTTER=1 DOCKER_GRAPHDRIVER=overlayfs dockerd
    ...
    INFO[2023-10-31T09:12:33.586269801Z] Starting daemon with containerd snapshotter integration enabled
    INFO[2023-10-31T09:12:33.586322176Z] Loading containers: start.
    INFO[2023-10-31T09:12:33.640514759Z] Loading containers: done.
    INFO[2023-10-31T09:12:33.646498134Z] Docker daemon                                 commit=dcf7287d647bcb515015e389df46ccf1e09855b7 graphdriver=overlayfs version=dev
    INFO[2023-10-31T09:12:33.646706551Z] Daemon has completed initialization
    INFO[2023-10-31T09:12:33.658840592Z] API listen on /var/run/docker.sock

With this change;

    TEST_INTEGRATION_USE_SNAPSHOTTER=1 DOCKER_GRAPHDRIVER=overlayfs dockerd
    ...
    INFO[2023-10-31T08:41:38.841155928Z] Starting daemon with containerd snapshotter integration enabled
    INFO[2023-10-31T08:41:38.841207512Z] Loading containers: start.
    INFO[2023-10-31T08:41:38.902461053Z] Loading containers: done.
    INFO[2023-10-31T08:41:38.910535137Z] Docker daemon                                 commit=dcf7287d647bcb515015e389df46ccf1e09855b7 containerd-snapshotter=true storage-driver=overlayfs version=dev
    INFO[2023-10-31T08:41:38.910936803Z] Daemon has completed initialization

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-31 10:20:27 +01:00
Sebastiaan van Stijn
aad51c0b4e
daemon: daemon.shutdownContainer: use context.WithoutCancel
Use context.WithoutCancel so that both the containerStop and
container.Wait can share the same parent context. This context is still
a "TODO", but can be wired up in future.

It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-20 17:50:06 +02:00
Sebastiaan van Stijn
80a9fc6d36
Merge pull request #46565 from vvoland/c8d-mirrors-fix
daemon/RegistryHosts: Don't lose mirrors
2023-10-13 22:31:24 +02:00
Sebastiaan van Stijn
9670d9364d
api/types: move ContainerListOptions to api/types/container
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-12 11:29:24 +02:00
Sebastiaan van Stijn
48cacbca24
api/types: move image-types to api/types/image
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-12 11:29:20 +02:00
Sebastiaan van Stijn
cff4f20c44
migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Paweł Gronowski
d871a665de
daemon/RegistryHosts: Don't lose mirrors
`docker.io` is present in the `IndexConfigs` so the `Mirrors` property
would get lost because a fresh `RegistryConfig` object was created.

Instead of creating a new object, reuse the existing one and just
mutate its fields.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2023-10-11 11:43:54 +02:00
Sebastiaan van Stijn
bd523abd44
remove more direct uses of logrus
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-15 20:12:27 +02:00
Djordje Lukic
0313544f4a
c8d: Handle userns properly
If the daemon is run with --userns-remap we need to chown the prepared
snapshot

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2023-09-11 16:39:29 +02:00
Brian Goff
642e9917ff Add otel support
This uses otel standard environment variables to configure tracing in
the daemon.
It also adds support for propagating trace contexts in the client and
reading those from the API server.

See
https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
for details on otel environment variables.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-09-07 18:38:19 +00:00
Sebastiaan van Stijn
1148a24e64
migrate to new github.com/distribution/reference module
The "reference" package was moved to a separate module, which was extracted
from b9b19409cf

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-05 12:09:26 +02:00
Sebastiaan van Stijn
2be118379e
api/types/container: add RestartPolicyMode type and enum
Also move the validation function to live with the type definition,
which allows it to be used outside of the daemon as well.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-22 16:40:57 +02:00
Sebastiaan van Stijn
74354043ff
remove uses of libnetwork/Network.Info()
Now that we removed the interface, there's no need to cast the Network
to a NetworkInfo interface, so we can remove uses of the `Info()` method.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-08 22:05:30 +02:00
Sebastiaan van Stijn
5e2a1195d7
swap logrus types for their containerd/logs aliases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-01 13:02:55 +02:00
Sebastiaan van Stijn
96b473a0bd
Merge pull request #45992 from jedevc/registry-hosts-use-service-config
daemon: use the registry service config for getting registry hosts
2023-07-24 16:45:09 +02:00
Sebastiaan van Stijn
750c441dfd
daemon: remove intermediate var
The imgSvcConfig is defined locally, and discarded if an error occurs,
so no need to use the intermediate vars here.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-18 14:25:41 +02:00
Sebastiaan van Stijn
375c4eb31c
daemon: rename vars that shadowed package-level types
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-18 14:06:33 +02:00
Sebastiaan van Stijn
034e7e153f
daemon: rename containerdCli to containerdClient
The containerdCli was somewhat confusing (is it the CLI?); let's rename
to make it match what it is :)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-18 13:57:27 +02:00
Justin Chadwell
eb68c5e747 daemon: support using CIDR notation for getting registry hosts
insecure-registries supports using CIDR notation, however, buildkit in
moby was not respecting these. We can update the RegistryHosts function
to support this by inserting the correct host into the lookup map if
it's explicitly marked as insecure.

Signed-off-by: Justin Chadwell <me@jedevc.com>
2023-07-17 12:34:54 +01:00
Justin Chadwell
e0b065cc33 daemon: use the registry service config for getting registry hosts
The RegistryHosts lookup function is used by both BuildKit and by the
containerd snapshotter. However, this function differs in behaviour from
the config parser for the RegistryConfig:

- The protocol for insecure registries is treated as significant by
  RegistryHosts, while the RegistryConfig strips this information.
- RegistryConfig validates and deduplicates mirrors.
- RegistryConfig does not parse the insecure-registries as URLs, which
  can lead to parsing opaque URLs as was possible by the RegistryHosts
  function.

This patch updates the lookup function to ensure consistency.

Signed-off-by: Justin Chadwell <me@jedevc.com>
2023-07-17 12:30:34 +01:00
Bjorn Neergaard
8805e38398
Merge pull request #45799 from cpuguy83/containerd_logrus
Switch all logging to use containerd log pkg
2023-06-26 11:51:44 -06:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
165dfd6c3e daemon: fix restoring container with missing task
Before 4bafaa00aa, if the daemon was
killed while a container was running and the container shim is killed
before the daemon is restarted, such as if the host system is
hard-rebooted, the daemon would restore the container to the stopped
state and set the exit code to 255. The aforementioned commit introduced
a regression where the container's exit code would instead be set to 0.
Fix the regression so that the exit code is once against set to 255 on
restore.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-23 11:28:45 -04:00
Djordje Lukic
32d58144fd c8d: Use reference counting while mounting a snapshot
Some snapshotters (like overlayfs or zfs) can't mount the same
directories twice. For example if the same directroy is used as an upper
directory in two mounts the kernel will output this warning:

    overlayfs: upperdir is in-use as upperdir/workdir of another mount, accessing files from both mounts will result in undefined behavior.

And indeed accessing the files from both mounts will result in an "No
such file or directory" error.

This change introduces reference counts for the mounts, if a directory
is already mounted the mount interface will only increment the mount
counter and return the mount target effectively making sure that the
filesystem doesn't end up in an undefined behavior.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2023-06-07 15:50:01 +02:00
Cory Snider
0f6eeecac0 daemon: consolidate runtimes config validation
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.

Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Cory Snider
038449467e Update BuildKit registry config on daemon reload
Historically, daemon.RegistryHosts() has returned a docker.RegistryHosts
callback function which closes over a point-in-time snapshot of the
daemon configuration. When constructing the BuildKit builder at daemon
startup, the return value of daemon.RegistryHosts() has been used.
Therefore the BuildKit builder would use the registry configuration as
it was at daemon startup for the life of the process, even if the
registry configuration is changed and the configuration reloaded.
Provide BuildKit with a RegistryHosts callback which reflects the
live daemon configuration after reloads so that registry operations
performed by BuildKit always use the same configuration as the rest of
the daemon.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:21 -04:00
Cory Snider
982e4fb448 api/server: get features from a callback fn
Passing around a bare pointer to the map of configured features in order
to propagate to consumers changes to the configuration across reloads is
dangerous. Map operations are not atomic, so concurrently reading from
the map while it is being updated is a data race as there is no
synchronization. Use a getter function to retrieve the current features
map so the features can be retrieved race-free.

Remove the unused features argument from the build router.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:43:27 -04:00
Cory Snider
9b9c5242eb daemon: lock in snapshotter setting at daemon init
Feature flags are one of the configuration items which can be reloaded
without restarting the daemon. Whether the daemon uses the containerd
snapshotter service or the legacy graph drivers is controlled by a
feature flag. However, much of the code which checks the snapshotter
feature flag assumes that the flag cannot change at runtime. Make it so
that the snapshotter setting can only be changed by restarting the
daemon, even if the flag state changes after a live configuration
reload.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-05-24 16:56:17 -04:00
Kevin Alvarez
6d139e5e95
build: use daemon id as worker id for the graph driver controller
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
2023-05-18 21:17:29 +02:00
Sebastiaan van Stijn
fb96b94ed0
daemon: remove handling for deprecated "oom-score-adjust", and produce error
This option was deprecated in 5a922dc162, which
is part of the v24.0.0 release, so we can remove it from master.

This patch;

- adds a check to ValidatePlatformConfig, and produces a fatal error
  if oom-score-adjust is set
- removes the deprecated libcontainerd/supervisor.WithOOMScore
- removes the warning from docker info

With this patch:

    dockerd --oom-score-adjust=-500 --validate
    Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.

And when using `daemon.json`:

    dockerd --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-06 16:36:17 +02:00
Sebastiaan van Stijn
a5d46a15f5
split GetRepository from ImageService
The GetRepository method interacts directly with the registry, and does
not depend on the snapshotter, but is used for two purposes;

For the GET /distribution/{name:.*}/json route;
dd3b71d17c/api/server/router/distribution/backend.go (L11-L15)

And to satisfy the "executor.ImageBackend" interface as used by Swarm;
58c027ac8b/daemon/cluster/executor/backend.go (L77)

This patch removes the method from the ImageService interface, and instead
implements it through an composite struct that satisfies both interfaces,
and an ImageBackend() method is added to the daemon.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

remove GetRepository from ImageService

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-04-09 12:07:57 +02:00
Djordje Lukic
15b9176d53
Add the events services to the containerd image service
No events are sent yet, these will come at a later stage.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2023-03-30 17:48:51 +02:00