This error is returned when attempting to walk a descriptor that
*should* be an index or a manifest.
Without this the error is not very helpful sicne there's no way to tell
what triggered it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Issue was caused by the changes here https://github.com/moby/moby/pull/45504
First released in v25.0.0-beta.1
Signed-off-by: Christopher Petito <47751006+krissetto@users.noreply.github.com>
Don't mutate the container's `Config.WorkingDir` permanently with a
cleaned path when creating a working directory.
Move the `filepath.Clean` to the `translateWorkingDir` instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `normalizeWorkdir` function has two branches, one that returns a
result of `filepath.Join` which always returns a cleaned path, and
another one where the input string is returned unmodified.
To make these two outputs consistent, also clean the path in the second
branch.
This also makes the cleaning of the container workdir explicit in the
`normalizeWorkdir` function instead of relying on the
`SetupWorkingDirectory` to mutate it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make the internal DNS resolver for Windows containers forward requests
to upsteam DNS servers when it cannot respond itself, rather than
returning SERVFAIL.
Windows containers are normally configured with the internal resolver
first for service discovery (container name lookup), then external
resolvers from '--dns' or the host's networking configuration.
When a tool like ping gets a SERVFAIL from the internal resolver, it
tries the other nameservers. But, nslookup does not, and with this
change it does not need to.
The internal resolver learns external server addresses from the
container's HNSEndpoint configuration, so it will use the same DNS
servers as processes in the container.
The internal resolver for Windows containers listens on the network's
gateway address, and each container may have a different set of external
DNS servers. So, the resolver uses the source address of the DNS request
to select external resolvers.
On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used
to prevent the internal resolver from forwarding requests (restoring the
old behaviour).
Signed-off-by: Rob Murray <rob.murray@docker.com>
- deprecate Prestart hook
- deprecate kernel memory limits
Additions
- config: add idmap and ridmap mount options
- config.md: allow empty mappings for [r]idmap
- features-linux: Expose idmap information
- mount: Allow relative mount destinations on Linux
- features: add potentiallyUnsafeConfigAnnotations
- config: add support for org.opencontainers.image annotations
Minor fixes:
- config: improve bind mount and propagation doc
full diff: https://github.com/opencontainers/runtime-spec/compare/v1.1.0...v1.2.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds some nolint-comments for the deprecated kernel-memory options; we
deprecated these, but they could technically still be accepted by alternative
runtimes.
daemon/daemon_unix.go:108:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &config.KernelMemory
^
daemon/update_linux.go:63:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &resources.KernelMemory
^
Prestart hooks are deprecated, and more granular hooks should be used instead.
CreateRuntime are the closest equivalent, and executed in the same locations
as Prestart-hooks, but depending on what these hooks do, possibly one of the
other hooks could be used instead (such as CreateContainer or StartContainer).
As these hooks are still supported, this patch adds nolint comments, but adds
some TODOs to consider migrating to something else;
daemon/nvidia_linux.go:86:2: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
daemon/oci_linux.go:76:5: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No IPAM IPv6 address is given to an interface in a network with
'--ipv6=false', but the kernel would assign a link-local address and,
in a macvlan/ipvlan network, the interface may get a SLAAC-assigned
address.
So, disable IPv6 on the interface to avoid that.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This reverts commit a77e147d32.
The ipvlan integration tests have been skipped in CI because of a check
intended to ensure the kernel has ipvlan support - which failed, but
seems to be unnecessary (probably because kernels have moved on).
Signed-off-by: Rob Murray <rob.murray@docker.com>
We document that an macvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an macvlan
network was still configured with a gateway. So, DNS proxying would
be enabled in the internal resolver (and, if the host's resolver
was on a localhost address, requests to external resolvers from the
host's network namespace would succeed).
This change disables configuration of a gateway for a macvlan Endpoint
if no parent interface is specified.
(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway will still be set up. Documentation
will need to be updated to note that '--internal' should be used to
prevent DNS request forwarding in this case.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
The internal DNS resolver should only forward requests to external
resolvers if the libnetwork.Sandbox served by the resolver has external
network access (so, no forwarding for '--internal' networks).
The test for external network access was whether the Sandbox had an
Endpoint with a gateway configured.
However, an ipvlan-l3 networks with external network access does not
have a gateway, it has a default route bound to an interface.
Also, we document that an ipvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an ipvlan-l2
network was configured with a gateway. So, DNS proxying would be enabled
in the internal resolver (and, if the host's resolver was on a localhost
address, requests to external resolvers from the host's network
namespace would succeed).
So, this change adjusts the test for enabling DNS proxying to include
a check for '--internal' (as a shortcut) and, for non-internal networks,
checks for a default route as well as a gateway. It also disables
configuration of a gateway or a default route for an ipvlan Endpoint if
no parent interface is specified.
(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway/default route will still be set up
and external DNS proxying will be enabled. The network must be
configured as '--internal' to prevent that from happening.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
go1.21.9 (released 2024-04-03) includes a security fix to the net/http
package, as well as bug fixes to the linker, and the go/types and
net/http packages. See the [Go 1.21.9 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.21.9+label%3ACherryPickApproved)
for more details.
These minor releases include 1 security fixes following the security policy:
- http2: close connections when receiving too many headers
Maintaining HPACK state requires that we parse and process all HEADERS
and CONTINUATION frames on a connection. When a request's headers exceed
MaxHeaderBytes, we don't allocate memory to store the excess headers but
we do parse them. This permits an attacker to cause an HTTP/2 endpoint
to read arbitrary amounts of header data, all associated with a request
which is going to be rejected. These headers can include Huffman-encoded
data which is significantly more expensive for the receiver to decode
than for an attacker to send.
Set a limit on the amount of excess header frames we will process before
closing a connection.
Thanks to Bartek Nowotarski (https://nowotarski.info/) for reporting this issue.
This is CVE-2023-45288 and Go issue https://go.dev/issue/65051.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.22.2
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.9+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.8...go1.21.9
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
full diff: https://github.com/golang/net/compare/v0.22.0...v0.23.0
Includes a fix for CVE-2023-45288, which is also addressed in go1.22.2
and go1.21.9;
> http2: close connections when receiving too many headers
>
> Maintaining HPACK state requires that we parse and process
> all HEADERS and CONTINUATION frames on a connection.
> When a request's headers exceed MaxHeaderBytes, we don't
> allocate memory to store the excess headers but we do
> parse them. This permits an attacker to cause an HTTP/2
> endpoint to read arbitrary amounts of data, all associated
> with a request which is going to be rejected.
>
> Set a limit on the amount of excess header frames we
> will process before closing a connection.
>
> Thanks to Bartek Nowotarski for reporting this issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diffs changes relevant to vendored code:
- https://github.com/golang/net/compare/v0.18.0...v0.22.0
- websocket: add support for dialing with context
- http2: remove suspicious uint32->v conversion in frame code
- http2: send an error of FLOW_CONTROL_ERROR when exceed the maximum octets
- https://github.com/golang/crypto/compare/v0.17.0...v0.21.0
- internal/poly1305: drop Go 1.12 compatibility
- internal/poly1305: improve sum_ppc64le.s
- ocsp: don't use iota for externally defined constants
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
illumos is the opensource continuation of OpenSolaris after Oracle
closed to source it (again).
For example use see: https://github.com/openbao/openbao/pull/205.
Signed-off-by: Jasper Siepkes <siepkes@serviceplanet.nl>
This was brought up by bmitch that its not expected to have a platform
object in the config descriptor.
Also checked with tianon who agreed, its not _wrong_ but is unexpected
and doesn't neccessarily make sense to have it there.
Also, while technically incorrect, ECR is throwing an error when it sees
this.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This was using `errors.Wrap` when there was no error to wrap, meanwhile
we are supposed to be creating a new error.
Found this while investigating some log corruption issues and
unexpectedly getting a nil reader and a nil error from `getTailReader`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The NetworkMode "default" is now normalized into the value it
aliases ("bridge" on Linux and "nat" on Windows) by the
ContainerCreate endpoint, the legacy image builder, Swarm's
cluster executor and by the container restore codepath.
builder-next is left untouched as it already uses the normalized
value (ie. bridge).
Going forward, this will make maintenance easier as there's one
less NetworkMode to care about.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
The `identity.ChainIDs` call was accidentally removed in
b37ced2551.
This broke the shared size calculation for images with more than one
layer that were sharing the same compressed layer.
This was could be reproduced with:
```
$ docker pull docker.io/docker/desktop-kubernetes-coredns:v1.11.1
$ docker pull docker.io/docker/desktop-kubernetes-etcd:3.5.10-0
$ docker system df
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
After a535a65c4b the size reported by the
image list was changed to include all platforms of that image.
This made the "shared size" calculation consider all diff ids of all the
platforms available in the image which caused "snapshot not found"
errors when multiple images were sharing the same layer which wasn't
unpacked.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is better because every possible platform combination
does not need to be defined in the Dockerfile. If built
for platform where Delve is not supported then it is just
skipped.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Copy the swagger / OpenAPI file to the documentation. This is the API
version used by the upcoming v26.0.0 release.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Benchmark the `Images` implementation (image list) against an image
store with 10, 100 and 1000 random images. Currently the images are
single-platform only.
The images are generated randomly, but a fixed seed is used so the
actual testing data will be the same across different executions.
Because the content store is not a real containerd image store but a
local implementation, a small delay (500us) is added to each content
store method call. This is to simulate a real-world usage where each
containerd client call requires a gRPC call.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit 8921897e3b introduced the uses of `clear()`,
which requires go1.21, but Go is downgrading this file to go1.16 when used in
other projects (due to us not yet being a go module);
0.175 + xx-go build '-gcflags=' -ldflags '-X github.com/moby/buildkit/version.Version=b53a13e -X github.com/moby/buildkit/version.Revision=b53a13e4f5c8d7e82716615e0f23656893df89af -X github.com/moby/buildkit/version.Package=github.com/moby/buildkit -extldflags '"'"'-static'"'" -tags 'osusergo netgo static_build seccomp ' -o /usr/bin/buildkitd ./cmd/buildkitd
181.8 # github.com/docker/docker/libnetwork/internal/resolvconf
181.8 vendor/github.com/docker/docker/libnetwork/internal/resolvconf/resolvconf.go:509:2: clear requires go1.21 or later (-lang was set to go1.16; check go.mod)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
52a80b40e2 extracted the `imageSummary`
function but introduced a bug causing the whole caller function to
return if the image should be skipped.
`imageSummary` returns a nil error and nil image when the image doesn't
have any platform or all its platforms are not available locally.
In this case that particular image should be skipped, instead of failing
the whole image list operation.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't run filter function which would only run through the images
reading theirs config without checking any label anyway.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit c655b7dc78 added a check to make sure
the TMP_OUT variable was not set to an empty value, as such a situation would
perform an `rm -rf /**` during cleanup.
However, it was a bit too eager, because Makefile conditionals (`ifeq`) are
evaluated when parsing the Makefile, which happens _before_ the make target
is executed.
As a result `$@_TMP_OUT` was always empty when the `ifeq` was evaluated,
making it not possible to execute the `generate-files` target.
This patch changes the check to use a shell command to evaluate if the var
is set to an empty value.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix `error mounting "/etc/hosts" to rootfs at "/etc/hosts": mount
/etc/hosts:/etc/hosts (via /proc/self/fd/6), flags: 0x5021: operation
not permitted`.
This error was introduced in 7d08d84b03
(`dockerd-rootless.sh: set rootlesskit --state-dir=DIR`) that changed
the filesystem of the state dir from /tmp to /run (in a typical setup).
Fix issue 47248
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This code is currently only used in the daemon, but is also needed in other
places. We should consider moving this code to github.com/moby/sys, so that
BuildKit can also use the same implementation instead of maintaining a fork;
moving it to internal allows us to reuse this code inside the repository, but
does not allow external consumers to depend on it (which we don't want as
it's not a permanent location).
As our code only uses this in linux files, I did not add a stub for other
platforms (but we may decide to do that in the moby/sys repository).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit cbc2a71c2 makes `connect` syscall fail fast when a container is
only attached to an internal network. Thanks to that, if such a
container tries to resolve an "external" domain, the embedded resolver
returns an error immediately instead of waiting for a timeout.
This commit makes sure the embedded resolver doesn't even try to forward
to upstream servers.
Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Adds an experimental `DOCKER_BUILDKIT_RUNC_COMMAND` variable that allows
to specify different runc-compatible binary to be used by the buildkit's
runc executor.
This allows runtimes like sysbox be used for the containers spawned by
buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
full diffs:
- https://github.com/protocolbuffers/protobuf-go/compare/v1.31.0...v1.33.0
- https://github.com/golang/protobuf/compare/v1.5.3...v1.5.4
From the Go security announcement list;
> Version v1.33.0 of the google.golang.org/protobuf module fixes a bug in
> the google.golang.org/protobuf/encoding/protojson package which could cause
> the Unmarshal function to enter an infinite loop when handling some invalid
> inputs.
>
> This condition could only occur when unmarshaling into a message which contains
> a google.protobuf.Any value, or when the UnmarshalOptions.UnmarshalUnknown
> option is set. Unmarshal now correctly returns an error when handling these
> inputs.
>
> This is CVE-2024-24786.
In a follow-up post;
> A small correction: This vulnerability applies when the UnmarshalOptions.DiscardUnknown
> option is set (as well as when unmarshaling into any message which contains a
> google.protobuf.Any). There is no UnmarshalUnknown option.
>
> In addition, version 1.33.0 of google.golang.org/protobuf inadvertently
> introduced an incompatibility with the older github.com/golang/protobuf
> module. (https://github.com/golang/protobuf/issues/1596) Users of the older
> module should update to github.com/golang/protobuf@v1.5.4.
govulncheck results in our code:
govulncheck ./...
Scanning your code and 1221 packages across 204 dependent modules for known vulnerabilities...
=== Symbol Results ===
Vulnerability #1: GO-2024-2611
Infinite loop in JSON unmarshaling in google.golang.org/protobuf
More info: https://pkg.go.dev/vuln/GO-2024-2611
Module: google.golang.org/protobuf
Found in: google.golang.org/protobuf@v1.31.0
Fixed in: google.golang.org/protobuf@v1.33.0
Example traces found:
#1: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Peek
#2: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Read
#3: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls protojson.Unmarshal
Your code is affected by 1 vulnerability from 1 module.
This scan found no other vulnerabilities in packages you import or modules you
require.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turn warnings into a deprecation notice and highlight that it will
prevent daemon startup in future releases.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- full diff: https://github.com/containerd/containerd/compare/v1.7.13...v1.7.14
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.14
Welcome to the v1.7.14 release of containerd!
The fourteenth patch release for containerd 1.7 contains various fixes and updates.
Highlights
- Update builds to use go 1.21.8
- Fix various timing issues with docker pusher
- Register imagePullThroughput and count with MiB
- Move high volume event logs to Trace level
Container Runtime Interface (CRI)
- Handle pod transition states gracefully while listing pod stats
Runtime
- Update runc-shim to process exec exits before init
Dependency Changes
- github.com/containerd/nri v0.4.0 -> v0.6.0
- github.com/containerd/ttrpc v1.2.2 -> v1.2.3
- google.golang.org/genproto/googleapis/rpc 782d3b101e98 -> cbb8c96f2d6d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With both rootless and live restore enabled, there's some race condition
which causes the container to be `Unmount`ed before the refcount is
restored.
This makes sure we don't underflow the refcount (uint64) when
decrementing it.
The root cause of this race condition still needs to be investigated and
fixed, but at least this unflakies the `TestLiveRestore`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use a separate `devcontainer` Dockerfile target, this allows to include
the `gopls` in the devcontainer so it doesn't have to be installed by
the Go vscode extension.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make sure the `ping` command used by `TestBridgeICC` actually has
the `-6` flag when it runs IPv6 test cases. Without this flag,
IPv6 connectivity isn't tested properly.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Currently this won't have any real effect because the platform matcher
matches all platform and is only used for sorting.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Move containers counting out of `singlePlatformImage` and count them
based on the `ImageManifest` property.
(also remove ChainIDs calculation as they're no longer used)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Avoid fetching `SnapshotService` from client every time. Fetch it once
and then store when creating the image service.
This also allows to pass custom snapshotter implementation for unit
testing.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use `image.Store` and `content.Store` stored in the ImageService struct
instead of fetching it every time from containerd client.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Both containerd and graphdriver image service use the same code to
create the cache - they only supply their own `cacheAdaptor` struct.
Extract the shared code to `cache.New`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Move image store backend specific code out of the cache code and move it
to a separate interface to allow using the same cache code with
containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Rather than error out if the host's resolv.conf has a bad ndots option,
just ignore it. Still validate ndots supplied via '--dns-option' and
treat failure as an error.
Signed-off-by: Rob Murray <rob.murray@docker.com>
When this was called concurrently from the moby image
exporter there could be a data race where a layer was
written to the refs map when it was already there.
In that case the reference count got mixed up and on
release only one of these layers was actually released.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
When IPv6 is disabled in a container by, for example, using the --sysctl
option - an IPv6 address/gateway is still allocated. Don't attempt to
apply that config because doing so enables IPv6 on the interface.
Signed-off-by: Rob Murray <rob.murray@docker.com>
When configuring the internal DNS resolver - rather than keep IPv6
nameservers read from the host's resolv.conf in the container's
resolv.conf, treat them like IPv4 addresses and use them as upstream
resolvers.
For IPv6 nameservers, if there's a zone identifier in the address or
the container itself doesn't have IPv6 support, mark the upstream
addresses for use in the host's network namespace.
Signed-off-by: Rob Murray <rob.murray@docker.com>
RootlessKit will print hints if something is still unsatisfied.
e.g., `kernel.apparmor_restrict_unprivileged_userns` constraint
rootless-containers/rootlesskit@33c3e7ca6c
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
In de2447c, the creation of the 'lower' file was changed from using
os.Create to using ioutils.AtomicWriteFile, which ignores the system's
umask. This means that even though the requested permission in the
source code was always 0666, it was 0644 on systems with default
umask of 0022 prior to de2447c, so the move to AtomicFile potentially
increased the file's permissions.
This is not a security issue because the parent directory does not
allow writes into the file, but it can confuse security scanners on
Linux-based systems into giving false positives.
Signed-off-by: Jaroslav Jindrak <dzejrou@gmail.com>
The field will still be present in the response, but will always be
`false`.
Searching for `is-automated=true` will yield no results, while
`is-automated=false` will effectively be a no-op.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When using devcontainers in VSCode, install the Go extension
automatically in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
While github.com/stretchr/testify is not used directly by any of the
repository code, it is a transitive dependency via Swarmkit and
therefore still easy to use without having to revendor. Add lint rules
to ban importing testify packages to make sure nobody does.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Apply command gotest.tools/v3/assert/cmd/gty-migrate-from-testify to the
cnmallocator package to be consistent with the assertion library used
elsewhere in moby.
Signed-off-by: Cory Snider <csnider@mirantis.com>
In a container-create API request, HostConfig.NetworkMode (the identity
of the "main" network) may be a name, id or short-id.
The configuration for that network, including preferred IP address etc,
may be keyed on network name or id - it need not match the NetworkMode.
So, when migrating the old container-wide MAC address to the new
per-endpoint field - it is not safe to create a new EndpointSettings
entry unless there is no possibility that it will duplicate settings
intended for the same network (because one of the duplicates will be
discarded later, dropping the settings it contains).
This change introduces a new API restriction, if the deprecated container
wide field is used in the new API, and EndpointsConfig is provided for
any network, the NetworkMode and key under which the EndpointsConfig is
store must be the same - no mixing of ids and names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This message accidentally changed in ac2a028dcc
because my IDE's "refactor tool" was a bit over-enthusiastic. It also went and
updated the tests accordingly, so CI didn't catch this :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Moby imports Swarmkit; Swarmkit no longer imports Moby. In order to
accomplish this feat, Swarmkit has introduced a new plugin.Getter
interface so it could stop importing our pkg/plugingetter package. This
new interface is not entirely compatible with our
plugingetter.PluginGetter interface, necessitating a thin adapter.
Swarmkit had to jettison the CNM network allocator to stop having to
import libnetwork as the cnmallocator package is deeply tied to
libnetwork. Move the CNM network allocator into libnetwork, where it
belongs. The package had a short an uninteresting Git history in the
Swarmkit repository so no effort was made to retain history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This patch disables pulling legacy (schema1 and schema 2, version 1) images by
default.
A `DOCKER_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE` environment-variable is
introduced to allow re-enabling this feature, aligning with the environment
variable used in containerd 2.0 (`CONTAINERD_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE`).
With this patch, attempts to pull a legacy image produces an error:
With graphdrivers:
docker pull docker:1.0
1.0: Pulling from library/docker
[DEPRECATION NOTICE] Docker Image Format v1, and Docker Image manifest version 2, schema 1 support will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format, or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
With the containerd image store enabled, output is slightly different
as it returns the error before printing the `1.0: pulling ...`:
docker pull docker:1.0
Error response from daemon: [DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
Using the "distribution" endpoint to resolve the digest for an image also
produces an error:
curl -v --unix-socket /var/run/docker.sock http://foo/distribution/docker.io/library/docker:1.0/json
* Trying /var/run/docker.sock:0...
* Connected to foo (/var/run/docker.sock) port 80 (#0)
> GET /distribution/docker.io/library/docker:1.0/json HTTP/1.1
> Host: foo
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< Api-Version: 1.45
< Content-Type: application/json
< Docker-Experimental: false
< Ostype: linux
< Server: Docker/dev (linux)
< Date: Tue, 27 Feb 2024 16:09:42 GMT
< Content-Length: 354
<
{"message":"[DEPRECATION NOTICE] Docker Image Format v1, and Docker Image manifest version 2, schema 1 support will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format, or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/"}
* Connection #0 to host foo left intact
Starting the daemon with the `DOCKER_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE`
env-var set to a non-empty value allows pulling the image;
docker pull docker:1.0
[DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
b0a0e6710d13: Already exists
d193ad713811: Already exists
ba7268c3149b: Already exists
c862d82a67a2: Already exists
Digest: sha256:5e7081837926c7a40e58881bbebc52044a95a62a2ea52fb240db3fc539212fe5
Status: Image is up to date for docker:1.0
docker.io/library/docker:1.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When creating a new daemon in the `TestDaemonProxy`, reset the
`OTEL_EXPORTER_OTLP_ENDPOINT` to an empty value to disable OTEL
collection to avoid it hitting the proxy.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This should allow to enable host loopback by setting
DOCKERD_ROOTLESS_ROOTLESSKIT_DISABLE_HOST_LOOPBACK to false,
defaults true.
Signed-off-by: serhii.n <serhii.n@thescimus.com>
Don't use all `*.json` files blindly, take only these that are likely to
be reports from go test.
Also, use `find ... -exec` instead of piping results to `xargs`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
For current implementation of Checkpoint Restore (C/R) in docker, it
will write the checkpoint to content store. However, when restoring
libcontainerd uses .Digest().Encoded(), which will remove the info
of alg, leading to error.
Signed-off-by: huang-jl <1046678590@qq.com>
Buildkit added support for exporting metrics in:
7de2e4fb32
Explicitly set the protocol for exporting metrics like we do for the
traces. We need that because Buildkit defaults to grpc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
30c069cb03
removed the `ResolveImageConfig` method in favor of more generic
`ResolveSourceMetadata` that can also support other things than images.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
e358792815
changed that field to a function and added an `OverrideResource`
function that allows to override it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
StaticDirSource definition changed and can no longer be initialized from
the composite literal.
a80b48544c
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All other progress updates are emitted with truncated id.
```diff
$ docker pull --platform linux/amd64 alpine
Using default tag: latest
latest: Pulling from library/alpine
-sha256:4abcf20661432fb2d719aaf90656f55c287f8ca915dc1c92ec14ff61e67fbaf8: Pulling fs layer
+4abcf2066143: Download complete
Digest: sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
Status: Image is up to date for alpine:latest
docker.io/library/alpine:latest
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't change the behavior for older clients and keep the same behavior.
Otherwise client can't opt-out (because `ReadOnlyNonRecursive` is
unsupported before 1.44).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit e6907243af applied a fix for situations
where the client was configured with API-version negotiation, but did not yet
negotiate a version.
However, the checkVersion() function that was implemented copied the semantics
of cli.NegotiateAPIVersion, which ignored connection failures with the
assumption that connection errors would still surface further down.
However, when using the result of a failed negotiation for NewVersionError,
an API version mismatch error would be produced, masking the actual connection
error.
This patch changes the signature of checkVersion to return unexpected errors,
including failures to connect to the API.
Before this patch:
docker -H unix:///no/such/socket.sock secret ls
"secret list" requires API version 1.25, but the Docker daemon API version is 1.24
With this patch applied:
docker -H unix:///no/such/socket.sock secret ls
Cannot connect to the Docker daemon at unix:///no/such/socket.sock. Is the docker daemon running?
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function has various errors that are returned when failing to make a
connection (due to permission issues, TLS mis-configuration, or failing to
resolve the TCP address).
The errConnectionFailed error is currently used as a special case when
processing Ping responses. The current code did not consistently treat
connection errors, and because of that could either absorb the error,
or process the empty response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
NegotiateAPIVersion was ignoring errors returned by Ping. The intent here
was to handle API responses from a daemon that may be in an unhealthy state,
however this case is already handled by Ping itself.
Ping only returns an error when either failing to connect to the API (daemon
not running or permissions errors), or when failing to parse the API response.
Neither of those should be ignored in this code, or considered a successful
"ping", so update the code to return
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in 27ef09a46f, which changed
the Ping handling to ignore internal server errors. That case is tested in
TestPingFail, which verifies that we accept the Ping response if a 500
status code was received.
The TestPingWithError test was added to verify behavior if a protocol
(connection) error occurred; however the mock-client returned both a
response, and an error; the error returned would only happen if a connection
error occurred, which means that the server would not provide a reply.
Running the test also shows that returning a response is unexpected, and
ignored:
=== RUN TestPingWithError
2024/02/23 14:16:49 RoundTripper returned a response & error; ignoring response
2024/02/23 14:16:49 RoundTripper returned a response & error; ignoring response
--- PASS: TestPingWithError (0.00s)
PASS
This patch updates the test to remove the response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't error out when mount source doesn't exist and mounts has
`CreateMountpoint` option enabled.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Any PR that is labeled with any `impact/*` label should have a
description for the changelog and an `area/*` label.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A common pattern in libnetwork is to delete an object using
`DeleteAtomic`, ie. to check the optimistic lock, but put in a retry
loop to refresh the data and the version index used by the optimistic
lock.
This commit introduces a new `Delete` method to delete without
checking the optimistic lock. It focuses only on the few places where
it's obvious the calling code doesn't rely on the side-effects of the
retry loop (ie. refreshing the object to be deleted).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
I noticed that this log didn't use structured logs;
[resolver] failed to query DNS server: 10.115.11.146:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:46361->10.115.11.146:53: i/o timeout
[resolver] failed to query DNS server: 10.44.139.225:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:53991->10.44.139.225:53: i/o timeout
But other logs did;
DEBU[2024-02-20T15:48:51.026704088Z] [resolver] forwarding query client-addr="udp:172.19.0.2:39661" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t A"
DEBU[2024-02-20T15:48:51.028331088Z] [resolver] forwarding query client-addr="udp:172.19.0.2:35163" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t AAAA"
DEBU[2024-02-20T15:48:51.057329755Z] [resolver] received AAAA record "2a00:1450:400e:801::200e" for "google.com." from udp:192.168.65.7
DEBU[2024-02-20T15:48:51.057666880Z] [resolver] received A record "142.251.36.14" for "google.com." from udp:192.168.65.7
As we're already constructing a logger with these fields, we may as well use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Allow to override the PAGER/GIT_PAGER variables inside the container.
Use `cat` as pager when running in Github Actions (to avoid things like
`git diff` stalling the CI).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't use OTEL tracing in this test because we're testing the HTTP proxy
behavior here and we don't want OTEL to interfere.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This will return a single entry for each name/value pair, and for now
all the "image specific" metadata (labels, config, size) should be
either "default platform" or "first platform we have locally" (which
then matches the logic for commands like `docker image inspect`, etc)
with everything else (just ID, maybe?) coming from the manifest
list/index.
That leaves room for the longer-term implementation to add new fields to
describe the _other_ images that are part of the manifest list/index.
Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
v1.33.0 is also available, but it would also cause
`github.com/aws/aws-sdk-go-v2` change from v1.24.1 to v1.25.0
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
DNS names were only set up for user-defined networks. On Linux, none
of the built-in networks (bridge/host/none) have built-in DNS, so they
don't need DNS names.
But, on Windows, the default network is "nat" and it does need the DNS
names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This matches the prior behavior before 2a6ff3c24f.
This also updates the Swagger documentation for the current version to note that the field might be the empty string and what that means.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Archives being unpacked by Dockerfiles may have been created on other
OSes with different conventions and semantics for xattrs, making them
impossible to apply when extracting. Restore the old best-effort xattr
behaviour users have come to depend on in the classic builder.
The (archive.Archiver).UntarPath function does not allow the options
passed to Untar to be customized. It also happens to be a trivial
wrapper around the Untar function. Inline the function body and add the
option.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Update to the latest patch release, which contains changes from v0.13.5 to
remove the reference package from "github.com/docker/distribution", which
is now a separate module.
full diff: https://github.com/containerd/nydus-snapshotter/compare/v0.8.2...v0.13.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Non-swarm networks created before network-creation-time validation
was added in 25.0.0 continued working, because the checks are not
re-run.
But, swarm creates networks when needed (with 'agent=true'), to
ensure they exist on each agent - ignoring the NetworkNameError
that says the network already existed.
By ignoring validation errors on creation of a network with
agent=true, pre-existing swarm networks with IPAM config that would
fail the new checks will continue to work too.
New swarm (overlay) networks are still validated, because they are
initially created with 'agent=false'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This spec is not directly relevant for the image spec, and the Docker
documentation no longer includes the actual specification.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to release 25.0.0, the bridge in an internal network was assigned
an IP address - making the internal network accessible from the host,
giving containers on the network access to anything listening on the
bridge's address (or INADDR_ANY on the host).
This change restores that behaviour. It does not restore the default
route that was configured in the container, because packets sent outside
the internal network's subnet have always been dropped. So, a 'connect()'
to an address outside the subnet will still fail fast.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.
Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.
Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.
Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This const contains the minimum API version that can be supported by the
API server. The daemon is currently configured to use the same version,
but we may increment the _configured_ minimum version when deprecating
old API versions in future.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 08e4e88482 (Docker Engine v25.0.0)
deprecated API version v1.23 and lower, but older API versions could be
enabled through the DOCKER_MIN_API_VERSION environment variable.
This patch removes all support for API versions < v1.24.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.20 (Docker Engine v1.11.0) and older allowed a HostConfig to be passed
when starting a container. This feature was deprecated in API v1.21 (Docker
Engine v1.10.0) in 3e7405aea8, and removed in
API v1.23 (Docker Engine v1.12.0) in commit 0a8386c8be.
API v1.23 and older are deprecated, and this patch removes the feature.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 322e2a7d05 changed the format of errors
returned by the API to be in JSON format for API v1.24. Older versions of
the API returned errors in plain-text format.
API v1.23 and older are deprecated, so we can remove support for plain-text
error responses.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This endpoint was deprecated in API v1.20 (Docker Engine v1.8.0) in
commit db9cc91a9e, in favor of the
`PUT /containers/{id}/archive` and `HEAD /containers/{id}/archive`
endpoints, and disabled in API v1.24 (Docker Engine v1.12.0) through
commit 428328908d.
This patch removes the endpoint, and the associated `daemon.ContainerCopy`
method in the backend.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.21 (Docker Engine v1.9.0) enforces the request to have a JSON
content-type on exec start (see 45dc57f229).
An exception was added in 0b5e628e14 to
make this check conditional (supporting API < 1.21).
API v1.23 and older are deprecated, and this patch removes the feature.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.20 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The TestInspectAPIContainerResponse mentioned that Windows does not
support API versions before v1.25.
While technically, no stable release existed for Windows with API versions
before that (see f811d5b128), API version
v1.24 was enabled in e4af39aeb3, to have
a consistend fallback version for API version negotiation.
This patch updates the test to reflect that change.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.19 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.20 and up produces an error when signalling / killing a non-running
container (see c92377e300). Older API versions
allowed this, and an exception was added in 621e3d8587.
API v1.23 and older are deprecated, so we can remove this handling.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API versions before 1.19 allowed CpuShares that were greater than the maximum
or less than the minimum supported by the kernel, and relied on the kernel to
do the right thing.
Commit ed39fbeb2a introduced code to adjust the
CPU shares to be within the accepted range when using API version 1.18 or
lower.
API v1.23 and older are deprecated, so we can remove support for this
functionality.
Currently, there's no validation for CPU shares to be within an acceptable
range; a TODO was added to add validation for this option, and to use the
`linuxMinCPUShares` and `linuxMaxCPUShares` consts for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "pull" option was added in API v1.16 (Docker Engine v1.4.0) in commit
054e57a622, which gated the option by API
version.
API v1.23 and older are deprecated, so we can remove the gate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "rm" option was made the default in API v1.12 (Docker Engine v1.0.0)
in commit b60d647172, and "force-rm" was
added in 667e2bd4ea.
API v1.23 and older are deprecated, so we can remove these gates.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "pause" flag was added in API v1.13 (Docker Engine v1.1.0), and is
enabled by default (see 17d870bed5).
API v1.23 and older are deprecated, so we can remove the version-gate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inspect and history used two different ways to find the present images.
This made history fail in some cases where image inspect would work (if
a configuration of a manifest wasn't found in the content store).
With this change we now use the same logic for both inspect and history.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59
futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af
futex: Add sys_futex_wait()
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722
futex: Add sys_futex_requeue()
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e
x86/shstk: Introduce map_shadow_stack syscall
When operating with shadow stacks enabled, the kernel will automatically
allocate shadow stacks for new threads, however in some cases userspace
will need additional shadow stacks. The main example of this is the
ucontext family of functions, which require userspace allocating and
pivoting to userspace managed stacks.
Unlike most other user memory permissions, shadow stacks need to be
provisioned with special data in order to be useful. They need to be setup
with a restore token so that userspace can pivot to them via the RSTORSSP
instruction. But, the security design of shadow stacks is that they
should not be written to except in limited circumstances. This presents a
problem for userspace, as to how userspace can provision this special
data, without allowing for the shadow stack to be generally writable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07
fs: Add fchmodat2()
On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].
There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.
The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25
mm/mempolicy: add set_mempolicy_home_node syscall
This syscall can be used to set a home node for the MPOL_BIND and
MPOL_PREFERRED_MANY memory policy. Users should use this syscall after
setting up a memory policy for the specified range as shown below.
mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
new_nodes->size + 1, 0);
sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
home_node, 0);
The syscall allows specifying a home node/preferred node from which
kernel will fulfill memory allocation requests first.
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The compatibility depends on whether `hyperv` or `process` container
isolation is used.
This fixes cache not being used when building images based on older
Windows versions on a newer Windows host.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.
Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Since 964ab7158c, we explicitly set the bridge MTU if it was specified.
Unfortunately, kernel <v4.17 have a check preventing us to manually set
the MTU to anything greater than 1500 if no links is attached to the
bridge, which is how we do things -- create the bridge, set its MTU and
later on, attach veths to it.
Relevant kernel commit: 804b854d37
As we still have to support CentOS/RHEL 7 (and their old v3.10 kernels)
for a few more months, we need to ignore EINVAL if the MTU is > 1500
(but <= 65535).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Commit 4f47013feb introduced a new validation step to make sure no
IPv6 subnet is configured on a network which has EnableIPv6=false.
Commit 5d5eeac310 then removed that validation step and automatically
enabled IPv6 for networks with a v6 subnet. But this specific commit
was reverted in c59e93a67b and now the error introduced by 4f47013feb
is re-introduced.
But it turns out some users expect a network created with an IPv6
subnet and EnableIPv6=false to actually have no IPv6 connectivity.
This restores that behavior.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Previous commit made getDBhandle a one-liner returning a struct
member -- making it useless. Inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This parameter was used to tell the boltdb kvstore not to open/close
the underlying boltdb db file before/after each get/put operation.
Since d21d0884ae, we've a single datastore instance shared by all
components that need it. That commit set `PersistConnection=true`.
We can now safely remove this param altogether, and remove all the
code that was opening and closing the db file before and after each
operation -- it's dead code!
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test is non-representative of what we now do in libnetwork.
Since the ability of opening the same boltdb database multiple
times in parallel will be dropped in the next commit, just remove
this test.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:
```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
The monitorDaemon() goroutine calls startContainerd() then blocks on
<-daemonWaitCh to wait for it to exit. The startContainerd() function
would (re)initialize the daemonWaitCh so a restarted containerd could be
waited on. This implementation was race-free because startContainerd()
would synchronously initialize the daemonWaitCh before returning. When
the call to start the managed containerd process was moved into the
waiter goroutine, the code to initialize the daemonWaitCh struct field
was also moved into the goroutine. This introduced a race condition.
Move the daemonWaitCh initialization to guarantee that it happens before
the startContainerd() call returns.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Containers attached to an 'internal' bridge network are unable to
communicate when the host is running firewalld.
Non-internal bridges are added to a trusted 'docker' firewalld zone, but
internal bridges were not.
DOCKER-ISOLATION iptables rules are still configured for an internal
network, they block traffic to/from addresses outside the network's subnet.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Do not set 'Config.MacAddress' in inspect output unless the MAC address
is configured.
Also, make sure it is filled in for a configured address on the default
network before the container is started (by translating the network name
from 'default' to 'config' so that the address lookup works).
Signed-off-by: Rob Murray <rob.murray@docker.com>
The API's EndpointConfig struct has a MacAddress field that's used for
both the configured address, and the current address (which may be generated).
A configured address must be restored when a container is restarted, but a
generated address must not.
The previous attempt to differentiate between the two, without adding a field
to the API's EndpointConfig that would show up in 'inspect' output, was a
field in the daemon's version of EndpointSettings, MACOperational. It did
not work, MACOperational was set to true when a configured address was
used. So, while it ensured addresses were regenerated, it failed to preserve
a configured address.
So, this change removes that code, and adds DesiredMacAddress to the wrapped
version of EndpointSettings, where it is persisted but does not appear in
'inspect' results. Its value is copied from MacAddress (the API field) when
a container is created.
Signed-off-by: Rob Murray <rob.murray@docker.com>
File paths can contain commas, particularly paths returned from
t.TempDir() in subtests which include commas in their names. There is
only one datastore provider and it only supports a single address, so
the only use of parsing the address is to break tests in mysterious
ways.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The bbolt library wants exclusive access to the boltdb file and uses
file locking to assure that is the case. The controller and each network
driver that needs persistent storage instantiates its own unique
datastore instance, backed by the same boltdb file. The boltdb kvstore
implementation works around multiple access to the same boltdb file by
aggressively closing the boltdb file between each transaction. This is
very inefficient. Have the controller pass its datastore instance into
the drivers and enable the PersistConnection option to disable closing
the boltdb between transactions.
Set data-dir in unit tests which instantiate libnetwork controllers so
they don't hang trying to lock the default boltdb database file.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- full diff: https://github.com/actions/setup-go/compare/v3.5.0...v5.0.0
v5
In scope of this release, we change Nodejs runtime from node16 to node20.
Moreover, we update some dependencies to the latest versions.
Besides, this release contains such changes as:
- Fix hosted tool cache usage on windows
- Improve documentation regarding dependencies caching
V4
The V4 edition of the action offers:
- Enabled caching by default
- The action will try to enable caching unless the cache input is explicitly
set to false.
Please see "Caching dependency files and build outputs" for more information:
https://github.com/actions/setup-go#caching-dependency-files-and-build-outputs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If a reader has caught up to the logger and is waiting for the next
message, it should stop waiting when the logger is closed. Otherwise
the reader will unnecessarily wait the full closedDrainTimeout for no
log messages to arrive.
This case was overlooked when the journald reader was recently
overhauled to be compatible with systemd 255, and the reader tests only
failed when a logical race happened to settle in such a way to exercise
the bugged code path. It was only after implicit flushing on close was
added to the journald test harness that the Follow tests would
repeatably fail due to this bug. (No new regression tests are needed.)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The journald reader test harness injects an artificial asynchronous
delay into the logging pipeline: a logged message won't be written to
the journal until at least 150ms after the Log() call returns. If a test
returns while log messages are still in flight to be written, the logs
may attempt to be written after the TempDir has been cleaned up, leading
to spurious errors.
The logger read tests which interleave writing and reading have to
include explicit synchronization points to work reliably with this delay
in place. On the other hand, tests should not be required to sync the
logger explicitly before returning. Override the Close() method in the
test harness wrapper to wait for in-flight logs to be flushed to disk.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Check the return value when logging messages
- Log the stream (stdout/stderr) and list of messages that were not read
- Wait until the logger is closed before returning early (panic/fatal)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Writing the systemd-journal-remote command output directly to os.Stdout
and os.Stderr makes it nearly impossible to tell which test case the
output is related to when the tests are not run in verbose mode. Extend
the journald sender fake to redirect output to the test log so they
interleave with the rest of the test output.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Go race detector was detecting a data race when running the
TestLogRead/Follow/Concurrent test against the journald logging driver.
The race was in the test harness, specifically syncLogger. The waitOn
field would be reassigned each time a log entry is sent to the journal,
which is not concurrency-safe. Make it concurrency-safe using the same
patterns that are used in the log follower implementation to synchronize
with the logger.
Signed-off-by: Cory Snider <csnider@mirantis.com>
When saving an image treat `image@sha256:abcdef...` the same as
`abcdef...`, this makes it:
- Not export the digested tag as the image name
- Not try to export all tags from the image repository
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Saving an image via digested reference, ID or truncated ID doesn't store
the image reference in the archive. This also causes the save code to
not add the image's manifest to the index.json.
This commit explicitly adds the untagged manifests to the index.json if
no tagged manifests were added.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
errDrainDone is a sentinel error which is never supposed to escape the
package. Consequently, it needs to be filtered out of returns all over
the place, adding boilerplate. Forgetting to filter out these errors
would be a logic bug which the compiler would not help us catch. Replace
it with boolean multi-valued returns as they can't be accidentally
ignored or propagated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While it doesn't really matter if the reader waits for an extra
arbitrary period beyond an arbitrary hardcoded timeout, it's also
trivial and cheap to implement, and nice to have.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The journald reader uses a timer to set an upper bound on how long to
wait for the final log message of a stopped container. However, the
timer channel is only received from in non-blocking select statements!
There isn't enough benefit of using a timer to offset the cost of having
to manage the timer resource. Setting a deadline and comparing the
current time is just as effective, without having to manage the
lifecycle of any runtime resources.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Synthesize a boot ID for journal entries fed into
systemd-journal-remote, as required by systemd 255.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Following logs with a non-negative tail when the container log is empty
is broken on the journald driver when used with systemd 255. Add tests
which cover this edge case to our loggertest suite.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Previously this was done indirectly - the `compare` function didn't
check the `ArgsEscaped`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Restrict cache candidates only to images that were built locally.
This doesn't affect builds using `--cache-from`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Store additional image property which makes it possible to distinguish
if image was built locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a follow-up to 2cf230951f, adding
more directives to adjust for some new code added since:
Before this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
# github.com/docker/docker/internal/sliceutil
internal/sliceutil/sliceutil.go:3:12: type parameter requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:3:14: predeclared comparable requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:4:19: invalid map key type T (missing comparable constraint)
# github.com/docker/docker/libnetwork
libnetwork/endpoint.go:252:17: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
# github.com/docker/docker/daemon
daemon/container_operations.go:682:9: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
daemon/inspect.go:42:18: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
With this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
=== RUN TestModuleCompatibllity
main_test.go:321: all packages have the correct go version specified through //go:build
--- PASS: TestModuleCompatibllity (0.00s)
PASS
ok gocompat 0.031s
make: Leaving directory '/go/src/github.com/docker/docker/internal/gocompat'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Functional programming for the win! Add a utility function to map the
values of a slice, along with a curried variant, to tide us over until
equivalent functionality gets added to the standard library
(https://go.dev/issue/61898)
Signed-off-by: Cory Snider <csnider@mirantis.com>
We need to isolate the images that we are remapping to a userns, we
can't mix them with "normal" images. In the graph driver case this means
we create a new root directory where we store the images and everything
else, in the containerd case we can use a new namespace.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These types were deprecated in v25.0, and moved to api/types/container;
This patch removes the aliases for;
- api/types.ResizeOptions (deprecated in 95b92b1f97)
- api/types.ContainerAttachOptions (deprecated in 30f09b4a1a)
- api/types.ContainerCommitOptions (deprecated in 9498d897ab)
- api/types.ContainerRemoveOptions (deprecated in 0f77875220)
- api/types.ContainerStartOptions (deprecated in 7bce33eb0f)
- api/types.ContainerListOptions (deprecated in 9670d9364d)
- api/types.ContainerLogsOptions (deprecated in ebef4efb88)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in v25.0, and moved to api/types/swarm;
This patch removes the aliases for;
- api/types.ServiceUpdateResponse (deprecated in 5b3e6555a3)
- api/types.ServiceCreateResponse (deprecated in ec69501e94)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in 48cacbca24
(v25.0), and moved to api/types/image.
This patch removes the aliases for;
- api/types.ImageDeleteResponseItem
- api/types.ImageSummary
- api/types.ImageMetadata
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in b688af2226
(v25.0), and moved to api/types/checkpoint.
This patch removes the aliases for;
- api/types.CheckpointCreateOptions
- api/types.CheckpointListOptions
- api/types.CheckpointDeleteOptions
- api/types.Checkpoint
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in c90229ed9a
(v25.0), and moved to api/types/system.
This patch removes the aliases for;
- api/types.Info
- api/types.Commit
- api/types.PluginsInfo
- api/types.NetworkAddressPool
- api/types.Runtime
- api/types.SecurityOpt
- api/types.KeyValue
- api/types.DecodeSecurityOptions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
To prevent a circular import between api/types and api/types image,
the RequestPrivilegeFunc reference was not moved, but defined as
part of the PullOptions / PushOptions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8b7af1d0f added some code to update the DNSNames of all
endpoints attached to a sandbox by loading a new instance of each
affected endpoints from the datastore through a call to
`Network.EndpointByID()`.
This method then calls `Network.getEndpointFromStore()`, that in
turn calls `store.GetObject()`, which then calls `cache.get()`,
which calls `o.CopyTo(kvObject)`. This effectively creates a fresh
new instance of an Endpoint. However, endpoints are already kept in
memory by Sandbox, meaning we now have two in-memory instances of
the same Endpoint.
As it turns out, libnetwork is built around the idea that no two objects
representing the same thing should leave in-memory, otherwise breaking
mutex locking and optimistic locking (as both instances will have a drifting
version tracking ID -- dbIndex in libnetwork parliance).
In this specific case, this bug materializes by container rename failing
when applied a second time for a given container. An integration test is
added to make sure this won't happen again.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
I made a mistake in the last commit - after resolving the IP from the
passed `addr` for CIFS it would still resolve the `device` part.
Apply only one name resolution
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to 7a9b680a, the container short ID was added to the network
aliases only for custom networks. However, this logic wasn't preserved
in 6a2542d and now the cid is always added to the list of network
aliases.
This commit reintroduces the old logic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- pass the cluster as an argument instead of manually setting it after
creating the router-options
- remove the "opts" variable, to prevent it accidentally being used (with
the assumption that's the value returned)
- use a struct-literal for the returned options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 21e50b89c9 added a label on the buildkit
worker to advertise the host-gateway-ip. This option can be either set by the
user in the daemon config, or otherwise defaults to the gateway-ip.
If no value is set by the user, discovery of the gateway-ip happens when
initializing the network-controller (`NewDaemon`, `daemon.restore()`).
However d222bf097c changed how we handle the
daemon config. As a result, the `cli.Config` used when initializing the
builder only holds configuration information form the daemon config
(user-specified or defaults), but is not updated with information set
by `NewDaemon`.
This patch adds an accessor on the daemon to get the current daemon config.
An alternative could be to return the config by `NewDaemon` (which should
likely be a _copy_ of the config).
Before this patch:
docker buildx inspect default
Name: default
Driver: docker
Nodes:
Name: default
Endpoint: default
Status: running
Buildkit: v0.12.4+3b6880d2a00f
Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
Labels:
org.mobyproject.buildkit.worker.moby.host-gateway-ip: <nil>
After this patch:
docker buildx inspect default
Name: default
Driver: docker
Nodes:
Name: default
Endpoint: default
Status: running
Buildkit: v0.12.4+3b6880d2a00f
Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
Labels:
org.mobyproject.buildkit.worker.moby.host-gateway-ip: 172.18.0.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8ae94cafa5 added a DNS resolution
of the `device` part of the volume option.
The previous way to resolve the passed hostname was to use `addr`
option, which was handled by the same code path as the `nfs` mount type.
The issue is that `addr` is also an SMB module option handled by kernel
and passing a hostname as `addr` produces an invalid argument error.
To fix that, restore the old behavior to handle `addr` the same way as
before, and only perform the new DNS resolution of `device` if there is
no `addr` passed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Update the version of compose used in CI to the latest version.
- full diff: docker/compose@v2.24.1...v2.24.2
- release notes: https://github.com/docker/compose/releases/tag/v2.24.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixes some potentially unclosed file-handles,
inlines some variables, and use consts for fixed
values.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixing a "defer in loop" warning, instead changing to use
sub-tests, and simplifying some code, using os.WriteFile() instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The names of extended attributes are not completely freeform. Attributes
are namespaced, and the kernel enforces (among other things) that only
attributes whose names are prefixed with a valid namespace are
permitted. The name of the attribute therefore needs to be known in
order to diagnose issues with lsetxattr. Include the name of the
extended attribute in the errors returned from the Lsetxattr and
Lgetxattr so users and us can more easily troubleshoot xattr-related
issues. Include the name in a separate rich-error field to provide code
handling the error enough information to determine whether or not the
failure can be ignored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `GetImageOpts` struct is used for options to be passed to the backend,
and are not used in client code. This struct currently is intended for internal
use only.
This patch moves the `GetImageOpts` struct to the backend package to prevent
it being imported in the client, and to make it more clear that this is part
of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MAC address of a running container was stored in the same place as
the configured address for a container.
When starting a stopped container, a generated address was treated as a
configured address. If that generated address (based on an IPAM-assigned
IP address) had been reused, the containers ended up with duplicate MAC
addresses.
So, remember whether the MAC address was explicitly configured, and
clear it if not.
Signed-off-by: Rob Murray <rob.murray@docker.com>
With containerd snapshotters enabled `docker run` currently fails when
creating a container from an image that doesn't have the default host
platform without an explicit `--platform` selection:
```
$ docker run image:amd64
Unable to find image 'asdf:amd64' locally
docker: Error response from daemon: pull access denied for asdf, repository does not exist or may require 'docker login'.
See 'docker run --help'.
```
This is confusing and the graphdriver behavior is much better here,
because it runs whatever platform the image has, but prints a warning:
```
$ docker run image:amd64
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
```
This commits changes the containerd snapshotter behavior to be the same
as the graphdriver. This doesn't affect container creation when platform
is specified explicitly.
```
$ docker run --rm --platform linux/arm64 asdf:amd64
Unable to find image 'asdf:amd64' locally
docker: Error response from daemon: pull access denied for asdf, repository does not exist or may require 'docker login'.
See 'docker run --help'.
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Order the layers in OCI manifest by their actual apply order. This is
required by the OCI image spec.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Since v25.0 (commit ff50388), we validate endpoint settings when
containers are created, instead of doing so when containers are started.
However, a container created prior to that release would still trigger
validation error at start-time. In such case, the API returns a 500
status code because the Go error isn't wrapped into an InvalidParameter
error. This is now fixed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test was added in f301c5765a to test
inspect output for API > v1.21, however, it was pinned to API v1.21,
which is now deprecated.
Remove the fixed version, as the intent was to test "current" API versions
(API v1.21 and up),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in f301c5765a to test
inspect output for API > v1.21, however, it was pinned to API v1.21,
which is now deprecated.
Remove the fixed version, as the intent was to test "current" API versions
(API v1.21 and up),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in 75f6929b44, but pinned
to the API version that was current at the time (v1.20), which is now
deprecated.
Update the test to use the current API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tables linked to deprecated API versions, and an up-to-date version of
the matrix is already included at https://docs.docker.com/engine/api/#api-version-matrix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- add some asserts for unhandled errors
- use consts for fixed values, and slightly re-format Dockerfile contentt
- inline one-line Dockerfiles
- fix some vars to be properly camel-cased
- improve assert for error-types;
Before:
=== RUN TestBuildPlatformInvalid
build_test.go:685: assertion failed: expression is false: errdefs.IsInvalidParameter(err)
--- FAIL: TestBuildPlatformInvalid (0.01s)
FAIL
After:
=== RUN TestBuildPlatformInvalid
build_test.go:689: assertion failed: error is Error response from daemon: "foobar": unknown operating system or architecture: invalid argument (errdefs.errSystem), not errdefs.IsInvalidParameter
--- FAIL: TestBuildPlatformInvalid (0.01s)
FAIL
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This matcher was only used internally in the containerd implementation of
the image store. Un-export it, and make it a local utility in that package
to prevent external use.
This package was introduced in 1616a09b61
(v24.0), and there are no known external consumers of this package, so there
should be no need to deprecate / alias the old location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When resolving names in swarm mode, services with exposed ports are
connected to user overlay network, ingress network, and local (docker_gwbridge)
networks. Name resolution should prioritize returning the VIP/IPs on user
overlay network over ingress and local networks.
Sandbox.ResolveName implemented this by taking the list of endpoints,
splitting the list into 3 separate lists based on the type of network
that the endpoint was attached to (dynamic, ingress, local), and then
creating a new list, applying the networks in that order.
This patch refactors that logic to use a custom sorter (sort.Interface),
which makes the code more transparent, and prevents iterating over the
list of endpoints multiple times.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Permit container network attachments to set any static IP address within
the network's IPAM master pool, including when a subpool is configured.
Users have come to depend on being able to statically assign container
IP addresses which are guaranteed not to collide with automatically-
assigned container addresses.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This package was introduced in af59752712
as a utility package for devicemapper, which was removed in commit
dc11d2a2d8 (v25.0.0), and the package
was deprecated in bf692d47fb.
This patch removes the package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This flag was marked deprecated in commit 5a922dc16 (released in v24.0)
and to be removed in the next release.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Some configuration in a container depends on whether it has support for
IPv6 (including default entries for '::1' etc in '/etc/hosts').
Before this change, the container's support for IPv6 was determined by
whether it was connected to any IPv6-enabled networks. But, that can
change over time, it isn't a property of the container itself.
So, instead, detect IPv6 support by looking for '::1' on the container's
loopback interface. It will not be present if the kernel does not have
IPv6 support, or the user has disabled it in new namespaces by other
means.
Once IPv6 support has been determined for the container, its '/etc/hosts'
is re-generated accordingly.
The daemon no longer disables IPv6 on all interfaces during initialisation.
It now disables IPv6 only for interfaces that have not been assigned an
IPv6 address. (But, even if IPv6 is disabled for the container using the
sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6
networks still get IPv6 addresses that appear in the internal DNS. There's
more to-do!)
Signed-off-by: Rob Murray <rob.murray@docker.com>
All components of the path are locked before the check, and
released once the path is already mounted.
This makes it impossible to replace the mounted directory until it's
actually mounted in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All subpath components are opened with openat, relative to the base
volume directory and checked against the volume escape.
The final file descriptor is mounted from the /proc/self/fd/<fd> to a
temporary mount point owned by the daemon and then passed to the
underlying container runtime.
Temporary mountpoint is removed after the container is started.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`VolumeOptions` now has a `Subpath` field which allows to specify a path
relative to the volume that should be mounted as a destination.
Symlinks are supported, but they cannot escape the base volume
directory.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We constructed a "function level" logger, which was used once "as-is", but
also added additional Fields in a loop (for each resource), effectively
overwriting the previous one for each iteration. Adding additional
fields can result in some overhead, so let's construct a "logger" only for
inside the loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We have many "image" packages, so these vars easily conflict/shadow
imports. Let's rename them (and in some cases use a const) to
prevent that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For some time, when adding an interface with no IPv6 address (an
interface to a network that does not have IPv6 enabled), we've been
disabling IPv6 on that interface.
As part of a separate change, I'm removing that logic - there's nothing
wrong with having IPv6 enabled on an interface with no routable address.
The difference is that the kernel will assign a link-local address.
TestAddRemoveInterface does this...
- Assign an IPv6 link-local address to one end of a veth interface, and
add it to a namespace.
- Add a bridge with no assigned IPv6 address to the namespace.
- Remove the veth interface from the namespace.
- Put the veth interface back into the namespace, still with an
explicitly assigned IPv6 link local address.
When IPv6 is disabled on the bridge interface, the test passes.
But, when IPv6 is enabled, the bridge gets a kernel assigned link-local
address.
Then, when re-adding the veth interface, the test generates an error in
'osl/interface_linux.go:checkRouteConflict()'. The conflict is between
the explicitly assigned fe80::2 on the veth, and a route for fe80::/64
belonging to the bridge.
So, in preparation for not-disabling IPv6 on these interfaces, use a
unique-local address in the test instead of link-local.
I don't think that changes the intent of the test.
With the change to not-always disable IPv6, it is possible to repro the
problem with a real container, disconnect and re-connect a user-defined
network with '--subnet fe80::/64' while the container's connected to an
IPv4 network. So, strictly speaking, that will be a regression.
But, it's also possible to repro the problem in master, by disconnecting
and re-connecting the fe80::/64 network while another IPv6 network is
connected. So, I don't think it's a problem we need to address, perhaps
other than by prohibiting '--subnet fe80::/64'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Message is different with containerd backend. The Linux test
`TestPullLinuxImageFailsOnLinux` was adjusted before, but we missed this
one.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All commonly used filesystems should use ref-counted mounter, so make it
the default instead of having to whitelist them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to this commit, a container running with `--net=host` had
`{"type":"network","path":"/var/run/docker/netns/default"}` in
the ``.linux.namespaces` field of the OCI Runtime Config,
but this wasn't needed.
Close issue 47100
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The actual divergence is due to differences in the snapshotter and
graphfilter mount behaviour on Windows, but the snapshotter behaviour is
better, so we deal with it here rather than changing the snapshotter
behaviour.
We're relying on the internals of containerd's Windows mount
implementation here. Unless this code flow is replaced, future work is
to move getBackingDeviceForContainerdMount into containerd's mount
implementation.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
That means 'null', not that we can call builder-next on Windows. If and
when we do get builder-next going, this will need to be solved properly
in some way.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The existing API ImageService.GetLayerFolders didn't have access to the
ID of the container, and once we have that, the snapshotter Mounts API
provides all the information we need here.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Needed for Diff on Windows. Don't remount it afterwards as the layer is
going to be released anyway.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is consistent with layerStore's CreateRWLayer behaviour.
Potentially this can be refactored to avoid creating the -init layer,
but as noted in layerStore's initMount, this name may be special, and
should be cleared-out all-at-once.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This change adds a TempDir function that ensures the correct permissions for
the fake-root user in rootless mode.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Now the state dir is set to `${XDG_RUNTIME_DIR}/dockerd-rootless`.
This is similar to `${XDG_RUNTIME_DIR}/containerd-rootless` used in nerdctl:
https://github.com/containerd/nerdctl/blob/v1.7.2/extras/rootless/containerd-rootless.sh#L35
Prior to this commit, the state dir was unset and a random dir under `/tmp` was used.
(e.g., `/tmp/rootlesskit1869901982`)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
XDG_RUNTIME_DIR will contain sockets so its path mustn't be too long.
Prior to this commit, it was set to very long path like
`/go/src/github.com/docker/docker/bundles/test-integration/TestDiskUsage/de4fb36576d7d/xdgrun`
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Consider only images that were built `FROM scratch` as valid candidates
for the `FROM scratch` + INSTRUCTION build step.
The images are marked as `FROM scratch` based by the classic builder
with a special label. It must be a new label instead of empty parent
label, because empty label values are not persisted.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In order for the cache in the classic builder to work we need to:
- use the came comparison function as the graph drivers implementation
- save the container config when commiting the image
- use all images to search a 'FROM "scratch"' image
- load all images if `cacheFrom` is empty
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Protecting the environment relies on the shared state (containers,
images, etc) which might already be mutated by other tests if the test
opted in into the Parallel execution before Protect was called.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The health status and probe log of containers are not mission-criticial
data which must survive a crash. It is not worth prematrely wearing out
consumer-grade flash storage by overwriting and fsync()ing the container
config on after every probe. Update only the live Container object and
the ViewDB replica on every container health probe instead. It will
eventually get checkpointed along with some other state (or config)
change. Running containers will not be checkpointed on daemon shutdown
when live-restore is enabled, but it does not matter: the health status
and probe log will be zeroed out when the daemon starts back up.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The "builtin" port driver was marked as "Slow" in the row for the lxc-user-nic
network driver, while it was marked as "Fast" in other rows.
It had to be consistently marked as "Fast" regardless to the network driver.
It is still not as fast as rootful.
Follow-up to PR 47076
Fixes: b5a5ecf4a3
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
setupTest should be called before Parallel as it modifies the test
environment which might produce:
```
fatal error: concurrent map writes
goroutine 143 [running]:
github.com/docker/docker/testutil/environment.(*Execution).ProtectContainer(...)
/go/src/github.com/docker/docker/testutil/environment/protect.go:59
github.com/docker/docker/testutil/environment.ProtectContainers({0x12e8d98, 0xc00040e420}, {0x12f2878?, 0xc0004fc340}, 0xc0001fac00)
/go/src/github.com/docker/docker/testutil/environment/protect.go:68 +0xb1
github.com/docker/docker/testutil/environment.ProtectAll({0x12e8d98, 0xc00040e210}, {0x12f2878, 0xc0004fc340}, 0xc0001fac00)
/go/src/github.com/docker/docker/testutil/environment/protect.go:45 +0xf3
github.com/docker/docker/integration/image.setupTest(0xc0004fc340)
/go/src/github.com/docker/docker/integration/image/main_test.go:46 +0x59
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Save the unmodified manifest list to keep the image ID of the
multi-platform images when not all platforms are present.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`diffIDPaths` is not used and can be removed.
`savedConfig` stores if the config was already saved (ID of the image is
the ID of the config).
`savedLayers` stores if the layer (diff ID) was already saved.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
https://github.com/rootless-containers/rootlesskit/releases/tag/v2.0.0
=== Pasta ===
RootlessKit v2 adds the support for pasta (https://passt.top/passt/).
Pasta is similar to slirp4netns but its port forwarder achieves better
throughput than slirp4netns port driver.
It is still not faster than RootlessKit's `builtin` port driver, but unlike the
`builtin` port driver, pasta can retain source IP address information.
Network driver | Port driver | Net throughput | Port throughput | Src IP | No SUID | Note
---------------|----------------|----------------|-----------------|--------|---------|--------------------------------------------
slirp4netns | builtin | Slow | Fast ✅ | ❌ | ✅ | Default in typical setup
vpnkit | builtin | Slow | Fast ✅ | ❌ | ✅ | Default when slirp4netns is not installed
slirp4netns | slirp4netns | Slow | Slow | ✅ | ✅ |
**pasta** | **implicit** | Slow | Fast ✅ | ✅ | ✅ | Experimental
lxc-user-nic | builtin | Fast ✅ | Slow | ❌ | ❌ | Experimental
(bypass4netns) | (bypass4netns) | Fast ✅ | Fast ✅ | ✅ | ✅ | (Not integrated to RootlessKit)
=== Detach-netns ===
Aside from pasta, RootlessKit v2 also brings the support for
"detach-netns" mode, which leaves the runtime in the host network namespace to
eliminate the slirp overhead for pull/push and to allow accessing the "real"
127.0.0.1.
See containerd/nerdctl PR 2723 for how detach-netns is being adopted in
nerdctl v2.
Integrating detach-netns into Docker/Moby will need an extra work and will be
deferred to Docker v26 (or later).
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Update the TestDaemonRestartKilContainers integration test to assert
that a container's healthcheck status is always reset to the Starting
state after a daemon restart, even when the container is live-restored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Layer size is the sum of the individual files count, not the tar
archive. Use the total bytes read returned by `io.Copy` to populate the
`Size` field.
Also set the digest to the actual digest of the tar archive.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Switch github.com/imdario/mergo to dario.cat/mergo v1.0.0, because
the module was renamed, and reached v1.0.0
full diff: https://github.com/imdario/mergo/compare/v0.3.13...v1.0.0
vendor: github.com/containerd/containerd v1.7.12
- full diff: https://github.com/containerd/containerd/compare/v1.7.11...v1.7.12
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.12
Welcome to the v1.7.12 release of containerd!
The twelfth patch release for containerd 1.7 contains various fixes and updates.
Notable Updates
- Fix on dialer function for Windows
- Improve `/etc/group` handling when appending groups
- Update shim pidfile permissions to 0644
- Update runc binary to v1.1.11
- Allow import and export to reference missing content
- Remove runc import
- Update Go version to 1.20.13
Deprecation Warnings
- Emit deprecation warning for `containerd.io/restart.logpath` label usage
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.11...v1.7.12
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.12
Welcome to the v1.7.12 release of containerd!
The twelfth patch release for containerd 1.7 contains various fixes and updates.
Notable Updates
- Fix on dialer function for Windows
- Improve `/etc/group` handling when appending groups
- Update shim pidfile permissions to 0644
- Update runc binary to v1.1.11
- Allow import and export to reference missing content
- Remove runc import
- Update Go version to 1.20.13
Deprecation Warnings
- Emit deprecation warning for `containerd.io/restart.logpath` label usage
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The new OCI-compatible archive export relies on the Descriptors returned
by the layer (`distribution.Describable` interface implementation).
The issue with that is that the `roLayer` and the `referencedCacheLayer`
types don't implement this interface. Implementing that interface for
them based on their `descriptor` doesn't work though, because that
descriptor is empty.
To workaround this issue, just create a new descriptor if the one
provided by the layer is empty.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Split task creation and start into two separate method calls in the
libcontainerd API. Clients now have the opportunity to inspect the
freshly-created task and customize its runtime environment before
starting execution of the user-specified binary.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The container may have been running without health probes for an
indeterminate amount of time. The container may have become unhealthy in
the interim. We should probe it sooner than in steady-state, while also
giving it some leeway to recover from e.g. timed-out connections. This
is easy to achieve by probing the container like a freshly-started one.
The original author of health-checks came to the same conclusion; the
health monitor was reinitialized on live-restored containers before
v17.11.0, when health monitoring of live-restored containers was
accidentally broken. Revert to the original behavior.
Signed-off-by: Cory Snider <csnider@mirantis.com>
commit 4f9db655ed moved looking up the
userland-proxy binary to early in the startup process, and introduced
a log-message if the binary was missing.
However, a side-effect of this was this message would also be printed
when running "--version";
dockerd --version
time="2024-01-09T09:18:53.705271292Z" level=warning msg="failed to lookup default userland-proxy binary" error="exec: \"docker-proxy\": executable file not found in $PATH"
Docker version v25.0.0-rc.1, build 9cebefa717
We should look if we can avoid this, but let's change the message to be
a debug message as a short-term workaround.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 0b1c1877c5 updated the version in
hack/dockerfile/install/rootlesskit.installer, but forgot to update the
version in Dockerfile.
Also updating both to use a tag, instead of commit. While it's good to pin by
an immutable reference, I think it's reasonably safe to use the tag, which is
easier to use, and what we do for other binaries, such as runc as well.
Full diff: https://github.com/rootless-containers/rootlesskit/compare/v1.1.0...v1.1.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The only use is in `builder/builder-next/adapters/snapshot.EnsureLayer()`,
which always calls the function with an _empty_ `oldTarDataPath`;
7082aecd54/builder/builder-next/adapters/snapshot/layer.go (L81)
When called with an empty `oldTarDataPath`, this function was an alias for
`checksumForGraphIDNoTarsplit`, so let's make it that.
Note that this code was added in 500e77bad0, as
part of the migration from "v1" images to "v2" (content-addressable) images.
Given that the remaining code lives in a "migration" file, possibly more code
can be removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ExecBackend.ContainerKill()` function was called before removing a build-
container.
This function is backed by `daemon.ContainerKill()` which, if no signal is passed,
performed a `daemon.Kill()`, using `SIGKILL` as signal. However, the
`ExecBackend.ContainerRm()` (backed by `daemonContainerRm()`), which is called
after this, is executed with the `ForceRemove` option set, which calls
`daemon.cleanupContainer()` with `ForceRemove` set, which also results in
`daemon.Kill()` being called:
1a0c15abbb/daemon/delete.go (L84-L95)
This makes the `ExecBackend.ContainerKill()` redundant, so removing this from
the interface.
While looking at this code, one (possible) race-condition was found in
`daemon.cleanupContainer()`, where `daemon.Kill()` could return a `errdefs.Conflict`
if the container was already stopped. An extra check was added for this case to
prevent `daemon.cleanupContainer()` from terminating early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent cleanup from terminating early when failing to remove a container;
- continue trying to remove remaining containers
- ignore errors due to containers that were not found
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was just a very thin wrapper for backend.ContainerRm(), and the
error it returned was not handled, so moving this code inline.
Moving it inline also allows differentiating the error message to
distinguish the "removing all intermediate containers" from "removing container"
(when cancelling a build).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The output var was used in a `defer`, but named `err` and shadowed in various
places. Rename the var to a more explicit name to make clear where it's used
and to prevent it being accidentally shadowed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes:
- NewSystemd handles UnitExists when starting units
- makefile fixes
- cgroups2: export memory max usage and swap max usage
- build(deps): bump github.com/cilium/ebpf from v0.9.1 to v0.11.0
- support psi
- feat: add Threads for cgroupv2
- Linux.Swap is defined as memory+swap combined, while in cgroup v2 swap is a separate value
- fix(): support re-enabling oom killer refs #307 by @kestrelcjx
full diff: https://github.com/containerd/cgroups/compare/v3.0.2...v3.0.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
To make the version format in the `moby-bin` consistent with the
version we use in the release pipeline.
```diff
Server: Docker Engine - Community
Engine:
- Version: v25.0.0
+ Version: 25.0.0
...
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, the endpoint did not fall back to querying the upstream if the mirror
did not contain the given reference.
For pull-through registry-mirrors, this was not a problem, as in that case the
registry would forward the request, but for other mirrors, no fallback would
happen. This was inconsistent with how "pulling" images handled this situation;
when pulling images, both the mirror and upstream would be tried.
This patch brings the daemon-side lookup of image-manifests on-par with the
client-side lookup (the GET /distribution endpoint) as used in API 1.30 and
higher.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, the endpoint did not fall back to querying the upstream if the mirror
did not contain the given reference.
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, did not fall back to querying the upstream if the mirror did not
contain the given reference.
For pull-through registry-mirrors, this was not a problem, as in that case the
registry would forward the request, but for other mirrors, no fallback would
happen. This was inconsistent with how "pulling" images handled this situation;
when pulling images, both the mirror and upstream would be tried.
This problem was caused by the logic used in GetRepository, which had an
optimization to only return the first registry it was successfully able to
configure (and connect to), with the assumption that the mirror either contained
all images used, or to be configured as a pull-through mirror.
This patch:
- Introduces a GetRepositories method, which returns all candidates (both
mirror(s) and upstream).
- Updates the endpoint to try all
Before this patch:
# the daemon is configured to use a mirror for Docker Hub
cat /etc/docker/daemon.json
{ "registry-mirrors": ["http://localhost:5000"]}
# start the mirror (empty registry, not configured as pull-through mirror)
docker run -d --name registry -p 127.0.0.1:5000:5000 registry:2
# querying the endpoint fails, because the image-manifest is not found in the mirror:
curl -s --unix-socket /var/run/docker.sock http://localhost/v1.43/distribution/docker.io/library/hello-world:latest/json
{
"message": "manifest unknown: manifest unknown"
}
With this patch applied:
# the daemon is configured to use a mirror for Docker Hub
cat /etc/docker/daemon.json
{ "registry-mirrors": ["http://localhost:5000"]}
# start the mirror (empty registry, not configured as pull-through mirror)
docker run -d --name registry -p 127.0.0.1:5000:5000 registry:2
# querying the endpoint succeeds (manifest is fetched from the upstream Docker Hub registry):
curl -s --unix-socket /var/run/docker.sock http://localhost/v1.43/distribution/docker.io/library/hello-world:latest/json | jq .
{
"Descriptor": {
"mediaType": "application/vnd.oci.image.index.v1+json",
"digest": "sha256:1b9844d846ce3a6a6af7013e999a373112c3c0450aca49e155ae444526a2c45e",
"size": 3849
},
"Platforms": [
{
"architecture": "amd64",
"os": "linux"
}
]
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A check was added to the bridge driver to detect when it was called to
create the default bridge nw whereas a stale default bridge already
existed. In such case, the bridge driver was deleting the stale network
before re-creating it. This check was introduced in docker/libnetwork@6b158eac6a
to fix an issue related to newly introduced live-restore.
However, since commit docker/docker@ecffb6d58c,
the daemon doesn't even try to create default networks if there're
active sandboxes (ie. due to live-restore).
Thus, now it's impossible for the default bridge network to be stale and
to exists when the driver's CreateNetwork() method is called. As such,
the check introduced in the first commit mentioned above is dead code
and can be safely removed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This hopefully makes the test less flakey (or removes any flake that
would be caused by the test itself).
1. Adds tail of cluster daemon logs when there is a test failure so we
can more easily see what may be happening
2. Scans the daemon logs to check if the key is rotated before
restarting the daemon. This is a little hacky but a little better
than assuming it is done after a hard-coded 3 seconds.
3. Cleans up the `node ls` check such that it uses a poll function
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The Go 1.21.5 compiler has a bug: per-file language version override
directives do not take effect when instantiating generic functions which
have certain nontrivial type constraints. Consequently, a module-mode
project with Moby as a dependency may fail to compile when the compiler
incorrectly applies go1.16 semantics to the generic function call.
As the offending function is trivial and is only used in one place, work
around the issue by converting it to a concretely-typed function.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Copy the swagger / OpenAPI file to the documentation. This is the API
version used by the upcoming v25.0.0 release.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The following fields are never written and are now marked as deprecated:
- `HairpinMode`
- `LinkLocalIPv6Address`
- `LinkLocalIPv6PrefixLen`
- `SecondaryIPAddress`
- `SecondaryIPv6Addresses`
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use stdlib's filepath.VolumeName to get the volume-name (if present) instead
of a self-crafted implementation, and unify the implementations for Windows
and Unix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package provided utilities to obtain the apparmor_parser version, as well
as loading a profile.
Commit e3e715666f (included in v24.0.0 through
bfffb0974e) deprecated GetVersion, as it was no
longer used, which made LoadProfile the only utility remaining in this package.
LoadProfile appears to have no external consumers, and the only use in our code
is "profiles/apparmor".
This patch moves the remaining code (LoadProfile) to profiles/apparmor as a
non-exported function, and deletes the package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit e3e715666f (included in v24.0.0 through
bfffb0974e) deprecated GetVersion, as it was no
longer used.
This patch removes the deprecated utility, and inlines the remaining code into
the LoadProfile function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function never returned an error, and was not matching an interface, so
remove the error-return.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The cleanup function never returns an error, so didn't add much value. This
patch removes the closure, and calls it inline to remove the extra
indirection, and removes the error which would never be returned.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The defer was set after the switch, but various code-paths inside the switch
could return with an error after the port was allocated / reserved, which
could result in those ports not being released.
This patch moves the defer into each individual branch of the switch to set
it immediately after succesfully reserving the port.
We can also remove a redundant ReleasePort from the cleanup function, as
it's only called if an error occurs, and the defers already take care of
that.
Note that the cleanup function was handling errors returned by ReleasePort,
but this function never returns an error, so it was fully redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent accidentally shadowing the error, which is used in a defer.
Also re-format the code to make it more clear we're not acting on
a locally-scoped "allocatedHostPort" variable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/klauspost/compress/compare/v1.17.2...v1.17.4
v1.17.4:
- huff0: Speed up symbol counting
- huff0: Remove byteReader
- gzhttp: Allow overriding decompression on transport
- gzhttp: Clamp compression level
- gzip: Error out if reserved bits are set
v1.17.3:
- fse: Fix max header size
- zstd: Improve better/best compression
- gzhttp: Fix missing content type on Close
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All underlying jobs inherit from the status of all parent jobs
in the tree, not just the very parent. We need to apply the same
kind of special condition.
Signed-off-by: CrazyMax <1951866+crazy-max@users.noreply.github.com>
If IPv6 is enabled for a bridge network, by the time configuration
is applied, the bridge will always have an address. Assert that, by
raising an error when the configuration is validated.
Use that to simplify the logic used to calculate which addresses
should be assigned to a bridge. Also remove a redundant check in
setupGatewayIPv6() and the error associated with it.
Fix unit tests that enabled IPv6, but didn't supply an IPv6 IPAM
address/pool. Before this change, these tests passed but silently
left the bridge without an IPv6 address.
(The daemon already ensured there was an IPv6 address, this change
does not add a new restriction on config at that level.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
Some checks in 'networkConfiguration.Validate()' were not running as
expected, they'd always pass - because 'parseNetworkOptions()' called
it before 'config.processIPAM()' had added IP addresses and gateways.
Signed-off-by: Rob Murray <rob.murray@docker.com>
No more concept of "anonymous endpoints". The equivalent is now an
endpoint with no DNSNames set.
Some of the code removed by this commit was mutating user-supplied
endpoint's Aliases to add container's short ID to that list. In order to
preserve backward compatibility for the ContainerInspect endpoint, this
commit also takes care of adding that short ID (and the container
hostname) to `EndpointSettings.Aliases` before returning the response.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The remapping in the commit code was in the wrong place, we would create
a diff and then remap the snapshot, but the descriptor created in
"CreateDiff" was still pointing to the old snapshot, we now remap the
snapshot before creating a diff. Also make sure we don't lose any
capabilities, they used to be lost after the chown.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
If the resolver's DNSBackend returns a name that cannot be marshaled
into a well-formed DNS message, the resolver will only discover this
when it attempts to write the reply message and it fails with an error.
No reply message is sent, leaving the client to wait out its timeout and
the user in the dark about what went wrong.
When writing the intended reply message fails, retry once with a
ServFail response to inform the client and user that the DNS query was
not resolved due to a problem with to the resolver, not the network.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The well-formedness of a DNS message is only checked when it is
serialized, through the (*dns.Msg).Pack() method. Add a call to Pack()
to our tstwriter mock to mirror the behaviour of the real
dns.ResponseWriter implementation. And fix tests which generated
ill-formed DNS query messages.
Signed-off-by: Cory Snider <csnider@mirantis.com>
They fail because exporting an image which targets a manifest list when
only one platform is available exports only the platform-specific
manifest so the ID of the loaded image is different (ID of the platform
manifest, not manifest list).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The semantics of an "anonymous" endpoint has always been weird: it was
set on endpoints which name shouldn't be taken into account when
inserting DNS records into libnetwork's `Controller.svcRecords` (and
into the NetworkDB). However, in that case the endpoint's aliases would
still be used to create DNS records; thus, making those "anonymous
endpoints" not so anonymous.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The `(*Endpoint).rename()` method is changed to only mutate `ep.name`
and let a new method `(*Endpoint).UpdateDNSNames()` handle DNS updates.
As a consequence, the rollback code that was part of
`(*Endpoint).rename()` is now removed, and DNS updates are now
rolled back by `ContainerRename`.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Instead of special-casing anonymous endpoints, use the list of DNS names
associated to the endpoint.
`(*Endpoint).isAnonymous()` has no more uses, so let's delete it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This new property will be empty if the daemon was upgraded with
live-restore enabled. To not break DNS resolutions for restored
containers, we need to populate dnsNames based on endpoint's myAliases &
anonymous properties.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Instead of special-casing anonymous endpoints in libnetwork, let the
daemon specify what (non fully qualified) DNS names should be associated
to container's endpoints.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
update the package, which contains a fix in the ssh package.
full diff: https://github.com/golang/crypto/compare/v0.16.0...v0.17.0
from the security mailing:
> Hello gophers,
>
> Version v0.17.0 of golang.org/x/crypto fixes a protocol weakness in the
> golang.org/x/crypto/ssh package that allowed a MITM attacker to compromise
> the integrity of the secure channel before it was established, allowing
> them to prevent transmission of a number of messages immediately after
> the secure channel was established without either side being aware.
>
> The impact of this attack is relatively limited, as it does not compromise
> confidentiality of the channel. Notably this attack would allow an attacker
> to prevent the transmission of the SSH2_MSG_EXT_INFO message, disabling a
> handful of newer security features.
>
> This protocol weakness was also fixed in OpenSSH 9.6.
>
> Thanks to Fabian Bäumer, Marcus Brinkmann, and Jörg Schwenk from Ruhr
> University Bochum for reporting this issue.
>
> This is CVE-2023-48795 and Go issue https://go.dev/issue/64784.
>
> Cheers,
> Roland on behalf of the Go team
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ensure that when removing an image, an image is checked consistently
against the images with the same target digest. Add unit testing around
delete.
Signed-off-by: Derek McGowan <derek@mcg.dev>
Simplify the hijack process by just performing the http request/response
on the connection and returning the raw conn after success. The client
conn from httputil is deprecated and easily replaced.
Signed-off-by: Derek McGowan <derek@mcg.dev>
This new property is meant to replace myAliases and anonymous
properties.
The end goal is to get rid of both properties by letting the daemon
determine what (non fully qualified) DNS names should be associated to
them.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
They just happen to exist on a network that doesn't support DNS-based
service discovery (ie. no embedded DNS servers are started for them).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.
(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)
IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.
Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.
Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Before this change `ParentId` was filled for images when calling the
`/images/json` (image list) endpoint but was not for the
`/images/<image>/json` (image inspect).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This command was originally added by ea7f555446
to test the code snippet put into libnet's README.md. Nothing compiles
this file and it doesn't add any value to the project. So better remove
it than maintaining it.
This commit also removes the code snippet from libnet's README.md for
the same reasons.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This repository is not yet a module (i.e., does not have a `go.mod`). This
is not problematic when building the code in GOPATH or "vendor" mode, but
when using the code as a module-dependency (in module-mode), different semantics
are applied since Go1.21, which switches Go _language versions_ on a per-module,
per-package, or even per-file base.
A condensed summary of that logic [is as follows][1]:
- For modules that have a go.mod containing a go version directive; that
version is considered a minimum _required_ version (starting with the
go1.19.13 and go1.20.8 patch releases: before those, it was only a
recommendation).
- For dependencies that don't have a go.mod (not a module), go language
version go1.16 is assumed.
- Likewise, for modules that have a go.mod, but the file does not have a
go version directive, go language version go1.16 is assumed.
- If a go.work file is present, but does not have a go version directive,
language version go1.17 is assumed.
When switching language versions, Go _downgrades_ the language version,
which means that language features (such as generics, and `any`) are not
available, and compilation fails. For example:
# github.com/docker/cli/cli/context/store
/go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/storeconfig.go:6:24: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
/go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/store.go:74:12: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
Note that these fallbacks are per-module, per-package, and can even be
per-file, so _(indirect) dependencies_ can still use modern language
features, as long as their respective go.mod has a version specified.
Unfortunately, these failures do not occur when building locally (using
vendor / GOPATH mode), but will affect consumers of the module.
Obviously, this situation is not ideal, and the ultimate solution is to
move to go modules (add a go.mod), but this comes with a non-insignificant
risk in other areas (due to our complex dependency tree).
We can revert to using go1.16 language features only, but this may be
limiting, and may still be problematic when (e.g.) matching signatures
of dependencies.
There is an escape hatch: adding a `//go:build` directive to files that
make use of go language features. From the [go toolchain docs][2]:
> The go line for each module sets the language version the compiler enforces
> when compiling packages in that module. The language version can be changed
> on a per-file basis by using a build constraint.
>
> For example, a module containing code that uses the Go 1.21 language version
> should have a `go.mod` file with a go line such as `go 1.21` or `go 1.21.3`.
> If a specific source file should be compiled only when using a newer Go
> toolchain, adding `//go:build go1.22` to that source file both ensures that
> only Go 1.22 and newer toolchains will compile the file and also changes
> the language version in that file to Go 1.22.
This patch adds `//go:build` directives to those files using recent additions
to the language. It's currently using go1.19 as version to match the version
in our "vendor.mod", but we can consider being more permissive ("any" requires
go1.18 or up), or more "optimistic" (force go1.21, which is the version we
currently use to build).
For completeness sake, note that any file _without_ a `//go:build` directive
will continue to use go1.16 language version when used as a module.
[1]: 58c28ba286/src/cmd/go/internal/gover/version.go (L9-L56)
[2]: https://go.dev/doc/toolchain
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These fields were an implementation detail of the classic image builder
and are empty when using buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When the doc job is skipped, the dependent ones will be skipped
as well. To fix this issue we need to apply special conditions
to always run dependent jobs but not if canceled or failed.
Signed-off-by: CrazyMax <1951866+crazy-max@users.noreply.github.com>
build --squash is an experimental feature that is not implemented in the
containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We already have everything needed to work inside a container, with this
configuration file developing in moby is even easier: the IDE will ask
you if you want to run everything inside a container and set it up for
you. No need to know that you have to run "BIN_DIR=. make shell" any
more.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
No need to copy the parent label from the source dangling image, because
it will already be copied from the source image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A validation step was added to prevent the daemon from considering "logentries"
as a dynamically loaded plugin, causing it to continue trying to load the plugin;
WARN[2023-12-12T21:53:16.866857127Z] Unable to locate plugin: logentries, retrying in 1s
WARN[2023-12-12T21:53:17.868296836Z] Unable to locate plugin: logentries, retrying in 2s
WARN[2023-12-12T21:53:19.874259254Z] Unable to locate plugin: logentries, retrying in 4s
WARN[2023-12-12T21:53:23.879869881Z] Unable to locate plugin: logentries, retrying in 8s
But would ultimately be returned as an error to the user:
docker container create --name foo --log-driver=logentries nginx:alpine
Error response from daemon: error looking up logging plugin logentries: plugin "logentries" not found
With the additional validation step, an error is returned immediately:
docker container create --log-driver=logentries busybox
Error response from daemon: the logentries logging driver has been deprecated and removed
A migration step was added on container restore. Containers using the
"logentries" logging driver are migrated to use the "local" logging driver:
WARN[2023-12-12T22:38:53.108349297Z] migrated deprecated logentries logging driver container=4c9309fedce75d807340ea1820cc78dc5c774d7bfcae09f3744a91b84ce6e4f7 error="<nil>"
As an alternative to the validation step, I also considered using a "stub"
deprecation driver, however this would not result in an error when creating
the container, and only produce an error when starting:
docker container create --name foo --log-driver=logentries nginx:alpine
4c9309fedce75d807340ea1820cc78dc5c774d7bfcae09f3744a91b84ce6e4f7
docker start foo
Error response from daemon: failed to create task for container: failed to initialize logging driver: the logentries logging driver has been deprecated and removed
Error: failed to start containers: foo
For containers, this validation is added in the backend (daemon). For services,
this was not sufficient, as SwarmKit would try to schedule the task, which
caused a close loop;
docker service create --log-driver=logentries --name foo nginx:alpine
zo0lputagpzaua7cwga4lfmhp
overall progress: 0 out of 1 tasks
1/1: no suitable node (missing plugin on 1 node)
Operation continuing in background.
DEBU[2023-12-12T22:50:28.132732757Z] Calling GET /v1.43/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22zo0lputagpzaua7cwga4lfmhp%22%3Atrue%7D%7D
DEBU[2023-12-12T22:50:28.137961549Z] Calling GET /v1.43/nodes
DEBU[2023-12-12T22:50:28.340665007Z] Calling GET /v1.43/services/zo0lputagpzaua7cwga4lfmhp?insertDefaults=false
DEBU[2023-12-12T22:50:28.343437632Z] Calling GET /v1.43/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22zo0lputagpzaua7cwga4lfmhp%22%3Atrue%7D%7D
DEBU[2023-12-12T22:50:28.345201257Z] Calling GET /v1.43/nodes
So a validation was added in the service create and update endpoints;
docker service create --log-driver=logentries --name foo nginx:alpine
Error response from daemon: the logentries logging driver has been deprecated and removed
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The service was discontinued on November 15, 2022, so
remove mentions of this driver in the API docs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The service was discontinued on November 15, 2022, so
remove mentions of this driver in the API docs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Logentries service will be discontinued next week:
> Dear Logentries user,
>
> We have identified you as the owner of, or collaborator of, a Logentries account.
>
> The Logentries service will be discontinued on November 15th, 2022. This means that your Logentries account access will be removed and all your log data will be permanently deleted on this date.
>
> Next Steps
> If you are interested in an alternative Rapid7 log management solution, InsightOps will be available for purchase through December 16th, 2022. Please note, there is no support to migrate your existing Logentries account to InsightOps.
>
> Thank you for being a valued user of Logentries.
>
> Thank you,
> Rapid7 Customer Success
There is no reason to preserve this code in Moby as a result.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit switches our code to use semconv 1.21, which is the version matching
the OTEL modules, as well as the containerd code.
The BuildKit 0.12.x module currently uses an older version of the OTEL modules,
and uses the semconv 0.17 schema. Mixing schema-versions is problematic, but
we still want to consume BuildKit's "detect" package to wire-up other parts
of OTEL.
To align the versions in our code, this patch sets the BuildKit detect.Resource
with the correct semconv version.
It's worth noting that the BuildKit package has a custom "serviceNameDetector";
https://github.com/moby/buildkit/blob/v0.12.4/util/tracing/detect/detect.go#L153-L169
Whith is merged with OTEL's default resource:
https://github.com/moby/buildkit/blob/v0.12.4/util/tracing/detect/detect.go#L100-L107
There's no need to duplicate that code, as OTEL's `resource.Default()` already
provides this functionality:
- It uses fromEnv{} detector internally: https://github.com/open-telemetry/opentelemetry-go/blob/v1.19.0/sdk/resource/resource.go#L208
- fromEnv{} detector reads OTEL_SERVICE_NAME: https://github.com/open-telemetry/opentelemetry-go/blob/v1.19.0/sdk/resource/env.go#L53
This patch also removes uses of the httpconv package, which is no longer included
in semconv 1.21 and now an internal package. Removing the use of this package
means that hijacked connections will not have the HTTP attributes on the Moby
client span, which isn't ideal, but a limited loss that'd impact exec/attach.
The span itself will still exist, it just won't the additional attributes that
are added by that package.
Alternatively, the httpconv call COULD remain - it will not error and will send
syntactically valid spans but we would be mixing & matching semconv versions,
so won't be compliant.
Some parts of the httpconv package were preserved through a very minimal local
implementation; a variant of `httpconv.ClientStatus(resp.StatusCode))` is added
to set the span status (`span.SetStatus()`). The `httpconv` package has complex
logic for this, but mostly drills down to HTTP status range (1xx/2xx/3xx/4xx/5xx)
to determine if the status was successfull or non-successful (4xx/5xx).
The additional logic it provided was to validate actual status-codes, and to
convert "bogus" status codes in "success" ranges (1xx, 2xx) into an error. That
code seemed over-reaching (and not accounting for potential future _valid_
status codes). Let's assume we only get valid status codes.
- https://github.com/open-telemetry/opentelemetry-go/blob/v1.21.0/semconv/v1.17.0/httpconv/http.go#L85-L89
- https://github.com/open-telemetry/opentelemetry-go/blob/v1.21.0/semconv/internal/v2/http.go#L322-L330
- https://github.com/open-telemetry/opentelemetry-go/blob/v1.21.0/semconv/internal/v2/http.go#L356-L404
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upgrade to the latest OpenTelemetry libraries; this will unblock a lot of
downstream projects in the ecosystem to upgrade, as some of the parts here
were pre-1.0/unstable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
update the dependency to v0.2.4 to prevent scanners from flagging the
vulnerability (GHSA-6xv5-86q9-7xr8 / GO-2023-2048). Note that that vulnerability
only affects Windows, and is currently only used in runc/libcontainer, so should
not impact our use (as that code is Linux-only).
full diff: https://github.com/cyphar/filepath-securejoin/compare/v0.2.3...v0.2.4
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.10...v1.7.11
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.11
Welcome to the v1.7.11 release of containerd!
The eleventh patch release for containerd 1.7 contains various fixes and
updates including one security issue.
Notable Updates
- Fix Windows default path overwrite issue
- Update push to always inherit distribution sources from parent
- Update shim to use net dial for gRPC shim sockets
- Fix otel version incompatibility
- Fix Windows snapshotter blocking snapshot GC on remove failure
- Mask /sys/devices/virtual/powercap path in runtime spec and deny in
default apparmor profile [GHSA-7ww5-4wqc-m92c]
Deprecation Warnings
- Emit deprecation warning for AUFS snapshotter
- Emit deprecation warning for v1 runtime
- Emit deprecation warning for deprecated CRI configs
- Emit deprecation warning for CRI v1alpha1 usage
- Emit deprecation warning for CRIU config in CRI
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.9...v1.7.10
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.10
Welcome to the v1.7.10 release of containerd!
The tenth patch release for containerd 1.7 contains various fixes and
updates.
Notable Updates
- Enhance container image unpack client logs
- cri: fix using the pinned label to pin image
- fix: ImagePull should close http connection if there is no available data to read.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When reading logs, timestamps should always be presented in UTC. Unlike
the "json-file" and other logging drivers, the "local" logging driver
was using local time.
Thanks to Roman Valov for reporting this issue, and locating the bug.
Before this change:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
docker logs --timestamps fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
2023-12-08T18:16:56.291023422+01:00 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T18:16:56.291056463+01:00 /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T18:16:56.291890130+01:00 /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
With this patch:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
docker logs --timestamps 14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
2023-12-08T17:18:46.635967625Z /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T17:18:46.635989792Z /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T17:18:46.636897417Z /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If no `dangling` filter is specified, prune should only delete dangling
images.
This wasn't visible by doing `docker image prune` because the CLI
explicitly sets this filter to true.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Isolate the prune effects by running the test in a separate daemon.
This minimizes the impact of/on other integration tests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Config serialization performed by the graphdriver implementation
maintained the distinction between an empty array and having no Cmd set.
With containerd integration we serialize the OCI types directly that use
the `omitempty` option which doesn't persist that distinction.
Considering that both values should have exactly the same semantics (no
cmd being passed) it should be fine if in this case the Cmd would be
null instead of an empty array.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Acquire the mutex in the help handler to synchronize access to the
handlers map. While a trivial issue---a panic in the request handler if
the node joins a swarm at just the right time, which would only result
in an HTTP 500 response---it is also a trivial race condition to fix.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We don't need C-style callback functions which accept a void* context
parameter: Go has closures. Drop the unnecessary httpHandlerCustom type
and refactor the diagnostic server handler functions into closures which
capture whatever context they need implicitly.
If the node leaves and rejoins a swarm, the cluster agent and its
associated NetworkDB are discarded and replaced with new instances. Upon
rejoin, the agent registers its NetworkDB instance with the diagnostic
server. These handlers would all conflict with the handlers registered
by the previous NetworkDB instance. Attempting to register a second
handler on a http.ServeMux with the same pattern will panic, which the
diagnostic server would historically deal with by ignoring the duplicate
handler registration. Consequently, the first NetworkDB instance to be
registered would "stick" to the diagnostic server for the lifetime of
the process, even after it is replaced with another instance. Improve
duplicate-handler registration such that the most recently-registered
handler for a pattern is used for all subsequent requests.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Rewrite `.build-empty-images` shell script that produced special images
(emptyfs with no layers, and empty danglign image) to a Go functions
that construct the same archives in a temporary directory.
Use them to load these images on demand only in the tests that need
them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This struct is intended for internal use only for the backend, and is
not intended to be used externally.
This moves the plugin-related `NetworkListConfig` types to the backend
package to prevent it being imported in the client, and to make it more
clear that this is part of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These structs are intended for internal use only for the backend, and are
not intended to be used externally.
This moves the plugin-related `PluginRmConfig`, `PluginEnableConfig`, and
`PluginDisableConfig` types to the backend package to prevent them being
imported in the client, and to make it more clear that this is part of
internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.21.5 (released 2023-12-05) includes security fixes to the go command,
and the net/http and path/filepath packages, as well as bug fixes to the
compiler, the go command, the runtime, and the crypto/rand, net, os, and
syscall packages. See the Go 1.21.5 milestone on our issue tracker for
details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.5+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.4...go1.21.5
from the security mailing:
[security] Go 1.21.5 and Go 1.20.12 are released
Hello gophers,
We have just released Go versions 1.21.5 and 1.20.12, minor point releases.
These minor releases include 3 security fixes following the security policy:
- net/http: limit chunked data overhead
A malicious HTTP sender can use chunk extensions to cause a receiver
reading from a request or response body to read many more bytes from
the network than are in the body.
A malicious HTTP client can further exploit this to cause a server to
automatically read a large amount of data (up to about 1GiB) when a
handler fails to read the entire body of a request.
Chunk extensions are a little-used HTTP feature which permit including
additional metadata in a request or response body sent using the chunked
encoding. The net/http chunked encoding reader discards this metadata.
A sender can exploit this by inserting a large metadata segment with
each byte transferred. The chunk reader now produces an error if the
ratio of real body to encoded bytes grows too small.
Thanks to Bartek Nowotarski for reporting this issue.
This is CVE-2023-39326 and Go issue https://go.dev/issue/64433.
- cmd/go: go get may unexpectedly fallback to insecure git
Using go get to fetch a module with the ".git" suffix may unexpectedly
fallback to the insecure "git://" protocol if the module is unavailable
via the secure "https://" and "git+ssh://" protocols, even if GOINSECURE
is not set for said module. This only affects users who are not using
the module proxy and are fetching modules directly (i.e. GOPROXY=off).
Thanks to David Leadbeater for reporting this issue.
This is CVE-2023-45285 and Go issue https://go.dev/issue/63845.
- path/filepath: retain trailing \ when cleaning paths like \\?\c:\
Go 1.20.11 and Go 1.21.4 inadvertently changed the definition of the
volume name in Windows paths starting with \\?\, resulting in
filepath.Clean(\\?\c:\) returning \\?\c: rather than \\?\c:\ (among
other effects). The previous behavior has been restored.
This is an update to CVE-2023-45283 and Go issue https://go.dev/issue/64028.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.21.4 (released 2023-11-07) includes security fixes to the path/filepath
package, as well as bug fixes to the linker, the runtime, the compiler, and
the go/types, net/http, and runtime/cgo packages. See the Go 1.21.4 milestone
on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.4+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.3...go1.21.4
from the security mailing:
[security] Go 1.21.4 and Go 1.20.11 are released
Hello gophers,
We have just released Go versions 1.21.4 and 1.20.11, minor point releases.
These minor releases include 2 security fixes following the security policy:
- path/filepath: recognize `\??\` as a Root Local Device path prefix.
On Windows, a path beginning with `\??\` is a Root Local Device path equivalent
to a path beginning with `\\?\`. Paths with a `\??\` prefix may be used to
access arbitrary locations on the system. For example, the path `\??\c:\x`
is equivalent to the more common path c:\x.
The filepath package did not recognize paths with a `\??\` prefix as special.
Clean could convert a rooted path such as `\a\..\??\b` into
the root local device path `\??\b`. It will now convert this
path into `.\??\b`.
`IsAbs` did not report paths beginning with `\??\` as absolute.
It now does so.
VolumeName now reports the `\??\` prefix as a volume name.
`Join(`\`, `??`, `b`)` could convert a seemingly innocent
sequence of path elements into the root local device path
`\??\b`. It will now convert this to `\.\??\b`.
This is CVE-2023-45283 and https://go.dev/issue/63713.
- path/filepath: recognize device names with trailing spaces and superscripts
The `IsLocal` function did not correctly detect reserved names in some cases:
- reserved names followed by spaces, such as "COM1 ".
- "COM" or "LPT" followed by a superscript 1, 2, or 3.
`IsLocal` now correctly reports these names as non-local.
This is CVE-2023-45284 and https://go.dev/issue/63713.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon currently provides support for API versions all the way back
to v1.12, which is the version of the API that shipped with docker 1.0. On
Windows, the minimum supported version is v1.24.
Such old versions of the client are rare, and supporting older API versions
has accumulated significant amounts of code to remain backward-compatible
(which is largely untested, and a "best-effort" at most).
This patch updates the minimum API version to v1.24, which is the fallback
API version used when API-version negotiation fails. The intent is to start
deprecating older API versions, but no code is removed yet as part of this
patch, and a DOCKER_MIN_API_VERSION environment variable is added, which
allows overriding the minimum version (to allow restoring the behavior from
before this patch).
With this patch the daemon defaults to API v1.24 as minimum:
docker version
Client:
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: dev
API version: 1.44 (minimum version 1.24)
Go version: go1.21.3
Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
Built: Mon Dec 4 15:22:17 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v1.7.9
GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Trying to use an older version of the API produces an error:
DOCKER_API_VERSION=1.23 docker version
Client:
Version: 24.0.2
API version: 1.23 (downgraded from 1.43)
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Error response from daemon: client version 1.23 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version
To restore the previous minimum, users can start the daemon with the
DOCKER_MIN_API_VERSION environment variable set:
DOCKER_MIN_API_VERSION=1.12 dockerd
API 1.12 is the oldest supported API version on Linux;
docker version
Client:
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: dev
API version: 1.44 (minimum version 1.12)
Go version: go1.21.3
Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
Built: Mon Dec 4 15:22:17 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v1.7.9
GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
When using the `DOCKER_MIN_API_VERSION` with a version of the API that
is not supported, an error is produced when starting the daemon;
DOCKER_MIN_API_VERSION=1.11 dockerd --validate
invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: 1.11
DOCKER_MIN_API_VERSION=1.45 dockerd --validate
invalid DOCKER_MIN_API_VERSION: maximum supported API version is 1.44: 1.45
Specifying a malformed API version also produces the same error;
DOCKER_MIN_API_VERSION=hello dockerd --validate
invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: hello
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ContainerCreateConfig` and `ContainerRmConfig` structs are used for
options to be passed to the backend, and are not used in client code.
Thess struct currently is intended for internal use only (for example, the
`AdjustCPUShares` is an internal implementation details to adjust the container's
config when older API versions are used).
Somewhat ironically, the signature of the Backend has a nicer UX than that
of the client's `ContainerCreate` signature (which expects all options to
be passed as separate arguments), so we may want to update that signature
to be closer to what the backend is using, but that can be left as a future
exercise.
This patch moves the `ContainerCreateConfig` and `ContainerRmConfig` structs
to the backend package to prevent it being imported in the client, and to make
it more clear that this is part of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Follow-up to e72c4818c4, which updated the
Dockerfile to use Debian 12 "bookworm", but forgot to update the package
repository to use for the CRIU packages. Note that the criu stage is currently
not built by default (see d3d2823edf), so to
verify the stage, it needs to be built manually;
docker build --target=criu .
This patch adds an extra `criu --version` to the build, so that it's verified
to be "functional".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests build new images, setupTest sets up the test cleanup
function that clears the test environment from created images,
containers, etc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This removes various skips that accounted for running the integration tests
against older versions of the daemon before 20.10 (API version v1.41). Those
versions are EOL, and we don't run tests against them.
This reverts most of e440831802, and similar
PRs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the minimum API version as advertised by the test-daemon, instead of the
hard-coded API version from code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was using API version 1.20 to test old behavior, but the actual change
in behavior was API v1.25; see commit 6d98e344c7
and 63b5a37203.
This updates the test to use API v1.24 to test the old behavior.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestKillDifferentUserContainer was migrated from integration-cli in
commit 0855922cd3. Before migration, it
was not using a specific API version, so we can assume "current"
API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 59b83d8aae containerized these steps,
as they didn't work well on Debian Jessie:
> Because the `mount` here will sometimes fail when run in `debian:jessie`,
> which is what the environrment hosting the test suite is running if run
> from the `Makefile`.
> Also, why the heck not containerize it, all the things.
Follow-up commits, such as 228d74842f, and
1c5806cf57 updated the Debian distro, but
also updated this comment, losing the original context (the issue was
(originally) related to Debian Jessie).
This patch changes the test back to not use containers, which seems to
work fine (at least "it worked on my machine").
make TEST_IGNORE_CGROUP_CHECK=1 TEST_FILTER=TestDaemonNoSpaceLeftOnDeviceError DOCKER_GRAPHDRIVER=overlay2 test-integration
=== RUN TestDockerDaemonSuite/TestDaemonNoSpaceLeftOnDeviceError
check_test.go:589: [df36ad96a412b] daemon is not started
--- PASS: TestDockerDaemonSuite (5.12s)
--- PASS: TestDockerDaemonSuite/TestDaemonNoSpaceLeftOnDeviceError (5.12s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was originally added in 8ec8564691,
at which time the upstream debian package repositories were not always
reliable, so using a mirror helped with CI stability and performance.
Debian's package repositories are a lot more reliable now, so there's no
longer a need to use a mirror.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When creating a CIFS volume, generate an error if the device URL
includes a port number, for example:
--opt device="//some.server.com:2345/thepath"
The port must be specified in the port option instead, for example:
--opt o=username=USERNAME,password=PASSWORD,vers=3,sec=ntlmsspi,port=1234
Signed-off-by: Rob Murray <rob.murray@docker.com>
Add the TaskStatus, PortStatus and ContainerStatus to api docs. TaskStatus was moved to the swagger definitions root from anonymous type definition, and PortStatus and Container Status are its dependencies.
Signed-off-by: Martin Jirku <martin@jirku.sk>
Move the initialization logic to the attachContext itself, so that
the container doesn't have to be aware about mutexes and other logic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Skip TestListDanglingImagesWithDigests which tests graphdriver
implementation specific behavior of `docker images --filter
dangling=true`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This removes various templating functions that were added for the
docker CLI. These are not needed for the dockerd binary, which does
not have subcommands or management commands.
Revert "Only hide commands if the env variable is set."
This reverts commit a7c8bcac2b.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We used DEBIAN_FRONTEND in some places to prevent installation of packages
from being blocked. However, debian bookworm now [includes a fix][1] for
situations like this (it was specifically reported for Docker situations <3),
so we can get rid of these.
Thanks to Tianon for noticing this, and for linking to the Debian ticket!
[1]: https://bugs.debian.org/929417
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Replace `time.Sleep` with a poll that checks if process no longer exists
to avoid possible race condition.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
I noticed this log being logged as an error, but the kill logic actually
proceeds after this (doing a "direct" kill instead). While usually containers
are expected to be exiting within the given timeout, I don't think this
needs to be logged as an error (an error is returned after we fail to
kill the container).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The auth service error response is not a part of the spec and containerd
doesn't parse it like the Docker's distribution does.
Check for containerd specific errors instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Debian Woodworm ships with a newer version of iptables, which caused two
tests to fail:
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
docker_cli_daemon_test.go:841: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge6.*ext-bridge6", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge6 !ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge6 ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
docker_cli_daemon_test.go:803: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge5.*ext-bridge5", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge5 !ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge5 ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
Both the `TestDaemonICCPing`, and `TestDaemonICCLinkExpose` test were introduced
in dd0666e64f. These tests called `iptables` with
the `-n` (`--numeric`) option, which prevents it from doing a reverse-DNS lookup
as an optimization.
However, the `-n` option did not have an effect to the `prot` column before
commit [da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa] (iptables < v1.8.9 or v1.8.8).
Newer versions, such as the iptables version shipping with Debian Woodworm do,
so we need to update the expected output for this version.
This patch removes the `-n` option, to keep the test more portable, also when
run non-containerized, and removes the use of regular expressions to check the
result, as these regular expressions were quite permissive (using `.*` wild-
card matching). Instead, we're getting the
With this change;
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestDaemonICC TEST_IGNORE_CGROUP_CHECK=1 test-integration
...
--- PASS: TestDockerDaemonSuite (139.11s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCLinkExpose (54.62s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCPing (84.48s)
[da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa]: https://git.netfilter.org/iptables/commit/?id=da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When live-restore is enabled, containers with autoremove enabled
shouldn't be forcibly killed when engine restarts.
They still should be removed if they exited while the engine was down
though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Change the repo name used as for an intermediate image so it doesn't
try to mount from the image pushed by `TestBuildMultiStageImplicitPull`.
Before this patch, this test failed because the distribution.source
labels are not cleared between tests and the busybox content still has
the distribution.source label pointing to the `dockercli/testf`
repository which is no longer present in the test registry.
So both `dockercli/busybox` and `dockercli/testf` are equally valid
mount candidates for `dockercli/crossrepopush` and containerd algorithm
just happens to select the last one.
This changes the repo name to not have the common repository component
(`dockercli`) with the `dockercli/testf` repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Starting with [6e0ed3d19c54603f0f7d628ea04b550151d8a262], the minimum
allowed size is now 300MB. Given that this is a sparse image, and
the size of the image is irrelevant to the test (we check for
limits defined through project-quotas, not the size of the
device itself), we can raise the size of this image.
[6e0ed3d19c54603f0f7d628ea04b550151d8a262]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=6e0ed3d19c54603f0f7d628ea04b550151d8a262
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On bookworm, AppArmor failed to start inside the container, which can be
seen at startup of the dev-container:
Created symlink /etc/systemd/system/systemd-firstboot.service → /dev/null.
Created symlink /etc/systemd/system/systemd-udevd.service → /dev/null.
Created symlink /etc/systemd/system/multi-user.target.wants/docker-entrypoint.service → /etc/systemd/system/docker-entrypoint.service.
hack/dind-systemd: starting /lib/systemd/systemd --show-status=false --unit=docker-entrypoint.target
systemd 252.17-1~deb12u1 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.
modprobe@configfs.service: Deactivated successfully.
modprobe@dm_mod.service: Deactivated successfully.
modprobe@drm.service: Deactivated successfully.
modprobe@efi_pstore.service: Deactivated successfully.
modprobe@fuse.service: Deactivated successfully.
modprobe@loop.service: Deactivated successfully.
apparmor.service: Starting requested but asserts failed.
proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 49 (systemd-binfmt)
+ source /etc/docker-entrypoint-cmd
++ hack/make.sh dynbinary test-integration
When checking "aa-status", an error was printed that the filesystem was
not mounted:
aa-status
apparmor filesystem is not mounted.
apparmor module is loaded.
Checking if "local-fs.target" was loaded, that seemed to be the case;
systemctl status local-fs.target
● local-fs.target - Local File Systems
Loaded: loaded (/lib/systemd/system/local-fs.target; static)
Active: active since Mon 2023-11-27 10:48:38 UTC; 18s ago
Docs: man:systemd.special(7)
However, **on the host**, "/sys/kernel/security" has a mount, which was not
present inside the container:
mount | grep securityfs
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
Interestingly, on `debian:bullseye`, this was not the case either; no
`securityfs` mount was present inside the container, and apparmor actually
failed to start, but succeeded silently:
mount | grep securityfs
systemctl start apparmor
systemctl status apparmor
● apparmor.service - Load AppArmor profiles
Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2023-11-27 11:59:09 UTC; 44s ago
Docs: man:apparmor(7)
https://gitlab.com/apparmor/apparmor/wikis/home/
Process: 43 ExecStart=/lib/apparmor/apparmor.systemd reload (code=exited, status=0/SUCCESS)
Main PID: 43 (code=exited, status=0/SUCCESS)
CPU: 10ms
Nov 27 11:59:09 9519f89cade1 apparmor.systemd[43]: Not starting AppArmor in container
Same, using the `/etc/init.d/apparmor` script:
/etc/init.d/apparmor start
Starting apparmor (via systemctl): apparmor.service.
echo $?
0
And apparmor was not actually active:
aa-status
apparmor module is loaded.
apparmor filesystem is not mounted.
aa-enabled
Maybe - policy interface not available.
After further investigating, I found that the non-systemd dind script
had a mount for AppArmor, which was added in 31638ab2ad
The systemd variant was missing this mount, which may have gone unnoticed
because `debian:bullseye` was silently ignoring this when starting the
apparmor service.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
BaseFS is not serialized and is lost after an unclean shutdown. Unmount
method in the containerd image service implementation will not work
correctly in that case.
This patch will allow Unmount to restore the BaseFS if the target is
still mounted.
The reason it works with graphdrivers is that it doesn't directly
operate on BaseFS. It uses RWLayer, which is explicitly restored
immediately as soon as container is loaded.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is purely cosmetic - if a non-default MTU is configured, the bridge
will have the default MTU=1500 until a container's 'veth' is connected
and an MTU is set on the veth. That's a disconcerting, it looks like the
config has been ignored - so, set the bridge's MTU explicitly.
Fixes#37937
Signed-off-by: Rob Murray <rob.murray@docker.com>
The graphdriver implementation sets the ModTime of all image content to
match the `Created` time from the image config, whereas the containerd's
archive export code just leaves it empty (zero).
Adjust the test in the case where containerd integration is enabled to
check if config file ModTime is equal to zero (UNIX epoch) instead.
This behaviour is not a part of the Docker Image Specification and the
intention behind introducing it was to make the `docker save` produce
the same archive regardless of the time it was performed.
It would also be a bit problematic with the OCI archive layout which can
contain multiple images referencing the same content.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Also, err `e` is renamed into the more standard `err` as the defer
already uses `retErr` to avoid clashes (changed in f5a611a74).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
DNS config is a property of each adapter on Windows, thus we've a
dedicated `EndpointOption` for that.
The list of `EndpointOption` that should be applied to a given endpoint
is built by `buildCreateEndpointOptions`. This function contains a
seemingly flawed condition that adds the DNS config _iff_:
1. the network isn't internal ;
2. no ports are published / exposed through another sandbox endpoint ;
While 1. does make sense, there's actually no justification for 2.,
hence this commit remove this part of the condition.
This logic flaw has been made obvious by 0fd0e82, but it was originally
introduced by d1e0a78. Commit and PR comments don't mention why this is
done like so. Most probably, this was overlooked both by the original
author and the PR reviewers.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The `buildCreateEndpointOptions` does a lot of things to build the list
of `libnetwork.EndpointOption` from the `EndpointSettings` spec. To skip
ports-related options, an early return was put in the middle of that
function body.
Early returns are generally great, but put in the middle of a 150-loc
long function that does a lot, they're just a potential footgun. And I'm
the one who pulled the trigger in 052562f. Since this commit, generic
options won't be applied to endpoints if there's already one with
exposed/published ports. As a consequence, only the first endpoint can
have a user-defined MAC address right now.
Instead of moving up the code line that adds generic options, a better
change IMO is to move ports-related options, and the early-return gating
those options, to a dedicated func to make `buildCreateEndpointOptions`
slightly easier to read and reason about.
There was actually one oddity in the original
`buildCreateEndpointOptions`: the early-return also gates the addition
of `CreateOptionDNS`. These options are Windows-specific; a comment is
added to explain that. But the oddity is really: why are we checking if
an endpoint with exposed / published ports joined this sandbox to decide
whether we want to configure DNS server on the endpoint's adapter? Well,
this early-return was most probably overlooked by the original author
and by reviewers at the time these options were added (in commit d1e0a78)
Let's fix that in a follow-up commit.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The example still used the deprecated types.ContainerListOptions;
also slightly updated the example to show both stopped and running
containers, so that the example works even if no container is running.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The DirCopy() function in "graphdriver/copy/copy.go" has a special case for
skip file-attribute copying when making a hard link to an already-copied
file, if "copyMode == Hardlink". Do the same for copies of hard-links in
the source filesystem.
Significantly speeds up vfs's copy of a BusyBox filesystem (which
consists mainly of hard links to a single binary), making moby's
integration tests run more quickly and more reliably in a dev container.
Fixes#46810
Signed-off-by: Rob Murray <rob.murray@docker.com>
- full diff: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.10
This is the tenth (and most likely final) patch release in the 1.1.z
release branch of runc. It mainly fixes a few issues in cgroups, and a
umask-related issue in tmpcopyup.
- Add support for `hugetlb.<pagesize>.rsvd` limiting and accounting.
Fixes the issue of postgres failing when hugepage limits are set.
- Fixed permissions of a newly created directories to not depend on the value
of umask in tmpcopyup feature implementation.
- libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes`
(fixes the compatibility with Linux kernel 6.1+).
- Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb
configuration. This issue is not a security issue because it requires a
malicious config.json, which is outside of our threat model.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.10
This is the tenth (and most likely final) patch release in the 1.1.z
release branch of runc. It mainly fixes a few issues in cgroups, and a
umask-related issue in tmpcopyup.
- Add support for `hugetlb.<pagesize>.rsvd` limiting and accounting.
Fixes the issue of postgres failing when hugepage limits are set.
- Fixed permissions of a newly created directories to not depend on the value
of umask in tmpcopyup feature implementation.
- libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes`
(fixes the compatibility with Linux kernel 6.1+).
- Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb
configuration. This issue is not a security issue because it requires a
malicious config.json, which is outside of our threat model.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a strong type for the DNS IP-addresses so that we can use flags.IPSliceVar,
instead of implementing our own option-type and validation.
Behavior should be the same, although error-messages have slightly changed:
Before this patch:
dockerd --dns 1.1.1.1oooo --validate
Status: invalid argument "1.1.1.1oooo" for "--dns" flag: 1.1.1.1oooo is not an ip address
See 'dockerd --help'., Code: 125
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1"]}
dockerd --dns 2.2.2.2 --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: dns: (from flag: [2.2.2.2], from file: [1.1.1.1])
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1oooo"]}
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: 1.1.1.1ooooo is not an ip address
With this patch:
dockerd --dns 1.1.1.1oooo --validate
Status: invalid argument "1.1.1.1oooo" for "--dns" flag: invalid string being converted to IP address: 1.1.1.1oooo
See 'dockerd --help'., Code: 125
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1"]}
dockerd --dns 2.2.2.2 --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: dns: (from flag: [2.2.2.2], from file: [1.1.1.1])
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1oooo"]}
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid IP address: 1.1.1.1oooo
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- document accepted values
- add test-coverage for the function's behavior (including whitespace handling),
and use sub-tests.
- improve error-message to use uppercase for "IP", and to use a common prefix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I was trying to find out why `docker info` was sometimes slow so
plumbing a context through to propagate trace data through.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The forwarding database (fdb) of Linux VXLAN links are restricted to
entries with destination VXLAN tunnel endpoint (VTEP) address of a
single address family. Which address family is permitted is set when the
link is created and cannot be modified. The overlay network driver
creates VXLAN links such that the kernel only allows fdb entries to be
created with IPv4 destination VTEP addresses. If the Swarm is configured
with IPv6 advertise addresses, creating fdb entries for remote peers
fails with EAFNOSUPPORT (address family not supported by protocol).
Make overlay networks functional over IPv6 transport by configuring the
VXLAN links for IPv6 VTEPs if the local node's advertise address is an
IPv6 address. Make encrypted overlay networks secure over IPv6 transport
by applying the iptables rules to the ip6tables when appropriate.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Early return if the iface or its address is nil to make the whole
function slightly easier to read.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
We remap the snapshot when we create a container, we have to to the
inverse when we commit the container into an image
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This test broke in 98323ac114.
This commit renamed WithMacAddress into WithContainerWideMacAddress.
This helper sets the MacAddress field in container.Config. However, API
v1.44 now ignores this field if the NetworkMode has no matching entry in
EndpointsConfig.
This fix uses the helper WithMacAddress and specify for which
EndpointConfig the MacAddress is specified.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When server address is not provided with the auth configuration,
use the domain from the image provided with the auth.
Signed-off-by: Derek McGowan <derek@mcg.dev>
The workaround is no longer required. The bug has been fixed in stable
versions of all supported containerd branches.
This reverts commit fb7ec1555c.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Change the non-refcounted implementation to perform the mount using the
same identity and access right. They should be the same regardless if
we're refcounting or not.
This also allows to refactor refCountMounter into a mounter decorator.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test was rewritten from an integration-cli test in commit
68d9beedbe, and originally implemented in
f4942ed864, which rewrote it from a unit-
test to an integration test.
Originally, it would check for the raw JSON response from the daemon, and
check for individual fields to be present in the output, but after commit
0fd5a65428, `client.Info()` was used, and
now the response is unmarshalled into a `system.Info`.
The remainder of the test remained the same in that rewrite, and as a
result were were now effectively testing if a `system.Info` struct,
when marshalled as JSON would show all the fields (surprise: it does).
TL;DR; the test would even pass with an empty `system.Info{}` struct,
which didn't provide much coverage, as it passed without a daemon:
func TestInfoAPI(t *testing.T) {
// always shown fields
stringsToCheck := []string{
"ID",
"Containers",
"ContainersRunning",
"ContainersPaused",
"ContainersStopped",
"Images",
"LoggingDriver",
"OperatingSystem",
"NCPU",
"OSType",
"Architecture",
"MemTotal",
"KernelVersion",
"Driver",
"ServerVersion",
"SecurityOptions",
}
out := fmt.Sprintf("%+v", system.Info{})
for _, linePrefix := range stringsToCheck {
assert.Check(t, is.Contains(out, linePrefix))
}
}
This patch makes the test _slightly_ better by checking if the fields
are non-empty. More work is needed on this test though; currently it
uses the (already running) daemon, so it's hard to check for specific
fields to be correct (withouth knowing state of the daemon), but it's
not unlikely that other tests (partially) cover some of that. A TODO
comment was added to look into that (we should probably combine some
tests to prevent overlap, and make it easier to spot "gaps" as well).
While working on this, also moving the `SystemTime` into this test,
because that field is (no longer) dependent on "debug" state
(It is was actually this change that led me down this rabbit-hole)
()_()
(-.-)
'(")(")'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Following tests are implemented in this specific commit:
- Inter-container communications for internal and non-internal
bridge networks, over IPv4 and IPv6.
- Inter-container communications using IPv6 link-local addresses for
internal and non-internal bridge networks.
- Inter-network communications for internal and non-internal bridge
networks, over IPv4 and IPv6, are disallowed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This commit introduces a new integration test suite aimed at testing
networking features like inter-container communication, network
isolation, port mapping, etc... and how they interact with daemon-level
and network-level parameters.
So far, there's pretty much no tests making sure our networks are well
configured: 1. there're a few tests for port mapping, but they don't
cover all use cases ; 2. there're a few tests that check if a specific
iptables rule exist, but that doesn't prevent that specific iptables
rule to be wrong in the first place.
As we're planning to refactor how iptables rules are written, and change
some of them to fix known security issues, we need a way to test all
combinations of parameters. So far, this was done by hand, which is
particularly painful and time consuming. As such, this new test suite is
foundational to upcoming work.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When the start interval is 0 we should treat that as unset.
This is especially important for older API versions where we reset the
value to 0.
Instead of using the default probe value we should be using the
configured `interval` value (which may be a default as well) which gives
us back the old behavior before support for start interval was added.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This syncs the seccomp profile with changes made to containerd's default
profile in [1].
The original containerd issue and PR mention:
> Security experts generally believe io_uring to be unsafe. In fact
> Google ChromeOS and Android have turned it off, plus all Google
> production servers turn it off. Based on the blog published by Google
> below it seems like a bunch of vulnerabilities related to io_uring can
> be exploited to breakout of the container.
>
> [2]
>
> Other security reaserchers also hold this opinion: see [3] for a
> blackhat presentation on io_uring exploits.
For the record, these syscalls were added to the allowlist in [4].
[1]: a48ddf4a20
[2]: https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html
[3]: https://i.blackhat.com/BH-US-23/Presentations/US-23-Lin-bad_io_uring.pdf
[4]: https://github.com/moby/moby/pull/39415
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test is very weird, the Size in the manifests that it creates is
wrong, graph drivers only print a warning in that case but containerd
fails because it verifies more things. The media types are also wrong in
the containerd case, the manifest list forces the media type to be
"schema2.MediaTypeManifest" but in the containerd case the media type is
an OCI one.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The build target is not quoted and it makes it difficult for some
persons to see what the problem is.
By quoting it we emphasize that the target name is variable.
Signed-off-by: Frank Villaro-Dixon <frank.villarodixon@merkle.com>
I am finally convinced that, given two netip.Prefix values a and b, the
expression
a.Contains(b.Addr()) || b.Contains(a.Addr())
is functionally equivalent to
a.Overlaps(b)
The (netip.Prefix).Contains method works by masking the address with the
prefix's mask and testing whether the remaining most-significant bits
are equal to the same bits in the prefix. The (netip.Prefix).Overlaps
method works by masking the longer prefix to the length of the shorter
prefix and testing whether the remaining most-significant bits are
equal. This is equivalent to
shorterPrefix.Contains(longerPrefix.Addr()), therefore applying Contains
symmetrically to two prefixes will always yield the same result as
applying Overlaps to the two prefixes in either order.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This delete was originally added in b37fdc5dd1
and migrated from `deleteImages(repoName)` in commit 1e55ace875,
however, deleting `foobar-save-multi-images-test` (`foobar-save-multi-images-test:latest`)
always resulted in an error;
Error response from daemon: No such image: foobar-save-multi-images-test:latest
This patch removes the redundant image delete.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Shutting down containers on Windows can take a long time (with hyper-v),
causing this test to be flaky; seen failing on windows 2022;
=== FAIL: github.com/docker/docker/integration/image TestSaveRepoWithMultipleImages (23.16s)
save_test.go:104: timeout waiting for container to exit
Looking at the test, we run a container only to commit it, and the test
does not make changes to the container's filesystem; it only runs a container
with a custom command (`true`).
Instead of running the container, we can _create_ a container and commit it;
this simplifies the tests, and prevents having to wait for the container to
exit (before committing).
To verify:
make BIND_DIR=. DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestSaveRepoWithMultipleImages test-integration
INFO: Testing against a local daemon
=== RUN TestSaveRepoWithMultipleImages
--- PASS: TestSaveRepoWithMultipleImages (1.20s)
PASS
DONE 1 tests in 2.668s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When starting a daemon in debug mode (such as used in CI), many log-messages
are printed during startup. As a result, the log message indicating whether
graph-drivers or snapshotters are used may appear far separate from the
informational log about the daemon (and selected storage-driver).
The existing log-driver also unconditionally uses the legacy "graph-driver"
terminology, instead of the more generic "storage-driver".
This patch changes the log message shown during startup to use the generic
"graph-driver" as field, and adds a new field that indicates wheter we're
using snapshotters or graph-drivers.
Given that snapshotters will be the default at some point, an alternative
could be to include the _type_ of driver used, for example;
`io.containerd.snapshotter.v1`, which may continue to be relevant after
snapshotters become the default, and at which point (potentially) the
type of snapshotter becomes more relevant.
Before this change:
TEST_INTEGRATION_USE_SNAPSHOTTER=1 DOCKER_GRAPHDRIVER=overlayfs dockerd
...
INFO[2023-10-31T09:12:33.586269801Z] Starting daemon with containerd snapshotter integration enabled
INFO[2023-10-31T09:12:33.586322176Z] Loading containers: start.
INFO[2023-10-31T09:12:33.640514759Z] Loading containers: done.
INFO[2023-10-31T09:12:33.646498134Z] Docker daemon commit=dcf7287d647bcb515015e389df46ccf1e09855b7 graphdriver=overlayfs version=dev
INFO[2023-10-31T09:12:33.646706551Z] Daemon has completed initialization
INFO[2023-10-31T09:12:33.658840592Z] API listen on /var/run/docker.sock
With this change;
TEST_INTEGRATION_USE_SNAPSHOTTER=1 DOCKER_GRAPHDRIVER=overlayfs dockerd
...
INFO[2023-10-31T08:41:38.841155928Z] Starting daemon with containerd snapshotter integration enabled
INFO[2023-10-31T08:41:38.841207512Z] Loading containers: start.
INFO[2023-10-31T08:41:38.902461053Z] Loading containers: done.
INFO[2023-10-31T08:41:38.910535137Z] Docker daemon commit=dcf7287d647bcb515015e389df46ccf1e09855b7 containerd-snapshotter=true storage-driver=overlayfs version=dev
INFO[2023-10-31T08:41:38.910936803Z] Daemon has completed initialization
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These platforms are filled by default from containerd
introspection API and may not be normalized. Initializing
wrong platform in here results in incorrect platform
for BUILDPLATFORM and TARGETPLATFORM build-args for
Dockerfile frontend (and probably other side effects).
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Redirecting check-config.sh output to a file puts control character
output into that file, which isn't helpful for reading.
Disable colorized output if either
1. NO_COLOR environment is set to "1"
2. stdout is not a terminal.
Signed-off-by: Scott Moser <smoser@brickies.net>
In case of `docker push -a`, we need to return an error if there is no
image for the given repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't wrap the `no basic auth credentials` error from containerd and
return it as-is.
The error will look like:
```
failed to resolve reference "docker.io/library/aodkoakds:latest": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The 403 error might not only be raised in swarm operations. It is
also returned when the given container is already connected to the
network and is currently running. I noticed this when during the
following PR: https://github.com/containers/podman/pull/20365
Signed-off-by: Philipp Fruck <dev@p-fruck.de>
When writing a tar file with archive/tar, extended attributes in the
deprecated (tar.Header).Xattrs map take precedence over conflicting
'SCHILY.xattr' records in the (tar.Header).PAXRecords map. Update
package tarsum to follow the same precedence rules as archive/tar.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Adds test ensuring that additional groups set with `--group-add`
are kept on exec when container had `--user` set on run.
Regression test for https://github.com/moby/moby/issues/46712
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Add a new `com.docker.network.host_ipv6` bridge option to compliment
the existing `com.docker.network.host_ipv4` option. When set to an
IPv6 address, this causes the bridge to insert `SNAT` rules instead of
`MASQUERADE` rules (assuming `ip6tables` is enabled). `SNAT` makes it
possible for users to control the source IP address used for outgoing
connections.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
Kept `coci` import alias since we use it elsewhere,
maybe to prevent confusion with our own `oci` package.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Having a sandbox/container-wide MacAddress field makes little sense
since a container can be connected to multiple networks at the same
time. This field is an artefact of old times where a container could be
connected to a single network only.
As we now have a way to specify per-endpoint mac address, this field is
now deprecated.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to this commit, only container.Config had a MacAddress field and
it's used only for the first network the container connects to. It's a
relic of old times where custom networks were not supported.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- Merge BC conds for API < v1.42 together
- Merge BC conds for API < v1.44 together
- Re-order BC conds by API version
- Move pids-limit normalization after BC conds
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The same error is already returned by `(*Daemon).containerCreate()` but
since this function is also called by the cluster executor, the error
has to be duplicated.
Doing that allows to remove a nil check on container config in
`postContainersCreate`.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
containerd's `WithUser` function now resets this property, starting with
[3eda46af12b1deedab3d0802adb2e81cb3521950][1] (v1.7.0-beta.4), so we no
longer need this function.
[1]: 3eda46af12
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/opencontainers/runc/libcontainer/user package was moved
to a separate module. While there's still uses of the old module in
our code-base, runc itself is migrating to the new module, and deprecated
the old package (for runc 1.2).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit def549c8f6 passed through the context
to the daemon.ContainerStart function. As a result, restarting containers
no longer is an atomic operation, because a context cancellation could
interrupt the restart (between "stopping" and "(re)starting"), resulting
in the container being stopped, but not restarted.
Restarting a container, or more factually; making a successful request on
the `/containers/{id]/restart` endpoint, should be an atomic operation.
This patch uses a context.WithoutCancel for restart requests.
It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).
Before this patch, starting a container that bind-mounts the docker socket,
then restarting itself from within the container would cancel the restart
operation. The container would be stopped, but not started after that:
docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3a2a741c65ff docker:cli "docker-entrypoint.s…" 26 seconds ago Exited (128) 7 seconds ago myself
With this patch: the stop still cancels the exec, but does not cancel the
restart operation, and the container is started again:
docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4393a01f7c75 docker:cli "docker-entrypoint.s…" About a minute ago Up 4 seconds myself
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix a silly bug in the implementation which had the effect of
len(h.Xattrs) blank entries being inserted in the middle of
orderedHeaders. Luckily this is not a load-bearing bug: empty headers
are ignored as the tarsum digest is computed by concatenating header
keys and values without any intervening delimiter.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing pkg/archive unit tests are primarily round-trip tests which
assert that pkg/archive produces tarballs which pkg/archive can unpack.
While these tests are effective at catching regressions in archiving or
unarchiving, they have a blind spot for regressions in compatibility
with the rest of the ecosystem. For example, a typo in the capabilities
extended attribute constant would result in subtly broken image layer
tarballs, but the existing tests would not catch the bug if both the
archiving and unarchiving implementations have the same typo.
Extend the test for archiving an overlay filesystem layer to assert that
the overlayfs style whiteouts (extended attributes and device files) are
transformed into AUFS-style whiteouts (magic file names).
Extend the test for archiving files with extended attributes to assert
that the extended attribute is encoded into the file's tar header in the
standard, interoperable format compatible with the rest of the
ecosystem.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Use context.WithoutCancel so that both the containerStop and
container.Wait can share the same parent context. This context is still
a "TODO", but can be wired up in future.
It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Follow-up to fc94ed0a86. Now that
f6e44bc0e8 added the compatcontext
package, we can start using context.WithoutCancel.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In the tagged case the error message when the image/tag is not found
should be "tag does not exist: ref"
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
While there is nothing inherently wrong with goto statements, their use
here is not helping with readability.
Signed-off-by: Cory Snider <csnider@mirantis.com>
ds.cache is never nil so the uncached code paths are unreachable in
practice. And given how many KVObject deep-copy implementations shallow
copy pointers and other reference-typed values, there is the distinct
possibility that disabling the datastore cache could break things.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The datastore cache only uses the reference to its datastore to get a
reference to the backing store. Modify the cache to take the backing
store reference directly so that methods on the datastore can't get
called, as that might result in infinite recursion between datastore and
cache methods.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Also removing some waitRun call, as they were not actually checked for
results, and the tests depended on that behavior (to get events about
the container starting etc).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When choosing the next image, don't reject images without the classic
builder parent label. The intention was to *prefer* images them instead
of making that a condition.
This fixes the ID not being filled for parent images that weren't built
with the classic builder.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `Tags` slice of each history entry was filled with tags of parent
image. Change it to correctly assign the current image tags.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Check for accurate values that may contain content sizes unknown to the
usage test in the calculation. Avoid asserting using deep equals when
only the expected value range is known to the test.
Signed-off-by: Derek McGowan <derek@mcg.dev>
Add IP_NF_MANGLE to "Generally Required" kernel features, since it appears to be necessary for Docker Swarm to work.
Closes https://github.com/moby/moby/issues/46636
Signed-off-by: Stephan Henningsen <stephan-henningsen@users.noreply.github.com>
Inline the tortured logic for deciding when to skip updating the svc
records to give us a fighting chance at deciphering the logic behind the
logic and spotting logic bugs.
Update the service records synchronously. The only potential for issues
is if this change introduces deadlocks, which should be fixed by
restrucuting the mutexes rather than papering over the issue with
sketchy hacks like deferring the operation to a goroutine.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Its only remaining purpose is to elide removing the endpoint from the
service records if it was not previously added. Deleting the service
records is an idempotent operation so it is harmless to delete service
records which do not exist.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The service db entry for each network is deleted by
(*Controller).cleanupServiceDiscovery() when the network is deleted.
There is no need to also eagerly delete it whenever the network's
endpoint count drops to zero.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The logic to rename an endpoint includes code which would synchronize
the renamed service records to peers through the distributed datastore.
It would trigger the remote peers to pick up the rename by touching a
datastore object which remote peers would have subscribed to events on.
The code also asserts that the local peer is subscribed to updates on
the network associated with the endpoint, presumably as a proxy for
asserting that the remote peers would also be subscribed.
https://github.com/moby/libnetwork/pull/712
Libnetwork no longer has support for distributed datastores or
subscribing to datastore object updates, so this logic can be deleted.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The meaning of the (*Controller).isDistributedControl() method is not
immediately clear from the name, and it does not have any doc comment.
It returns true if and only if the controller is neither a manager node
nor an agent node -- that is, if the daemon is _not_ participating in a
Swarm cluster. The method name likely comes from the old abandoned
datastore-as-IPC control plane architecture for libnetwork. Refactor
c.isDistributedControl() -> !c.isSwarmNode()
to make it easier to understand code which consumes the method.
Signed-off-by: Cory Snider <csnider@mirantis.com>
server: prohibit more than MaxConcurrentStreams handlers from running at once
(CVE-2023-44487).
In addition to this change, applications should ensure they do not leave running
tasks behind related to the RPC before returning from method handlers, or should
enforce appropriate limits on any such work.
- https://github.com/grpc/grpc-go/compare/v1.56.2...v1.56.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We support importing images for other platforms when
using the containerd image store, so we shouldn't validate
the image OS on import.
This commit also splits the test into two, so that we can
keep running the "success" import with a custom platform tests
running w/ c8d while skipping the "error/rejection" test cases.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
After a successful push, all pushed blobs should have a
distribution.source label pointing to the new registry.
Before this commit, the label was only appended to the top-level blob
(manifest or manifest list). Adjust this to also do that recursively to
its children.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use the distribution code to query the remote repository for tags and
pull them sequentially just like the non-c8d pull.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add a function to return tags for the given repository reference. This
is needed to implement the `pull -a` (pull all tags) for containerd
which doesn't directly use distribution, but we need to somehow make an
API call to the registry to obtain the available tags.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When the default bridge is disabled by setting dockerd's `--bridge=none`
option, the daemon still creates a sandbox for containers with no
network attachment specified. In that case `NetworkDisabled` will be set
to true.
However, currently the `releaseNetwork` call will early return if
NetworkDisabled is true. Thus, these sandboxes won't be deleted until
the daemon is restarted. If a high number of such containers are
created, the daemon would then take few minutes to start.
See https://github.com/moby/moby/issues/42461.
Signed-off-by: payall4u <payall4u@qq.com>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This is a follow-up to 2216d3ca8d, which
implemented the StartInterval for health-checks, but did not add validation
for the minimum accepted interval;
> The time to wait between checks in nanoseconds during the start period.
> It should be 0 or at least 1000000 (1 ms). 0 means inherit.
This patch adds validation for the minimum accepted interval (1ms).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While updating the docker/docker dependency in BuildKit, I noticed that the
dependency tree showed _two_ separate versions of the semconv package;
BuildKit and containerd were using the v1.17.0 version and docker/docker was
using v1.7.0.
This patch updates the version we use to align with BuildKit and containerd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Rename all variables/fields/map keys associated with the
`com.docker.network.host_ipv4` option from `HostIP` to `HostIPv4`.
Rationale:
* This makes the variable/field name consistent with the option
name.
* This makes the code more readable because it is clear that the
variable/field does not hold an IPv6 address. This will hopefully
avoid bugs like <https://github.com/moby/moby/issues/46445> in the
future.
* If IPv6 SNAT support is ever added, the names will be symmetric.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
Rather than pass an `iptables.IPVersion` value alongside every
`iptRule` parameter, embed the IP version in the `iptRule` struct.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
That field was only used to pass `-t nat` for NAT rules. Now `-t
<tableName>` (where `<tableName>` is one of the `iptables.Table`
values) is always passed, eliminating the need for `preArgs`.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
Pass the entire `*networkConfiguration` struct to
`setupIPTablesInternal` to simplify the function signature and improve
code readability.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
update buildkit to the latest code in the v0.12 branch:
full diff: f94ed7cec3...6560bb937e
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This reverts commit 8777592397, which
turns out to break other test cases/the registry flow.
The correct place to handle missing credentials is instead
15bf23df09/remotes/docker/authorizer.go (L200).
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use a unique parent view snapshot key for each diff request.
I considered using singleflight at first, but I realized it wouldn't
really be correct.
The diff can take some time, so there's a window of time between the
diff start and finish, where the file system can change.
These changes not always will be reflected in the running diff.
With singleflight, the second diff request which happened before the
previous diff was finished, would not include changes made to the
container filesystem after the first diff request has started.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Implement a behavior from the graphdriver's export where `docker save
something` (untagged reference) would export all images matching the
specified repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
On modern kernels this is an alias; however newer code has preferred
ctstate while older code has preferred the deprecated 'state' name.
Prefer the newer name for uniformity in the rules libnetwork creates,
and because some implementations/distributions of the xtables userland
tools may not support the legacy alias.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The image that this test pulls contains an error in the linux/amd64
manifest description, the reported size is 424 but the actual size is
524, making this test fail with containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
go1.21.3 (released 2023-10-10) includes a security fix to the net/http package.
See the Go 1.21.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.21.3+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.21.2...go1.21.3
From the security mailing:
[security] Go 1.21.3 and Go 1.20.10 are released
Hello gophers,
We have just released Go versions 1.21.3 and 1.20.10, minor point releases.
These minor releases include 1 security fixes following the security policy:
- net/http: rapid stream resets can cause excessive work
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource consumption.
While the total number of requests is bounded to the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing.
HTTP/2 servers now bound the number of simultaneously executing
handler goroutines to the stream concurrency limit. New requests
arriving when at the limit (which can only happen after the client
has reset an existing, in-flight request) will be queued until a
handler exits. If the request queue grows too large, the server
will terminate the connection.
This issue is also fixed in golang.org/x/net/http2 v0.17.0,
for users manually configuring HTTP/2.
The default stream concurrency limit is 250 streams (requests)
per HTTP/2 connection. This value may be adjusted using the
golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams
setting and the ConfigureServer function.
This is CVE-2023-39325 and Go issue https://go.dev/issue/63417.
This is also tracked by CVE-2023-44487.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.21.2 (released 2023-10-05) includes one security fixes to the cmd/go package,
as well as bug fixes to the compiler, the go command, the linker, the runtime,
and the runtime/metrics package. See the Go 1.21.2 milestone on our issue
tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.21.2+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.21.1...go1.21.2
From the security mailing:
[security] Go 1.21.2 and Go 1.20.9 are released
Hello gophers,
We have just released Go versions 1.21.2 and 1.20.9, minor point releases.
These minor releases include 1 security fixes following the security policy:
- cmd/go: line directives allows arbitrary execution during build
"//line" directives can be used to bypass the restrictions on "//go:cgo_"
directives, allowing blocked linker and compiler flags to be passed during
compliation. This can result in unexpected execution of arbitrary code when
running "go build". The line directive requires the absolute path of the file in
which the directive lives, which makes exploting this issue significantly more
complex.
This is CVE-2023-39323 and Go issue https://go.dev/issue/63211.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/golang/net/compare/v0.13.0...v0.17.0
This fixes the same CVE as go1.21.3 and go1.20.10;
- net/http: rapid stream resets can cause excessive work
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource consumption.
While the total number of requests is bounded to the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing.
HTTP/2 servers now bound the number of simultaneously executing
handler goroutines to the stream concurrency limit. New requests
arriving when at the limit (which can only happen after the client
has reset an existing, in-flight request) will be queued until a
handler exits. If the request queue grows too large, the server
will terminate the connection.
This issue is also fixed in golang.org/x/net/http2 v0.17.0,
for users manually configuring HTTP/2.
The default stream concurrency limit is 250 streams (requests)
per HTTP/2 connection. This value may be adjusted using the
golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams
setting and the ConfigureServer function.
This is CVE-2023-39325 and Go issue https://go.dev/issue/63417.
This is also tracked by CVE-2023-44487.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.
This patch moves our own uses of the package to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix storageDriver gcs not registered in binaries
- reference: replace uses of deprecated function SplitHostname
- Dont parse errors as JSON unless Content-Type is set to JSON
- update to go1.20.8
- Set Content-Type header in registry client ReadFrom
- deprecate reference package, migrate to github.com/distribution/reference
- digestset: deprecate package in favor of go-digest/digestset
- Do not close HTTP request body in HTTP handler
full diff: https://github.com/distribution/distribution/compare/v2.8.2...v2.8.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`docker build --squash` is an experimental feature which is not
implemented for containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`docker.io` is present in the `IndexConfigs` so the `Mirrors` property
would get lost because a fresh `RegistryConfig` object was created.
Instead of creating a new object, reuse the existing one and just
mutate its fields.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Extract the distribution source label append into its own function and
make it not fail on any error, we do still log the error.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These are not yet implemented with containerd snapshotters. We skip them
now because implementing this is not trivial with containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
So far, internal networks were only isolated from the host by iptables
DROP rules. As a consequence, outbound connections from containers would
timeout instead of being "rejected" through an immediate ICMP dest/port
unreachable, a TCP RST or a failing `connect` syscall.
This was visible when internal containers were trying to resolve a
domain that don't match any container on the same network (be it a truly
"external" domain, or a container that don't exist/is dead). In that
case, the embedded resolver would try to forward DNS queries for the
different values of resolv.conf `search` option, making DNS resolution
slow to return an error, and the slowness being exacerbated by some libc
implementations.
This change makes `connect` syscall to return ENETUNREACH, and thus
solves the broader issue of failing fast when external connections are
attempted.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Path prefixes were originally disallowed in the `--registry-mirrors`
option because the /v1 endpoint was assumed to be at the root of the
URI. This is no longer the case in v2.
Close#36598
Signed-off-by: Régis Behmo <regis@behmo.com>
This change creates a few OTEL spans and plumb context through the DNS
resolver and DNS backends (ie. Sandbox and Network). This should help
better understand how much lock contention impacts performance, and
help debug issues related to DNS queries (we basically have no
visibility into what's happening here right now).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This isn't something that user should do, but technically the dangling
images exist in the image store and user can pass its name (`moby-dangling@digest`).
Change it so rmi now recognizes that it's actually a dangling image and
doesn't handle it like a regular tagged image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- Pass empty containerd socket which forces the daemon to create a new
supervised containerd. Otherwise a global containerd daemon will be
used and the pulled image data will be stored in its data directory,
instead of the the newly specified `data-root` that has a limited
storage capacity.
- Don't try to use `vfs` snapshotter, instead use `native` which is
containerd's equivalent for `vfs`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Instead of passing a completely fresh context without any values, just
discard the cancellation.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This reverts commit a9fa147a92.
The commit is unfortunately broken as it is still using `providerHandle`
to write events but that handle is never actually set, so it is always
invalid. All logging fails.
Note: This is note a straight revert due to the change to
containerd/log.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
To match the graphdriver's push behavior which only shows the progress
for layers.
Exclude indexes, manifests and image configs from the push progress.
Don't explicitly check for `IsLayerType` to also handle other
potentially big blobs (like buildkit attestations).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This was mistakenly added to bklog.
Since this is getting attached to the standard logger, and bklog is
using the standard logger, we only need this added once.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Fix issue 46563 "Rootful-in-Rootless dind doesn't work since systemd v250 (due to oom score adj)"
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Before this commit, `doPack`, `doUnpack` and `doUnpackLayer` were not implemented for Darwin, causing build failure.
This change allows all non-Linux Unixes to use FreeBSD reexec-based pack/unpack implementation
See also: moby/buildkit#4059
See also: 8b843732b3
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
This required changes to the download-URL, as downloads are now provided
using the full version (including the `.0` patch version);
curl -sI https://go.dev/dl/go1.21.windows-amd64.zip | grep 'location'
location: https://dl.google.com/go/go1.21.windows-amd64.zip
curl -sI https://dl.google.com/go/go1.21.windows-amd64.zip
HTTP/2 404
# ...
curl -sI https://dl.google.com/go/go1.21.0.windows-amd64.zip
HTTP/2 200
# ...
Unfortunately this also means that the GO_VERSION can no longer be set to
versions lower than 1.21.0 (without additional changes), because older
versions do NOT provide the `.0` version, and Go 1.21.0 and up, no longer
provides URLs _without_ the `.0` version.
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test was depending on the client constructing an error based on the
http-status code, and the client not reading the response body if the
response was not a JSON response.
This fix;
- adds the correct content-type headers in the response
- includes error-messages in the response
- adds additional tests to cover both the plain (non-JSON) and JSON
error responses, as well as an empty response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was setting the content-type header after WriteHeader was
called, and the header was not sent because of that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use named return variables to make the function more self-describing
- rename variable for readability
- slightly optimize slice initialization, and keep linters happy
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copy the implementation of `context.WithoutCancel` introduced in Go 1.21
to be able to use it when building with older versions.
This will use the stdlib directly when building with Go 1.21+.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This commit moves one-shot stats processing out of the publishing
channels, i.e. collect stats directly.
Also changes the method of getSystemCPUUsage() on Linux to return
number of online CPUs also.
Signed-off-by: Xinfeng Liu <XinfengLiu@icloud.com>
It was only used in a single place, and it was defined far away from
where it was used.
Move the code inline, so that it's clear at a glance what it's doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The store field is only mutated by Controller.initStores(), which is
only called inside the cosntructor (libnetwork.New), so there should be
no need to protect the field with a mutex in non-exported functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Controller.getNetworkFromStore() already returns a ErrNoSuchNetwork if
no network was found, so we don't need to convert the existing error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently, all traces coming from the API have an empty operation
string, which make them indistinguishable from each other without looking
at the logs of the root span, and prevent proper filtering on Jaeger UI.
With this change, traces get the route pattern as the operation string.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The reason it doesn't change with the graphdrivers is caused by an
implementation detail and the fact that the image is loaded into the
same daemon it was saved from.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Rewrite TestSaveMultipleNames and TestSaveSingleTag so that they don't
use legacy `repositories` file (which isn't present in the OCI
archives).
`docker save` output is now OCI compatible, so we don't need
to use the legacy file.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The input archive is in the old Docker format that's not OCI compatible
and is not supported by the containerd archive import:
```
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/VERSION
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/json
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/layer.tar
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/VERSION
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/json
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/layer.tar
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/VERSION
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/json
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/layer.tar
repositories
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This exposes `ACTIONS_RUNTIME_TOKEN` and `ACTIONS_CACHE_URL`, which are
used to skip cache exporter tests, when combined with
a8789cbd4a
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Now that this is a generic, we can define a struct type at the package
level, and remove the casting logic necessary when we had to use
interface{}.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
SourcePolicy was accounted for in 330cf7ae7d
TODO: replace applySourcePolicies with BuildKit's implementation, which
is currently unexported.
Co-authored-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
With BuildKit 0.12, some existing types are now required to be wrapped
by new types:
* containerd's LeaseManager and ContentStore have to be a
(namespace-aware) BuildKit type since f044e0a946
* BuildKit's solver.CacheManager is used instead of
bboltstorage.CacheKeyStorage since 2b30693409
* The MaxAge config field is a bkconfig.Duration since e06c96274f
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The following changes were required:
* integration/build: progressui's signature changed in 6b8fbed01e
* builder-next: flightcontrol.Group has become a generic type in 8ffc03b8f0
* builder-next/executor: add github.com/moby/buildkit/executor/resources types, necessitated by 6e87e4b455
* builder-next: stub util/network/Namespace.Sample(), necessitated by 963f16179f
Co-authored-by: CrazyMax <crazy-max@users.noreply.github.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The DeepEqual ignore required in the daemon tests is a bit ugly, but it
works given the new protoc output.
We also have to ignore lints related to schema1 deprecations; these do
not apply as we must continue to support this schema version.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The current executor is only tested on Linux, so let's be honest about
that. Stubbing this correctly helps avoid incorrectly trying to call
into Linux-only code in e.g. libnetwork.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Simplify the lock/unlock cycle, and make the "lookupAlias" branch
more similar to the non-lookupAlias variant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Skip faster when we're looking for aliases. Also check for the list
of aliases to be empty, not just `nil` (although in practice it should
be equivalent).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use `nameOrAlias` for the name (or alias) to resolve
- use `lookupAlias` to indicate what the intent is; this function
is either looking up aliases or "regular" names. Ideally we would
split the function, but let's keep that for a future exercise.
- name the `ipv6Miss` output variable. The "ipv6 miss" logic is rather
confusing, and should probably be revisited, but let's start with
giving the variable a name to make it more apparent what it is.
- use `nw` for networks, which is the more common local name
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove some intermediate vars, or move them closer to where they're used.
- ResolveService: use strings.SplitN to limit number of elements. This
code is only used to validate the input, results are not used.
- ResolveService: return early instead of breaking the loop. This makes
it clearer from the code that were not returning anything (nil, nil).
- Controller.sandboxCleanup(): rename a var, and slight refactor of
error-handling.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was introduced in
0a79e67e4f
Make use of it throughout our log-format handling code, and convert back
to a string before we pass it to the containerd client.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
This function was used to check if the network is a multi-host, swarm-scoped
network. Part of this check involved a check whether the cluster-agent was
present.
In all places where this function was used, the next step after checking if
the network was "cluster eligible", was to get the agent, and (again) check
if it was not nil.
This patch rewrites the isClusterEligible utility into a clusterAgent utility,
which both checks if the network is cluster-eligible, and returns the agent
(if set). For convenience, an "ok" bool is added, which callers can use to
return early (although just checking for nilness would likely have been
sufficient).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes redundant nil-checks in Endpoint.deleteServiceInfoFromCluster
and Endpoint.addServiceInfoToCluster.
These functions return early if the network is not ["cluster eligible"][1],
and the function used for that (`Network.isClusterEligible`) requires the
[agent to not be `nil`][2].
This check moved around a few times ([3][3], [4][4]), but was originally
added in [libnetwork 1570][5] which, among others, tried to avoid a nil-pointer
exception reported in [moby 28712][6], which accessed the `Controller.agent`
[without locking][7]. That issue was addressed by adding locks, adding a
`Controller.getAgent` accessor, and updating deleteServiceInfoFromCluster
to use a local var. It also sprinkled this `nil` check to be on the safe
side, but as `Network.isClusterEligible` already checks for the agent
to not be `nil`, this should not be redundant.
[1]: 5b53ddfcdd/libnetwork/agent.go (L529-L534)
[2]: 5b53ddfcdd/libnetwork/agent.go (L688-L696)
[3]: f2307265c7
[4]: 6426d1e66f
[5]: 8dcf9960aa
[6]: https://github.com/moby/moby/issues/28712
[7]: 75fd88ba89/vendor/github.com/docker/libnetwork/agent.go (L452)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When graphdriver is not provided the graphdriver is looked up
from docker info, but without quotes it may fail and set the
graphdriver to an incorrect value.
Signed-off-by: Derek McGowan <derek@mcg.dev>
It's not set when containerd is used as an image store and buildkit
never sets it either, so let's skip this test if snapshotters are used
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
We try to perform API-version negotiation as lazy as possible (and only execute
when we are about to make an API request). However, some code requires API-version
dependent handling (to set options, or remove options based on the version of the
API we're using).
Currently this code depended on the caller code to perform API negotiation (or
to configure the API version) first, which may not happen, and because of that
we may be missing options (or set options that are not supported on older API
versions).
This patch:
- splits the code that triggered API-version negotiation to a separate
Client.checkVersion() function.
- updates NewVersionError to accept a context
- updates NewVersionError to perform API-version negotiation (if enabled)
- updates various Client functions to manually trigger API-version negotiation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
moby and containerd have slightly different error messages when someone
tries to pull an image that doesn't contain the current platform,
instead of looking inside the error returned by containerd we match the
errors in the test related to what image backend we are using
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Diffing a container yielded some extra changes that come from the
files/directories that we mount inside the container (/etc/resolv.conf
for example). To avoid that we create an intermediate snapshot that has
these files, with this we can now diff the container fs with its parent
and only get the differences that were made inside the container.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Final progress messages were sent after the progress updater finished
which made the "Downloading" progress not being updated into "Download
complete".
Fix by sending the final messages after the progress has finished.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
It's still not "great", but implement a `newInterface()` constructor
to create a new Interface instance, instead of creating a partial
instance and applying "options" after the fact.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We're only using the results if the interface doesn't have an address
yet, so skip this step if we don't use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Flatten some nested "if"-statements, and improve error.
Errors returned by this function are not handled, and only logged, so
make them more informative if debugging is needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They were not consistently used, and the locations where they were
used were already "setters", so we may as well inline the code.
Also updating Namespace.Restore to keep the lock slightly longer,
instead of locking/unlocking for each property individually, although
we should consider to keep the long for the duration of the whole
function to make it more atomic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the mutex internal to the Namespace; locking/unlocking should not
be done externally, and this makes it easier to see where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Interface.Remove() was directly accessing Namespace "internals", such
as locking/unlocking. Move the code from Interface.Remove() into the
Namespace instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We weren't checking for the asked platform in the case the image was a
manifest, only if it was a manifest list.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Makes it possible to pull `application/vnd.docker.distribution.manifest.v1+prettyjws`
legacy manifests.
They are not stored in their original form but are converted to the OCI
manifests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
While this is not strictly necessary as the default OCI config masks this
path, it is possible that the user disabled path masking, passed their
own list, or is using a forked (or future) daemon version that has a
modified default config/allows changing the default config.
Add some defense-in-depth by also masking out this problematic hardware
device with the AppArmor LSM.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The ability to read these files may offer a power-based sidechannel
attack against any workloads running on the same kernel.
This was originally [CVE-2020-8694][1], which was fixed in
[949dd0104c496fa7c14991a23c03c62e44637e71][2] by restricting read access
to root. However, since many containers run as root, this is not
sufficient for our use case.
While untrusted code should ideally never be run, we can add some
defense in depth here by masking out the device class by default.
[Other mechanisms][3] to access this hardware exist, but they should not
be accessible to a container due to other safeguards in the
kernel/container stack (e.g. capabilities, perf paranoia).
[1]: https://nvd.nist.gov/vuln/detail/CVE-2020-8694
[2]: 949dd0104c
[3]: https://web.eece.maine.edu/~vweaver/projects/rapl/
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Return the number of containers that use an image if it was asked,
during a `docker system df` call for example.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This issue wasn't caught on ContainerCreate or NetworkConnect (when
container wasn't started yet).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Thus far, validation code would stop as soon as a bad value was found.
Now, we try to validate as much as we can, to return all errors to the
API client.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
So far, only a subset of NetworkingConfig was validated when calling
ContainerCreate. Other parameters would be validated when the container
was started. And the same goes for EndpointSettings on NetworkConnect.
This commit adds two validation steps:
1. Check if the IP addresses set in endpoint's IPAMConfig are valid,
when ContainerCreate and ConnectToNetwork is called ;
2. Check if the network allows static IP addresses, only on
ConnectToNetwork as we need the libnetwork's Network for that and it
might not exist until NetworkAttachment requests are sent to the
Swarm leader (which happens only when starting the container) ;
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Just return a regular error, because the API converts the error to
the expected ErrorResponse. Before/After produce the same API response:
curl -v --unix-socket /var/run/docker.sock 'http://localhost/v1.43/images/search?term=hello'
* Trying /var/run/docker.sock:0...
* Connected to localhost (/var/run/docker.sock) port 80 (#0)
> GET /v1.43/images/search?term=hello HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Api-Version: 1.44
< Content-Type: application/json
< Docker-Experimental: false
< Ostype: linux
< Server: Docker/dev (linux)
< Traceparent: 00-c38c2da5cf30305fcb66836a28e227bf-d16f4f7d2c7002a1-01
< Date: Mon, 18 Sep 2023 14:30:18 GMT
< Content-Length: 41
<
{"message":"Unexpected status code 409"}
* Connection #0 to host localhost left intact
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make `PullImage` accept `reference.Named` directly instead of
duplicating the parsing code for both graphdriver and containerd image
service implementations.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test checks how the layer store works, so we don't need it when we
use containerd as image store
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This package was introduced in af59752712
as a utility package for devicemapper, which was removed in commit
dc11d2a2d8 (v25.0.0).
It looks like there's no external consumers of this package, so we should
consider removing it, but deprecating it first, just in case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use local variables and remove some intermediate variables
- handle the events inside the switch itself; this makes all the
switch branches use the same logic, instead of "some" using
a `continue`, and others falling through to have the event handled
outside of the switch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were sending the "Pulling from ..." message too early, if the pull
progress wasn't able to resolve the image we wouldn't sent the error
back. Sending that first message would have flushed the output stream
and image_routes.go would return a nil error.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
full diff: https://github.com/containerd/containerd/compare/v1.6.22...v1.6.24
v1.6.24 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.6.23...v1.6.24
The twenty-fourth patch release for containerd 1.6 contains various fixes
and updates.
Notable Updates
- CRI: fix leaked shim caused by high IO pressure
- Update to go1.20.8
- Update runc to v1.1.9
- Backport: add configurable mount options to overlay snapshotter
- log: cleanups and improvements to decouple more from logrus
v1.6.23 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.6.22...v1.6.23
The twenty-third patch release for containerd 1.6 contains various fixes
and updates.
Notable Updates
- Add stable ABI support in windows platform matcher + update hcsshim tag
- cri: Don't use rel path for image volumes
- Upgrade GitHub actions packages in release workflow
- update to go1.19.12
- backport: ro option for userxattr mount check + cherry-pick: Fix ro mount option being passed
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add "The push referers to repository X" message which is present in the
push output when using the graphdrivers.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Update the version used in testing;
full diff: https://github.com/containerd/containerd/compare/v1.7.3...v1.7.6
v1.7.6 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.7.5...v1.7.6
The sixth patch release for containerd 1.7 contains various fixes and updates.
- Fix log package for clients overwriting the global logger
- Fix blockfile snapshotter copy on Darwin
- Add support for Linux usernames on non-Linux platforms
- Update Windows platform matcher to invoke stable ABI compability function
- Update Golang to 1.20.8
- Update push to inherit distribution sources from parent
v1.7.5 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.7.4...v1.7.5
The fifth patch release for containerd 1.7 fixes a versioning issue from
the previous release and includes some internal logging API changes.
v1.7.4 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.7.3...v1.7.4
The fourth patch release for containerd 1.7 contains remote differ plugin support,
a new block file based snapshotter, and various fixes and updates.
Notable Updates
- Add blockfile snapshotter
- Add remote/proxy differ
- Update runc binary to v1.1.9
- Cri: Don't use rel path for image volumes
- Allow attaching to any combination of stdin/out/err
- Fix ro mount option being passed
- Fix leaked shim caused by high IO pressure
- Add configurable mount options to overlay snapshotter
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The API endpoint `/containers/create` accepts several EndpointsConfig
since v1.22 but the daemon would error out in such case. This check is
moved from the daemon to the api and is now applied only for API < 1.44,
effectively allowing the daemon to create containers connected to
several networks.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When registry token is provided, the authorization header can be
directly applied to the registry request. No other type of
authorization will be attempted when the registry token is provided.
Signed-off-by: Derek McGowan <derek@mcg.dev>
We occassionally receive contributions to this script that are outside
its intended scope. Let's add a comment to the script that outlines
what it's meant for, and a link to a GitHub ticket with alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Synchronize the code to do the same thing as Exec.
reap doesn't need to be called before the start event was sent.
There's already a defer block which cleans up the process in case where
an error occurs.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Error check in defer block used wrong error variable which is always nil
if the flow reaches the defer. This caused the `newProcess.Kill` to be
never called if the subsequent attemp to attach to the stdio failed.
Although this only happens in Exec (as Start does overwrite the error),
this also adjusts the Start to also use the returned error to avoid this
kind of mistake in future changes.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
None of the code using this function was setting the value, so let's
simplify and remove the argument.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When the daemon process or the host running it is abruptly terminated,
the layer metadata file can become inconsistent on the file system.
Specifically, `link` and `lower` files may exist but be empty, leading
to overlay mounting errors during layer extraction, such as:
"failed to register layer: error creating overlay mount to <path>:
too many levels of symbolic links."
This commit introduces the use of `AtomicWriteFile` to ensure that the
layer metadata files contain correct data when they exist on the file system.
Signed-off-by: Mike <mike.sul@foundries.io>
Fixes#18864, #20648, #33561, #40901.
[This GH comment][1] makes clear network name uniqueness has never been
enforced due to the eventually consistent nature of Classic Swarm
datastores:
> there is no guaranteed way to check for duplicates across a cluster of
> docker hosts.
And this is further confirmed by other comments made by @mrjana in that
same issue, eg. [this one][2]:
> we want to adopt a schema which can pave the way in the future for a
> completely decentralized cluster of docker hosts (if scalability is
> needed).
This decentralized model is what Classic Swarm was trying to be. It's
been superseded since then by Docker Swarm, which has a centralized
control plane.
To circumvent this drawback, the `NetworkCreate` endpoint accepts a
`CheckDuplicate` flag. However it's not perfectly reliable as it won't
catch concurrent requests.
Due to this design decision, API clients like Compose have to implement
workarounds to make sure names are really unique (eg.
docker/compose#9585). And the daemon itself has seen a string of issues
due to that decision, including some that aren't fixed to this day (for
instance moby/moby#40901):
> The problem is, that if you specify a network for a container using
> the ID, it will add that network to the container but it will then
> change it to reference the network by using the name.
To summarize, this "feature" is broken, has no practical use and is a
source of pain for Docker users and API consumers. So let's just remove
it for _all_ API versions.
[1]: https://github.com/moby/moby/issues/18864#issuecomment-167201414
[2]: https://github.com/moby/moby/issues/18864#issuecomment-167202589
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Windows doesn't support "FROM scratch", and the platform was only used
for validation on other platforms if a platform was provided, so no need
to set defaults.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
strong-type the fields with the expected type, to make it more explicit
what we're expecting here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
PR 4f47013feb added a validation step to `NetworkCreate` to ensure
no IPv6 subnet could be set on a network if its `EnableIPv6` parameter
is false.
Before that, the daemon was accepting such request but was doing nothing
with the IPv6 subnet.
This validation step is now deleted, and we automatically set
`EnableIPv6` if an IPv6 subnet was specified.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Use `ImageService.unpackImage` when we want to unpack an image and we
know the exact platform-manifest to be unpacked beforehand.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
DiffID is only a digest of the one tar layer and matches the snapshot ID
only for the first layer (DiffID = ChainID).
Instead of generating random ID as a key for rolayer, just use the
snapshot ID of the unpacked image content and use it later as a parent
for creating a new RWLayer.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
diffID is the digest of a tar archive containing changes to the parent
layer - rolayer doesn't have any changes to the parent.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
go1.20.8 (released 2023-09-06) includes two security fixes to the html/template
package, as well as bug fixes to the compiler, the go command, the runtime,
and the crypto/tls, go/types, net/http, and path/filepath packages. See the
Go 1.20.8 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.8+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.7...go1.20.8
From the security mailing:
[security] Go 1.21.1 and Go 1.20.8 are released
Hello gophers,
We have just released Go versions 1.21.1 and 1.20.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- cmd/go: go.mod toolchain directive allows arbitrary execution
The go.mod toolchain directive, introduced in Go 1.21, could be leveraged to
execute scripts and binaries relative to the root of the module when the "go"
command was executed within the module. This applies to modules downloaded using
the "go" command from the module proxy, as well as modules downloaded directly
using VCS software.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-39320 and Go issue https://go.dev/issue/62198.
- html/template: improper handling of HTML-like comments within script contexts
The html/template package did not properly handle HMTL-like "<!--" and "-->"
comment tokens, nor hashbang "#!" comment tokens, in <script> contexts. This may
cause the template parser to improperly interpret the contents of <script>
contexts, causing actions to be improperly escaped. This could be leveraged to
perform an XSS attack.
Thanks to Takeshi Kaneko (GMO Cybersecurity by Ierae, Inc.) for reporting this
issue.
This is CVE-2023-39318 and Go issue https://go.dev/issue/62196.
- html/template: improper handling of special tags within script contexts
The html/template package did not apply the proper rules for handling occurrences
of "<script", "<!--", and "</script" within JS literals in <script> contexts.
This may cause the template parser to improperly consider script contexts to be
terminated early, causing actions to be improperly escaped. This could be
leveraged to perform an XSS attack.
Thanks to Takeshi Kaneko (GMO Cybersecurity by Ierae, Inc.) for reporting this
issue.
This is CVE-2023-39319 and Go issue https://go.dev/issue/62197.
- crypto/tls: panic when processing post-handshake message on QUIC connections
Processing an incomplete post-handshake message for a QUIC connection caused a panic.
Thanks to Marten Seemann for reporting this issue.
This is CVE-2023-39321 and CVE-2023-39322 and Go issue https://go.dev/issue/62266.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
schema1 was deprecated a while ago, containerd fails to push to a
schema1 registry, let's just skip these tests for the containerd
integration
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Graph drivers create the parent directory with
rootPair().GID:CurrentIdentity().UID owner. This change brings these in
line
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Constants for both platform-specific and platform-independent networks
are added to the api/network package.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Before this commit, setting the `com.docker.network.host_ipv4` bridge
option when `enable_ipv6` is true and the experimental `ip6tables`
option is enabled would cause Docker to fail to create the network:
> failed to create network `test-network`: Error response from daemon:
> Failed to Setup IP tables: Unable to enable NAT rule: (iptables
> failed: `ip6tables --wait -t nat -I POSTROUTING -s fd01::/64 ! -o
> br-test -j SNAT --to-source 192.168.0.2`: ip6tables
> v1.8.7 (nf_tables): Bad IP address "192.168.0.2"
>
> Try `ip6tables -h` or `ip6tables --help` for more information.
> (exit status 2))
Fix this error by passing nil -- not the `host_ipv4` address -- when
creating the IPv6 rules.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
- store linkIndex in a local variable so that it can be reused
- remove / rename some intermediate vars that shadowed existing declaration
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This argument was originally added in libnetwork:
03f440667f
At the time, this argument was conditional, but currently it's always set
to "true", so let's remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The code ignores these errors, but will unconditionally print a warning;
> If the kernel deletion fails for the neighbor entry still remote it
> from the namespace cache. Otherwise if the neighbor moves back to the
> same host again, kernel update can fail.
Let's reduce noise if the neighbor wasn't found, to prevent logs like:
Aug 16 13:26:35 master1.local dockerd[4019880]: time="2023-08-16T13:26:35.186662370+02:00" level=warning msg="error while deleting neighbor entry" error="no such file or directory"
Aug 16 13:26:35 master1.local dockerd[4019880]: time="2023-08-16T13:26:35.366585939+02:00" level=warning msg="error while deleting neighbor entry" error="no such file or directory"
Aug 16 13:26:42 master1.local dockerd[4019880]: time="2023-08-16T13:26:42.366658513+02:00" level=warning msg="error while deleting neighbor entry" error="no such file or directory"
While changing this code, also slightly rephrase the code-comment, and
fix a typo ("remote -> remove").
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
libnetwork/osl: Namespace.DeleteNeighbor: rephrase code-comment
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
container.Run() should be a synchronous operation in normal circumstances;
the container is created and started, so polling after that for the
container to be in the "running" state should not be needed.
This should also prevent issues when a container (for whatever reason)
exited immediately after starting; in that case we would continue
polling for it to be running (which likely would never happen).
Let's skip the polling; if the container is not in the expected state
(i.e. exited), tests should fail as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
validateEndpoint was doing more than just validating; it was also implicitly
mutating the endpoint that was passed to it (by reference).
Given that validation only happend when constructing a new v1Endpoint, let's
merge these functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon would pass an EndpointCreateOption to set the interface MAC
address if the network name and the provided network mode were matching.
Obviously, if the network mode is a network ID, it won't work. To make
things worse, the network mode is never normalized if it's a partial ID.
To fix that: 1. the condition under what the container's mac-address is
applied is updated to also match the full ID; 2. the network mode is
normalized to a full ID when it's only a partial one.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Without these compile flags, Delve is unable to report the value of some
variables and it's not possible to jump into inlined code.
As the contributing docs already mention that `DOCKER_DEBUG` should
disable "build optimizations", the env var is reused here instead of
introducing a new one.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
validateEndpoint uses `v1Endpoint.ping` to verify if the search API can
use a secure connection, and to fall back to basic auth. For Docker Hub,
we don't allow insecure connections, and `v1Endpoint.ping` will not connect
to Docker Hub (Docker Hub also does not implement the `_ping` endpoint,
so doing so would always fail).
Let's make it more clear that we don't do any validation, and return
early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We have the response available, which is an io.Reader, so we don't have
to read the entire response into memory before decoding.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was making a request to the `_ping` endpoint, which (if
implemented) would return a JSON response, which we unmarshal (the only
field we use from the response is the `Standalone` field).
However, if the response had a `X-Docker-Registry-Standalone`, that header
took precedence, and would overwrite the earlier `Standalone` value we
obtained from the JSON response.
This patch adds a fast-path for situations where the header is present,
in which case we can skip handling the JSON response altogether.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We don't really want the daemon to panic for this so let's log a warning
about max downloads and uploads
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
With containerd image store the images don't depend on each other even
if they share the same content and it's totally fine to delete the
"parent" image.
The skip is necessary because deleting the "parent" image does not
produce an error with the c8d image store and deleting the `busybox`
image breaks other tests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- The `Version` field was not used for any purpose, other than a debug log
- The `X-Docker-Registry-Version` header was part of the registry v1 spec,
however, as we're not using the `Version` field, we don't need the
header for anything.
- The `X-Docker-Registry-Config` header was only set by the mock registry;
there's no code consuming it, so we don't need to mock it (even if an
actual v1 registry / search API would return it).
It's also worth noting that we never call the `_ping` endpoint when using
Docker Hub's search API, and Docker Hub does not even implement the `_ping`
endpoint;
curl -fsSL https://index.docker.io/_ping | head -n 4
<!DOCTYPE html>
<html lang="en">
<head>
<title>Docker</title>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In some cases, when the daemon launched by a test panics and quits, the
cleanup code would end with an error when trying to kill it by its pid.
In those cases the whole suite will end up waiting for the daemon that
we start in .integration-daemon-start to finish and we end up waiting 2
hours for the CI to cancel after a timeout.
Using process substitution makes the integration tests quit.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Images built by classic builder will have an additional label (in the
containerd image object, not image config) pointing to a parent of that
image.
This allows to differentiate intermediate images (dangling
images created as a result of a each Dockerfile instruction) from the
final images.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Reduce some of the boiler-plating, and by combining the tests, we skip
the testenv.Clean() in between each of the tests. Performance gain isn't
really measurable, but every bit should help :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
container.Run should be an synchronous operation; the container should
be running after the request was made (or produce an error). Simplify
these tests, and remove the redundant polling.
These were added as part of 8f800c9415,
but no such polls were in place before the refactor, and there's no
mention of these during review of the PR, so I assume these were just
added either as a "precaution", or a result of "copy/paste" from another
test.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was failing frequently on Windows, where the test was waiting
for the container to exit before continuing;
=== FAIL: github.com/docker/docker/integration/container TestResizeWhenContainerNotStarted (18.69s)
resize_test.go:58: timeout hit after 10s: waiting for container to be one of (exited), currently running
It looks like this test is merely validating that a container in any non-
running state should produce an error, so there's no need to run a container
(waiting for it to stop), and just "creating" a container (which would be
in `created` state) should work for this purpose.
Looking at 8f800c9415, I see `createSimpleContainer`
and `runSimpleContainer` utilities were added, so I'm even wondering if the
original intent was to use `createSimpleContainer` for this test.
While updating, also check if we get the expected error-type, instead of
only checking for the error-message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement a function that returns an error to replace existing uses of
the IsOSSupported utility, where callers had to produce the error after
checking.
The IsOSSupported function was used in combination with images, so implementing
a utility in "image" to prevent having to import pkg/system (which contains many
unrelated functions)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This wires up the integration tests to export spans to a jager instance.
After tests are finished it exports the data out of jaeger and uploads
as an artifact to the action run.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Integration tests will now configure clients to propagate traces as well
as create spans for all tests.
Some extra changes were needed (or desired for trace propagation) in the
test helpers to pass through tracing spans via context.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This uses otel standard environment variables to configure tracing in
the daemon.
It also adds support for propagating trace contexts in the client and
reading those from the API server.
See
https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
for details on otel environment variables.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- check if we have to download layers and print the approriate message
- show the digest of the pulled manifest(list)
- skip pulling if we already have the right manifest
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The compressor is already closed a few lines below and there's no error
returns between so the defer is not needed.
Calling Close twice on a writerCloserWrapper is unsafe as it causes it
to put the same buffer to the pool multiple times.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When running a `docker cp` to copy files to/from a container, the
lookup of the `getent` executable happens within the container's
filesystem, so we cannot re-use the results.
Unfortunately, that also means we can't preserve the results for
any other uses of these functions, but probably the lookup should not
be "too" costly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This causes the test to have a saner error message when the `images
-q` returns multiple images separated by newline.
Before this the test would fail with `invalid reference format` when
parsing the multiline string as an image reference.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The test-file had a duplicate definition for ErrNotImplemented, which
caused an error in this package, and was not used otherwise, so we can
remove this file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updated the description to clarify that this is the endpoint to use if
you want to pull an image.
Signed-off-by: David Karlsson <35727626+dvdksn@users.noreply.github.com>
This makes the c8d code which creates/reads OCI types not lose
Docker-specific features like ONBUILD or Healthcheck.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Parent is a graph-driver only field which is stored in the ImageStore.
It's not available when using containerd snapshotters.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
These were dependent on the DOCKER_ENGINE_GOARCH environment variable
but this var was no longer set. There was also some weird check to see
if the architecture is "windows" which doesn't make sense. Seeing how
nothing failed ever since the TIMEOUT was no longer platform-dependent
we can safely remove this check.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The test considered `Foo/bar` to be an invalid name, with the assumption
that it was `[docker.io]/Foo/bar`. However, this was incorrect, and the
test passed because the reference parsing had a bug; if the first element
(`Foo`) is not lowercase (so not a valid namespace / "path element"), then
it *should* be considered a domain (as uppercase domain names are valid).
The reference parser did not account for this, and running the test with
a version of the parser with a fix caused the test to fail:
=== Failed
=== FAIL: client TestImageTagInvalidSourceImageName/invalidRepo/FOO/bar (0.00s)
image_tag_test.go:54: assertion failed: expected error to contain "not a valid repository/tag", got "Error response from daemon: client should not have made an API call"
Error response from daemon: client should not have made an API call
=== FAIL: client TestImageTagInvalidSourceImageName (0.00s)
This patch removes the faulty test-case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add back files at the old locations, as there may be external links
referencing the specification.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This merges the v1.2 specs to provide a single history of the
specification.
To view the combined history:
git log --follow image/spec/spec.md
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in b18ae8c9cc, which
was part of v1.12.0-rc1 and later, which used image spec v1.2.0.
This patch amends the v1.2 spec to include the missing field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in 9db5db1b94, which
was part of v1.10.0-rc1 and later, which used image spec v1.1.0.
It's worth noting that documentation for the v1.1.0 image spec was not
yet available until commit 4fa0eccd10,
which was included in v1.12.0-rc1 and up. The `ArgsEscaped` field was
also adopted by the OCI image spec since [v1.1.0-rc3][1], but considered
deprecated, and not recommended to be used.
This patch amends the v1.1 and v1.2 specifications to describe the field.
[1]: 59780aa569
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in commit 9f994c9646,
which was merged before the image-spec v1.0.0 was released (which happened
in commit 79910625f0).
This patch backfills the specifications to describe the property.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove some trailing commas, which made the JSON invalid (some of these
were fixed in the 1.2 spec, but not in older versions).
- synchronise some formatting / phrasing between versions, to make them
easier to compare.
- remove non-breaking spaces (`NBSP`) in example outputs, and replace
them with regular spaces.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While there's not much we can do if we failed to store a snapshot of the
container's state, let's log the error in case it happens in stad of discarding.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Daemon.handleContainerExit() returns an error if snapshotting the container's
state to disk fails. There's not much we can do with the error if it occurs,
but let's log the error if that happens, instead of discarding it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function either had to create a new StaticRoute, or add the destination
to the list of routes. Skip creating a StaticRoute struct if we're not
gonna use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Tests are failing with this error:
E ValueError: scheme http+docker is invalid
Which is reported in docker-py in https://github.com/docker/docker-py/issues/1478.
Not sure what changed in the tests, but could be due to updated Python
version or dependencies, but let's skip it for now.
Test failure:
___________ AttachContainerTest.test_run_container_reading_socket_ws ___________
tests/integration/api_container_test.py:1245: in test_run_container_reading_socket_ws
pty_stdout = self.client.attach_socket(container, opts, ws=True)
docker/utils/decorators.py:19: in wrapped
return f(self, resource_id, *args, **kwargs)
docker/api/container.py:98: in attach_socket
return self._attach_websocket(container, params)
docker/utils/decorators.py:19: in wrapped
return f(self, resource_id, *args, **kwargs)
docker/api/client.py:312: in _attach_websocket
return self._create_websocket_connection(full_url)
docker/api/client.py:315: in _create_websocket_connection
return websocket.create_connection(url)
/usr/local/lib/python3.7/site-packages/websocket/_core.py:601: in create_connection
websock.connect(url, **options)
/usr/local/lib/python3.7/site-packages/websocket/_core.py:245: in connect
options.pop('socket', None))
/usr/local/lib/python3.7/site-packages/websocket/_http.py:117: in connect
hostname, port, resource, is_secure = parse_url(url)
/usr/local/lib/python3.7/site-packages/websocket/_url.py:62: in parse_url
raise ValueError("scheme %s is invalid" % scheme)
E ValueError: scheme http+docker is invalid
------- generated xml file: /src/bundles/test-docker-py/junit-report.xml -------
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 1980deffae, which
changed the implementation, but forgot to update imports in these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There is no meaningful distinction between driverapi.Registerer and
drvregistry.DriverNotifyFunc. They are both used to register a network
driver with an interested party. They have the same function signature.
The only difference is that the latter could be satisfied by an
anonymous closure. However, in practice the only implementation of
drvregistry.DriverNotifyFunc is the
(*libnetwork.Controller).RegisterDriver method. This same method also
makes the libnetwork.Controller type satisfy the Registerer interface,
therefore the DriverNotifyFunc type is redundant. Change
drvregistry.Networks to notify a Registerer and drop the
DriverNotifyFunc type.
Signed-off-by: Cory Snider <csnider@mirantis.com>
On startup all local volumes were unmounted as a cleanup mechanism for
the non-clean exit of the last engine process.
This caused live-restored volumes that used special volume opt mount
flags to be broken. While the refcount was restored, the _data directory
was just unmounted, so all new containers mounting this volume would
just have the access to the empty _data directory instead of the real
volume.
With this patch, the mountpoint isn't unmounted. Instead, if the volume
is already mounted, just mark it as mounted, so the next time Mount is
called only the ref count is incremented, but no second attempt to mount
it is performed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A package only needs one "import" comment to enforce, so keeping
one in the go.doc.
It should be noted that even with that; in most cases, go will ignore
these comments (if go modules are used, even in "vendor" mode).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single test, and was not using any of
the gotest.tools features, so let's remove it as dependency.
With this, the package has no external dependencies (only stdlib).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Log a warning if we encounter an error when releasing leases. While it
may not have direct consequences, failing to release the lease should be
unexpected, so let's make them visible.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was used when the code supported both "v1" and "v2" registries.
We no longer support v1 registries, and the only v1 endpoint that's still
used is for the legacy "search" endpoint, which does not use the APIEndpoint
type.
As no code is using this field, and the value will always be set to "v2",
we can deprecated the Version field.
I'm keeping this field for 1 release, to give notice to any potential
external consumer, after which we can delete it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure that the content in the live-restored volume mounted in a new
container is the same as the content in the old container.
This checks if volume's _data directory doesn't get unmounted on
startup.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Explain that search is not supported on v2 endpoints, and include the
offending endpoint in the error-message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
First, remove the loop over `apiVersions`. The `apiVersions` map has two
entries (`APIVersion1 => "v1"`, and `APIVersion2 => "v2"`), and `APIVersion1`
is skipped, which means that the loop effectively translates to;
if apiVersionStr == "v2" {
return "", invalidParamf("unsupported V1 version path %s", apiVersionStr)
}
Which leaves us with "anything else" being returned as-is.
This patch removes the loop, and replaces the remaining handling to check
for the "v2" suffix to produce an error, or to strip the "v1" suffix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Combine the two tests into a TestV1EndpointParse function, and rewrite
them to use gotest.tools for asserting.
Also changing the test-cases to use "https://", as the scheme doesn't
matter for this test, but using "http://" may trip-up some linters,
so let's avoid that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 70ad5b818f changed event.Type
to be a strong type, no longer an alias for string. for some reason,
this test passed on the PR, but failed later on;
=== Failed
=== FAIL: daemon/events TestLoadBufferedEventsOnlyFromPast (0.00s)
events_test.go:203: assertion failed: network (messages[0].Type events.Type) != network (string)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the error message slightly clearer on "what" part is not valid,
and provide suggestions on what are acceptable values.
Before this change:
docker create --restart=always:3 busybox
Error response from daemon: invalid restart policy: maximum retry count cannot be used with restart policy 'always'
docker create --restart=always:-1 busybox
Error response from daemon: invalid restart policy: maximum retry count cannot be used with restart policy 'always'
docker create --restart=unknown busybox
Error response from daemon: invalid restart policy 'unknown'
After this change:
docker create --restart=always:3 busybox
Error response from daemon: invalid restart policy: maximum retry count can only be used with 'on-failure'
docker create --restart=always:-1 busybox
Error response from daemon: invalid restart policy: maximum retry count can only be used with 'on-failure' and cannot be negative
docker create --restart=unknown busybox
Error response from daemon: invalid restart policy: unknown policy 'unknown'; use one of 'no', 'always', 'on-failure', or 'unless-stopped'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were testing the deprecated fields, instead of their non-deprecated
alternatives.
This patch adds a utility to verify that they match, and rewrites the tests
to check the non-deprecated fields instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- clean up "//import" comment, as test-files cannot be imported, and only
one "//import" comment is needed per package.
- remove some intermediate variables
- rewrite assertions to use gotest.tools
- use assert.Check()) (non-fatal) where possible
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added in 247f4796d2, and
at the time was added as an alias for string;
> api/types/events: add "Type" type for event-type enum
>
> Currently just an alias for string, but we can change it to be an
> actual type.
Now that all code uses the defined types, we should be able to make
this an actual type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updating the version of crun that we use in our tests to a version that
supports the "features" command (crun v1.8.6 and up). This should prevent
some warnings in our logs:
WARN[2023-08-26T17:05:35.042978552Z] Failed to run [/usr/local/bin/crun features]: "unknown command features\n" error="exit status 1"
full diff: https://github.com/containers/crun/compare/1.4.5...1.8.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add a fast-patch to some functions, to prevent locking/unlocking,
or other operations that would not be needed;
- Network.addDriverInfoToCluster
- Network.deleteDriverInfoFromCluster
- Network.addServiceInfoToCluster
- Network.deleteServiceInfoFromCluster
- Network.addDriverWatches
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- return early when failing to fetch the driver
- store network-ID and controller in a variable to prevent repeatedly
locking/unlocking. We don't expect the network's ID to change
during this operation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This method is not part of any interface, and identical to Endpoint.Iface,
but one returns an Interface-type (driverapi.InterfaceInfo) and the other
returns a concrete type (EndpointInterface).
Interface-matching should generally happen on the receiver side, and this
function was only used in a single location, and passed as argument to
Driver.CreateEndpoint, which already matches the interface by accepting
a driverapi.InterfaceInfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Interface-matching should generally happen on the receiver side, and this
function was only used in a single location, and passed as argument to
Driver.CreateEndpoint, which already matches the interface by accepting
a driverapi.InterfaceInfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Interface-matching should generally happen on the receiver side, and this
function was only used in a single location, and passed as argument to
Driver.CreateEndpoint, which already matches the interface by accepting
a driverapi.InterfaceInfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also swapping the order of arguments; putting the "attributes" arguments
last, so that variables can be more cleanly inlined.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The content of this file was removed in c0bc14e8dd,
and all it container since was the package name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The devicemapper graphdriver has been removed in commit
dc11d2a2d8, and we should
no longer need this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This check was added in 98fe4bd8f1, to check
whether dm_task_deferred_remove could be removed, and to conditionally set
the corresponding build-tags. Now that the devicemapper graphdriver has been
removed in dc11d2a2d8, we no longer need this.
This patch:
- removes uses of the (no longer used) `libdm`, `dlsym_deferred_remove`,
and `libdm_no_deferred_remove` build-tags.
- removes the `add_buildtag` utility, which is now unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This documentation moved to a different page, and the Go documentation
moved to the https://go.dev/ domain.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit ab35df454d removed most of the pre-go1.17
build-tags, but for some reason, "go fix" doesn't remove these, so removing
the remaining ones manually
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copying relevant documentation from the EndpointInfo interface. We should
remove this interface, and the related Info() function, but it's currently
acting as a "gate" to prevent accessing the Endpoint's accessors without
making sure it's fully hydrated.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While working on this code, I noticed that there's currently an issue
with userns enabled. When userns is enabled, joining another container's
namespace must also join its user-namespace.
However, a container can only be in a single user namespace, so if a
container joins namespaces from multiple containers, latter user-namespaces
overwrite former ones.
We must add validation for this, but in the meantime, add notes / todo's.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Most error-message returned would already include "container" and the
container ID in the error-message (e.g. "container %s is not running"),
so there's no need to add a custom prefix for that.
- os.Stat returns a PathError, which already includes the operation ("stat"),
the path, and the underlying error that occurred.
And while updating, let's also fix the name to be proper camelCase :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function didn't need the whole container, only its ID, so let's
use that as argument. This also makes it consistent with getIpcContainer.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`Daemon.getPidContainer()` was wrapping the error-message with a message
("cannot join PID of a non running container") that did not reflect the
actual reason for the error; `Daemon.GetContainer()` could either return
an invalid parameter (invalid / empty identifier), or a "not found" error
if the specified container-ID could not be found.
In the latter case, we don't want to return a "not found" error through
the API, as this would indicate that the container we're _starting_ was
not found (which is not the case), so we need to convert the error into
an `errdefs.ErrInvalidParameter` (the container-ID specified for the PID
namespace is invalid if the container doesn't exist).
This logic is similar to what we do for IPC namespaces. which received
a similar fix in c3d7a0c603.
This patch updates the error-types, and moves them into the getIpcContainer
and getPidContainer container functions, both of which should return
an "invalid parameter" if the container was not found.
It's worth noting that, while `WithNamespaces()` may return an "invalid
parameter" error, the `start` endpoint itself may _not_ be. as outlined
in commit bf1fb97575, starting a container
that has an invalid configuration should be considered an internal server
error, and is not an invalid _request_. However, for uses other than
container "start", `WithNamespaces()` should return the correct error
to allow code to handle it accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were using a mixture of approaches for these; aligning them a bit
to all use switch statements.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 12485d62ee to save some
duplication, but was really over-engineered to save a few lines of code,
at the cost of hiding away what it does and also potentially returning
inconsistent errors (not addressed in this patch). Let's start with
inlining these.
This removes;
- Daemon.checkContainer
- daemon.containerIsRunning
- daemon.containerIsNotRestarting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We used to have to clone and build the registry v2 but now that we have
updated the version we can directtly copy the binary from the official
distribution/distribution image.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Previous image created a new partially filled image.
This caused child images to lose their parent's layers.
Instead of creating a new object and trying to replace its fields, just
clone the original passed image and change its ID to the manifest
digest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
There's only one implementation; let's use that.
Also fixing a linting issue;
libnetwork/osl/interface_linux.go:91:2: S1001: should use copy(to, from) instead of a loop (gosimple)
for i, iface := range n.iFaces {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
InterfaceOptions() returned an IfaceOptionSetter interface, which contained
"methods" that returned functional options. Such a construct could have made
sense if the functional options returned would (e.g.) be pre-propagated with
information from the Sandbox (network namespace), but none of that was the case.
There was only one implementation of IfaceOptionSetter (networkNamespace),
which happened to be the same as the only implementation of Sandbox, so remove
the interface as well, to help networkNamespace with its multi-personality
disorder.
This patch:
- removes Sandbox.Bridge() and makes it a regular function (WithIsBridge)
- removes Sandbox.Master() and makes it a regular function (WithMaster)
- removes Sandbox.MacAddress() and makes it a regular function (WithMACAddress)
- removes Sandbox.Address() and makes it a regular function (WithIPv4Address)
- removes Sandbox.AddressIPv6() and makes it a regular function (WithIPv6Address)
- removes Sandbox.LinkLocalAddresses() and makes it a regular function (WithLinkLocalAddresses)
- removes Sandbox.Routes() and makes it a regular function (WithRoutes)
- removes Sandbox.InterfaceOptions().
- removes the IfaceOptionSetter interface.
Note that the IfaceOption signature was changes as well to allow returning
an error. This is not currently used, but will be used for some options
in the near future, so adding that in preparation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
NeighborOptions() returned an NeighborOptionSetter interface, which
contained "methods" that returned functional options. Such a construct
could have made sense if the functional options returned would (e.g.)
be pre-propagated with information from the Sandbox (network namespace),
but none of that was the case.
There was only one implementation of NeighborOptionSetter (networkNamespace),
which happened to be the same as the only implementation of Sandbox, so
remove the interface as well, to help networkNamespace with its multi-personality
disorder.
This patch:
- removes Sandbox.LinkName() and makes it a regular function (WithLinkName)
- removes Sandbox.Family() and makes it a regular function (WithFamily)
- removes Sandbox.NeighborOptions().
- removes the NeighborOptionSetter interface
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
osl.NewSandbox() always returns a nil interface on Windows (and other non-Linux
platforms). This means that any code that these fields are always nil, and
any code using these fields must be considered Linux-only.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
osl.NewSandbox() always returns a nil interface on Windows (and other non-Linux
platforms). This means that any code that these fields are always nil, and
any code using these fields must be considered Linux-only;
- libnetwork/Controller.defOsSbox
- libnetwork/Sandbox.osSbox
Ideally, these fields would live in Linux-only files, but they're referenced
in various platform-neutral parts of the code, so let's start with moving
the initialization code to Linux-only files.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copying the descriptions from the Sandbox, Info, NeighborOptionSetter,
and IfaceOptionSetter interfaces that it implements.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Check if firewalld is running before running the function, so that consumers
of the function don't have to check for the status.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test is currently failing with containerd-integration, which should
be looked into, but let's start with preventing it from panicking, to make
the test-failures less noisy;
--- FAIL: TestDiskUsage/after_container.Run (0.26s)
panic: runtime error: index out of range [0] with length 0 [recovered]
panic: runtime error: index out of range [0] with length 0
goroutine 280 [running]:
testing.tRunner.func1.2({0xb07a00, 0x40002006a8})
/usr/local/go/src/testing/testing.go:1526 +0x1c8
testing.tRunner.func1()
/usr/local/go/src/testing/testing.go:1529 +0x364
panic({0xb07a00, 0x40002006a8})
/usr/local/go/src/runtime/panic.go:884 +0x1f4
github.com/docker/docker/integration/system.TestDiskUsage.func3(0x0?, {0x0, {0x14ea4a8, 0x0, 0x0}, {0x14ea4a8, 0x0, 0x0}, {0x14ea4a8, 0x0, ...}, ...})
/go/src/github.com/docker/docker/integration/system/disk_usage_test.go:82 +0x7e4
github.com/docker/docker/integration/system.TestDiskUsage.func4(0x4000235c80?)
/go/src/github.com/docker/docker/integration/system/disk_usage_test.go:118 +0x8c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also remove integration-cli: `DockerAPISuite.TestContainerAPIDeleteConflict`,
which was testing the same conditions as `TestRemoveContainerRunning` in
integration/container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Saw this failure in a flaky test, and I wondered why we consider this an
error condition;
=== RUN TestKillWithStopSignalAndRestartPolicies
main_test.go:32: assertion failed: error is not nil: Error response from daemon: Could not kill running container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7, cannot remove - Container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7 is not running: failed to remove 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7
--- FAIL: TestKillWithStopSignalAndRestartPolicies (0.84s)
=== RUN TestKillWithStopSignalAndRestartPolicies/same-signal-disables-restart-policy
--- PASS: TestKillWithStopSignalAndRestartPolicies/same-signal-disables-restart-policy (0.42s)
=== RUN TestKillWithStopSignalAndRestartPolicies/different-signal-keep-restart-policy
--- PASS: TestKillWithStopSignalAndRestartPolicies/different-signal-keep-restart-policy (0.23s)
In the above;
1. `Error response from daemon: Could not kill running container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7`
2. `cannot remove - Container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7 is not running`
3. `failed to remove 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7`
So it looks like the removal fails because we couldn't kill the container
because it was already stopped, which may be a race condition where the first
check shows the container to be running (but may already be in process to be
removed or killed. In either case, we probably shouldn't fail the removal if
the container is already stopped.
This patch adds a `isNotRunning()` utility, so that we can ignore this case,
and proceed with the removal.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function never returns an error, so let's remove the error-return,
and give it a slightly more to-the-point name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Windows uses the container-iD as ID for sandboxes, so it's not needed to
generate an ID when running on Windows.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The BuildKit dockerignore package was integrated in the patternmatcher
repository / module. This patch updates our uses of the BuildKit package
with its new location.
A small local change was made to keep the format of the existing error message,
because the "ignorefile" package is slightly more agnostic in that respect
and doesn't include ".dockerignore" in the error message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the lease doesn't exit (for example when creating the container
failed), just ignore the not found error.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to moby/moby#44968, libnetwork would happily accept a ChildSubnet
with a bigger mask than its parent subnet. In such case, it was
producing IP addresses based on the parent subnet, and the child subnet
was not allocated from the address pool.
This commit automatically fixes invalid ChildSubnet for networks stored
in libnetwork's datastore.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Currently, IPAM config is never validated by the API. Some checks
are done by the CLI, but they're not exhaustive. And some of these
misconfigurations might be caught early by libnetwork (ie. when the
network is created), and others only surface when connecting a container
to a misconfigured network. In both cases, the API would return a 500.
Although the `NetworkCreate` endpoint might already return warnings,
these are never displayed by the CLI. As such, it was decided during a
maintainer's call to return validation errors _for all API versions_.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also move the validation function to live with the type definition,
which allows it to be used outside of the daemon as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the image for the wanted platform doesn't exist then the lease
doesn't exist either. Returning this error hides the real error, so
let's not return it.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The Controller.Sandboxes method was used by some SandboxWalkers. Now
that those have been removed, there are no longer any consumers of this
method, so let's remove it for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I had a CI run fail to "Upload reports":
Exponential backoff for retry #1. Waiting for 4565 milliseconds before continuing the upload at offset 0
Finished backoff for retry #1, continuing with upload
Total file count: 211 ---- Processed file #160 (75.8%)
...
Total file count: 211 ---- Processed file #164 (77.7%)
Total file count: 211 ---- Processed file #164 (77.7%)
Total file count: 211 ---- Processed file #164 (77.7%)
A 503 status code has been received, will attempt to retry the upload
##### Begin Diagnostic HTTP information #####
Status Code: 503
Status Message: Service Unavailable
Header Information: {
"content-length": "592",
"content-type": "application/json; charset=utf-8",
"date": "Mon, 21 Aug 2023 14:08:10 GMT",
"server": "Kestrel",
"cache-control": "no-store,no-cache",
"pragma": "no-cache",
"strict-transport-security": "max-age=2592000",
"x-tfs-processid": "b2fc902c-011a-48be-858d-c62e9c397cb6",
"activityid": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-tfs-session": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-vss-e2eid": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-vss-senderdeploymentid": "63be6134-28d1-8c82-e969-91f4e88fcdec",
"x-frame-options": "SAMEORIGIN"
}
###### End Diagnostic HTTP information ######
Retry limit has been reached for chunk at offset 0 to https://pipelinesghubeus5.actions.githubusercontent.com/Y2huPMnV2RyiTvKoReSyXTCrcRyxUdSDRZYoZr0ONBvpl5e9Nu/_apis/resources/Containers/8331549?itemPath=integration-reports%2Fubuntu-22.04-systemd%2Fbundles%2Ftest-integration%2FTestInfoRegistryMirrors%2Fd20ac12e48cea%2Fdocker.log
Warning: Aborting upload for /tmp/reports/ubuntu-22.04-systemd/bundles/test-integration/TestInfoRegistryMirrors/d20ac12e48cea/docker.log due to failure
Error: aborting artifact upload
Total file count: 211 ---- Processed file #165 (78.1%)
A 503 status code has been received, will attempt to retry the upload
Exponential backoff for retry #1. Waiting for 5799 milliseconds before continuing the upload at offset 0
As a result, the "Download reports" continued retrying:
...
Total file count: 1004 ---- Processed file #436 (43.4%)
Total file count: 1004 ---- Processed file #436 (43.4%)
Total file count: 1004 ---- Processed file #436 (43.4%)
An error occurred while attempting to download a file
Error: Request timeout: /Y2huPMnV2RyiTvKoReSyXTCrcRyxUdSDRZYoZr0ONBvpl5e9Nu/_apis/resources/Containers/8331549?itemPath=integration-reports%2Fubuntu-20.04%2Fbundles%2Ftest-integration%2FTestCreateWithDuplicateNetworkNames%2Fd47798cc212d1%2Fdocker.log
at ClientRequest.<anonymous> (/home/runner/work/_actions/actions/download-artifact/v3/dist/index.js:3681:26)
at Object.onceWrapper (node:events:627:28)
at ClientRequest.emit (node:events:513:28)
at TLSSocket.emitRequestTimeout (node:_http_client:839:9)
at Object.onceWrapper (node:events:627:28)
at TLSSocket.emit (node:events:525:35)
at TLSSocket.Socket._onTimeout (node:net:550:8)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)
Exponential backoff for retry #1. Waiting for 5305 milliseconds before continuing the download
Total file count: 1004 ---- Processed file #436 (43.4%)
And, it looks like GitHub doesn't allow cancelling the job, possibly
because it is defined with `if: always()`?
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This functionality has been replaced with Controller.GetSandbox, and is
no longer used anywhere.
This patch removes:
- the Controller.WalkSandboxes method
- the SandboxContainerWalker SandboxWalker
- the SandboxWalker type
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Various parts of the code were using "walkers" to iterate over the
controller's sandboxes, and the only condition for all of them was
to find the sandbox for a given container-ID. Iterating over all
sandboxes was also sub-optimal, because on Windows, the ContainerID
is used as Sandbox-ID, which can be used to lookup the sandbox from
the "sandboxes" map on the controller.
This patch implements a GetSandbox method on the controller that
looks up the sandbox for a given container-ID, using the most optimal
approach (depending on the platform).
The new method can return errors for invalid (empty) container-IDs, and
a "not found" error to allow consumers to detect non-existing sandboxes,
or potentially invalid IDs.
This new method replaces the (non-exported) Daemon.getNetworkSandbox(),
which was only used internally, in favor of directly accessing the
controller's method.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was not exported so let's remove the abstraction to not make it look
like something more than it is.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
"Pay no attention to the implementation behind the curtain!"
There's only one implementation of the Sandbox interface, and only one implementation
of the Info interface, and they both happens to be implemented by the same type:
networkNamespace. Let's merge these interfaces.
And now that we know that there's one, and only one Info, we can drop the charade,
and relieve the Sandbox from its dual personality.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This better aligns to GHA/CI settings, and is in general a better
practice in the year 2023.
We also drop the 'unsupported' fallback for `git rev-parse` in the
Makefile; we have a better fallback behavior for an empty
DOCKER_GITCOMMIT in `hack/make.sh`.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
There are still messy special cases (e.g. DOCKER_GITCOMMIT vs VERSION),
but this makes things a little easier to follow, as we keep
GHA-specifics in the GHA files.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
This is something that stood out to me: removing the intermediate
container is part of a build step, but unlike the other output from
the build, wasn't indented (and prefixed with `--->`) to be shown
as part of the build.
This patch adds the `--->` prefix, to make it clearer what step the
removal was part of.
While at it, I also updated the message itself: this output is printed
_after_ the intermediate container has been removed, so we may as well
make it match reality, so I changed "removing" to "removed".
Before:
echo -e 'FROM busybox\nRUN echo hello > /dev/null\nRUN echo world > /dev/null\n' | DOCKER_BUILDKIT=0 docker build --no-cache -
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM busybox
---> a416a98b71e2
Step 2/3 : RUN echo hello > /dev/null
---> Running in a1a65b9365ac
Removing intermediate container a1a65b9365ac
---> 8c6b57ebebdd
Step 3/3 : RUN echo world > /dev/null
---> Running in 9fa977b763a5
Removing intermediate container 9fa977b763a5
---> 795c1f2fc7b9
Successfully built 795c1f2fc7b9
After:
echo -e 'FROM busybox\nRUN echo hello > /dev/null\nRUN echo world > /dev/null\n' | DOCKER_BUILDKIT=0 docker build --no-cache -
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM busybox
---> fc9db2894f4e
Step 2/3 : RUN echo hello > /dev/null
---> Running in 38d7c34c2178
---> Removed intermediate container 38d7c34c2178
---> 7c0dbc45111d
Step 3/3 : RUN echo world > /dev/null
---> Running in 629620285d4c
---> Removed intermediate container 629620285d4c
---> b92f70f2e57d
Successfully built b92f70f2e57d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Slightly refactor Resolver.dialExtDNS:
- use net.JoinHostPort to properly format IPv6 addresses
- define a const for the default port, and avoid int -> string
conversion if no custom port is defined
- slightly simplify logic if the HostLoopback is used (at the cost of
duplicating one line); in that case we don't need to define the closure
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in 36fd9d02be
(libnetwork: ce6c6e8c35),
because there were multiple places where a DNS response was created,
which had to use the same options. However, new "common" options were
added since, and having it in a function separate from the other (also
common) options was just hiding logic, so let's remove it.
What the above probably _should_ have done was to create a common utility
to create a DNS response (as all other options are shared as well). This
was actually done in 0c22e1bd07 (libnetwork:
be3531759b),
which added a `createRespMsg` utility, but missed that it could be used
for both cases.
This patch:
- removes the setCommonFlags function
- uses createRespMsg instead to share common options
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removes the deprecated consts, which moved to a separate "scope" package
in commit 6ec03d6745, and are no longer used;
- datastore.LocalScope
- datastore.GlobalScope
- datastore.SwarmScope
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
UnavailableError is now compatible with errdefs.UnavailableError. These
errors will now return a 503 instead of a 500.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
InvalidParameter is now compatible with errdefs.InvalidParameter. Thus,
these errors will now return a 400 status code instead of a 500.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Fix a failure to inspect image if any of its present manifest references
an image config which isn't present locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- un-export ZoneSettings, because it's only used internally
- make conversion to a "interface" slice a method on the struct
- remove the getDockerZoneSettings() function, and move the type-definition
close to where it's used, as it was only used in a single location
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test didn't make a lot of sense, because `checkRunning()` depends on
the `connection` package-var being set, which is done by `firewalldInit()`,
so would never be true on its own.
Add a small utility that opens its own D-Bus connection to verify if
firewalld is running, and otherwise skips the tests (preserving any
error in the process).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
DelInterfaceFirewalld returns an error if the interface to delete was
not found. Let's ignore cases where we were successfully able to get
the list of interfaces in the zone, but the interface was not part of
the zone.
This patch changes the error for these cases to an errdefs.ErrNotFound,
and updates IPTable.ProgramChain to ignore those errors.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As we have a hard time figuring out what moby/moby#46099 should look
like, this drop-in replacement will solve the initial formatting problem
we have. It's made internal such that we can remove it whenever we want
and unlike moby/moby#46099 doesn't require thoughtful API changes.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
This patch:
- introduces a `retErr` output variable, to be used in defer statements.
- explicitly changes some `err` uses to locally-scoped variables.
- moves some variable definitions closer to where they're used (where possible).
While working on this change, there was one point in the code where
error handling was ambiguous. I added a note for that, in case this
was not a bug:
> This code was previously assigning the error to the global "err"
> variable (before it was renamed to "retErr"), but in case of a
> "MaskableError" did not *return* the error:
> b325dcbff6/libnetwork/controller.go (L566-L573)
>
> Depending on code paths further down, that meant that this error
> was either overwritten by other errors (and thus not handled in
> defer statements) or handled (if no other code was overwriting it.
>
> I suspect this was a bug (but possible without effect), but it could
> have been intentional. This logic is confusing at least, and even
> more so combined with the handling in defer statements that check for
> both the "err" return AND "skipCfgEpCount":
> b325dcbff6/libnetwork/controller.go (L586-L602)
>
> To save future visitors some time to dig up history:
>
> - config-only networks were added in 25082206df
> - the special error-handling and "skipCfgEpcoung" was added in ddd22a8198
> - and updated in 87b082f365 to don't use string-matching
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There were quite some places where the type collided with variables
named `agent`. Let's rename the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function has _four_ output variables of the same type, and several
defer statements that checked the error returned (but using the `err`
variable).
This patch names the return variables to make it clearer what's being
returned, and renames the error-return to `retErr` to make it clearer
where we're dealing with the returned error (and not any local err), to
prevent accidentally shadowing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There's nothing handling these results, and they're logged as debug-logs,
so we may as well remove the returned variables.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both functions were generating debug logs if there was nothing to log.
The function already produces logs if things failed while deleting entries,
so these logs would only be printed if there was nothing to delete, so can
safely be discarded.
Before this change:
DEBU[2023-08-14T12:33:23.082052638Z] Revoking external connectivity on endpoint sweet_swirles (1519f9376a3abe7a1c981600c25e8df6bbd0a3bc3a074f1c2b3bcbad0438443b)
DEBU[2023-08-14T12:33:23.085782847Z] DeleteConntrackEntries purged ipv4:0, ipv6:0
DEBU[2023-08-14T12:33:23.085793847Z] DeleteConntrackEntriesByPort for udp ports purged ipv4:0, ipv6:0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This goroutine was added in c458bca6dc, and
looks for errors from the wait channel. If no error is returned, it attempts
to start the container, and *updates* the error if a failure happened while
doing so, so that the code below it can update the container's status, and
perform auto-remove (if set for the container).
However, due to the formatting of the code, it was easy to overlook that
the "err" variable was not local to the "if" statement.
This patch breaks up the if-statement in an attempt to make it clearer that
this is not a local "err" variable, and adds a code-comment explaining the
logic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The function was checking in a loop if networking for the container was
disabled. Change the function to return early, and to only set hooks
if one needs to be set.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only called as part of the "libnetwork-setkey" re-exec, so un-exporting
it to make clear it's not for external use.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The basepath is only used on Linux, so no need to call it on other
platforms. SetBasePath was already stubbed out on other platforms,
but "osl" was still imported in various places where it was not actually
used, so trying to reduce imports to get a better picture of what parts
are used (and not used).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were implicitly skipped through the `getTestEnv()` utility,
which made it hard to discover they were not ran on Windows.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to spot if code is only used on Linux. Note that "all of"
the bridge driver is Linux-only.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removing this type, because:
- containerNotModifiedError is not an actual error, and abstracting it away
was hiding some of these details. It also wasn't used as a sentinel error
anywhere, so doesn't have to be its own type.
- Defining a type just to toggle the error-message between "not running"
and "not stopped" felt a bit over-the-top, as each variant was only used once.
- So "it only had one job", and it didn't even do that right; it produced
capitalized error messages, which makes linters unhappy.
So, let's just inline what it does in the two places it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There's no need for this to be a closure; let's just make it a regular
function. While moving it out, also make some minor code-changes and
add some code-comments to describe the flow / intent, which may not
be trivial for people that are not familiar with these details.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The container rw layer may already be mounted, so it's not safe to use
it in another overlay mount. Use the ref counted mounter (which will
reuse the existing mount if it exists) to avoid that.
Also, mount the parent mounts (layers of the base image) in a read-only
mode.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
To prevent mounting the container rootfs in a rw mode if it's already
mounted. This can't use `mount.WithReadonlyTempMount` because the
archive code does a chroot with a pivot_root, which creates a new
directory in the rootfs.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Check that operations that could potentially perform overlayfs mounts
that could cause undefined behaviors.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Any error that occurs while creating the spec, even if it's the
result of an invalid container config, must be considered a System
error (internal server error), as it's not an error with the request
to start the container.
Invalid configuration in the config itself must be validated when
creating the container (creating its config), but some errors are
dependent on the current state, for example when starting a container
that shares a namespace with another container, and that container
is not running (or missing).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was only used for a single test, and it was very limited
in functionality as it only allowed for a certain error-string to be
matched.
Let's change it into a more generic function; a helper that allows a
container to be created from a `TestContainerConfig` (which can be
constructed using `NewTestConfig`) and that returns the response from
client.ContainerCreate(), so that any result from that can be tested,
leaving it up to the test to check the results.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Introduce a NewTestConfig utility, to allow using the available utilities
for constructing a config, and use them with the regular API client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `client` variable was colliding with the `client` import. In some cases
the confusing `cli` name (it's not the "cli") was used. Given that such names
can easily start spreading (through copy/paste, or "code by example"), let's
make a one-time pass through all of them in this package to use the same name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `client` variable was colliding with the `client` import in various
files. While it didn't conflict in all files, there was inconsistency
in the naming, sometimes using the confusing `cli` name (it's not the
"cli"), and such names can easily start spreading (through copy/paste,
or "code by example").
Let's make a one-time pass through all of them in this package to use
the same name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In these cases, continuing after a non nil error will result in a nil
dereference in panic.
Change the `assert.Check` to `assert.NilError` to avoid that.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test was testing the client-side validation, so might as well
move it there, and validate that the client invalidates before
trying to make an API call.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Attach the context to the request while we're creating it, instead of
creating the context first, and adding the context later.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Re-use the request, and change the method to GET instead of building
a new request "from scratch".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't exit immediately (due to `set -e` bash behavior) when grep returns
with a non-zero exit code. Use empty dirs instead and let it print
messages about all tests being filtered out.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
To avoid passing the `/` prefix in the -test.run to the integration test
suite, which for some reason executes all tests, but works fine with
integration-cli.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previous check checked if ANY of the test directories isn't
integration-cli. This means it was true if TEST_FILTER matched multiple
tests from both integration and integration-cli suite.
Remove the grep `-v` inversion and replace it with a bash negation, so
it actually checks if there is no `integration-cli` in test dirs.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The mutex is only used on reads, but there's nothing protecting writes,
and it looks like nothing is mutating fields after creation, so let's
remove this altogether.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No context in the commit that added it, but PR discussion shows that
the API was mostly exploratory, and it was 8 Years go, so let's not
head in that direction :) b646784859
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that we removed the interface, there's no need to cast the Network
to a NetworkInfo interface, so we can remove uses of the `Info()` method.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These errors aren't used in our repo and seem unused by the OSS
community (this was checked with Sourcegraph).
- ErrIpamInternalError has never been used
- ErrInvalidRequest is unused since moby/libnetwork@c85356efa
- ErrPoolNotFound has never been used
- ErrOverlapPool has never been used
- ErrNoAvailablePool has never been used
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
PR moby/moby#45759 is going to use the new `errors.Join` function to
return a list of validation errors.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Check the preferredPool first, as other checks could be doing more
(such as locking, or validating / parsing). Also adding a note, as
it's unclear why we're ignoring invalid pools here.
The "invalid" conditions was added in [libnetwork#1095][1], which
moved code to reduce os-specific dependencies in the ipam package,
but also introduced a types.IsIPNetValid() function, which considers
"0.0.0.0/0" invalid, and added it to the condition to return early.
Unfortunately review does not mention this change, so there's no
context why. Possibly this was done to prevent errors further down
the line (when checking for overlaps), but returning an error here
instead would likely have avoided that as well, so we can only guess.
To make this code slightly more transparent, this patch also inlines
the "types.IsIPNetValid" function, as it's not used anywhere else,
and inlining it makes it more visible.
[1]: 5ca79d6b87 (diff-bdcd879439d041827d334846f9aba01de6e3683ed8fdd01e63917dae6df23846)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was only run if no preferred pool was specified, however,
since [libnetwork#1162][2], the function would already return early
if a preferred pools was set (and the overlap check to be skipped),
so this was now just dead code.
[2]: 9cc3385f44
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function intentionally holds a lock / lease on address-pools to
prevent trying the same pool repeatedly.
Let's try to make this logic slightly more transparent, and prevent
defining defers in a loop. Releasing all the pools in a singe defer
also allows us to get the network-name once, which prevents locking
and unlocking the network for each iteration.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both functions have multiple output vars with generic types, which made
it hard to grasp what's what.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to consume, without first having to create an empty
PoolID.
Performance is the same:
BenchmarkPoolIDFromString-10 6100345 196.5 ns/op 112 B/op 3 allocs/op
BenchmarkPoolIDFromString-10 6252750 192.0 ns/op 112 B/op 3 allocs/op
Note that I opted not to change the return-type to a pointer, as that seems
to perform less;
BenchmarkPoolIDFromString-10 6252750 192.0 ns/op 112 B/op 3 allocs/op
BenchmarkPoolIDFromString-10 5288682 226.6 ns/op 192 B/op 4 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As this function may be called repeatedly to convert to/from a string,
it may be worth optimizing it a bit. Adding a minimal Benchmark for
it as well.
Before/after:
BenchmarkPoolIDToString-10 2842830 424.3 ns/op 232 B/op 12 allocs/op
BenchmarkPoolIDToString-10 7176738 166.8 ns/op 112 B/op 7 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
network.requestPoolHelper and Allocator.RequestPool have many args and
output vars with generic types. Add names for them to make it easier to
grasp what's what.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The options are unused, other than for debug-logging, which made it look
as if they were actually consumed anywhere, but they aren't.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it slightly more readable to see what's returned in each of
the code-paths. Also move validation of pool/subpool earlier in the
function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were testing non-existing plugins, but therefore triggered
the retry-loop, which times out after 15-30 seconds. Add some options
to allow overriding this timeout during tests.
Before:
go test -v -run '^(TestGet|TestNewClientWithTimeout)$'
=== RUN TestGet
=== RUN TestGet/success
=== RUN TestGet/not_implemented
=== RUN TestGet/not_exists
WARN[0000] Unable to locate plugin: vegetable, retrying in 1s
WARN[0001] Unable to locate plugin: vegetable, retrying in 2s
WARN[0003] Unable to locate plugin: vegetable, retrying in 4s
WARN[0007] Unable to locate plugin: vegetable, retrying in 8s
--- PASS: TestGet (15.02s)
--- PASS: TestGet/success (0.00s)
--- PASS: TestGet/not_implemented (0.00s)
--- PASS: TestGet/not_exists (15.02s)
=== RUN TestNewClientWithTimeout
client_test.go:166: started remote plugin server listening on: http://127.0.0.1:36275
WARN[0015] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": context deadline exceeded (Client.Timeout exceeded while awaiting headers), retrying in 1s
WARN[0017] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": context deadline exceeded (Client.Timeout exceeded while awaiting headers), retrying in 2s
WARN[0019] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": net/http: request canceled (Client.Timeout exceeded while awaiting headers), retrying in 4s
WARN[0024] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": net/http: request canceled (Client.Timeout exceeded while awaiting headers), retrying in 8s
--- PASS: TestNewClientWithTimeout (17.64s)
PASS
ok github.com/docker/docker/pkg/plugins 32.664s
After:
go test -v -run '^(TestGet|TestNewClientWithTimeout)$'
=== RUN TestGet
=== RUN TestGet/success
=== RUN TestGet/not_implemented
=== RUN TestGet/not_exists
WARN[0000] Unable to locate plugin: this-plugin-does-not-exist, retrying in 1s
--- PASS: TestGet (1.00s)
--- PASS: TestGet/success (0.00s)
--- PASS: TestGet/not_implemented (0.00s)
--- PASS: TestGet/not_exists (1.00s)
=== RUN TestNewClientWithTimeout
client_test.go:167: started remote plugin server listening on: http://127.0.0.1:45973
--- PASS: TestNewClientWithTimeout (0.04s)
PASS
ok github.com/docker/docker/pkg/plugins 1.050s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If TEST_INTEGRATION_FAIL_FAST is not set, run the integration-cli tests
even if integration tests failed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Collect a list of all the links we successfully enabled (if any), and
use a single defer to disable them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The iptables package has types defined for these actions; use them directly
instead of creating a string only to convert it to a known value.
As the linkContainers() function is only used internally, and with fixed
values, we can also remove the validation, and InvalidIPTablesCfgError
error, which is now unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Partially revert commit 94b880f.
The CheckDuplicate field has been introduced in commit 2ab94e1. At that
time, this check was done in the network router. It was then moved to
the daemon package in commit 3ca2982. However, commit 94b880f duplicated
the logic into the network router for no apparent reason. Finally,
commit ab18718 made sure a 409 would be returned instead of a 500.
As this logic is first done by the daemon, the error -> warning
conversion can't happen because CheckDuplicate has to be true for the
daemon package to return an error. If it's false, the daemon proceed
with the network creation, set the Warning field of its return value and
return no error.
Thus, the CheckDuplicate logic in the api is removed and
libnetwork.NetworkNameError now implements the ErrConflict interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
During review, it was decided to remove `LimitNOFILE` from `docker.service` to rely on the systemd v240 implicit default of `1024:524288`. On supported platforms with systemd prior to v240, packagers will patch the service with an explicit `LimitNOFILE=1024:524288`.
- `1024` soft limit is an implicit default, avoiding unexpected breakage. Software that needs a higher limit should request to raise the soft limit for its process.
- `524288` hard limit is an implicit default since systemd v240 and is adequate for most processes (_half of the historical limit from `fs.nr_open` of `1048576`_), while 4096 is the implicit default from the kernel (often too low). Individual containers can be started with `--ulimit` when a larger hard limit is required.
- The hard limit may not exceed `fs.nr_open` (_which a value of `infinity` will resolve to_). On most systems with systemd v240 or newer, this will resolve to an excessive size of 2^30 (over 1 billion).
- When set to `infinity` (usually as the soft limit) software may experience significantly increased resource usage, resulting in a performance regression or runtime failures that are difficult to troubleshoot.
- OpenRC current config approach lacks support for different soft/hard limits being set as it adjusts additional limits and `ulimit` does not support mixed usage of `-H` + `-S`. A soft limit of `524288` is not ideal, but 2^19 is much less overhead than 2^30, whilst a hard limit of 4096 would be problematic for Docker.
Signed-off-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
Remove some intermediate vars, move vars closer to where they're used,
and introduce local var for `nw.Name()` to reduce some locking/unlocking in:
- `Daemon.allocateNetwork()`
- `Daemon.releaseNetwork()`
- `Daemon.connectToNetwork()`
- `Daemon.disconnectFromNetwork()`
- `Daemon.findAndAttachNetwork()`
Also un-wrapping some lines to make it slightly easier to read the conditions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Remove intermediate variable
- Optimize the order of checks in the condition; check for unmanaged containers
first, before getting information about cluster state and network information.
- Simplify the log messages, as the error would already contain the same
information about the network (name or ID) and container (ID), so would
print the network ID twice:
error detaching from network <ID>: could not find network attachment for container <ID> to network <name or ID>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The function was declaring an err variable which was shadowed. It was
intended for directly assigning to a struct field, but as this function
is directly mutating an existing object, and the err variable was declared
far away from its use, let's use an intermediate var for that to make it
slightly more atomic.
While at it, also combined two "if" branches.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
store network.Name() in a variable to reduce repeatedly locking/unlocking
of the network (although this is very, very minimal in the grand scheme
of things).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests were made parallel to speed up the execution, but this
turned out to be flaky, because they mutate some shared state.
The tests use shared `storage` variable without any synchronization.
However, adding synchronization is not enough in all cases, some tests
register the same plugin, so they can't be run in parallel to each
other.
This commit adds the synchronization around `storage` variable
modification and removes parallel from the tests where it's not enough.
Before:
```
$ go test -race -v . -count 1
...
--- FAIL: TestGet (15.02s)
--- FAIL: TestGet/not_implemented (0.00s)
testing.go:1446: race detected during execution of test
testing.go:1446: race detected during execution of test
FAIL
FAIL github.com/docker/docker/pkg/plugins 17.655s
FAIL
```
After:
```
$ go test -race -v . -count 1
ok github.com/docker/docker/pkg/plugins 32.702s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This function is called by `daemon.containerCreate()` which is already
wrapping errors coming from `verifyNetworkingConfig()` with
`errdefs.InvalidParameter()`. So `verifyNetworkingConfig()` should only
return standard errors.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
"HEAD" will still be used as a version if no DOCKER_COMMIT is provided
(for example when not running via `make`), but it won't prevent it being
set to the GITHUB_SHA variable when it's present.
This should fix `Git commit` reported by `docker version` for the
binaries generated by `moby-bin`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This code was initializing a new PortBinding, and creating a deep copy
for each binding. It's unclear what the intent was here, but at least
PortBinding.GetCopy() wasn't adding much value, as it created a new
PortBinding, [copying all values from the original][1], which includes
a [copy of IPAddresses in it][2]. Our original "template" did not have any
of that, so let's forego that, and just create new PortBindings as we go.
[1]: 454b6a7cf5/libnetwork/types/types.go (L110-L120)
[2]: 454b6a7cf5/libnetwork/types/types.go (L236-L244)
Benchmarking before/after;
BenchmarkPortBindingCopy-10 166752 6230 ns/op 1600 B/op 100 allocs/op
BenchmarkPortBindingNoCopy-10 226989 5056 ns/op 1600 B/op 100 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were not adding much, so just getting rid of them. Also added a
TODO to move this code to the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move variables closer to where they're used instead of defining them all
at the start of the function.
Also removing some intermediate variables, unwrapped some lines, and combined
some checks to a single check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Outside of some tests, these options are the only code setting these fields,
so we can update them to set the value, instead of appending.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was created as a "method", but didn't use the Daemon in any
way, and all other options were checked inline, so let's not pretend this
function is more "special" than the other checks, and inline the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- store network.Name() in a variable to reduce repeatedly locking/unlocking
of the network (although this is very, very minimal in the grand scheme
of things).
- un-wrap long conditions
- ever so slightly optimise some conditions by changeing the order of checks.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was initializing a new PortBinding, and creating a deep copy
for each binding. It's unclear what the intent was here, but at least
PortBinding.GetCopy() wasn't adding much value, as it created a new
PortBinding, [copying all values from the original][1], which includes
a [copy of IPAddresses in it][2]. Our original "template" did not have any
of that, so let's forego that, and just create new PortBindings as we go.
[1]: 454b6a7cf5/libnetwork/types/types.go (L110-L120)
[2]: 454b6a7cf5/libnetwork/types/types.go (L236-L244)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were not adding much, so just getting rid of them. Also added a
TODO to move this code to the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move variables closer to where they're used instead of defining them all
at the start of the function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`getPortMapInfo` does many things; it creates a copy of all the sandbox
endpoints, gets the driver, endpoints, and network from store, and creates
port-bindings for all exposed and mapped ports.
We should look if we can create a more minimal implementation for this
purpose, but in the meantime, let's prevent it being called if we don't
need it by making it the second condition in the check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- don't initialize slices; it's not needed to append to them
- store network-ID in a var to prevent repeated lock/unlocking in nw.ID()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When we added this deprecation warning, some registries had not yet
moved away from the deprecated specification, so we made the warning
conditional for pulling from Docker Hub.
That condition was added in 647dfe99a5,
which is over 4 Years ago, which should be time enough for images
and registries to have moved to current specifications.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Use the same warning for both "v1 in manifest-index" and bare "v1" images.
- Update URL to use a "/go/" redirect, which allows the docs team to more
easily redirect the URL to relevant docs (if things move).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The original code in container.Exec was potentially leaking the copy
goroutine when the context was cancelled or timed out. The new
`demultiplexStreams()` function won't return until the goroutine has
finished its work, and to ensure that it takes care of closing the
hijacked connection.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Includes a fix for CVE-2023-29409
go1.20.7 (released 2023-08-01) includes a security fix to the crypto/tls
package, as well as bug fixes to the assembler and the compiler. See the
Go 1.20.7 milestone on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.7+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.6...go1.20.7
From the mailing list announcement:
[security] Go 1.20.7 and Go 1.19.12 are released
Hello gophers,
We have just released Go versions 1.20.7 and 1.19.12, minor point releases.
These minor releases include 1 security fixes following the security policy:
- crypto/tls: restrict RSA keys in certificates to <= 8192 bits
Extremely large RSA keys in certificate chains can cause a client/server
to expend significant CPU time verifying signatures. Limit this by
restricting the size of RSA keys transmitted during handshakes to <=
8192 bits.
Based on a survey of publicly trusted RSA keys, there are currently only
three certificates in circulation with keys larger than this, and all
three appear to be test certificates that are not actively deployed. It
is possible there are larger keys in use in private PKIs, but we target
the web PKI, so causing breakage here in the interests of increasing the
default safety of users of crypto/tls seems reasonable.
Thanks to Mateusz Poliwczak for reporting this issue.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.20.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it slightly clearer what it does, as "resolve" may give the
impression it's doing more than just returning the TLS config configured
for the client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
fallbackDial was only used in a single place, and it was defined far away
from where it's used, so let's inline it, so that it's clear at a glance
what we're doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The is-automated field is being deprecated by Docker Hub's search API,
and will always be set to "false" in future.
This patch deprecates the field and related filter for the Engine's API.
In future, the `is-automated` filter will no longer yield any results
when searching for `is-automated=true`, and will be ignored when
searching for `is-automated=false`.
Given that this field is deprecated by an external API, the deprecation
will not be versioned, and will apply to any API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed this was always being skipped because of race conditions
checking the logs.
This change adds a log scanner which will look through the logs line by
line rather than allocating a big buffer.
Additionally it adds a `poll.Check` which we can use to actually wait
for the desired log entry.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv4AddrNoMatchError and IPv6AddrNoMatchError are currently implementing
BadRequestError. They are returned in two cases, and none are due to a
bad user request:
- When calling daemon's CreateNetwork route, if the bridge's IPv4
address or none of the bridge's IPv6 addresses match what's requested.
If that happens, there's a big issue somewhere in libnetwork or the
kernel.
- When restoring a network, for the same reason. In that case, the
on-disk state drifted from the interface state.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error can only be reached because of an error in our code, so it's
not a "bad user request". As it's never type asserted, no need to keep
it around.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error is only used in defensive checks whereas the precondition is
already checked by caller. If we reach it, we messed something else. So
it's definitely not a BadRequest. Also, it's not type asserted anywhere,
so just inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
It was not used as a sentinel error, and didn't carry a specific type,
which made it a rather complex way to create an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added moved to the types package as part of a refactor
in 778e2a72b3
but the introduction of the sandbox API changed the existing API to
weak types (not using a plain string);
9a47be244a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- InvalidIPTablesCfgError: implement InternalError instead of
BadRequestError. This error is returned when an invalid iptables
action is passed as argument (ie. none of -A, -I, or -D).
- ErrInvalidDriverConfig: don't implement BadRequestError. This is
returned when libnetwork controller initialization pass bad driver
config -- there's no call from an HTTP route.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- full diff: https://github.com/containerd/containerd/compare/v1.6.21...v1.6.22
- release notes: https://github.com/containerd/containerd/releases/tag/v1.6.22
---
Notable Updates
- RunC: Update runc binary to v1.1.8
- CRI: Fix `additionalGids`: it should fallback to `imageConfig.User`
when `securityContext.RunAsUser`, `RunAsUsername` are empty
- CRI: Write generated CNI config atomically
- Fix concurrent writes for `UpdateContainerStats`
- Make `checkContainerTimestamps` less strict on Windows
- Port-Forward: Correctly handle known errors
- Resolve `docker.NewResolver` race condition
- SecComp: Always allow `name_to_handle_at`
- Adding support to run hcsshim from local clone
- Pinned image support
- Runtime/V2/RunC: Handle early exits w/o big locks
- CRITool: Move up to CRI-TOOLS v1.27.0
- Fix cpu architecture detection issue on emulated ARM platform
- Task: Don't `close()` io before `cancel()`
- Fix panic when remote differ returns empty result
- Plugins: Notify readiness when registered plugins are ready
- Unwrap io errors in server connection receive error handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- go.mod: update dependencies and go version by
- Use Go1.20
- Fix couple of typos
- Added `WithStdout` and `WithStderr` helpers
- Moved `cmdOperators` handling from `RunCmd` to `StartCmd`
- Deprecate `assert.ErrorType`
- Remove outdated Dockerfile
- add godoc links
full diff: https://github.com/gotestyourself/gotest.tools/compare/v3.4.0...v3.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Follow-up to fca38bcd0a, which made the
Discover API optional for drivers to implement, but forgot to remove the
stubs from the Windows drivers, which didn't implement this API.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "Capability" type defines DataScope and ConnectivityScope fields,
but their value was set from consts in the datastore package, which
required importing that package and its dependencies for the consts
only.
This patch:
- Moves the consts to a separate "scope" package
- Adds aliases for the consts in the datastore package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
changes:
- specs-go: remove artifact prefixed annotations
- Switch from scratch to empty
- Remove artifact media type reference
- image-index: add artifactType to specs and schema
- Add artifactType to image index
- Apply version change from #1050
- Specify the content of the scratch blob
- Add language from artifacttype field to forbid allowlists of media types
- spec: clarify descriptor, align with de facto artifact usage
- Remove special guidance around wasm
- Update descriptor.go
- releases: use +dev as in-development suffix
- version: bump HEAD back to -dev
- image-index: add the subject field
full diff: https://github.com/opencontainers/image-spec/compare/v1.1.0-rc3...v1.1.0-rc4
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most drivers do not implement this, so detect if a driver implements
the discoverAPI, and remove the implementation from drivers that do
not support it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.2...v1.7.3
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.3
----
Welcome to the v1.7.3 release of containerd!
The third patch release for containerd 1.7 contains various fixes and updates.
Notable Updates
- RunC: Update runc binary to v1.1.8
- CRI: Fix `additionalGids`: it should fallback to `imageConfig.User`
when `securityContext.RunAsUser`,`RunAsUsername` are empty
- CRI: write generated CNI config atomically
- Port-Forward: Correctly handle known errors
- Resolve docker.NewResolver race condition
- Fix `net.ipv4.ping_group_range` with userns
- Runtime/V2/RunC: handle early exits w/o big locks
- SecComp: always allow `name_to_handle_at`
- CRI: Windows Pod Stats: Add a check to skip stats for containers that
are not running
- Task: don't `close()` io before cancel()
- Remove CNI conf_template deprecation
- Fix issue for HPC pod metrics
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.1...v1.7.2
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.2
----
Welcome to the v1.7.2 release of containerd!
The second patch release for containerd 1.7 includes enhancements to CRI
sandbox mode, Windows snapshot mounting support, and CRI and container IO
bug fixes.
CRI/Sandbox Updates
- Publish sandbox events
- Make stats respect sandbox's platform
Other Notable Updates
- Mount snapshots on Windows
- Notify readiness when registered plugins are ready
- Fix `cio.Cancel()` should close pipes
- CDI: Use CRI `Config.CDIDevices` field for CDI injection
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows.
This issue was not limited to the go command itself, and could also affect binaries
that use `os.Command`, `os.LookPath`, etc.
From the related blogpost (https://blog.golang.org/path-security):
> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing
At time of the go1.15 release, the Go team considered changing the behavior of
`os.LookPath()` and `exec.LookPath()` to be a breaking change, and made the
behavior "opt-in" by providing the `golang.org/x/sys/execabs` package as a
replacement.
However, for the go1.19 release, this changed, and the default behavior of
`os.LookPath()` and `exec.LookPath()` was changed. From the release notes:
https://go.dev/doc/go1.19#os-exec-path
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe)
> in the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
A result of this change was that registering the daemon as a Windows service
no longer worked when done from within the directory of the binary itself:
C:\> cd "Program Files\Docker\Docker\resources"
C:\Program Files\Docker\Docker\resources> dockerd --register-service
exec: "dockerd": cannot run executable found relative to current directory
Note that using an absolute path would work around the issue:
C:\Program Files\Docker\Docker>resources\dockerd.exe --register-service
This patch changes `registerService()` to use `os.Executable()`, instead of
depending on `os.Args[0]` and `exec.LookPath()` for resolving the absolute
path of the binary.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.
To solve this, this commit changes the following rules:
```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```
into:
```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```
These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.
Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.
Solve #45460.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
refreshImage is the only function used as a reducer and it doesn't use
the `filter *listContext`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Aggregate same images into one object and add the list of tags pointing
to it to the RepoTags array
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
- use assert.Check to continue the test even if a check fails
- assert the total number of images returned, not only their RepoTags
- use subtests
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Refactor GetContainerLayerSize to calculate unpacked image size only by
following the snapshot parent tree directly instead of following it by
using diff ids from image config.
This works even if the original manifest/config used to create that
container is no longer present in the content store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Check for generic `errdefs.NotFound` rather than specific error helper
struct when checking if the error is caused by the image not being
present.
It still works for `ErrImageDoesNotExist` because it
implements the NotFound errdefs interface too.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Now that all helper functions are updated, we can use a struct-literal
for this function, which makes it slightly easier to read.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the buildDetailedNetworkResources function into separate functions for
collecting container attachments (`buildContainerAttachments`) and service
attachments (`buildServiceAttachments`). This allows us to get rid of the
"verbose" bool, and makes the logic slightly more transparent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Pass the endpoint and endpoint-info, instead of individual fields from the
endpoint.
- Remove redundant nil-check, as it's already checked on the call-side
in `buildDetailedNetworkResources`, which skips endpoints without info.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the function return the constructed network.IPAM instead of applying
it to a network struct, and rename it to "buildIPAMResources".
Rewrite the function itself:
- Use struct-literals where possible to make it slightly more readable.
- Use a boolean (hasIPv4Config, hasIPv6Config) for both IPv4 and IPv6 to
check whether the IPAM-info needs to be added. This makes the logic the
same for both, and makes the processing order-independent. This also
allows for the `network.IpamInfo()` call to be skipped if it's not needed.
- Change order of "ipv4 config / ipv4 info" and "ipv6 config / ipv4 info"
blocks to make it slightly clearer (and to allow skipping the forementioned
call to `network.IpamInfo()`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move the length-check into the function, and change the code to
be a basic type-case, as networkdb.PeerInfo and network.PeerInfo
are identical types.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This variable was only accessed from within LocalRegistry methods, but
due to being a package-level variable, tests had to deal with setting
and resetting it.
Move it to be a field scoped to the LocalRegistry. This simplifies the
tests, and to make this more transparent, also removing the "Setup()"
helper (which, wasn't marked as a t.Helper()).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The client's transport can only be set by newClientWithTransport, which
is not exported, and always uses a transport.HTTPTransport.
However, requestFactory is mocked in one of the tests, so keep the interface,
but make it a local, non-exported one.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The interface is not consumed anywhere, and only non-exported functions
produced one, so we can remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was exported, but never mutated outside of the package, and
effectively a rather "creative" way to define a method on LocalRegistry.
While un-exporting also store these paths in a field, instead of constructing
them on every call, as the results won't change during the lifecycle of the
LocalRegistry.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the exported SpecsPaths from the platform-specific implementations,
so that documentation can be maintained in a single location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since 0e5eaf8ee3, these implementations
were fully identical, so removing the duplicate, and move it to a
platform-agnostic file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use gotest.tools assertions
- use consts and struct-literals where possible
- use assert.Check instead of t.Fatal() where possible
- fix some unhandled errors
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was used by the postNetworkConnect() handler, but is handled
by the backend itself, starting with d63a5a1ff5.
Since that commit, this function was no longer used, so we can remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The rootChain variable that the Key function references is a
package-global slice. As the append() built-in may append to the slice's
backing array in place, it is theoretically possible for the temporary
slices in concurrent Key() calls to share the same backing array, which
would be a data race. Thankfully in my tests (on Go 1.20.6)
cap(rootChain) == len(rootChain)
held true, so in practice a new slice is always allocated and there is
no race. But that is a very brittle assumption to depend upon, which
could blow up in our faces at any time without warning. Rewrite the
implementation in a way which cannot lead to data races.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Before this change, integration test would fail fast and not execute all
test suites when one suite fails.
Change this behavior into opt-in enabled by TEST_INTEGRATION_FAIL_FAST
variable.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
It only had a single implementation, so let's remove the interface.
While changing, also renaming;
- datastore.DataStore -> datastore.Store
- datastore.NewDataStore -> datastore.New
- datastore.NewDataStoreFromConfig -> datastore.FromConfig
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were only used internally, and ErrConntrackNotConfigurable was not used
as a sentinel error anywhere. Remove ErrConntrackNotConfigurable, and change
IsConntrackProgrammable to return an error instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
arrangeUserFilterRule uses the package-level [`ctrl` variable][1], which
holds a reference to a controller instance. This variable is set by
[`setupArrangeUserFilterRule()`][2], which is called when initialization
a controller ([`libnetwork.New`][3]).
In normal circumstances, there would only be one controller, created during
daemon startup, and the instance of the controller would be the same as
the controller that `NewNetwork` is called from, but there's no protection
for the `ctrl` variable, and various integration tests create their own
controller instance.
The global `ctrl` var was introduced in [54e7900fb89b1aeeb188d935f29cf05514fd419b][4],
with the assumption that [only one controller could ever exist][5].
This patch tries to reduce uses of the `ctrl` variable, and as we're calling
this code from inside a method on a specific controller, we inline the code
and use that specific controller instead.
[1]: 37b908aa62/libnetwork/firewall_linux.go (L12)
[2]: 37b908aa62/libnetwork/firewall_linux.go (L14-L17)
[3]: 37b908aa62/libnetwork/controller.go (L163)
[4]: 54e7900fb8
[5]: https://github.com/moby/libnetwork/pull/2471#discussion_r343457183
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in libnetwork through 50964c9948
and, based on the name of the function and its signature, I think it
was meant to be a test. This patch refactors it to be one.
Changing it into a test made it slightly broken:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
errors_test.go:15: Failed to detect err network not found is of type BadRequestError. Got type: libnetwork.ErrNoSuchNetwork
errors_test.go:15: Failed to detect err endpoint not found is of type BadRequestError. Got type: libnetwork.ErrNoSuchEndpoint
errors_test.go:42: Failed to detect err unknown driver "" is of type ForbiddenError. Got type: libnetwork.NetworkTypeError
errors_test.go:42: Failed to detect err unknown network id is of type ForbiddenError. Got type: *libnetwork.UnknownNetworkError
errors_test.go:42: Failed to detect err unknown endpoint id is of type ForbiddenError. Got type: *libnetwork.UnknownEndpointError
--- FAIL: TestErrorInterfaces (0.00s)
FAIL
This was because some errors were tested twice, but for the wrong type
(`NetworkTypeError`, `UnknownNetworkError`, `UnknownEndpointError`).
Moving them to the right test left no test-cases for `types.ForbiddenError`,
so I added `ActiveContainerError` to not make that part of the code feel lonely.
Other failures were because some errors were changed from `types.BadRequestError`
to a `types.NotFoundError` error in commit ba012a703a,
so I moved those to the right part.
Before this patch:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
--- PASS: TestErrorInterfaces (0.00s)
PASS
ok github.com/docker/docker/libnetwork 0.013s
After this patch:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
--- PASS: TestErrorInterfaces (0.00s)
PASS
ok github.com/docker/docker/libnetwork 0.013s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit ffd75c2e0c updated this function to
set up the DOCKER-USER chain for both iptables and ip6tables, however the
function would return early if a failure happened (instead of continuing
with the next iptables version).
This patch extracts setting up the chain to a separate function, and updates
arrangeUserFilterRule to log the failure as a warning, but continue with
the next iptables version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These functions were mostly identical, except for iptables being enabled
by default (unless explicitly disabled by config).
Rewrite the function to a enabledIptablesVersions, which returns the list
of iptables-versions that are enabled for the controller. This prevents
having to acquire a lock twice, and simplifies arrangeUserFilterRule, which
can now just iterate over the enabled versions.
Also moving this function to a linux-only file, as other platforms don't have
the iptables types defined.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
"ro-non-recursive", "ro-force-recursive", and "rro" are
now removed from the legacy mount API.
CLI may still support them via the new mount API (if we want).
Follow-up to PR 45278
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Unfortunately also brings in golang.org/x/tools as a dependency, due to
go-winio using a "tools.go" file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove gotest.tools dependency as it was only used in one test,
and only for a trivial check
- use t.TempDir()
- rename vars that collided with package types
- don't use un-keyed structs
- explicitly ignore some errors to please linters
- use iotest.ErrReader
- TestReadCloserWrapperClose: verify reading works before closing :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.8
full diff: https://github.com/opencontainers/runc/compare/v1.1.7...v1.1.9
This is the eighth patch release of the 1.1.z release branch of runc.
The most notable change is the addition of RISC-V support, along with a
few bug fixes.
- Support riscv64.
- init: do not print environment variable value.
- libct: fix a race with systemd removal.
- tests/int: increase num retries for oom tests.
- man/runc: fixes.
- Fix tmpfs mode opts when dir already exists.
- docs/systemd: fix a broken link.
- ci/cirrus: enable some rootless tests on cs9.
- runc delete: call systemd's reset-failed.
- libct/cg/sd/v1: do not update non-frozen cgroup after frozen failed.
- CI: bump Fedora, Vagrant, bats.
- .codespellrc: update for 2.2.5.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- jsonpb: accept 'null' as a valid representation of NullValue in unmarshal
The canonical JSON representation for NullValue is JSON "null".
full diff: https://github.com/golang/protobuf/compare/v1.5.2...v1.5.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- unmarshal does not return nil object when value is nil
- fixes "ctr tasks checkpoint returns invalid task checkpoint option for io.containerd.runc.v2: unknown"
full diff: https://github.com/containerd/typeurl/compare/v2.1.0...v2.1.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change add leases for all the content that will be exported, once
the image(s) are exported the lease is removed, thus letting
containerd's GC to do its job if needed. This fixes the case where
someone would remove an image that is still being exported.
This fixes the TestAPIImagesSaveAndLoad cli integration test.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
When resolving a reference that is both a Named and Digested, it could
be resolved to an image that has the same digest, but completely
different repository name.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
If image name is already an untagged digested reference, don't produce
additional digested ref.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Post-f8c0d92a22bad004cb9cbb4db704495527521c42, BUILDKIT_REPO doesn't
really do what it claims to. Instead, don't allow overloading since the
import path for BuildKit is always the same, and make clear the
provenance of values when generating the final variable definitions.
We also better document the script, and follow some best practices for
both POSIX sh and Bash.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The imgSvcConfig is defined locally, and discarded if an error occurs,
so no need to use the intermediate vars here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The interface is defined on the receiver-side, and returning concrete
types makes it more transparent what we're creating.
As these namespaced wrappers were not exported, let's inline them, so
that it's clear at a glance what it's doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The containerdCli was somewhat confusing (is it the CLI?); let's rename
to make it match what it is :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
gotest.tools has an init() which registers a '-update' flag;
a80f057529/internal/source/update.go (L21-L23)
The quota helper contains a testhelpers file, which is meant for usage
in (integration) tests, but as it's in the same pacakge as production
code, would also trigger the gotest.tools init.
This patch removes the gotest.tools code from this file.
Before this patch:
$ (exec -a libnetwork-setkey "$(which dockerd)" -help)
Usage of libnetwork-setkey:
-exec-root string
docker exec root (default "/run/docker")
-update
update golden values
With this patch applied:
$ (exec -a libnetwork-setkey "$(which dockerd)" -help)
Usage of libnetwork-setkey:
-exec-root string
docker exec root (default "/run/docker")
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add `-f` to output nothing to tar if the curl fails, and `-S` to report
errors if they happen.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use bind-mounts instead of a `COPY` for cli.sh, and use `COPY --link`
for rootlesskit's build stage.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use a non-slash escape sequence to support mirrors with a path
component, and do not unconditionally replace the mirror in
Dockerfile.simple.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
This aligns `docker build` as invoked by the Makefile with both `docker
buildx bake` as invoked by the Makefile and directly by the user.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
- Fix copy on windows plus tests
- Fix follow symlinkResolver on Windows
- Implement proper renameFile on Windows
- Fix potential nil pointer dereference
- Use RunWithPrivileges
- Fix leaking file handle
- handle mkdir race for diskwriter
- walk: avoid stat()'ing files unnecessarily
- ci: fix freebsd workflow
- update to Go 1.20
full diff: fb433841cb...36ef4d8c0d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
insecure-registries supports using CIDR notation, however, buildkit in
moby was not respecting these. We can update the RegistryHosts function
to support this by inserting the correct host into the lookup map if
it's explicitly marked as insecure.
Signed-off-by: Justin Chadwell <me@jedevc.com>
The RegistryHosts lookup function is used by both BuildKit and by the
containerd snapshotter. However, this function differs in behaviour from
the config parser for the RegistryConfig:
- The protocol for insecure registries is treated as significant by
RegistryHosts, while the RegistryConfig strips this information.
- RegistryConfig validates and deduplicates mirrors.
- RegistryConfig does not parse the insecure-registries as URLs, which
can lead to parsing opaque URLs as was possible by the RegistryHosts
function.
This patch updates the lookup function to ensure consistency.
Signed-off-by: Justin Chadwell <me@jedevc.com>
- remove some intermediate variables
- explicitly return "nil" if there's no error
- remove redundant check for response-headers being nil
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The driver-configurations are only set when creating a new controller,
using the `config.OptionDriverConfig()` option that can be passed to
`New()`, and used as "read-only" after that.
Taking away any other paths that set these options, the only type used
for per-driver options are a `map[string]interface{}`, so we can change
the type from `map[string]interface{}` to a `map[string]map[string]interface{}`,
(or its "modern" variant: `map[string]map[string]any`), so that it's
no longer needed to cast the type before use.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't fail early if we can still test more, and be slightly more strict
in what error we're looking for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test already creates instances for each ip-version, so let's
re-use them. While changing, also use assert.Check to not fail early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
New() allows for driver-options to be passed using the config.OptionDriverConfig.
Update the test to not manually mutate the controller's configuration after
creating.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Not critical, but when used from ChainInfo, we had to construct an IPTable
based on the version of the ChainInfo, which then only used the version
we passed to get the right loopback.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make some variables local to the if-branches to be slightly more iodiomatic,
and to make clear it's only used in that branch.
Move the bestEffortLock locking later in IPtable.raw(), because that function'
could return before the lock was even needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that all consumers of these functions are passing non-empty values,
let's validate that no empty strings for either chain or table are passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only used internally, and it was last used in commit:
0220b06cd6
But moved into the iptables package in this commit:
998f3ce22c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was not used for "Config", but for Networks and Endpoints.
Having this utility made it look like more than it was, and the related
test was effectively testing stdlib.
Abstracting the validation also was hiding that, while validation does
not allow "empty" names, it happily allows leading/trailing whitespace,
and does not remove that before creating networks or endpoints;
docker network create "bridge "
docker network create "bridge "
docker network create "bridge "
docker network create " bridge "
docker network create " bridge "
docker network create " bridge"
docker network ls --filter driver=bridge
NETWORK ID NAME DRIVER SCOPE
d4d53210f185 bridge bridge local
e9afba0d99de bridge bridge local
69fb7a7ba67c bridge bridge local
a452bf065403 bridge bridge local
49d96c59061d bridge bridge local
8eae1c4be12c bridge bridge local
86dd65b881b9 bridge bridge local
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the NewGeneric utility as it was not used anywhere, except for
in tests.
Also "modernize" the type, and use `any` instead of `interface{}`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Use a more modern approach to check error-types
- Touch-up grammar of the error-message
- Remove redundant "nil" check for errors, as it's never nil at that point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
tlsconfig.Client() does various things, including reading certs and
checking them. So we may as well return early if we're not gonna be
able to use the config.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- don't use un-keyed structs
- use http consts where possible
- use errors.As instead of manually checking the error-type
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use Client.buildRequest instead of a local copy of the same logic so
that we're using the same logic, and there's less chance of diverging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.6 (released 2023-07-11) includes a security fix to the net/http package,
as well as bug fixes to the compiler, cgo, the cover tool, the go command,
the runtime, and the crypto/ecdsa, go/build, go/printer, net/mail, and text/template
packages. See the Go 1.20.6 milestone on our issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.20.6+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.20.5...go1.20.6
These minor releases include 1 security fixes following the security policy:
net/http: insufficient sanitization of Host header
The HTTP/1 client did not fully validate the contents of the Host header.
A maliciously crafted Host header could inject additional headers or entire
requests. The HTTP/1 client now refuses to send requests containing an
invalid Request.Host or Request.URL.Host value.
Thanks to Bartek Nowotarski for reporting this issue.
Includes security fixes for [CVE-2023-29406 ][1] and Go issue https://go.dev/issue/60374
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For local communications (npipe://, unix://), the hostname is not used,
but we need valid and meaningful hostname.
The current code used the socket path as hostname, which gets rejected by
go1.20.6 and go1.19.11 because of a security fix for [CVE-2023-29406 ][1],
which was implemented in https://go.dev/issue/60374.
Prior versions go Go would clean the host header, and strip slashes in the
process, but go1.20.6 and go1.19.11 no longer do, and reject the host
header.
Before this patch, tests would fail on go1.20.6:
=== FAIL: pkg/authorization TestAuthZRequestPlugin (15.01s)
time="2023-07-12T12:53:45Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 1s"
time="2023-07-12T12:53:46Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 2s"
time="2023-07-12T12:53:48Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 4s"
time="2023-07-12T12:53:52Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 8s"
authz_unix_test.go:82: Failed to authorize request Post "http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq": http: invalid Host header
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For local communications (npipe://, unix://), the hostname is not used,
but we need valid and meaningful hostname.
The current code used the client's `addr` as hostname in some cases, which
could contain the path for the unix-socket (`/var/run/docker.sock`), which
gets rejected by go1.20.6 and go1.19.11 because of a security fix for
[CVE-2023-29406 ][1], which was implemented in https://go.dev/issue/60374.
Prior versions go Go would clean the host header, and strip slashes in the
process, but go1.20.6 and go1.19.11 no longer do, and reject the host
header.
This patch introduces a `DummyHost` const, and uses this dummy host for
cases where we don't need an actual hostname.
Before this patch (using go1.20.6):
make GO_VERSION=1.20.6 TEST_FILTER=TestAttach test-integration
=== RUN TestAttachWithTTY
attach_test.go:46: assertion failed: error is not nil: http: invalid Host header
--- FAIL: TestAttachWithTTY (0.11s)
=== RUN TestAttachWithoutTTy
attach_test.go:46: assertion failed: error is not nil: http: invalid Host header
--- FAIL: TestAttachWithoutTTy (0.02s)
FAIL
With this patch applied:
make GO_VERSION=1.20.6 TEST_FILTER=TestAttach test-integration
INFO: Testing against a local daemon
=== RUN TestAttachWithTTY
--- PASS: TestAttachWithTTY (0.12s)
=== RUN TestAttachWithoutTTy
--- PASS: TestAttachWithoutTTy (0.02s)
PASS
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Combine TestAttachWithTTY and TestAttachWithoutTTy to a single test using sub-tests
- Set up and tear-down the test-environment once
- Remove redundant client.ContainerRemove, as it's taken care of by testEnv.Clean()
- Run both tests in parallel
make TEST_FILTER=TestAttach DOCKER_GRAPHDRIVER=overlay2 TESTDEBUG=1 test-integration
Loaded image: busybox:latest
Loaded image: busybox:glibc
Loaded image: debian:bullseye-slim
Loaded image: hello-world:latest
Loaded image: arm32v7/hello-world:latest
INFO: Testing against a local daemon
=== RUN TestAttach
=== RUN TestAttach/without_TTY
=== PAUSE TestAttach/without_TTY
=== RUN TestAttach/with_TTY
=== PAUSE TestAttach/with_TTY
=== CONT TestAttach/without_TTY
=== CONT TestAttach/with_TTY
--- PASS: TestAttach (0.00s)
--- PASS: TestAttach/without_TTY (0.03s)
--- PASS: TestAttach/with_TTY (0.03s)
PASS
DONE 3 tests in 1.347s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Calling function returned from setupTest (which calls testEnv.Clean) in
a defer block inside a test that spawns parallel subtests caused the
cleanup function to be called before any of the subtest did anything.
Change the defer expressions to use `t.Cleanup` instead to call it only
after all subtests have also finished.
This only changes tests which have parallel subtests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- use assert.Check() where possible to not fail early
- improve checks for error-types
- rename "testURL" var to be more descriptive, and use a const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests are testing timeouts and take a long time to run. Run the tests
in parallel, so that the test-suite takes shorter to run.
Before:
ok github.com/docker/docker/pkg/plugins 34.013s
After:
ok github.com/docker/docker/pkg/plugins 17.945s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Refactor setupRemotePluginServer() to be a helper, and to spin up a test-
server for each test instead of sharing the same instance between tests.
This allows the tests to be run in parallel without stepping on each-other's
toes (tearing down the server).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's convenient to have in the dev container when debugging issues which
reproduce consistently when deploying containers through compose.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We can't upload the same file in a matrix so generate
metadata in prepare job instead. Also fixes wrong bake meta
file in merge job.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Use http.Header, which is more descriptive on intent, and we're already
importing the package in the client. Removing the "header" type also fixes
various locations where the type was shadowed by local variables named
"headers".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "version" header was added in c0afd9c873,
but used the wrong information to get the API version.
This issue was fixed in a9d20916c3, which switched
the API handler code to get the API version from the context. That change is part
of Docker Engine 20.10 (API v1.30 and up)
This patch updates the code to only set the header on APi v1.29 and older, as it's
not used by newer API versions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With this change, the API will now return a 403 instead of a 500 when
trying to create an overlay network on a non-manager node.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The commit befff0e13f inadvertendly
disabled the error returned when trying to create an overlay network on
a node which is not part of a Swarm cluster.
Since commit e3708a89cc the overlay
netdriver returns the error: `no VNI provided`.
This commit reinstate the original error message by checking if the node
is a manager before calling libnetwork's `controller.NewNetwork()`.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This patch contains some optimizations I still had stashed when working
on eaa9494b71.
- Use the bytes package for handling the output of "lsof", instead of
converting to a string.
- Count the number of newlines in the output, instead of splitting the
output into a slice of strings. We're only interested in the number
of lines in the output.
- Use lsof's -F option to only print the file-descriptor for each line,
as we don't need other information.
- Use the -l, -n, and -P options to omit converting usernames, host names,
and port numbers.
From the [LSOF(8)][1] man-page:
-l This option inhibits the conversion of user ID numbers to
login names. It is also useful when login name lookup is
working improperly or slowly.
-n This option inhibits the conversion of network numbers to host
names for network files. Inhibiting conversion can make lsof run faster.
It is also useful when host name lookup is not working properly.
-P This option inhibits the conversion of port numbers to port names for network files.
Inhibiting the conversion can make lsof run a little faster.
It is also useful when host name lookup is not working properly.
Output looks something like;
lsof -lnP -Ff -p 39849
p39849
fcwd
ftxt
ftxt
f0
f1
f2
f3
f4
f5
f6
f7
f8
f9
f10
f11
Before/After:
BenchmarkGetTotalUsedFds-10 122 9479384 ns/op 10816 B/op 63 allocs/op
BenchmarkGetTotalUsedFds-10 154 7814697 ns/op 7257 B/op 60 allocs/op
[1]: https://opensource.apple.com/source/lsof/lsof-49/lsof/lsof.man.auto.html
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- return a errdefs.System if we fail to decode the registry's response
- use strconv.Itoa instead of fmt.Sprintf
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon sleeps for 15 seconds at start up when the API binds to a TCP
socket with no TLS certificate set. That's what the hack/make/run script
does, but it doesn't explicitly disable tls, thus we're experiencing
this annoying delay every time we use this script.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Golang map iteration order is not guaranteed, so in some cases the built slice has it's output of order as well. This means that testing for exact warning messages in docker build output would result in random test failures, making it more annoying for end-users to test against this functionality.
Signed-off-by: Jose Diaz-Gonzalez <email@josediazgonzalez.com>
...that Swarmkit no longer needs now that it has been migrated to use
the new-style driver registration APIs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The only remaining user is Swarmkit, which now has its own private copy
of the package tailored to its needs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The daemon.lazyInitializeVolume() function only handles restoring Volumes
if a Driver is specified. The Container's MountPoints field may also
contain other kind of mounts (e.g., bind-mounts). Those were ignored, and
don't return an error; 1d9c8619cd/daemon/volumes.go (L243-L252C2)
However, the prepareMountPoints() assumed each MountPoint was a volume,
and logged an informational message about the volume being restored;
1d9c8619cd/daemon/mounts.go (L18-L25)
This would panic if the MountPoint was not a volume;
github.com/docker/docker/daemon.(*Daemon).prepareMountPoints(0xc00054b7b8?, 0xc0007c2500)
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/mounts.go:24 +0x1c0
github.com/docker/docker/daemon.(*Daemon).restore.func5(0xc0007c2500, 0x0?)
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/daemon.go:552 +0x271
created by github.com/docker/docker/daemon.(*Daemon).restore
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/daemon.go:530 +0x8d8
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x564e9be4c7c0]
This issue was introduced in 647c2a6cdd
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently moby drops ep sets before the entrypoint is executed.
This does mean that with combination of no-new-privileges the
file capabilities stops working with non-root containers.
This is undesired as the usability of such containers is harmed
comparing to running root containers.
This commit therefore sets the effective/permitted set in order
to allow use of file capabilities or libcap(3)/prctl(2) respectively
with combination of no-new-privileges and without respectively.
For no-new-privileges the container will be able to obtain capabilities
that are requested.
Signed-off-by: Luboslav Pivarc <lpivarc@redhat.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
...which ignore the config argument. Notably, none of the network
drivers referenced by Swarmkit use config, which is good as Swarmkit
unconditionally passes nil for the config when registering drivers.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Albin is currently a curator, has been contributing for various years prior
to that, and has taken on the daunting task to work on Moby's networking stack.
Albin would be a great addition to our list of maintainers and to allow him
to perform his work in these areas in a more official capacity.
I nominated Albin as maintainer, and votes passed, so opening a PR to
make it official.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Kevin is a maintainer for BuildKit, Buildx, and Docker's official GitHub
actions (among others), has been our "in-house GitHub actions expert"
for a long time, and has made significant contributions to the integration
with BuildKit, and to improve our build pipeline(s).
Kevin would be a great addition to our list of maintainers and to allow him
to perform his work in these areas in a more official capacity.
I nominated Kevin as maintainer, and votes passed, so opening a PR to
make it official.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Laura has done significant work on the containerd integration, helping
triage and fixing bugs, both in this repository, containerd, and the
docker CLI, and would make a great addition to our list of maintainers.
I nominated Laura as maintainer, and votes passed, so opening a PR to
make it official.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds an additional interval to be used by healthchecks during the
start period.
Typically when a container is just starting you want to check if it is
ready more quickly than a typical healthcheck might run. Without this
users have to balance between running healthchecks to frequently vs
taking a very long time to mark a container as healthy for the first
time.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fixes a case where on Docker For Mac if you need to bind mount the
bundles dir (e.g. to get test results back).
The unix socket does not work over oxsfs, so instead we put it in a
tmpfs.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
1. On failed start tail the daemon logs
2. Exposes generic tailing functions to make test debugging simpler
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
setupBridgeNetFiltering:
- Indicate that the bridgeInterface argument is unused (but it's needed
to satisfy the signature).
- Return instead of nullifying the err. Still not great, but I thought it
was very slightly more logical thing to do.
checkBridgeNetFiltering:
- Remove unused argument, and scope ipVerName to the branch where it's
used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
initConnection was effectively just part of the constructor; ot was not
used elsewhere. Merge the two functions to simplify things a bit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const was added in 8301dcc6d7, before
being moved to libnetwork, and moved back, but it was never used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove local bridgeName variable that shadowed the const, but
used the same value
- remove some redundant `var` declarations, and changed fixed
values to a const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
None of these errors were string-matched anywhere, so let's change them
to be non-capitalized, as they should.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
looks like this error was added in 1cbdaebaa1,
and later moved to libnetwork in 44c96449c2
which also updated the description to something that doesn't match what
it means.
In either case, this error was never used as a special / sentinel error,
so we can just use a regular error return.
While at it, I also lower-cased the error-message; it's not string-matched
anywhere, so we can update it to make linters more happy.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- validate input variables before constructing the ChainInfo
- only construct the ChainInfo if things were successful
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Clarify that the argument to New is an exclusive upper bound.
Correct the documentation for SetAnyInRange: the end argument is
inclusive rather than exclusive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The idm package wraps bitseq.Handle to provide an offset and
synchronization. bitseq.Handle wraps bitmap.Bitmap to provide
persistence in a datastore. As no datastore is passed and the offset is
zero, the idm.Idm instance is nothing more than a concurrency-safe
wrapper around a bitmap.Bitmap with differently-named methods. Switch
over to using bitmap.Bitmap directly, using the ovmanager driver's mutex
for concurrency control.
Hold the driver mutex for the entire duration that VXLANs are being
assigned to the new network. This makes allocating VXLANs for a network
an atomic operation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
In the network.obtainVxlanID() method, the mutex only guards a local
variable and a function argument. Locking is therefore unnecessary.
The network.releaseVxlanID() method is only called in two contexts:
driver.NetworkAllocate(), where the network struct is a local variable
and network.releaseVxlanID() is only called in failure code-paths in
which the network does not escape; and driver.NetworkFree(), while the
driver mutex is held. Locking is therefore unnecessary.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Multiple daemons starting/running concurrently can collide with each
other when editing iptables rules. Most integration tests which opt into
parallelism and start daemons work around this problem by starting the
daemon with the --iptables=false option. However, some of the tests
neglect to pass the option when starting or restarting the daemon,
resulting in those tests being flaky.
Audit the integration tests which call t.Parallel() and (*Daemon).Stop()
and add --iptables=false arguments where needed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestClientWithRequestTimeout has been observed to flake in CI. The
timing in the test is quite tight, only giving the client a 10ms window
to time out, which could potentially be missed if the host is under
load and the goroutine scheduling is unlucky. Give the client a full
five seconds of grace to time out before failing the test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Now that the MTU field was moved, this function only needs the BridgeConfig,
which contains all options for the default "bridge" network.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option is only used for the default bridge network; let's move the
field to that struct to make it clearer what it's used for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The --mtu option is only used for the default "bridge" network on Linux.
On Windows, the flag is available, but ignored. As this option has been
available for a long time, and was always silently ignored, deprecating
or removing it would be a breaking change (and perhaps it's possible to
support it in future).
This patch:
- hides the option on Windows binaries
- logs a warning if the option is set to any non-zero value other than
the default on a Windows binary
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The OptionLocalKVProvider, OptionLocalKVProviderURL, and OptionLocalKVProviderConfig
options were only used in tests, so un-export them, and move them to the
test-files.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was implemented in dd4950f36d
which added a "key" field, but that field was never used anywhere, and
still appears unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `store.Watch()` was only used in `Controller.processEndpointCreate()`,
and skipped if the store was not "watchable" (`store.Watchable()`).
Whether a store is watchable depends on the store's datastore.scope;
local stores are not watchable;
func (ds *datastore) Watchable() bool {
return ds.scope != LocalScope
}
datastore is only initialized in two locations, and both locations set the
scope field to LocalScope:
datastore.newClient() (also called by datastore.NewDataStore()):
3e4c9d90cf/libnetwork/datastore/datastore.go (L213)
datastore.NewTestDataStore() (used in tests);
3e4c9d90cf/libnetwork/datastore/datastore_test.go (L14-L17)
Furthermore, the backing BoltDB kvstore does not implement the Watch()
method;
3e4c9d90cf/libnetwork/internal/kvstore/boltdb/boltdb.go (L464-L467)
Based on the above;
- our datastore is never Watchable()
- so datastore.Watch() is never used
This patch removes the Watchable(), Watch(), and RestartWatch() functions,
as well as the code handling watching.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The sequential field determined whether a lock was needed when storing
and retrieving data. This field was always set to true, with the exception
of NewTestDataStore() in the tests.
This field was added in a18e2f9965
to make locking optional for non-local scoped stores. Such stores are no
longer used, so we can remove this field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the code slightly more idiomatic; remove some "var" declarations,
remove some intermediate variables and redundant error-checks, and remove
the filePerm const.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This boolean was not used anywhere, so we can remove it. Also cleaning up
the implementation a bit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The WriteOptions struct was only used to set the "IsDir" option. This option
was added in d635a8e32b
and was only supported by the etcd libkv store.
The BoltDB store does not support this option, making the WriteOptions
struct fully redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The only remaining kvstore is BoltDB, which doesn't use TLS connections
or authentication, so we can remove these options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use string-literal for reduce escaped quotes, which makes for easier grepping.
While at it, also changed http -> https to keep some linters at bay.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent potential suggestion when many concurrent requests happen on
the /info endpoint. It's worth noting that with this change,
requests to the endpoint while another request is still in flight
will share the results, hence might be slightly incorrect (for example,
the output includes SystemTime, which may now be incorrect).
Assuming that under normal circumstances, requests will still
happen fast enough to not be shared, this may not be a problem,
but we could decide to update specific fields to not be shared.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Got a linter warning on this one, and I don't think eventFilter() was
intentionally using a value (not pointer).
> Struct containerConfig has methods on both value and pointer receivers.
> Such usage is not recommended by the Go Documentation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These aliases were not needed, and only used in a couple of places,
which made it inconsistent, so let's use the import without aliasing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They're not used anywhere, so let's remove them; better to have
a compile error than a panic at runtime.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Add the field as a "deprecated" field in the API type.
- Don't error when failing to parse the options, but produce a warning
instead, because the client won't be able to fix issues in the daemon
configuration. This was unlikely to happen, as the daemon probably
would fail to start with an invalid config, but just in case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ClusterStore` and `ClusterAdvertise` fields were deprecated in commit
616e64b42f (and would no longer be included in
the `/info` API response), and were fully removed in 24.0.0 through commit
68bf777ece
This patch removes the fields from the swagger file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ClusterStore` and `ClusterAdvertise` fields were deprecated in commit
616e64b42f (and would no longer be included in
the `/info` API response), and were fully removed in 24.0.0 through commit
68bf777ece
This patch removes the fields from the swagger file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ClusterStore` and `ClusterAdvertise` fields were deprecated in commit
616e64b42f (and would no longer be included in
the `/info` API response), and were fully removed in 24.0.0 through commit
68bf777ece
This patch removes the fields from the swagger file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is an indirect dependency for github.com/fluent/fluent-logger-golang,
which does not yet use a go.mod. Update the dependency to the latest patch
release, which contains some fixes, and updates for newer go versions;
full diff: https://github.com/tinylib/msgp/compare/v1.1.6...v1.1.8
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/cgroups/compare/v3.0.1...v3.0.2
relevant changes:
- cgroup2: only enable the cpuset controller if cpus or mems is specified
- cgroup1 delete: proceed to the next subsystem when a cgroup is not found
- Cgroup2: Reduce allocations for manager.Stat
- Improve performance by for pid stats (cgroups1) re-using readuint
- Reduce allocs in ReadUint64 by pre-allocating byte buffer
- cgroup2: rm/simplify some code
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If an image is only by id instead of its name, don't prune it
completely. but only untag it and create a dangling image for it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This field was added in f0e6e135a8, and
from that change I suspect it was intended to store the default SELinux
mount-labels to be set on containers.
However, it was never used, so let's remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the intermediate variable, and move the option closer
to where it's used, as in some cases we created the variable,
but could return with an error before it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was not using the DriverCallback interface, and only
required the Registerer interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
CI failed sometimes if no daemon.json was present:
Run sudo rm /etc/docker/daemon.json
sudo rm /etc/docker/daemon.json
sudo service docker restart
docker version
docker info
shell: /usr/bin/bash -e {0}
env:
DESTDIR: ./build
BUILDKIT_REPO: moby/buildkit
BUILDKIT_TEST_DISABLE_FEATURES: cache_backend_azblob,cache_backend_s3,merge_diff
BUILDKIT_REF: 798ad6b0ce9f2fe86dfb2b0277e6770d0b545871
rm: cannot remove '/etc/docker/daemon.json': No such file or directory
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Linux 6.2 and up (commit [f1f1f2569901ec5b9d425f2e91c09a0e320768f3][1])
provides a fast path for the number of open files for the process.
From the [Linux docs][2]:
> The number of open files for the process is stored in 'size' member of
> `stat()` output for /proc/<pid>/fd for fast access.
[1]: f1f1f25699
[2]: https://docs.kernel.org/filesystems/proc.html#proc-pid-fd-list-of-symlinks-to-open-files
This patch adds a fast-path for Kernels that support this, and falls back
to the slow path if the Size fields is zero.
Comparing on a Fedora 38 (kernel 6.2.9-300.fc38.x86_64):
Before/After:
go test -bench ^BenchmarkGetTotalUsedFds$ -run ^$ ./pkg/fileutils/
BenchmarkGetTotalUsedFds 57264 18595 ns/op 408 B/op 10 allocs/op
BenchmarkGetTotalUsedFds 370392 3271 ns/op 40 B/op 3 allocs/op
Note that the slow path has 1 more file-descriptor, due to the open
file-handle for /proc/<pid>/fd during the calculation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use File.Readdirnames instead of os.ReadDir, as we're only interested in
the number of files, and results don't have to be sorted.
Before:
BenchmarkGetTotalUsedFds-5 149272 7896 ns/op 945 B/op 20 allocs/op
After:
BenchmarkGetTotalUsedFds-5 153517 7644 ns/op 408 B/op 10 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8d56108ffb moved this function from
the generic (no build-tags) fileutils.go to a unix file, adding "freebsd"
to the build-tags.
This likely was a wrong assumption (as other files had freebsd build-tags).
FreeBSD's procfs does not mention `/proc/<pid>/fd` in the manpage, and
we don't test FreeBSD in CI, so let's drop it, and make this a Linux-only
file.
While updating also dropping the import-tag, as we're planning to move
this file internal to the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update docker to support a '--log-format' option, which accepts either
'text' (default) or 'json'. Propagate the log format to containerd as
well, to ensure that everything will be logged consistently.
Signed-off-by: Philip K. Warren <pkwarren@gmail.com>
When live-restoring a container the volume driver needs be notified that
there is an active mount for the volume.
Before this change the count is zero until the container stops and the
uint64 overflows pretty much making it so the volume can never be
removed until another daemon restart.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
I think this may be missing a sudo (as all other operations do use
sudo to access daemon.json);
Run if [ ! -e /etc/docker/daemon.json ]; then
if [ ! -e /etc/docker/daemon.json ]; then
echo '{}' | tee /etc/docker/daemon.json >/dev/null
fi
DOCKERD_CONFIG=$(jq '.+{"experimental":true,"live-restore":true,"ipv6":true,"fixed-cidr-v6":"2001:db8:1::/64"}' /etc/docker/daemon.json)
sudo tee /etc/docker/daemon.json <<<"$DOCKERD_CONFIG" >/dev/null
sudo service docker restart
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
GO_VERSION: 1.20.5
GOTESTLIST_VERSION: v0.3.1
TESTSTAT_VERSION: v0.1.3
ITG_CLI_MATRIX_SIZE: 6
DOCKER_EXPERIMENTAL: 1
DOCKER_GRAPHDRIVER: overlay2
tee: /etc/docker/daemon.json: Permission denied
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Dockerfile in this repository performs many stages in parallel. If any of
those stages fails to build (which could be due to networking congestion),
other stages are also (forcibly?) terminated, which can cause an unclean
shutdown.
In some case, this can cause `git` to be terminated, leaving a `.lock` file
behind in the cache mount. Retrying the build now will fail, and the only
workaround is to clean the build-cache (which causes many stages to be
built again, potentially triggering the problem again).
> [dockercli-integration 3/3] RUN --mount=type=cache,id=dockercli-integration-git-linux/arm64/v8,target=./.git --mount=type=cache,target=/root/.cache/go-build,id=dockercli-integration-build-linux/arm64/v8 /download-or-build-cli.sh v17.06.2-ce https://github.com/docker/cli.git /build:
#0 1.575 fatal: Unable to create '/go/src/github.com/docker/cli/.git/shallow.lock': File exists.
#0 1.575
#0 1.575 Another git process seems to be running in this repository, e.g.
#0 1.575 an editor opened by 'git commit'. Please make sure all processes
#0 1.575 are terminated then try again. If it still fails, a git process
#0 1.575 may have crashed in this repository earlier:
#0 1.575 remove the file manually to continue.
This patch:
- Updates the Dockerfile to remove `.lock` files (`shallow.lock`, `index.lock`)
that may have been left behind from previous builds. I put this code in the
Dockerfile itself (not the script), as the script may be used in other
situations outside of the Dockerfile (for which we cannot guarantee no other
git session is active).
- Adds a `docker --version` step to the stage; this is mostly to verify the
build was successful (and to be consistent with other stages).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
re-enable the DCO check, which was temporarily disabled to migrate
old commits from github.com/docker/libkv
This reverts commit 7d7225fae6.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ideally, this should actually do a lookup across images that have no parent, but I wasn't 100% sure how to accomplish that so I opted for the smaller change of having `FROM scratch` builds not be cached for now.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
The migrated history has some commits that missed a DCO:
These commits do not have a proper 'Signed-off-by:' marker:
- 3fa22634a617e2c52d2c5f061826e5107e27985f
- 9b11053e9147884c43c9a9d8ebfcd7bb9470e8b5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A reduced set of the dependency, only taking the parts that are used. Taken from
upstream commit: dfacc563de
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/libkv.git temp_libkv
cd temp_libkv
# create branch to work with
git checkout -b migrate_libkv
# remove all code, except for the files we need; rename the remaining ones to their new target location
git filter-repo --force \
--path libkv.go \
--path store/store.go \
--path store/boltdb/boltdb.go \
--path-rename libkv.go:libnetwork/internal/kvstore/kvstore_manage.go \
--path-rename store/store.go:libnetwork/internal/kvstore/kvstore.go \
--path-rename store/boltdb/:libnetwork/internal/kvstore/boltdb/
# go to the target github.com/moby/moby repository
cd ~/projects/docker
# create a branch to work with
git checkout -b integrate_libkv
# add the temporary repository as an upstream and make sure it's up-to-date
git remote add temp_libkv ~/projects/temp_libkv
git fetch temp_libkv
# merge the upstream code, rewriting "pkg/symlink" to "symlink"
git merge --allow-unrelated-histories --signoff -S temp_libkv/migrate_libkv
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was only used in tests, and internally, and no longer
used since we switch to using os.UserHomeDir() from stdlib.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was last used in the pkg/mflag package, which was removed
in 14712f9ff0, and is no longer used in
libnetwork code since e6de8aec2f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in f0e5b3d7d8 to
account for older versions of the engine (Docker EE LTS versions), which
did not yet provide the OSType field in Docker info, and had to be manually
set using the TEST_OSTYPE env-var.
This patch removes the field in favor of the equivalent in DaemonInfo. It's
more verbose, but also less ambiguous what information we're using (i.e.,
the platform the daemon is running on, not the local platform).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This env-var was added in f0e5b3d7d8 to
account for older versions of the engine (Docker EE LTS versions), which
did not yet provide the OSType field in Docker info.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before 4bafaa00aa, if the daemon was
killed while a container was running and the container shim is killed
before the daemon is restarted, such as if the host system is
hard-rebooted, the daemon would restore the container to the stopped
state and set the exit code to 255. The aforementioned commit introduced
a regression where the container's exit code would instead be set to 0.
Fix the regression so that the exit code is once against set to 255 on
restore.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If an exec fails to start in such a way that containerd publishes an
exit event for it, daemon.ProcessEvent will race
daemon.ContainerExecStart in handling the failure. This race has been a
long-standing bug, which was mostly harmless until
4bafaa00aa. After that change, the daemon
would dereference a nil pointer and crash if ProcessEvent won the race.
Restore the status quo buggy behaviour by adding a check to skip the
dereference if execConfig.Process is nil.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We missed a case when parsing extra hosts from the dockerfile
frontend so the build fails.
To handle this case we need to set a dedicated worker label
that contains the host gateway IP so clients like Buildx
can just set the proper host:ip when parsing extra hosts
that contain the special string "host-gateway".
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The stargz snapshotter cannot be re-mounted, so the reference-counted
path must be used.
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Commit 90de570cfa passed through the request
context to daemon.ContainerStop(). As a result, cancelling the context would
cancel the "graceful" stop of the container, and would proceed with forcefully
killing the container.
This patch partially reverts the changes from 90de570cfa
and breaks the context to prevent cancelling the context from cancelling the stop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The official Python images on Docker Hub switched to debian bookworm,
which is now the current stable version of Debian.
However, the location of the apt repository config file changed, which
causes the Dockerfile build to fail;
Loaded image: emptyfs:latest
Loaded image ID: sha256:0df1207206e5288f4a989a2f13d1f5b3c4e70467702c1d5d21dfc9f002b7bd43
INFO: Building docker-sdk-python3:5.0.3...
tests/Dockerfile:6
--------------------
5 | ARG APT_MIRROR
6 | >>> RUN sed -ri "s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g" /etc/apt/sources.list \
7 | >>> && sed -ri "s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g" /etc/apt/sources.list
8 |
--------------------
ERROR: failed to solve: process "/bin/sh -c sed -ri \"s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g\" /etc/apt/sources.list && sed -ri \"s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g\" /etc/apt/sources.list" did not complete successfully: exit code: 2
This needs to be fixed in docker-py, but in the meantime, we can pin to
the bullseye variant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This enables picking up OTLP tracing context for the gRPC
requests.
Also sets up the in-memory recorder that BuildKit History API
can use to store the traces associated with specific build
in a database after build completes.
This doesn't enable Jaeger tracing endpoints from env
but this can be easily enabled by adding another import if
maintainers want it.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
go1.20.5 (released 2023-06-06) includes four security fixes to the cmd/go and
runtime packages, as well as bug fixes to the compiler, the go command, the
runtime, and the crypto/rsa, net, and os packages. See the Go 1.20.5 milestone
on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.5+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.4...go1.20.5
These minor releases include 3 security fixes following the security policy:
- cmd/go: cgo code injection
The go command may generate unexpected code at build time when using cgo. This
may result in unexpected behavior when running a go program which uses cgo.
This may occur when running an untrusted module which contains directories with
newline characters in their names. Modules which are retrieved using the go command,
i.e. via "go get", are not affected (modules retrieved using GOPATH-mode, i.e.
GO111MODULE=off, may be affected).
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29402 and Go issue https://go.dev/issue/60167.
- runtime: unexpected behavior of setuid/setgid binaries
The Go runtime didn't act any differently when a binary had the setuid/setgid
bit set. On Unix platforms, if a setuid/setgid binary was executed with standard
I/O file descriptors closed, opening any files could result in unexpected
content being read/written with elevated prilieges. Similarly if a setuid/setgid
program was terminated, either via panic or signal, it could leak the contents
of its registers.
Thanks to Vincent Dehors from Synacktiv for reporting this issue.
This is CVE-2023-29403 and Go issue https://go.dev/issue/60272.
- cmd/go: improper sanitization of LDFLAGS
The go command may execute arbitrary code at build time when using cgo. This may
occur when running "go get" on a malicious module, or when running any other
command which builds untrusted code. This is can by triggered by linker flags,
specified via a "#cgo LDFLAGS" directive.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29404 and CVE-2023-29405 and Go issues https://go.dev/issue/60305 and https://go.dev/issue/60306.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon.generateNewName() already reserves the generated name, but its name
did not indicate it did. The daemon.registerName() assumed that the generated
name still had to be reserved, which could mean it would try to reserve the
same name again.
This patch renames daemon.generateNewName to daemon.generateAndReserveName
to make it clearer what it does, and updates registerName() to return early
if it successfully generated (and registered) the container name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The most notable change here is that the OCI's type uses a pointer for `Created`, which we probably should've been too, so most of these changes are accounting for that (and embedding our `Equal` implementation in the one single place it was used).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This utility was only used in a single location (as part of `docker info`),
but the `pkg/rootless` package is imported in various locations, causing
rootlesskit to be a dependency for consumers of that package.
Move GetRootlessKitClient to the daemon code, which is the only location
it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For configured runtimes with a runtimeType other than
io.containerd.runc.v1, io.containerd.runc.v2 and
io.containerd.runhcs.v1, the only supported way to pass configuration is
through the generic containerd "runtimeoptions/v1".Options type. Add a
unit test case which verifies that the options set in the daemon config
are correctly unmarshaled into the daemon's in-memory runtime config,
and that the map keys for the daemon config align with the ones used
when configuring cri-containerd (PascalCase, not camelCase or
snake_case).
Signed-off-by: Cory Snider <csnider@mirantis.com>
When constructing the client, and setting the User-Agent, care must be
taken to apply the header in the right location, as custom headers can
be set in the CLI configuration, and merging these custom headers should
not override the User-Agent header.
This patch adds a dedicated `WithUserAgent()` option, which stores the
user-agent separate from other headers, centralizing the merging of
other headers, so that other parts of the (CLI) code don't have to be
concerned with merging them in the right order.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Convert CreateMountpoint, ReadOnlyNonRecursive, and ReadOnlyForceRecursive.
See moby/swarmkit PR 3134
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Some snapshotters (like overlayfs or zfs) can't mount the same
directories twice. For example if the same directroy is used as an upper
directory in two mounts the kernel will output this warning:
overlayfs: upperdir is in-use as upperdir/workdir of another mount, accessing files from both mounts will result in undefined behavior.
And indeed accessing the files from both mounts will result in an "No
such file or directory" error.
This change introduces reference counts for the mounts, if a directory
is already mounted the mount interface will only increment the mount
counter and return the mount target effectively making sure that the
filesystem doesn't end up in an undefined behavior.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Audit the OCI spec options used for Linux containers to ensure they are
less order-dependent. Ensure they don't assume that any pointer fields
are non-nil and that they don't unintentionally clobber mutations to the
spec applied by other options.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Many of the fields in LinuxResources struct are pointers to scalars for
some reason, presumably to differentiate between set-to-zero and unset
when unmarshaling from JSON, despite zero being outside the acceptable
range for the corresponding kernel tunables. When creating the OCI spec
for a container, the daemon sets the container's OCI spec CPUShares and
BlkioWeight parameters to zero when the corresponding Docker container
configuration values are zero, signifying unset, despite the minimum
acceptable value for CPUShares being two, and BlkioWeight ten. This has
gone unnoticed as runC does not distingiush set-to-zero from unset as it
also uses zero internally to represent unset for those fields. However,
kata-containers v3.2.0-alpha.3 tries to apply the explicit-zero resource
parameters to the container, exactly as instructed, and fails loudly.
The OCI runtime-spec is silent on how the runtime should handle the case
when those parameters are explicitly set to out-of-range values and
kata's behaviour is not unreasonable, so the daemon must therefore be in
the wrong.
Translate unset values in the Docker container's resources HostConfig to
omit the corresponding fields in the container's OCI spec when starting
and updating a container in order to maximize compatibility with
runtimes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Switch to using t.TempDir() instead of rolling our own.
Clean up mounts leaked by the tests as otherwise the tests fail due to
the leaked mounts because unlike the old cleanup code, t.TempDir()
cleanup does not ignore errors from os.RemoveAll.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Avoids invalidation of dev-systemd-true and dev-base when changing the
CLI version/repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't show `error: No such remote: 'origin'` error when building for the
first time and the cached git repository doesn't a remote yet.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Installs the buildx cli plugin in the container shell by default.
Previously user had to manually download the buildx binary to use
buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use separate cli for integration-cli to allow use newer CLI for
interactive dev shell usage.
Both versions can be overriden with DOCKERCLI_VERSION or
DOCKERCLI_INTEGRATION_VERSION. Binary is downloaded from
download.docker.com if it's available, otherwise it's built from the
source.
For backwards compatibility DOCKER_CLI_PATH overrides BOTH clis.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
I missed the most important COPY in 637ca59375
Copying the source code into the dev-container does not depend on the parent
layers, so can use the --link option as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.
Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.
Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.
Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.
Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Config reloading has interleaved validations and other fallible
operations with mutating the live daemon configuration. The daemon
configuration could be left in a partially-reloaded state if any of the
operations returns an error. Mutating a copy of the configuration and
atomically swapping the config struct on success is not currently an
option as config values are not copyable due to the presence of
sync.Mutex fields. Introduce a two-phase commit protocol to defer any
mutations of the daemon state until after all fallible operations have
succeeded.
Reload transactions are not yet entirely hermetic. The platform
reloading logic for custom runtimes on *nix could still leave the
directory of generated runtime wrapper scripts in an indeterminate state
if an error is encountered.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Historically, daemon.RegistryHosts() has returned a docker.RegistryHosts
callback function which closes over a point-in-time snapshot of the
daemon configuration. When constructing the BuildKit builder at daemon
startup, the return value of daemon.RegistryHosts() has been used.
Therefore the BuildKit builder would use the registry configuration as
it was at daemon startup for the life of the process, even if the
registry configuration is changed and the configuration reloaded.
Provide BuildKit with a RegistryHosts callback which reflects the
live daemon configuration after reloads so that registry operations
performed by BuildKit always use the same configuration as the rest of
the daemon.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Passing around a bare pointer to the map of configured features in order
to propagate to consumers changes to the configuration across reloads is
dangerous. Map operations are not atomic, so concurrently reading from
the map while it is being updated is a data race as there is no
synchronization. Use a getter function to retrieve the current features
map so the features can be retrieved race-free.
Remove the unused features argument from the build router.
Signed-off-by: Cory Snider <csnider@mirantis.com>
With this patch, the user-agent has information about the containerd-client
version and the storage-driver that's used when using the containerd-integration;
time="2023-06-01T11:27:07.959822887Z" level=info msg="listening on [::]:5000" go.version=go1.19.9 instance.id=53590f34-096a-4fd1-9c58-d3b8eb7e5092 service=registry version=2.8.2
...
172.18.0.1 - - [01/Jun/2023:11:30:12 +0000] "HEAD /v2/multifoo/blobs/sha256:c7ec7661263e5e597156f2281d97b160b91af56fa1fd2cc045061c7adac4babd HTTP/1.1" 404 157 "" "docker/dev go/go1.20.4 git-commit/8d67d0c1a8 kernel/5.15.49-linuxkit-pr os/linux arch/arm64 containerd-client/1.6.21+unknown storage-driver/overlayfs UpstreamClient(Docker-Client/24.0.2 \\(linux\\))"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was not really "inserting" anything, just formatting and appending.
Simplify this by changing this in to a `getUpstreamUserAgent()` function
which returns the upstream User-Agent (if any) into a `UpstreamClient()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a const for the characters to escape, instead of implementing
this as a generic escaping function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this, the client would report itself as containerd, and the containerd
version from the containerd go module:
time="2023-06-01T09:43:21.907359755Z" level=info msg="listening on [::]:5000" go.version=go1.19.9 instance.id=67b89d83-eac0-4f85-b36b-b1b18e80bde1 service=registry version=2.8.2
...
172.18.0.1 - - [01/Jun/2023:09:43:33 +0000] "HEAD /v2/multifoo/blobs/sha256:cb269d7c0c1ca22fb5a70342c3ed2196c57a825f94b3f0e5ce3aa8c55baee829 HTTP/1.1" 404 157 "" "containerd/1.6.21+unknown"
With this patch, the user-agent has the docker daemon information;
time="2023-06-01T11:27:07.959822887Z" level=info msg="listening on [::]:5000" go.version=go1.19.9 instance.id=53590f34-096a-4fd1-9c58-d3b8eb7e5092 service=registry version=2.8.2
...
172.18.0.1 - - [01/Jun/2023:11:27:20 +0000] "HEAD /v2/multifoo/blobs/sha256:c7ec7661263e5e597156f2281d97b160b91af56fa1fd2cc045061c7adac4babd HTTP/1.1" 404 157 "" "docker/dev go/go1.20.4 git-commit/8d67d0c1a8 kernel/5.15.49-linuxkit-pr os/linux arch/arm64 UpstreamClient(Docker-Client/24.0.2 \\(linux\\))"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix timeouts from very long raft messages
- fix: code optimization
- update dependencies
full diff: 75e92ce14f...01bb7a4139
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Build-cache for the build-stages themselves are already invalidated if the
base-images they're using is updated, and the COPY operations don't depend
on previous steps (as there's no overlap between artifacts copied).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The default implementation of the containerd.Image interface provided by
the containerd operates on the parent index/manifest list of the image
and the platform matcher.
This isn't convenient when a specific manifest is already known and it's
redundant to search the whole index for a manifest that matches the
given platform matcher. It can also result in a different manifest
picked up than expected when multiple manifests with the same platform
are present.
This introduces a walkImageManifests which walks the provided image and
calls a handler with a ImageManifest, which is a simple wrapper that
implements containerd.Image interfaces and performs all containerd.Image
operations against a platform specific manifest instead of the root
manifest list/index.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Resolver.setupIPTable() checks whether it needs to flush or create the
user chains used for NATing container DNS requests by testing for the
existence of the rules which jump to said user chains. Unfortunately it
does so using the IPTable.RawCombinedOutputNative() method, which
returns a non-nil error if the iptables command returns any output even
if the command exits with a zero status code. While that is fine with
iptables-legacy as it prints no output if the rule exists, iptables-nft
v1.8.7 prints some information about the rule. Consequently,
Resolver.setupIPTable() would incorrectly think that the rule does not
exist during container restore and attempt to create it. This happened
work work by coincidence before 8f5a9a741b
because the failure to create the already-existing table would be
ignored and the new NAT rules would be inserted before the stale rules
left in the table from when the container was last started/restored. Now
that failing to create the table is treated as a fatal error, the
incompatibility with iptables-nft is no longer hidden.
Switch to using IPTable.ExistsNative() to test for the existence of the
jump rules as it correctly only checks the iptables command's exit
status without regard for whether it outputs anything.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The method to restore a network namespace takes a collection of
interfaces to restore with the options to apply. The interface names are
structured data, tuples of (SrcName, DstPrefix) but for whatever reason
are being passed into Restore() serialized to strings. A refactor,
f0be4d126d, accidentally broke the
serialization by dropping the delimiter. Rather than fix the
serialization and leave the time-bomb for someone else to trip over,
pass the interface names as structured data.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Switching snapshotter implementations would result in an error when
preparing a snapshot, check that the image is indeed unpacked for the
current snapshot before trying to prepare a snapshot.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This type (as well as TarsumBackup), was used for the experimental --stream
support for the classic builder. This feature was removed in commit
6ca3ec88ae, which also removed uses of
the CachableSource type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Dockerfile.e2e is not used anymore. Integration tests run
through the main Dockerfile.
Also removes the daemon OS/Arch detection script that is not
necessary anymore. It was used to select the Dockerfile based
on the arch like Dockerfile.arm64 but we don't have those
anymore. Was also used to check referenced frozen images
in the Dockerfile.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Adds a Dockerfile and make targets to update and validate
generated files (proto, seccomp default profile)
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This makes the output of `docker save` fully OCI compliant.
When using the containerd image store, this code is not used. That
exporter will just use containerd's export method and should give us the
output we want for multi-arch images.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In cases where an exec start failed the exec process will be nil even
though the channel to signal that the exec started was closed.
Ideally ExecConfig would get a nice refactor to handle this case better
(ie. it's not started so don't close that channel).
This is a minimal fix to prevent NPE. Luckilly this would only get
called by a client and only the http request goroutine gets the panic
(http lib recovers the panic).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
While the VXLAN interface and the iptables rules to mark outgoing VXLAN
packets for encryption are configured to use the Swarm data path port,
the XFRM policies for actually applying the encryption are hardcoded to
match packets with destination port 4789/udp. Consequently, encrypted
overlay networks do not pass traffic when the Swarm is configured with
any other data path port: encryption is not applied to the outgoing
VXLAN packets and the destination host drops the received cleartext
packets. Use the configured data path port instead of hardcoding port
4789 in the XFRM policies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This struct was never modified; let's just use consts for these.
Also remove the args return from detectContentType(), as it was
not used anywhere.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type (as well as TarsumBackup), was used for the experimental --stream
support for the classic builder. This feature was removed in commit
6ca3ec88ae, which also removed uses of
the CachableSource type.
As far as I could find, there's no external consumers of these types,
but let's deprecated it, to give potential users a heads-up that it
will be removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It turns out that the unnecessary serialization removed in
b75246202a happened to work around a bug
in containerd. When many exec processes are started concurrently in the
same containerd task, it takes seconds to minutes for them all to start.
Add the workaround back in, only deliberately this time.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`docker run -v /foo:/foo:ro` is now recursively read-only on kernel >= 5.12.
Automatically falls back to the legacy non-recursively read-only mount mode on kernel < 5.12.
Use `ro-non-recursive` to disable RRO.
Use `ro-force-recursive` or `rro` to explicitly enable RRO. (Fails on kernel < 5.12)
Fix issue 44978
Fix docker/for-linux issue 788
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Feature flags are one of the configuration items which can be reloaded
without restarting the daemon. Whether the daemon uses the containerd
snapshotter service or the legacy graph drivers is controlled by a
feature flag. However, much of the code which checks the snapshotter
feature flag assumes that the flag cannot change at runtime. Make it so
that the snapshotter setting can only be changed by restarting the
daemon, even if the flag state changes after a live configuration
reload.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Starting with go1.19, the Go runtime on Windows now supports the `netgo` build-
flag to use a native Go DNS resolver. Prior to that version, the build-flag
only had an effect on non-Windows platforms. When using the `netgo` build-flag,
the Windows's host resolver is not used, and as a result, custom entries in
`etc/hosts` are ignored, which is a change in behavior from binaries compiled
with older versions of the Go runtime.
From the go1.19 release notes: https://go.dev/doc/go1.19#net
> Resolver.PreferGo is now implemented on Windows and Plan 9. It previously
> only worked on Unix platforms. Combined with Dialer.Resolver and Resolver.Dial,
> it's now possible to write portable programs and be in control of all DNS name
> lookups when dialing.
>
> The net package now has initial support for the netgo build tag on Windows.
> When used, the package uses the Go DNS client (as used by Resolver.PreferGo)
> instead of asking Windows for DNS results. The upstream DNS server it discovers
> from Windows may not yet be correct with complex system network configurations,
> however.
Our Windows binaries are compiled with the "static" (`make/binary-daemon`)
script, which has the `netgo` option set by default. This patch unsets the
`netgo` option when cross-compiling for Windows.
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Docker with containerd integration emits "Exists" progress action when a
layer of the currently pulled image already exists. This is different
from the non-c8d Docker which emits "Already exists".
This makes both implementations consistent by emitting backwards
compatible "Already exists" action.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
To allow skipping integration tests that don't apply to the
containerd snapshotter.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This moves the blobs around so they follow the OCI spec.
Note that because docker reads paths from the manifest.json inside the
tar this is not a breaking change.
This does, however, remove the old layer "VERSION" file which had a big
"why is this even here" in the code comments. I suspect it does not
matter at all even for really old versions of Docker. In any case it is
a useless file for any even relatively modern version of Docker.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
TestProxyNXDOMAIN has proven to be susceptible to failing as a
consequence of unlocked threads being set to the wrong network
namespace. As the failure mode looks a lot like a bug in the test
itself, it seems prudent to add a check for mismatched namespaces to the
test so we will know for next time that the root cause lies elsewhere.
Signed-off-by: Cory Snider <csnider@mirantis.com>
osl.setIPv6 mistakenly captured the calling goroutine's thread's network
namespace instead of the network namespace of the thread getting its
namespace temporarily changed. As this function appears to only be
called from contexts in the process's initial network namespace, this
mistake would be of little consequence at runtime. The libnetwork unit
tests, on the other hand, unshare network namespaces so as not to
interfere with each other or the host's network namespace. But due to
this bug, the isolation backfires and the network namespace of
goroutines used by a test which are expected to be in the initial
network namespace can randomly become the isolated network namespace of
some other test. Symptoms include a loopback network server running in
one goroutine being inexplicably and randomly being unreachable by a
client in another goroutine.
Capture the original network namespace of the thread from the thread to
be tampered with, after locking the goroutine to the thread.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Swapping out the global logger on the fly is causing tests to flake out
by logging to a test's log output after the test function has returned.
Refactor Resolver to use a dependency-injected logger and the resolver
unit tests to inject a private logger instance into the Resolver under
test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
tstwriter mocks the server-side connection between the resolver and the
container, not the resolver and the external DNS server, so returning
the external DNS server's address as w.LocalAddr() is technically
incorrect and misleading. Only the protocols need to match as the
resolver uses the client's choice of protocol to determine which
protocol to use when forwarding the query to the external DNS server.
While this change has no material impact on the tests, it makes the
tests slightly more comprehensible for the next person.
Signed-off-by: Cory Snider <csnider@mirantis.com>
commit dc11d2a2d8 removed the devicemapper
storage-driver, so these warnings are no longer relevant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 0abb8dec3f removed support for
running overlay/overlay2 on top of a backing filesystem without d_type
support, and turned it into a fatal error when starting the daemon,
so there's no need to generate warnings for this situation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/spf13/cobra/releases/tag/v1.7.0
Features
- Allow to preserve ordering of completions in bash, zsh, pwsh, & fish
- Add support for PowerShell 7.2+ in completions
- Allow sourcing zsh completion script
Bug fixes
- Don't remove flag values that match sub-command name
- Fix powershell completions not returning single word
- Remove masked template import variable name
- Correctly detect completions with dash in argument
Testing & CI/CD
- Deprecate Go 1.15 in CI
- Deprecate Go 1.16 in CI
- Add testing for Go 1.20 in CI
- Add tests to illustrate unknown flag bug
Maintenance
- Update main image to better handle dark backgrounds
- Fix stale.yaml mispellings
- Remove stale bot from GitHub actions
- Add makefile target for installing dependencies
- Add Sia to projects using Cobra
- Add Vitess and Arewefastyet to projects using cobra
- Fixup for Kubescape github org
- Fix route for GitHub workflows badge
- Fixup for GoDoc style documentation
- Various bash scripting improvements for completion
- Add Constellation to projects using Cobra
Documentation
- Add documentation about disabling completion descriptions
- Improve MarkFlagsMutuallyExclusive example in user guide
- Update shell_completions.md
- Update copywrite year
- Document suggested layout of subcommands
- Replace deprecated ExactValidArgs with MatchAll in doc
full diff: https://github.com/spf13/cobra/compare/v1.6.1...v1.7.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Extended attributes are set on files in container images for a reason.
Fail to unpack if extended attributes are present in a layer and setting
the attributes on the unpacked files fails for any reason.
Add an option to the vfs graph driver to opt into the old behaviour
where ENOTSUPP and EPERM errors encountered when setting extended
attributes are ignored. Make it abundantly clear to users and anyone
triaging their bug reports that they are shooting themselves in the
foot by enabling this option.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Our resolver is just a forwarder for external DNS so it should act like
it. Unless it's a server failure or refusal, take the response at face
value and forward it along to the client. RFC 8020 is only applicable to
caching recursive name servers and our resolver is neither caching nor
recursive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fixes a bug where, if a user pulls an image with a tag != `latest` and
a specific platform, we return an NotFound error for the wrong (`latest`) tag.
see: https://github.com/moby/moby/issues/45558
This bug was introduced in 779a5b3029
in the changes to `daemon/images/image_pull.go`, when we started returning the error from the call to
`GetImage` after the pull. We do this call, if pulling with a specified platform, to check if the platform
of the pulled image matches the requested platform (for cases with single-arch images).
However, when we call `GetImage` we're not passing the image tag, only name, so `GetImage` assumes `latest`
which breaks when the user has requested a different tag, since there might not be such an image in the store.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
The error returned by DecodeConfig was changed in
b6d58d749c and caused this to regress.
Allow empty request bodies for this endpoint once again.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The internal Client request methods which accept an object as a body use
nil to signal that the request should not have a body. But it is easy to
accidentally pass a typed-nil value as the object, e.g. if the object
comes from a function argument or struct field of a concrete type. The
result is that these requests will, surprisingly, have a JSON body of
`null`. Treat typed-nil pointers the same as untyped nils for the
purposes of determining whether or not the request should include a
body.
Stop assuming that POST requests should always have a body. POST /commit
does not require a body, for example.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Upstart has been EOL for 8 years and isn't used by any distributions we support any more.
Additionally, this removes the "cgroups v1" setup code because it's more reasonable now for us to expect something _else_ to have set up cgroups appropriately (especially cgroups v2).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
These changes add basic CDI integration to the docker daemon.
A cdi driver is added to handle cdi device requests. This
is gated by an experimental feature flag and is only supported on linux
This change also adds a CDISpecDirs (cdi-spec-dirs) option to the config.
This allows the default values of `/etc/cdi`, /var/run/cdi` to be overridden
which is useful for testing.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Fixing case where username may contain a backslash.
This case can happen for winbind/samba active directory domain users.
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Use more meaningful variable name
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Update contrib/dockerd-rootless-setuptool.sh
Co-authored-by: Akihiro Suda <suda.kyoto@gmail.com>
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Use more meaningful variable name
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Update contrib/dockerd-rootless-setuptool.sh
Co-authored-by: Akihiro Suda <suda.kyoto@gmail.com>
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
When the `ServerAddress` in the `AuthConfig` provided by the client is
empty, default to the default registry (registry-1.docker.io).
This makes the behaviour the same as with the containerd image store
integration disabled.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- use is.ErrorType
- replace uses of client.IsErrNotFound for errdefs.IsNotFound, as
the client no longer returns the old error-type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use is.ErrorType
- replace uses of client.IsErrNotFound for errdefs.IsNotFound, as
the client no longer returns the old error-type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use is.ErrorType
- replace uses of client.IsErrNotFound for errdefs.IsNotFound, as
the client no longer returns the old error-type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
None of the client will return the old error-types, so there's no need
to keep the compatibility code. We can consider deprecating this function
in favor of the errdefs equivalent this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that most uses of reexec have been replaced with non-reexec
solutions, most of the reexec.Init() calls peppered throughout the test
suites are unnecessary. Furthermore, most of the reexec.Init() calls in
test code neglects to check the return value to determine whether to
exit, which would result in the reexec'ed subprocesses proceeding to run
the tests, which would reexec another subprocess which would proceed to
run the tests, recursively. (That would explain why every reexec
callback used to unconditionally call os.Exit() instead of returning...)
Remove unneeded reexec.Init() calls from test and example code which no
longer needs it, and fix the reexec.Init() calls which are not inert to
exit after a reexec callback is invoked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Our templates no longer contain version-specific rules, so this function
is no longer used. This patch deprecates it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 7008a51449 removed version-conditional
rules from the template, so we no longer need the apparmor_parser Version.
This patch removes the call to `aaparser.GetVersion()`
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 2e19a4d56b removed all other version-
conditional statements from the AppArmor template, but left this one in place.
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.9)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the remaining conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add `execDuration` field to the event attributes map. This is useful for tracking how long the container ran.
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Ports over all the previous image delete logic, such as:
- Introduce `prune` and `force` flags
- Introduce the concept of hard and soft image delete conflics, which represent:
- image referenced in multiple tags (soft conflict)
- image being used by a stopped container (soft conflict)
- image being used by a running container (hard conflict)
- Implement delete logic such as:
- if deleting by reference, and there are other references to the same image, just
delete the passed reference
- if deleting by reference, and there is only 1 reference and the image is being used
by a running container, throw an error if !force, or delete the reference and create
a dangling reference otherwise
- if deleting by imageID, and force is true, remove all tags (otherwise soft conflict)
- if imageID, check if stopped container is using the image (soft conflict), and
delete anyway if force
- if imageID was passed in, check if running container is using the image (hard conflict)
- if `prune` is true, and the image being deleted has dangling parents, remove them
This commit also implements logic to get image parents in c8d by comparing shared layers.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
This option was deprecated in 5a922dc162, which
is part of the v24.0.0 release, so we can remove it from master.
This patch;
- adds a check to ValidatePlatformConfig, and produces a fatal error
if oom-score-adjust is set
- removes the deprecated libcontainerd/supervisor.WithOOMScore
- removes the warning from docker info
With this patch:
dockerd --oom-score-adjust=-500 --validate
Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.
And when using `daemon.json`:
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in dbb48e4b29, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in 818ee96219, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in 9d5e754caa, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in 9f3e5eead5, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in 2d49080056, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was deprecated in c63ea32a17, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const was deprecated in 5c78cbd3be, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field is deprecated since 1261fe69a3,
and will now be omitted on API v1.44 and up for the `GET /images/json`,
`GET /images/{id}/json`, and `GET /system/df` endpoints.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 1261fe69a3 deprecated the VirtualSize
field, but forgot to mention that it's also included in the /system/df
endpoint.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 24.0 branch was created, so changes in master/main should now be
targeting the next version of the API (1.44).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a workaround to have buildinfo with deps embedded in the
binary. We need to create a go.mod file before building with
-modfile=vendor.mod, otherwise it fails with:
"-modfile cannot be used to set the module root directory."
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This sets BuildKit version from the build information embedded
in running binary so we are aligned with the expected vendoring.
We iterate over all dependencies and find the BuildKit one
and set the right version. We also check if the module is
replaced and use it this case.
There is also additional checks if a pseudo version is
detected. See comments in code for more info.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Encrypted overlay networks are unique in that they are the only kind of
network for which libnetwork programs an iptables rule to explicitly
accept incoming packets. No other network driver does this. The overlay
driver doesn't even do this for unencrypted networks!
Because the ACCEPT rule is appended to the end of INPUT table rather
than inserted at the front, the rule can be entirely inert on many
common configurations. For example, FirewallD programs an unconditional
REJECT rule at the end of the INPUT table, so any ACCEPT rules appended
after it have no effect. And on systems where the rule is effective, its
presence may subvert the administrator's intentions. In particular,
automatically appending the ACCEPT rule could allow incoming traffic
which the administrator was expecting to be dropped implicitly with a
default-DROP policy.
Let the administrator always have the final say in how incoming
encrypted overlay packets are filtered by no longer automatically
programming INPUT ACCEPT iptables rules for them.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This has been around for a long time - since v17.04 (API v1.28)
but was never documented.
It allows removing a plugin even if it's still in use.
Signed-off-by: Milas Bowman <milas.bowman@docker.com>
While doing a boltdb operation and if the bucket is not found
we should not return a boltdb specific bucket not found error
because this causes leaky abstraction where in the user of libkv
needs to know about boltdb and import boltdb dependencies
neither of which is desirable. Replaced all the bucket not found
errors with the more generic `store.ErrKeyNotFound` error which
is more appropriate.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When using AtomicPut with 'previous' set at nil, it interprets
that the Key should be created with the AtomicPut. Instead of
returning a generic error, we return store.ErrKeyExists if the
key exists in the store during the operation.
Signed-off-by: Alexandre Beslic <abronan@docker.com>
Currently boltdb uses a handle which can be accessed
concurrently from multiple go routines and all of them
trying to open and close the boldb db handle which can
cause havoc. Use a mutex to serialize db access and
handle access.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This commit migrates the old 'go-etcd' client, which is deprecated
to the new 'coreos/etcd/client'.
One notable change is the ability to specify an 'IsDir' parameter
to the 'Put' call. This allows to circumvent the limitations of etcd
regarding the key/directory distinction while setting up Watches on
a directory. A conservative measure to set up a watch that should be
used the same way for etcd/consul/zookeeper is to enforce the 'IsDir'
parameter with 'WriteOptions' on 'Put' to avoid the 'NotANode' error
thrown by etcd on Watch call. Consul and zookeeper are not using the
'IsDir' parameter.
Signed-off-by: Alexandre Beslic <abronan@docker.com>
AtomicPut can now be used to Compare-and-Swap against the state
where the key doesn't yet exist. E.g. a race where two clients
create the same key: one succeeds, the other fails.
Pass nil for the previous argument of AtomicPut for this
behavior. Before, this would cause an error.
Implements this change for all three backends.
run:echo "::error::PR title suggests targetting the ${{ steps.title_branch.outputs.branch }} branch, but is opened against ${{ github.event.pull_request.base.ref }}" && exit 1
DOCKER_GITCOMMIT:=$(shell git rev-parse --short HEAD ||echo unsupported)
DOCKER_GITCOMMIT:=$(shell git rev-parse HEAD)
exportDOCKER_GITCOMMIT
# allow overriding the repository and branch that validation scripts are running
@ -20,6 +16,9 @@ export VALIDATE_REPO
exportVALIDATE_BRANCH
exportVALIDATE_ORIGIN_BRANCH
exportPAGER
exportGIT_PAGER
# env vars passed through directly to Docker's build scripts
# to allow things like `make KEEPBUNDLE=1 binary` easily
# `project/PACKAGERS.md` have some limited documentation of some of these
@ -28,10 +27,9 @@ export VALIDATE_ORIGIN_BRANCH
# option of "go build". For example, a built-in graphdriver priority list
# can be changed during build time like this:
#
# make DOCKER_LDFLAGS="-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,devicemapper" dynbinary
# make DOCKER_LDFLAGS="-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,zfs" dynbinary
#
DOCKER_ENVS:=\
-e BUILD_APT_MIRROR \
-e BUILDFLAGS \
-e KEEPBUNDLE \
-e DOCKER_BUILD_ARGS \
@ -41,6 +39,10 @@ DOCKER_ENVS := \
-e DOCKER_BUILDKIT \
-e DOCKER_BASH_COMPLETION_PATH \
-e DOCKER_CLI_PATH \
-e DOCKERCLI_VERSION \
-e DOCKERCLI_REPOSITORY \
-e DOCKERCLI_INTEGRATION_VERSION \
-e DOCKERCLI_INTEGRATION_REPOSITORY \
-e DOCKER_DEBUG \
-e DOCKER_EXPERIMENTAL \
-e DOCKER_GITCOMMIT \
@ -58,8 +60,10 @@ DOCKER_ENVS := \
-e TEST_FORCE_VALIDATE \
-e TEST_INTEGRATION_DIR \
-e TEST_INTEGRATION_USE_SNAPSHOTTER \
-e TEST_INTEGRATION_FAIL_FAST \
-e TEST_SKIP_INTEGRATION \
-e TEST_SKIP_INTEGRATION_CLI \
-e TEST_IGNORE_CGROUP_CHECK \
-e TESTCOVERAGE \
-e TESTDEBUG \
-e TESTDIRS \
@ -75,7 +79,12 @@ DOCKER_ENVS := \
-e PLATFORM \
-e DEFAULT_PRODUCT_LICENSE \
-e PRODUCT \
-e PACKAGER_NAME
-e PACKAGER_NAME \
-e PAGER \
-e GIT_PAGER \
-e OTEL_EXPORTER_OTLP_ENDPOINT \
-e OTEL_EXPORTER_OTLP_PROTOCOL \
-e OTEL_SERVICE_NAME
# note: we _cannot_ add "-e DOCKER_BUILDTAGS" here because even if it's unset in the shell, that would shadow the "ENV DOCKER_BUILDTAGS" set in our Dockerfile, which is very important for our official builds
# to allow `make BIND_DIR=. shell` or `make BIND_DIR= test`
@ -37,6 +37,6 @@ There is hopefully enough example material in the file for you to copy a similar
When you make edits to `swagger.yaml`, you may want to check the generated API documentation to ensure it renders correctly.
Run `make swagger-docs` and a preview will be running at `http://localhost`. Some of the styling may be incorrect, but you'll be able to ensure that it is generating the correct documentation.
Run `make swagger-docs` and a preview will be running at `http://localhost:9000`. Some of the styling may be incorrect, but you'll be able to ensure that it is generating the correct documentation.
The production documentation is generated by vendoring `swagger.yaml` into [docker/docker.github.io](https://github.com/docker/docker.github.io).
expectedErr:fmt.Sprintf("invalid API version: the minimum API version (%s) is higher than the default version (%s)",api.DefaultVersion,api.MinSupportedAPIVersion),
},
{
doc:"invalid default too low",
defaultVersion:"0.1",
minVersion:api.MinSupportedAPIVersion,
expectedErr:fmt.Sprintf("invalid default API version (0.1): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid default too high",
defaultVersion:"9999.9999",
minVersion:api.DefaultVersion,
expectedErr:fmt.Sprintf("invalid default API version (9999.9999): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid minimum too low",
defaultVersion:api.MinSupportedAPIVersion,
minVersion:"0.1",
expectedErr:fmt.Sprintf("invalid minimum API version (0.1): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid minimum too high",
defaultVersion:api.DefaultVersion,
minVersion:"9999.9999",
expectedErr:fmt.Sprintf("invalid minimum API version (9999.9999): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
errString:"client version 0.1 is too old. Minimum supported API version is 1.2.0, please upgrade your client to a newer version",
errString:fmt.Sprintf("client version 0.1 is too old. Minimum supported API version is %s, please upgrade your client to a newer version",api.MinSupportedAPIVersion),
},
{
reqVersion:"9999.9999",
errString:"client version 9999.9999 is too new. Maximum supported API version is 1.10.0",
errString:fmt.Sprintf("client version 9999.9999 is too new. Maximum supported API version is %s",api.DefaultVersion),
// There is existing endpoint config - if it's not indexed by NetworkMode.Name(), we
// can't tell which network the container-wide settings was intended for. NetworkMode,
// the keys in EndpointsConfig and the NetworkID in EndpointsConfig may mix network
// name/id/short-id. It's not safe to create EndpointsConfig under the NetworkMode
// name to store the container-wide MAC address, because that may result in two sets
// of EndpointsConfig for the same network and one set will be discarded later. So,
// reject the request ...
ep,ok:=networkingConfig.EndpointsConfig[nwName]
if!ok{
return"",errdefs.InvalidParameter(errors.New("if a container-wide MAC address is supplied, HostConfig.NetworkMode must match the identity of a network in NetworkSettings.Networks"))
}
// ep is the endpoint that needs the container-wide MAC address; migrate the address
// to it, or bail out if there's a mismatch.
ifep.MacAddress==""{
ep.MacAddress=deprecatedMacAddress
}elseifep.MacAddress!=deprecatedMacAddress{
return"",errdefs.InvalidParameter(errors.New("the container-wide MAC address must match the endpoint-specific MAC address for the main network, or be left empty"))
}
}
}
warning="The container-wide MacAddress field is now deprecated. It should be specified in EndpointsConfig instead."
config.MacAddress=""//nolint:staticcheck // ignore SA1019: field is deprecated, but still used on API < v1.44.
stream:=grpc.StreamInterceptor(grpc_middleware.ChainStreamServer(otelgrpc.StreamServerInterceptor(),grpcerrors.StreamServerInterceptor))//nolint:staticcheck // TODO(thaJeztah): ignore SA1019 for deprecated options: see https://github.com/moby/moby/issues/47437
withTrace:=otelgrpc.UnaryServerInterceptor()//nolint:staticcheck // TODO(thaJeztah): ignore SA1019 for deprecated options: see https://github.com/moby/moby/issues/47437
the URL is not supported by the daemon, a HTTP `400 Bad Request` error message
is returned.
If you omit the version-prefix, the current version of the API (v1.43) is used.
For example, calling `/info` is the same as calling `/v1.43/info`. Using the
If you omit the version-prefix, the current version of the API (v1.45) is used.
For example, calling `/info` is the same as calling `/v1.45/info`. Using the
API without a version-prefix is deprecated and will be removed in a future release.
Engine releases in the near future should support this version of the API,
@ -388,6 +388,20 @@ definitions:
description:"Create mount point on host if missing"
type:"boolean"
default:false
ReadOnlyNonRecursive:
description:|
Make the mount non-recursively read-only, but still leave the mount recursive
(unless NonRecursive is set to `true` in conjunction).
Addded in v1.44, before that version all read-only mounts were
non-recursive by default. To match the previous behaviour this
will default to `true` for clients on versions prior to v1.44.
type:"boolean"
default:false
ReadOnlyForceRecursive:
description:"Raise an error if the mount cannot be made recursively read-only."
type:"boolean"
default:false
VolumeOptions:
description:"Optional configuration for the `volume` type."
type:"object"
@ -413,6 +427,10 @@ definitions:
type:"object"
additionalProperties:
type:"string"
Subpath:
description:"Source path inside the volume. Must be relative without any back traversals."
type:"string"
example:"dir-inside-volume/subdirectory"
TmpfsOptions:
description:"Optional configuration for the `tmpfs` type."
type:"object"
@ -794,6 +812,12 @@ definitions:
1000000(1 ms). 0 means inherit.
type:"integer"
format:"int64"
StartInterval:
description:|
The time to wait between checks in nanoseconds during the start period.
It should be 0 or at least 1000000 (1 ms). 0 means inherit.
type:"integer"
format:"int64"
Health:
description:|
@ -1297,7 +1321,10 @@ definitions:
type:"boolean"
x-nullable:true
MacAddress:
description:"MAC address of the container."
description:|
MAC address of the container.
Deprecated:this field is deprecated in API v1.44 and up. Use EndpointSettings.MacAddress instead.
type:"string"
x-nullable:true
OnBuild:
@ -1347,16 +1374,16 @@ definitions:
EndpointsConfig:
description:|
A mapping of network name to endpoint configuration for that network.
The endpoint configuration can be left empty to connect to that
network with no particular endpoint configuration.
type:"object"
additionalProperties:
$ref:"#/definitions/EndpointSettings"
example:
# putting an example here, instead of using the example values from
# /definitions/EndpointSettings, because containers/create currently
# does not support attaching to multiple networks, so the example request
# would be confusing if it showed that multiple networks can be contained
# in the EndpointsConfig.
# TODO remove once we support multiple networks on container create (see https://github.com/moby/moby/blob/07e6b843594e061f82baa5fa23c2ff7d536c2a05/daemon/create.go#L323)
# /definitions/EndpointSettings, because EndpointSettings contains
# operational data returned when inspecting a container that we don't
# accept here.
EndpointsConfig:
isolated_nw:
IPAMConfig:
@ -1365,19 +1392,22 @@ definitions:
LinkLocalIPs:
- "169.254.34.68"
- "fe80::3468"
MacAddress:"02:42:ac:12:05:02"
Links:
- "container_1"
- "container_2"
Aliases:
- "server_x"
- "server_y"
database_nw:{}
NetworkSettings:
description:"NetworkSettings exposes the network settings in the API"
type:"object"
properties:
Bridge:
description:Name of the network's bridge (for example, `docker0`).
description:|
Name of the default bridge interface when dockerd's --bridge flag is set.
type:"string"
example:"docker0"
SandboxID:
@ -1387,34 +1417,40 @@ definitions:
HairpinMode:
description:|
Indicates if hairpin NAT should be enabled on the virtual interface.
Deprecated:This field is never set and will be removed in a future release.
type:"boolean"
example:false
LinkLocalIPv6Address:
description:IPv6 unicast address using the link-local prefix.
description:|
IPv6 unicast address using the link-local prefix.
Deprecated:This field is never set and will be removed in a future release.
type:"string"
example:"fe80::42:acff:fe11:1"
example:""
LinkLocalIPv6PrefixLen:
description:Prefix length of the IPv6 unicast address.
description:|
Prefix length of the IPv6 unicast address.
Deprecated:This field is never set and will be removed in a future release.
type:"integer"
example:"64"
example:""
Ports:
$ref:"#/definitions/PortMap"
SandboxKey:
description:SandboxKey identifies the sandbox
description:SandboxKey is the full path of the netns handle
type:"string"
example:"/var/run/docker/netns/8ab54b426c38"
# TODO is SecondaryIPAddresses actually used?
SecondaryIPAddresses:
description:""
description:"Deprecated: This field is never set and will be removed in a future release."
type:"array"
items:
$ref:"#/definitions/Address"
x-nullable:true
# TODO is SecondaryIPv6Addresses actually used?
SecondaryIPv6Addresses:
description:""
description:"Deprecated: This field is never set and will be removed in a future release."
type:"array"
items:
$ref:"#/definitions/Address"
@ -1715,18 +1751,27 @@ definitions:
description:|
Date and time at which the image was created, formatted in
[RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format with nano-seconds.
This information is only available if present in the image,
and omitted otherwise.
type:"string"
x-nullable:false
format:"dateTime"
x-nullable:true
example:"2022-02-04T21:20:12.497794809Z"
Container:
description:|
The ID of the container that was used to create the image.
Depending on how the image was created, this field may be empty.
**Deprecated**:this field is kept for backward compatibility, but
description:"operation not supported for pre-defined networks"
description:|
Forbidden operation. This happens when trying to create a network named after a pre-defined network,
or when trying to create an overlay network on a daemon which is not part of a Swarm cluster.
schema:
$ref:"#/definitions/ErrorResponse"
404:
@ -9956,13 +10106,7 @@ paths:
type:"string"
CheckDuplicate:
description:|
Check for networks with duplicate names. Since Network is
primarily keyed based on a random ID and not on the name, and
network name is strictly a user-friendly alias to the network
which is uniquely identified using ID, there is no guaranteed
way to check for duplicates. CheckDuplicate is there to provide
a best effort checking of any networks which has the same name
but it is not guaranteed to catch all name collisions.
Deprecated:CheckDuplicate is now always enabled.
type:"boolean"
Driver:
description:"Name of the network driver plugin to use."
@ -10030,14 +10174,19 @@ paths:
/networks/{id}/connect:
post:
summary:"Connect a container to a network"
description:"The network must be either a local-scoped network or a swarm-scoped network with the `attachable` option set. A network cannot be re-attached to a running container"
operationId:"NetworkConnect"
consumes:
- "application/json"
responses:
200:
description:"No error"
400:
description:"bad parameter"
schema:
$ref:"#/definitions/ErrorResponse"
403:
description:"Operation not supported for swarm scoped networks"
description:"Operation forbidden"
schema:
$ref:"#/definitions/ErrorResponse"
404:
@ -10072,6 +10221,7 @@ paths:
IPAMConfig:
IPv4Address:"172.24.56.89"
IPv6Address:"2001:db8::5689"
MacAddress:"02:42:ac:12:05:02"
tags:["Network"]
/networks/{id}/disconnect:
@ -10393,6 +10543,12 @@ paths:
default if omitted.
required:true
type:"string"
- name:"force"
in:"query"
description:|
Force disable a plugin even if still in use.
required:false
type:"boolean"
tags:["Plugin"]
/plugins/{name}/upgrade:
post:
@ -11059,18 +11215,7 @@ paths:
201:
description:"no error"
schema:
type:"object"
title:"ServiceCreateResponse"
properties:
ID:
description:"The ID of the created service."
type:"string"
Warning:
description:"Optional warning message"
type:"string"
example:
ID:"ak7w3gjqoa3kuz8xcpnyy0pvl"
Warning:"unable to pin image doesnotexist:latest to digest: image library/doesnotexist:latest not found"
msg:="invalid restart policy: maximum retry count can only be used with 'on-failure'"
ifpolicy.MaximumRetryCount<0{
msg+=" and cannot be negative"
}
return&errInvalidParameter{fmt.Errorf(msg)}
}
returnnil
caseRestartPolicyOnFailure:
ifpolicy.MaximumRetryCount<0{
return&errInvalidParameter{fmt.Errorf("invalid restart policy: maximum retry count cannot be negative")}
}
returnnil
case"":
// Versions before v25.0.0 created an empty restart-policy "name" as
// default. Allow an empty name with "any" MaximumRetryCount for
// backward-compatibility.
returnnil
default:
return&errInvalidParameter{fmt.Errorf("invalid restart policy: unknown policy '%s'; use one of '%s', '%s', '%s', or '%s'",policy.Name,RestartPolicyDisabled,RestartPolicyAlways,RestartPolicyOnFailure,RestartPolicyUnlessStopped)}
}
}
// LogMode is a type to define the available modes for logging
// These modes affect how logs are handled when log messages start piling up.