This error is returned when attempting to walk a descriptor that
*should* be an index or a manifest.
Without this the error is not very helpful sicne there's no way to tell
what triggered it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Issue was caused by the changes here https://github.com/moby/moby/pull/45504
First released in v25.0.0-beta.1
Signed-off-by: Christopher Petito <47751006+krissetto@users.noreply.github.com>
Don't mutate the container's `Config.WorkingDir` permanently with a
cleaned path when creating a working directory.
Move the `filepath.Clean` to the `translateWorkingDir` instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `normalizeWorkdir` function has two branches, one that returns a
result of `filepath.Join` which always returns a cleaned path, and
another one where the input string is returned unmodified.
To make these two outputs consistent, also clean the path in the second
branch.
This also makes the cleaning of the container workdir explicit in the
`normalizeWorkdir` function instead of relying on the
`SetupWorkingDirectory` to mutate it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make the internal DNS resolver for Windows containers forward requests
to upsteam DNS servers when it cannot respond itself, rather than
returning SERVFAIL.
Windows containers are normally configured with the internal resolver
first for service discovery (container name lookup), then external
resolvers from '--dns' or the host's networking configuration.
When a tool like ping gets a SERVFAIL from the internal resolver, it
tries the other nameservers. But, nslookup does not, and with this
change it does not need to.
The internal resolver learns external server addresses from the
container's HNSEndpoint configuration, so it will use the same DNS
servers as processes in the container.
The internal resolver for Windows containers listens on the network's
gateway address, and each container may have a different set of external
DNS servers. So, the resolver uses the source address of the DNS request
to select external resolvers.
On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used
to prevent the internal resolver from forwarding requests (restoring the
old behaviour).
Signed-off-by: Rob Murray <rob.murray@docker.com>
- deprecate Prestart hook
- deprecate kernel memory limits
Additions
- config: add idmap and ridmap mount options
- config.md: allow empty mappings for [r]idmap
- features-linux: Expose idmap information
- mount: Allow relative mount destinations on Linux
- features: add potentiallyUnsafeConfigAnnotations
- config: add support for org.opencontainers.image annotations
Minor fixes:
- config: improve bind mount and propagation doc
full diff: https://github.com/opencontainers/runtime-spec/compare/v1.1.0...v1.2.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds some nolint-comments for the deprecated kernel-memory options; we
deprecated these, but they could technically still be accepted by alternative
runtimes.
daemon/daemon_unix.go:108:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &config.KernelMemory
^
daemon/update_linux.go:63:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &resources.KernelMemory
^
Prestart hooks are deprecated, and more granular hooks should be used instead.
CreateRuntime are the closest equivalent, and executed in the same locations
as Prestart-hooks, but depending on what these hooks do, possibly one of the
other hooks could be used instead (such as CreateContainer or StartContainer).
As these hooks are still supported, this patch adds nolint comments, but adds
some TODOs to consider migrating to something else;
daemon/nvidia_linux.go:86:2: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
daemon/oci_linux.go:76:5: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No IPAM IPv6 address is given to an interface in a network with
'--ipv6=false', but the kernel would assign a link-local address and,
in a macvlan/ipvlan network, the interface may get a SLAAC-assigned
address.
So, disable IPv6 on the interface to avoid that.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This reverts commit a77e147d32.
The ipvlan integration tests have been skipped in CI because of a check
intended to ensure the kernel has ipvlan support - which failed, but
seems to be unnecessary (probably because kernels have moved on).
Signed-off-by: Rob Murray <rob.murray@docker.com>
We document that an macvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an macvlan
network was still configured with a gateway. So, DNS proxying would
be enabled in the internal resolver (and, if the host's resolver
was on a localhost address, requests to external resolvers from the
host's network namespace would succeed).
This change disables configuration of a gateway for a macvlan Endpoint
if no parent interface is specified.
(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway will still be set up. Documentation
will need to be updated to note that '--internal' should be used to
prevent DNS request forwarding in this case.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
The internal DNS resolver should only forward requests to external
resolvers if the libnetwork.Sandbox served by the resolver has external
network access (so, no forwarding for '--internal' networks).
The test for external network access was whether the Sandbox had an
Endpoint with a gateway configured.
However, an ipvlan-l3 networks with external network access does not
have a gateway, it has a default route bound to an interface.
Also, we document that an ipvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an ipvlan-l2
network was configured with a gateway. So, DNS proxying would be enabled
in the internal resolver (and, if the host's resolver was on a localhost
address, requests to external resolvers from the host's network
namespace would succeed).
So, this change adjusts the test for enabling DNS proxying to include
a check for '--internal' (as a shortcut) and, for non-internal networks,
checks for a default route as well as a gateway. It also disables
configuration of a gateway or a default route for an ipvlan Endpoint if
no parent interface is specified.
(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway/default route will still be set up
and external DNS proxying will be enabled. The network must be
configured as '--internal' to prevent that from happening.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
go1.21.9 (released 2024-04-03) includes a security fix to the net/http
package, as well as bug fixes to the linker, and the go/types and
net/http packages. See the [Go 1.21.9 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.21.9+label%3ACherryPickApproved)
for more details.
These minor releases include 1 security fixes following the security policy:
- http2: close connections when receiving too many headers
Maintaining HPACK state requires that we parse and process all HEADERS
and CONTINUATION frames on a connection. When a request's headers exceed
MaxHeaderBytes, we don't allocate memory to store the excess headers but
we do parse them. This permits an attacker to cause an HTTP/2 endpoint
to read arbitrary amounts of header data, all associated with a request
which is going to be rejected. These headers can include Huffman-encoded
data which is significantly more expensive for the receiver to decode
than for an attacker to send.
Set a limit on the amount of excess header frames we will process before
closing a connection.
Thanks to Bartek Nowotarski (https://nowotarski.info/) for reporting this issue.
This is CVE-2023-45288 and Go issue https://go.dev/issue/65051.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.22.2
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.9+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.8...go1.21.9
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
full diff: https://github.com/golang/net/compare/v0.22.0...v0.23.0
Includes a fix for CVE-2023-45288, which is also addressed in go1.22.2
and go1.21.9;
> http2: close connections when receiving too many headers
>
> Maintaining HPACK state requires that we parse and process
> all HEADERS and CONTINUATION frames on a connection.
> When a request's headers exceed MaxHeaderBytes, we don't
> allocate memory to store the excess headers but we do
> parse them. This permits an attacker to cause an HTTP/2
> endpoint to read arbitrary amounts of data, all associated
> with a request which is going to be rejected.
>
> Set a limit on the amount of excess header frames we
> will process before closing a connection.
>
> Thanks to Bartek Nowotarski for reporting this issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diffs changes relevant to vendored code:
- https://github.com/golang/net/compare/v0.18.0...v0.22.0
- websocket: add support for dialing with context
- http2: remove suspicious uint32->v conversion in frame code
- http2: send an error of FLOW_CONTROL_ERROR when exceed the maximum octets
- https://github.com/golang/crypto/compare/v0.17.0...v0.21.0
- internal/poly1305: drop Go 1.12 compatibility
- internal/poly1305: improve sum_ppc64le.s
- ocsp: don't use iota for externally defined constants
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
illumos is the opensource continuation of OpenSolaris after Oracle
closed to source it (again).
For example use see: https://github.com/openbao/openbao/pull/205.
Signed-off-by: Jasper Siepkes <siepkes@serviceplanet.nl>
This was brought up by bmitch that its not expected to have a platform
object in the config descriptor.
Also checked with tianon who agreed, its not _wrong_ but is unexpected
and doesn't neccessarily make sense to have it there.
Also, while technically incorrect, ECR is throwing an error when it sees
this.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This was using `errors.Wrap` when there was no error to wrap, meanwhile
we are supposed to be creating a new error.
Found this while investigating some log corruption issues and
unexpectedly getting a nil reader and a nil error from `getTailReader`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The NetworkMode "default" is now normalized into the value it
aliases ("bridge" on Linux and "nat" on Windows) by the
ContainerCreate endpoint, the legacy image builder, Swarm's
cluster executor and by the container restore codepath.
builder-next is left untouched as it already uses the normalized
value (ie. bridge).
Going forward, this will make maintenance easier as there's one
less NetworkMode to care about.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
The `identity.ChainIDs` call was accidentally removed in
b37ced2551.
This broke the shared size calculation for images with more than one
layer that were sharing the same compressed layer.
This was could be reproduced with:
```
$ docker pull docker.io/docker/desktop-kubernetes-coredns:v1.11.1
$ docker pull docker.io/docker/desktop-kubernetes-etcd:3.5.10-0
$ docker system df
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
After a535a65c4b the size reported by the
image list was changed to include all platforms of that image.
This made the "shared size" calculation consider all diff ids of all the
platforms available in the image which caused "snapshot not found"
errors when multiple images were sharing the same layer which wasn't
unpacked.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is better because every possible platform combination
does not need to be defined in the Dockerfile. If built
for platform where Delve is not supported then it is just
skipped.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Copy the swagger / OpenAPI file to the documentation. This is the API
version used by the upcoming v26.0.0 release.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Benchmark the `Images` implementation (image list) against an image
store with 10, 100 and 1000 random images. Currently the images are
single-platform only.
The images are generated randomly, but a fixed seed is used so the
actual testing data will be the same across different executions.
Because the content store is not a real containerd image store but a
local implementation, a small delay (500us) is added to each content
store method call. This is to simulate a real-world usage where each
containerd client call requires a gRPC call.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit 8921897e3b introduced the uses of `clear()`,
which requires go1.21, but Go is downgrading this file to go1.16 when used in
other projects (due to us not yet being a go module);
0.175 + xx-go build '-gcflags=' -ldflags '-X github.com/moby/buildkit/version.Version=b53a13e -X github.com/moby/buildkit/version.Revision=b53a13e4f5c8d7e82716615e0f23656893df89af -X github.com/moby/buildkit/version.Package=github.com/moby/buildkit -extldflags '"'"'-static'"'" -tags 'osusergo netgo static_build seccomp ' -o /usr/bin/buildkitd ./cmd/buildkitd
181.8 # github.com/docker/docker/libnetwork/internal/resolvconf
181.8 vendor/github.com/docker/docker/libnetwork/internal/resolvconf/resolvconf.go:509:2: clear requires go1.21 or later (-lang was set to go1.16; check go.mod)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
52a80b40e2 extracted the `imageSummary`
function but introduced a bug causing the whole caller function to
return if the image should be skipped.
`imageSummary` returns a nil error and nil image when the image doesn't
have any platform or all its platforms are not available locally.
In this case that particular image should be skipped, instead of failing
the whole image list operation.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't run filter function which would only run through the images
reading theirs config without checking any label anyway.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit c655b7dc78 added a check to make sure
the TMP_OUT variable was not set to an empty value, as such a situation would
perform an `rm -rf /**` during cleanup.
However, it was a bit too eager, because Makefile conditionals (`ifeq`) are
evaluated when parsing the Makefile, which happens _before_ the make target
is executed.
As a result `$@_TMP_OUT` was always empty when the `ifeq` was evaluated,
making it not possible to execute the `generate-files` target.
This patch changes the check to use a shell command to evaluate if the var
is set to an empty value.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix `error mounting "/etc/hosts" to rootfs at "/etc/hosts": mount
/etc/hosts:/etc/hosts (via /proc/self/fd/6), flags: 0x5021: operation
not permitted`.
This error was introduced in 7d08d84b03
(`dockerd-rootless.sh: set rootlesskit --state-dir=DIR`) that changed
the filesystem of the state dir from /tmp to /run (in a typical setup).
Fix issue 47248
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This code is currently only used in the daemon, but is also needed in other
places. We should consider moving this code to github.com/moby/sys, so that
BuildKit can also use the same implementation instead of maintaining a fork;
moving it to internal allows us to reuse this code inside the repository, but
does not allow external consumers to depend on it (which we don't want as
it's not a permanent location).
As our code only uses this in linux files, I did not add a stub for other
platforms (but we may decide to do that in the moby/sys repository).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit cbc2a71c2 makes `connect` syscall fail fast when a container is
only attached to an internal network. Thanks to that, if such a
container tries to resolve an "external" domain, the embedded resolver
returns an error immediately instead of waiting for a timeout.
This commit makes sure the embedded resolver doesn't even try to forward
to upstream servers.
Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Adds an experimental `DOCKER_BUILDKIT_RUNC_COMMAND` variable that allows
to specify different runc-compatible binary to be used by the buildkit's
runc executor.
This allows runtimes like sysbox be used for the containers spawned by
buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
full diffs:
- https://github.com/protocolbuffers/protobuf-go/compare/v1.31.0...v1.33.0
- https://github.com/golang/protobuf/compare/v1.5.3...v1.5.4
From the Go security announcement list;
> Version v1.33.0 of the google.golang.org/protobuf module fixes a bug in
> the google.golang.org/protobuf/encoding/protojson package which could cause
> the Unmarshal function to enter an infinite loop when handling some invalid
> inputs.
>
> This condition could only occur when unmarshaling into a message which contains
> a google.protobuf.Any value, or when the UnmarshalOptions.UnmarshalUnknown
> option is set. Unmarshal now correctly returns an error when handling these
> inputs.
>
> This is CVE-2024-24786.
In a follow-up post;
> A small correction: This vulnerability applies when the UnmarshalOptions.DiscardUnknown
> option is set (as well as when unmarshaling into any message which contains a
> google.protobuf.Any). There is no UnmarshalUnknown option.
>
> In addition, version 1.33.0 of google.golang.org/protobuf inadvertently
> introduced an incompatibility with the older github.com/golang/protobuf
> module. (https://github.com/golang/protobuf/issues/1596) Users of the older
> module should update to github.com/golang/protobuf@v1.5.4.
govulncheck results in our code:
govulncheck ./...
Scanning your code and 1221 packages across 204 dependent modules for known vulnerabilities...
=== Symbol Results ===
Vulnerability #1: GO-2024-2611
Infinite loop in JSON unmarshaling in google.golang.org/protobuf
More info: https://pkg.go.dev/vuln/GO-2024-2611
Module: google.golang.org/protobuf
Found in: google.golang.org/protobuf@v1.31.0
Fixed in: google.golang.org/protobuf@v1.33.0
Example traces found:
#1: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Peek
#2: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Read
#3: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls protojson.Unmarshal
Your code is affected by 1 vulnerability from 1 module.
This scan found no other vulnerabilities in packages you import or modules you
require.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turn warnings into a deprecation notice and highlight that it will
prevent daemon startup in future releases.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- full diff: https://github.com/containerd/containerd/compare/v1.7.13...v1.7.14
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.14
Welcome to the v1.7.14 release of containerd!
The fourteenth patch release for containerd 1.7 contains various fixes and updates.
Highlights
- Update builds to use go 1.21.8
- Fix various timing issues with docker pusher
- Register imagePullThroughput and count with MiB
- Move high volume event logs to Trace level
Container Runtime Interface (CRI)
- Handle pod transition states gracefully while listing pod stats
Runtime
- Update runc-shim to process exec exits before init
Dependency Changes
- github.com/containerd/nri v0.4.0 -> v0.6.0
- github.com/containerd/ttrpc v1.2.2 -> v1.2.3
- google.golang.org/genproto/googleapis/rpc 782d3b101e98 -> cbb8c96f2d6d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With both rootless and live restore enabled, there's some race condition
which causes the container to be `Unmount`ed before the refcount is
restored.
This makes sure we don't underflow the refcount (uint64) when
decrementing it.
The root cause of this race condition still needs to be investigated and
fixed, but at least this unflakies the `TestLiveRestore`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use a separate `devcontainer` Dockerfile target, this allows to include
the `gopls` in the devcontainer so it doesn't have to be installed by
the Go vscode extension.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make sure the `ping` command used by `TestBridgeICC` actually has
the `-6` flag when it runs IPv6 test cases. Without this flag,
IPv6 connectivity isn't tested properly.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Currently this won't have any real effect because the platform matcher
matches all platform and is only used for sorting.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Move containers counting out of `singlePlatformImage` and count them
based on the `ImageManifest` property.
(also remove ChainIDs calculation as they're no longer used)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Avoid fetching `SnapshotService` from client every time. Fetch it once
and then store when creating the image service.
This also allows to pass custom snapshotter implementation for unit
testing.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use `image.Store` and `content.Store` stored in the ImageService struct
instead of fetching it every time from containerd client.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Both containerd and graphdriver image service use the same code to
create the cache - they only supply their own `cacheAdaptor` struct.
Extract the shared code to `cache.New`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Move image store backend specific code out of the cache code and move it
to a separate interface to allow using the same cache code with
containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Rather than error out if the host's resolv.conf has a bad ndots option,
just ignore it. Still validate ndots supplied via '--dns-option' and
treat failure as an error.
Signed-off-by: Rob Murray <rob.murray@docker.com>
When this was called concurrently from the moby image
exporter there could be a data race where a layer was
written to the refs map when it was already there.
In that case the reference count got mixed up and on
release only one of these layers was actually released.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
When IPv6 is disabled in a container by, for example, using the --sysctl
option - an IPv6 address/gateway is still allocated. Don't attempt to
apply that config because doing so enables IPv6 on the interface.
Signed-off-by: Rob Murray <rob.murray@docker.com>
When configuring the internal DNS resolver - rather than keep IPv6
nameservers read from the host's resolv.conf in the container's
resolv.conf, treat them like IPv4 addresses and use them as upstream
resolvers.
For IPv6 nameservers, if there's a zone identifier in the address or
the container itself doesn't have IPv6 support, mark the upstream
addresses for use in the host's network namespace.
Signed-off-by: Rob Murray <rob.murray@docker.com>
RootlessKit will print hints if something is still unsatisfied.
e.g., `kernel.apparmor_restrict_unprivileged_userns` constraint
rootless-containers/rootlesskit@33c3e7ca6c
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
In de2447c, the creation of the 'lower' file was changed from using
os.Create to using ioutils.AtomicWriteFile, which ignores the system's
umask. This means that even though the requested permission in the
source code was always 0666, it was 0644 on systems with default
umask of 0022 prior to de2447c, so the move to AtomicFile potentially
increased the file's permissions.
This is not a security issue because the parent directory does not
allow writes into the file, but it can confuse security scanners on
Linux-based systems into giving false positives.
Signed-off-by: Jaroslav Jindrak <dzejrou@gmail.com>
The field will still be present in the response, but will always be
`false`.
Searching for `is-automated=true` will yield no results, while
`is-automated=false` will effectively be a no-op.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When using devcontainers in VSCode, install the Go extension
automatically in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
While github.com/stretchr/testify is not used directly by any of the
repository code, it is a transitive dependency via Swarmkit and
therefore still easy to use without having to revendor. Add lint rules
to ban importing testify packages to make sure nobody does.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Apply command gotest.tools/v3/assert/cmd/gty-migrate-from-testify to the
cnmallocator package to be consistent with the assertion library used
elsewhere in moby.
Signed-off-by: Cory Snider <csnider@mirantis.com>
In a container-create API request, HostConfig.NetworkMode (the identity
of the "main" network) may be a name, id or short-id.
The configuration for that network, including preferred IP address etc,
may be keyed on network name or id - it need not match the NetworkMode.
So, when migrating the old container-wide MAC address to the new
per-endpoint field - it is not safe to create a new EndpointSettings
entry unless there is no possibility that it will duplicate settings
intended for the same network (because one of the duplicates will be
discarded later, dropping the settings it contains).
This change introduces a new API restriction, if the deprecated container
wide field is used in the new API, and EndpointsConfig is provided for
any network, the NetworkMode and key under which the EndpointsConfig is
store must be the same - no mixing of ids and names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This message accidentally changed in ac2a028dcc
because my IDE's "refactor tool" was a bit over-enthusiastic. It also went and
updated the tests accordingly, so CI didn't catch this :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Moby imports Swarmkit; Swarmkit no longer imports Moby. In order to
accomplish this feat, Swarmkit has introduced a new plugin.Getter
interface so it could stop importing our pkg/plugingetter package. This
new interface is not entirely compatible with our
plugingetter.PluginGetter interface, necessitating a thin adapter.
Swarmkit had to jettison the CNM network allocator to stop having to
import libnetwork as the cnmallocator package is deeply tied to
libnetwork. Move the CNM network allocator into libnetwork, where it
belongs. The package had a short an uninteresting Git history in the
Swarmkit repository so no effort was made to retain history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This patch disables pulling legacy (schema1 and schema 2, version 1) images by
default.
A `DOCKER_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE` environment-variable is
introduced to allow re-enabling this feature, aligning with the environment
variable used in containerd 2.0 (`CONTAINERD_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE`).
With this patch, attempts to pull a legacy image produces an error:
With graphdrivers:
docker pull docker:1.0
1.0: Pulling from library/docker
[DEPRECATION NOTICE] Docker Image Format v1, and Docker Image manifest version 2, schema 1 support will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format, or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
With the containerd image store enabled, output is slightly different
as it returns the error before printing the `1.0: pulling ...`:
docker pull docker:1.0
Error response from daemon: [DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
Using the "distribution" endpoint to resolve the digest for an image also
produces an error:
curl -v --unix-socket /var/run/docker.sock http://foo/distribution/docker.io/library/docker:1.0/json
* Trying /var/run/docker.sock:0...
* Connected to foo (/var/run/docker.sock) port 80 (#0)
> GET /distribution/docker.io/library/docker:1.0/json HTTP/1.1
> Host: foo
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< Api-Version: 1.45
< Content-Type: application/json
< Docker-Experimental: false
< Ostype: linux
< Server: Docker/dev (linux)
< Date: Tue, 27 Feb 2024 16:09:42 GMT
< Content-Length: 354
<
{"message":"[DEPRECATION NOTICE] Docker Image Format v1, and Docker Image manifest version 2, schema 1 support will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format, or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/"}
* Connection #0 to host foo left intact
Starting the daemon with the `DOCKER_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE`
env-var set to a non-empty value allows pulling the image;
docker pull docker:1.0
[DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
b0a0e6710d13: Already exists
d193ad713811: Already exists
ba7268c3149b: Already exists
c862d82a67a2: Already exists
Digest: sha256:5e7081837926c7a40e58881bbebc52044a95a62a2ea52fb240db3fc539212fe5
Status: Image is up to date for docker:1.0
docker.io/library/docker:1.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When creating a new daemon in the `TestDaemonProxy`, reset the
`OTEL_EXPORTER_OTLP_ENDPOINT` to an empty value to disable OTEL
collection to avoid it hitting the proxy.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This should allow to enable host loopback by setting
DOCKERD_ROOTLESS_ROOTLESSKIT_DISABLE_HOST_LOOPBACK to false,
defaults true.
Signed-off-by: serhii.n <serhii.n@thescimus.com>
Don't use all `*.json` files blindly, take only these that are likely to
be reports from go test.
Also, use `find ... -exec` instead of piping results to `xargs`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
For current implementation of Checkpoint Restore (C/R) in docker, it
will write the checkpoint to content store. However, when restoring
libcontainerd uses .Digest().Encoded(), which will remove the info
of alg, leading to error.
Signed-off-by: huang-jl <1046678590@qq.com>
Buildkit added support for exporting metrics in:
7de2e4fb32
Explicitly set the protocol for exporting metrics like we do for the
traces. We need that because Buildkit defaults to grpc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
30c069cb03
removed the `ResolveImageConfig` method in favor of more generic
`ResolveSourceMetadata` that can also support other things than images.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
e358792815
changed that field to a function and added an `OverrideResource`
function that allows to override it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
StaticDirSource definition changed and can no longer be initialized from
the composite literal.
a80b48544c
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All other progress updates are emitted with truncated id.
```diff
$ docker pull --platform linux/amd64 alpine
Using default tag: latest
latest: Pulling from library/alpine
-sha256:4abcf20661432fb2d719aaf90656f55c287f8ca915dc1c92ec14ff61e67fbaf8: Pulling fs layer
+4abcf2066143: Download complete
Digest: sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
Status: Image is up to date for alpine:latest
docker.io/library/alpine:latest
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't change the behavior for older clients and keep the same behavior.
Otherwise client can't opt-out (because `ReadOnlyNonRecursive` is
unsupported before 1.44).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit e6907243af applied a fix for situations
where the client was configured with API-version negotiation, but did not yet
negotiate a version.
However, the checkVersion() function that was implemented copied the semantics
of cli.NegotiateAPIVersion, which ignored connection failures with the
assumption that connection errors would still surface further down.
However, when using the result of a failed negotiation for NewVersionError,
an API version mismatch error would be produced, masking the actual connection
error.
This patch changes the signature of checkVersion to return unexpected errors,
including failures to connect to the API.
Before this patch:
docker -H unix:///no/such/socket.sock secret ls
"secret list" requires API version 1.25, but the Docker daemon API version is 1.24
With this patch applied:
docker -H unix:///no/such/socket.sock secret ls
Cannot connect to the Docker daemon at unix:///no/such/socket.sock. Is the docker daemon running?
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function has various errors that are returned when failing to make a
connection (due to permission issues, TLS mis-configuration, or failing to
resolve the TCP address).
The errConnectionFailed error is currently used as a special case when
processing Ping responses. The current code did not consistently treat
connection errors, and because of that could either absorb the error,
or process the empty response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
NegotiateAPIVersion was ignoring errors returned by Ping. The intent here
was to handle API responses from a daemon that may be in an unhealthy state,
however this case is already handled by Ping itself.
Ping only returns an error when either failing to connect to the API (daemon
not running or permissions errors), or when failing to parse the API response.
Neither of those should be ignored in this code, or considered a successful
"ping", so update the code to return
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in 27ef09a46f, which changed
the Ping handling to ignore internal server errors. That case is tested in
TestPingFail, which verifies that we accept the Ping response if a 500
status code was received.
The TestPingWithError test was added to verify behavior if a protocol
(connection) error occurred; however the mock-client returned both a
response, and an error; the error returned would only happen if a connection
error occurred, which means that the server would not provide a reply.
Running the test also shows that returning a response is unexpected, and
ignored:
=== RUN TestPingWithError
2024/02/23 14:16:49 RoundTripper returned a response & error; ignoring response
2024/02/23 14:16:49 RoundTripper returned a response & error; ignoring response
--- PASS: TestPingWithError (0.00s)
PASS
This patch updates the test to remove the response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't error out when mount source doesn't exist and mounts has
`CreateMountpoint` option enabled.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Any PR that is labeled with any `impact/*` label should have a
description for the changelog and an `area/*` label.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A common pattern in libnetwork is to delete an object using
`DeleteAtomic`, ie. to check the optimistic lock, but put in a retry
loop to refresh the data and the version index used by the optimistic
lock.
This commit introduces a new `Delete` method to delete without
checking the optimistic lock. It focuses only on the few places where
it's obvious the calling code doesn't rely on the side-effects of the
retry loop (ie. refreshing the object to be deleted).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
I noticed that this log didn't use structured logs;
[resolver] failed to query DNS server: 10.115.11.146:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:46361->10.115.11.146:53: i/o timeout
[resolver] failed to query DNS server: 10.44.139.225:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:53991->10.44.139.225:53: i/o timeout
But other logs did;
DEBU[2024-02-20T15:48:51.026704088Z] [resolver] forwarding query client-addr="udp:172.19.0.2:39661" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t A"
DEBU[2024-02-20T15:48:51.028331088Z] [resolver] forwarding query client-addr="udp:172.19.0.2:35163" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t AAAA"
DEBU[2024-02-20T15:48:51.057329755Z] [resolver] received AAAA record "2a00:1450:400e:801::200e" for "google.com." from udp:192.168.65.7
DEBU[2024-02-20T15:48:51.057666880Z] [resolver] received A record "142.251.36.14" for "google.com." from udp:192.168.65.7
As we're already constructing a logger with these fields, we may as well use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Allow to override the PAGER/GIT_PAGER variables inside the container.
Use `cat` as pager when running in Github Actions (to avoid things like
`git diff` stalling the CI).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't use OTEL tracing in this test because we're testing the HTTP proxy
behavior here and we don't want OTEL to interfere.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This will return a single entry for each name/value pair, and for now
all the "image specific" metadata (labels, config, size) should be
either "default platform" or "first platform we have locally" (which
then matches the logic for commands like `docker image inspect`, etc)
with everything else (just ID, maybe?) coming from the manifest
list/index.
That leaves room for the longer-term implementation to add new fields to
describe the _other_ images that are part of the manifest list/index.
Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
v1.33.0 is also available, but it would also cause
`github.com/aws/aws-sdk-go-v2` change from v1.24.1 to v1.25.0
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
DNS names were only set up for user-defined networks. On Linux, none
of the built-in networks (bridge/host/none) have built-in DNS, so they
don't need DNS names.
But, on Windows, the default network is "nat" and it does need the DNS
names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This matches the prior behavior before 2a6ff3c24f.
This also updates the Swagger documentation for the current version to note that the field might be the empty string and what that means.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Archives being unpacked by Dockerfiles may have been created on other
OSes with different conventions and semantics for xattrs, making them
impossible to apply when extracting. Restore the old best-effort xattr
behaviour users have come to depend on in the classic builder.
The (archive.Archiver).UntarPath function does not allow the options
passed to Untar to be customized. It also happens to be a trivial
wrapper around the Untar function. Inline the function body and add the
option.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Update to the latest patch release, which contains changes from v0.13.5 to
remove the reference package from "github.com/docker/distribution", which
is now a separate module.
full diff: https://github.com/containerd/nydus-snapshotter/compare/v0.8.2...v0.13.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Non-swarm networks created before network-creation-time validation
was added in 25.0.0 continued working, because the checks are not
re-run.
But, swarm creates networks when needed (with 'agent=true'), to
ensure they exist on each agent - ignoring the NetworkNameError
that says the network already existed.
By ignoring validation errors on creation of a network with
agent=true, pre-existing swarm networks with IPAM config that would
fail the new checks will continue to work too.
New swarm (overlay) networks are still validated, because they are
initially created with 'agent=false'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This spec is not directly relevant for the image spec, and the Docker
documentation no longer includes the actual specification.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to release 25.0.0, the bridge in an internal network was assigned
an IP address - making the internal network accessible from the host,
giving containers on the network access to anything listening on the
bridge's address (or INADDR_ANY on the host).
This change restores that behaviour. It does not restore the default
route that was configured in the container, because packets sent outside
the internal network's subnet have always been dropped. So, a 'connect()'
to an address outside the subnet will still fail fast.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.
Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.
Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.
Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This const contains the minimum API version that can be supported by the
API server. The daemon is currently configured to use the same version,
but we may increment the _configured_ minimum version when deprecating
old API versions in future.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 08e4e88482 (Docker Engine v25.0.0)
deprecated API version v1.23 and lower, but older API versions could be
enabled through the DOCKER_MIN_API_VERSION environment variable.
This patch removes all support for API versions < v1.24.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.20 (Docker Engine v1.11.0) and older allowed a HostConfig to be passed
when starting a container. This feature was deprecated in API v1.21 (Docker
Engine v1.10.0) in 3e7405aea8, and removed in
API v1.23 (Docker Engine v1.12.0) in commit 0a8386c8be.
API v1.23 and older are deprecated, and this patch removes the feature.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 322e2a7d05 changed the format of errors
returned by the API to be in JSON format for API v1.24. Older versions of
the API returned errors in plain-text format.
API v1.23 and older are deprecated, so we can remove support for plain-text
error responses.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This endpoint was deprecated in API v1.20 (Docker Engine v1.8.0) in
commit db9cc91a9e, in favor of the
`PUT /containers/{id}/archive` and `HEAD /containers/{id}/archive`
endpoints, and disabled in API v1.24 (Docker Engine v1.12.0) through
commit 428328908d.
This patch removes the endpoint, and the associated `daemon.ContainerCopy`
method in the backend.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.21 (Docker Engine v1.9.0) enforces the request to have a JSON
content-type on exec start (see 45dc57f229).
An exception was added in 0b5e628e14 to
make this check conditional (supporting API < 1.21).
API v1.23 and older are deprecated, and this patch removes the feature.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.20 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The TestInspectAPIContainerResponse mentioned that Windows does not
support API versions before v1.25.
While technically, no stable release existed for Windows with API versions
before that (see f811d5b128), API version
v1.24 was enabled in e4af39aeb3, to have
a consistend fallback version for API version negotiation.
This patch updates the test to reflect that change.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.19 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.20 and up produces an error when signalling / killing a non-running
container (see c92377e300). Older API versions
allowed this, and an exception was added in 621e3d8587.
API v1.23 and older are deprecated, so we can remove this handling.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API versions before 1.19 allowed CpuShares that were greater than the maximum
or less than the minimum supported by the kernel, and relied on the kernel to
do the right thing.
Commit ed39fbeb2a introduced code to adjust the
CPU shares to be within the accepted range when using API version 1.18 or
lower.
API v1.23 and older are deprecated, so we can remove support for this
functionality.
Currently, there's no validation for CPU shares to be within an acceptable
range; a TODO was added to add validation for this option, and to use the
`linuxMinCPUShares` and `linuxMaxCPUShares` consts for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "pull" option was added in API v1.16 (Docker Engine v1.4.0) in commit
054e57a622, which gated the option by API
version.
API v1.23 and older are deprecated, so we can remove the gate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "rm" option was made the default in API v1.12 (Docker Engine v1.0.0)
in commit b60d647172, and "force-rm" was
added in 667e2bd4ea.
API v1.23 and older are deprecated, so we can remove these gates.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "pause" flag was added in API v1.13 (Docker Engine v1.1.0), and is
enabled by default (see 17d870bed5).
API v1.23 and older are deprecated, so we can remove the version-gate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inspect and history used two different ways to find the present images.
This made history fail in some cases where image inspect would work (if
a configuration of a manifest wasn't found in the content store).
With this change we now use the same logic for both inspect and history.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59
futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af
futex: Add sys_futex_wait()
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722
futex: Add sys_futex_requeue()
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e
x86/shstk: Introduce map_shadow_stack syscall
When operating with shadow stacks enabled, the kernel will automatically
allocate shadow stacks for new threads, however in some cases userspace
will need additional shadow stacks. The main example of this is the
ucontext family of functions, which require userspace allocating and
pivoting to userspace managed stacks.
Unlike most other user memory permissions, shadow stacks need to be
provisioned with special data in order to be useful. They need to be setup
with a restore token so that userspace can pivot to them via the RSTORSSP
instruction. But, the security design of shadow stacks is that they
should not be written to except in limited circumstances. This presents a
problem for userspace, as to how userspace can provision this special
data, without allowing for the shadow stack to be generally writable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07
fs: Add fchmodat2()
On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].
There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.
The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25
mm/mempolicy: add set_mempolicy_home_node syscall
This syscall can be used to set a home node for the MPOL_BIND and
MPOL_PREFERRED_MANY memory policy. Users should use this syscall after
setting up a memory policy for the specified range as shown below.
mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
new_nodes->size + 1, 0);
sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
home_node, 0);
The syscall allows specifying a home node/preferred node from which
kernel will fulfill memory allocation requests first.
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The compatibility depends on whether `hyperv` or `process` container
isolation is used.
This fixes cache not being used when building images based on older
Windows versions on a newer Windows host.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.
Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Since 964ab7158c, we explicitly set the bridge MTU if it was specified.
Unfortunately, kernel <v4.17 have a check preventing us to manually set
the MTU to anything greater than 1500 if no links is attached to the
bridge, which is how we do things -- create the bridge, set its MTU and
later on, attach veths to it.
Relevant kernel commit: 804b854d37
As we still have to support CentOS/RHEL 7 (and their old v3.10 kernels)
for a few more months, we need to ignore EINVAL if the MTU is > 1500
(but <= 65535).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Commit 4f47013feb introduced a new validation step to make sure no
IPv6 subnet is configured on a network which has EnableIPv6=false.
Commit 5d5eeac310 then removed that validation step and automatically
enabled IPv6 for networks with a v6 subnet. But this specific commit
was reverted in c59e93a67b and now the error introduced by 4f47013feb
is re-introduced.
But it turns out some users expect a network created with an IPv6
subnet and EnableIPv6=false to actually have no IPv6 connectivity.
This restores that behavior.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Previous commit made getDBhandle a one-liner returning a struct
member -- making it useless. Inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This parameter was used to tell the boltdb kvstore not to open/close
the underlying boltdb db file before/after each get/put operation.
Since d21d0884ae, we've a single datastore instance shared by all
components that need it. That commit set `PersistConnection=true`.
We can now safely remove this param altogether, and remove all the
code that was opening and closing the db file before and after each
operation -- it's dead code!
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test is non-representative of what we now do in libnetwork.
Since the ability of opening the same boltdb database multiple
times in parallel will be dropped in the next commit, just remove
this test.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:
```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
The monitorDaemon() goroutine calls startContainerd() then blocks on
<-daemonWaitCh to wait for it to exit. The startContainerd() function
would (re)initialize the daemonWaitCh so a restarted containerd could be
waited on. This implementation was race-free because startContainerd()
would synchronously initialize the daemonWaitCh before returning. When
the call to start the managed containerd process was moved into the
waiter goroutine, the code to initialize the daemonWaitCh struct field
was also moved into the goroutine. This introduced a race condition.
Move the daemonWaitCh initialization to guarantee that it happens before
the startContainerd() call returns.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Containers attached to an 'internal' bridge network are unable to
communicate when the host is running firewalld.
Non-internal bridges are added to a trusted 'docker' firewalld zone, but
internal bridges were not.
DOCKER-ISOLATION iptables rules are still configured for an internal
network, they block traffic to/from addresses outside the network's subnet.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Do not set 'Config.MacAddress' in inspect output unless the MAC address
is configured.
Also, make sure it is filled in for a configured address on the default
network before the container is started (by translating the network name
from 'default' to 'config' so that the address lookup works).
Signed-off-by: Rob Murray <rob.murray@docker.com>
The API's EndpointConfig struct has a MacAddress field that's used for
both the configured address, and the current address (which may be generated).
A configured address must be restored when a container is restarted, but a
generated address must not.
The previous attempt to differentiate between the two, without adding a field
to the API's EndpointConfig that would show up in 'inspect' output, was a
field in the daemon's version of EndpointSettings, MACOperational. It did
not work, MACOperational was set to true when a configured address was
used. So, while it ensured addresses were regenerated, it failed to preserve
a configured address.
So, this change removes that code, and adds DesiredMacAddress to the wrapped
version of EndpointSettings, where it is persisted but does not appear in
'inspect' results. Its value is copied from MacAddress (the API field) when
a container is created.
Signed-off-by: Rob Murray <rob.murray@docker.com>
File paths can contain commas, particularly paths returned from
t.TempDir() in subtests which include commas in their names. There is
only one datastore provider and it only supports a single address, so
the only use of parsing the address is to break tests in mysterious
ways.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The bbolt library wants exclusive access to the boltdb file and uses
file locking to assure that is the case. The controller and each network
driver that needs persistent storage instantiates its own unique
datastore instance, backed by the same boltdb file. The boltdb kvstore
implementation works around multiple access to the same boltdb file by
aggressively closing the boltdb file between each transaction. This is
very inefficient. Have the controller pass its datastore instance into
the drivers and enable the PersistConnection option to disable closing
the boltdb between transactions.
Set data-dir in unit tests which instantiate libnetwork controllers so
they don't hang trying to lock the default boltdb database file.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- full diff: https://github.com/actions/setup-go/compare/v3.5.0...v5.0.0
v5
In scope of this release, we change Nodejs runtime from node16 to node20.
Moreover, we update some dependencies to the latest versions.
Besides, this release contains such changes as:
- Fix hosted tool cache usage on windows
- Improve documentation regarding dependencies caching
V4
The V4 edition of the action offers:
- Enabled caching by default
- The action will try to enable caching unless the cache input is explicitly
set to false.
Please see "Caching dependency files and build outputs" for more information:
https://github.com/actions/setup-go#caching-dependency-files-and-build-outputs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If a reader has caught up to the logger and is waiting for the next
message, it should stop waiting when the logger is closed. Otherwise
the reader will unnecessarily wait the full closedDrainTimeout for no
log messages to arrive.
This case was overlooked when the journald reader was recently
overhauled to be compatible with systemd 255, and the reader tests only
failed when a logical race happened to settle in such a way to exercise
the bugged code path. It was only after implicit flushing on close was
added to the journald test harness that the Follow tests would
repeatably fail due to this bug. (No new regression tests are needed.)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The journald reader test harness injects an artificial asynchronous
delay into the logging pipeline: a logged message won't be written to
the journal until at least 150ms after the Log() call returns. If a test
returns while log messages are still in flight to be written, the logs
may attempt to be written after the TempDir has been cleaned up, leading
to spurious errors.
The logger read tests which interleave writing and reading have to
include explicit synchronization points to work reliably with this delay
in place. On the other hand, tests should not be required to sync the
logger explicitly before returning. Override the Close() method in the
test harness wrapper to wait for in-flight logs to be flushed to disk.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Check the return value when logging messages
- Log the stream (stdout/stderr) and list of messages that were not read
- Wait until the logger is closed before returning early (panic/fatal)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Writing the systemd-journal-remote command output directly to os.Stdout
and os.Stderr makes it nearly impossible to tell which test case the
output is related to when the tests are not run in verbose mode. Extend
the journald sender fake to redirect output to the test log so they
interleave with the rest of the test output.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Go race detector was detecting a data race when running the
TestLogRead/Follow/Concurrent test against the journald logging driver.
The race was in the test harness, specifically syncLogger. The waitOn
field would be reassigned each time a log entry is sent to the journal,
which is not concurrency-safe. Make it concurrency-safe using the same
patterns that are used in the log follower implementation to synchronize
with the logger.
Signed-off-by: Cory Snider <csnider@mirantis.com>
When saving an image treat `image@sha256:abcdef...` the same as
`abcdef...`, this makes it:
- Not export the digested tag as the image name
- Not try to export all tags from the image repository
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Saving an image via digested reference, ID or truncated ID doesn't store
the image reference in the archive. This also causes the save code to
not add the image's manifest to the index.json.
This commit explicitly adds the untagged manifests to the index.json if
no tagged manifests were added.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
errDrainDone is a sentinel error which is never supposed to escape the
package. Consequently, it needs to be filtered out of returns all over
the place, adding boilerplate. Forgetting to filter out these errors
would be a logic bug which the compiler would not help us catch. Replace
it with boolean multi-valued returns as they can't be accidentally
ignored or propagated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While it doesn't really matter if the reader waits for an extra
arbitrary period beyond an arbitrary hardcoded timeout, it's also
trivial and cheap to implement, and nice to have.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The journald reader uses a timer to set an upper bound on how long to
wait for the final log message of a stopped container. However, the
timer channel is only received from in non-blocking select statements!
There isn't enough benefit of using a timer to offset the cost of having
to manage the timer resource. Setting a deadline and comparing the
current time is just as effective, without having to manage the
lifecycle of any runtime resources.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Synthesize a boot ID for journal entries fed into
systemd-journal-remote, as required by systemd 255.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Following logs with a non-negative tail when the container log is empty
is broken on the journald driver when used with systemd 255. Add tests
which cover this edge case to our loggertest suite.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Previously this was done indirectly - the `compare` function didn't
check the `ArgsEscaped`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Restrict cache candidates only to images that were built locally.
This doesn't affect builds using `--cache-from`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Store additional image property which makes it possible to distinguish
if image was built locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a follow-up to 2cf230951f, adding
more directives to adjust for some new code added since:
Before this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
# github.com/docker/docker/internal/sliceutil
internal/sliceutil/sliceutil.go:3:12: type parameter requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:3:14: predeclared comparable requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:4:19: invalid map key type T (missing comparable constraint)
# github.com/docker/docker/libnetwork
libnetwork/endpoint.go:252:17: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
# github.com/docker/docker/daemon
daemon/container_operations.go:682:9: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
daemon/inspect.go:42:18: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
With this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
=== RUN TestModuleCompatibllity
main_test.go:321: all packages have the correct go version specified through //go:build
--- PASS: TestModuleCompatibllity (0.00s)
PASS
ok gocompat 0.031s
make: Leaving directory '/go/src/github.com/docker/docker/internal/gocompat'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Functional programming for the win! Add a utility function to map the
values of a slice, along with a curried variant, to tide us over until
equivalent functionality gets added to the standard library
(https://go.dev/issue/61898)
Signed-off-by: Cory Snider <csnider@mirantis.com>
We need to isolate the images that we are remapping to a userns, we
can't mix them with "normal" images. In the graph driver case this means
we create a new root directory where we store the images and everything
else, in the containerd case we can use a new namespace.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These types were deprecated in v25.0, and moved to api/types/container;
This patch removes the aliases for;
- api/types.ResizeOptions (deprecated in 95b92b1f97)
- api/types.ContainerAttachOptions (deprecated in 30f09b4a1a)
- api/types.ContainerCommitOptions (deprecated in 9498d897ab)
- api/types.ContainerRemoveOptions (deprecated in 0f77875220)
- api/types.ContainerStartOptions (deprecated in 7bce33eb0f)
- api/types.ContainerListOptions (deprecated in 9670d9364d)
- api/types.ContainerLogsOptions (deprecated in ebef4efb88)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in v25.0, and moved to api/types/swarm;
This patch removes the aliases for;
- api/types.ServiceUpdateResponse (deprecated in 5b3e6555a3)
- api/types.ServiceCreateResponse (deprecated in ec69501e94)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in 48cacbca24
(v25.0), and moved to api/types/image.
This patch removes the aliases for;
- api/types.ImageDeleteResponseItem
- api/types.ImageSummary
- api/types.ImageMetadata
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in b688af2226
(v25.0), and moved to api/types/checkpoint.
This patch removes the aliases for;
- api/types.CheckpointCreateOptions
- api/types.CheckpointListOptions
- api/types.CheckpointDeleteOptions
- api/types.Checkpoint
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in c90229ed9a
(v25.0), and moved to api/types/system.
This patch removes the aliases for;
- api/types.Info
- api/types.Commit
- api/types.PluginsInfo
- api/types.NetworkAddressPool
- api/types.Runtime
- api/types.SecurityOpt
- api/types.KeyValue
- api/types.DecodeSecurityOptions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
To prevent a circular import between api/types and api/types image,
the RequestPrivilegeFunc reference was not moved, but defined as
part of the PullOptions / PushOptions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8b7af1d0f added some code to update the DNSNames of all
endpoints attached to a sandbox by loading a new instance of each
affected endpoints from the datastore through a call to
`Network.EndpointByID()`.
This method then calls `Network.getEndpointFromStore()`, that in
turn calls `store.GetObject()`, which then calls `cache.get()`,
which calls `o.CopyTo(kvObject)`. This effectively creates a fresh
new instance of an Endpoint. However, endpoints are already kept in
memory by Sandbox, meaning we now have two in-memory instances of
the same Endpoint.
As it turns out, libnetwork is built around the idea that no two objects
representing the same thing should leave in-memory, otherwise breaking
mutex locking and optimistic locking (as both instances will have a drifting
version tracking ID -- dbIndex in libnetwork parliance).
In this specific case, this bug materializes by container rename failing
when applied a second time for a given container. An integration test is
added to make sure this won't happen again.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
I made a mistake in the last commit - after resolving the IP from the
passed `addr` for CIFS it would still resolve the `device` part.
Apply only one name resolution
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to 7a9b680a, the container short ID was added to the network
aliases only for custom networks. However, this logic wasn't preserved
in 6a2542d and now the cid is always added to the list of network
aliases.
This commit reintroduces the old logic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- pass the cluster as an argument instead of manually setting it after
creating the router-options
- remove the "opts" variable, to prevent it accidentally being used (with
the assumption that's the value returned)
- use a struct-literal for the returned options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 21e50b89c9 added a label on the buildkit
worker to advertise the host-gateway-ip. This option can be either set by the
user in the daemon config, or otherwise defaults to the gateway-ip.
If no value is set by the user, discovery of the gateway-ip happens when
initializing the network-controller (`NewDaemon`, `daemon.restore()`).
However d222bf097c changed how we handle the
daemon config. As a result, the `cli.Config` used when initializing the
builder only holds configuration information form the daemon config
(user-specified or defaults), but is not updated with information set
by `NewDaemon`.
This patch adds an accessor on the daemon to get the current daemon config.
An alternative could be to return the config by `NewDaemon` (which should
likely be a _copy_ of the config).
Before this patch:
docker buildx inspect default
Name: default
Driver: docker
Nodes:
Name: default
Endpoint: default
Status: running
Buildkit: v0.12.4+3b6880d2a00f
Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
Labels:
org.mobyproject.buildkit.worker.moby.host-gateway-ip: <nil>
After this patch:
docker buildx inspect default
Name: default
Driver: docker
Nodes:
Name: default
Endpoint: default
Status: running
Buildkit: v0.12.4+3b6880d2a00f
Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
Labels:
org.mobyproject.buildkit.worker.moby.host-gateway-ip: 172.18.0.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8ae94cafa5 added a DNS resolution
of the `device` part of the volume option.
The previous way to resolve the passed hostname was to use `addr`
option, which was handled by the same code path as the `nfs` mount type.
The issue is that `addr` is also an SMB module option handled by kernel
and passing a hostname as `addr` produces an invalid argument error.
To fix that, restore the old behavior to handle `addr` the same way as
before, and only perform the new DNS resolution of `device` if there is
no `addr` passed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Update the version of compose used in CI to the latest version.
- full diff: docker/compose@v2.24.1...v2.24.2
- release notes: https://github.com/docker/compose/releases/tag/v2.24.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixes some potentially unclosed file-handles,
inlines some variables, and use consts for fixed
values.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixing a "defer in loop" warning, instead changing to use
sub-tests, and simplifying some code, using os.WriteFile() instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The names of extended attributes are not completely freeform. Attributes
are namespaced, and the kernel enforces (among other things) that only
attributes whose names are prefixed with a valid namespace are
permitted. The name of the attribute therefore needs to be known in
order to diagnose issues with lsetxattr. Include the name of the
extended attribute in the errors returned from the Lsetxattr and
Lgetxattr so users and us can more easily troubleshoot xattr-related
issues. Include the name in a separate rich-error field to provide code
handling the error enough information to determine whether or not the
failure can be ignored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `GetImageOpts` struct is used for options to be passed to the backend,
and are not used in client code. This struct currently is intended for internal
use only.
This patch moves the `GetImageOpts` struct to the backend package to prevent
it being imported in the client, and to make it more clear that this is part
of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MAC address of a running container was stored in the same place as
the configured address for a container.
When starting a stopped container, a generated address was treated as a
configured address. If that generated address (based on an IPAM-assigned
IP address) had been reused, the containers ended up with duplicate MAC
addresses.
So, remember whether the MAC address was explicitly configured, and
clear it if not.
Signed-off-by: Rob Murray <rob.murray@docker.com>
With containerd snapshotters enabled `docker run` currently fails when
creating a container from an image that doesn't have the default host
platform without an explicit `--platform` selection:
```
$ docker run image:amd64
Unable to find image 'asdf:amd64' locally
docker: Error response from daemon: pull access denied for asdf, repository does not exist or may require 'docker login'.
See 'docker run --help'.
```
This is confusing and the graphdriver behavior is much better here,
because it runs whatever platform the image has, but prints a warning:
```
$ docker run image:amd64
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
```
This commits changes the containerd snapshotter behavior to be the same
as the graphdriver. This doesn't affect container creation when platform
is specified explicitly.
```
$ docker run --rm --platform linux/arm64 asdf:amd64
Unable to find image 'asdf:amd64' locally
docker: Error response from daemon: pull access denied for asdf, repository does not exist or may require 'docker login'.
See 'docker run --help'.
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Order the layers in OCI manifest by their actual apply order. This is
required by the OCI image spec.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Since v25.0 (commit ff50388), we validate endpoint settings when
containers are created, instead of doing so when containers are started.
However, a container created prior to that release would still trigger
validation error at start-time. In such case, the API returns a 500
status code because the Go error isn't wrapped into an InvalidParameter
error. This is now fixed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test was added in f301c5765a to test
inspect output for API > v1.21, however, it was pinned to API v1.21,
which is now deprecated.
Remove the fixed version, as the intent was to test "current" API versions
(API v1.21 and up),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in f301c5765a to test
inspect output for API > v1.21, however, it was pinned to API v1.21,
which is now deprecated.
Remove the fixed version, as the intent was to test "current" API versions
(API v1.21 and up),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in 75f6929b44, but pinned
to the API version that was current at the time (v1.20), which is now
deprecated.
Update the test to use the current API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tables linked to deprecated API versions, and an up-to-date version of
the matrix is already included at https://docs.docker.com/engine/api/#api-version-matrix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- add some asserts for unhandled errors
- use consts for fixed values, and slightly re-format Dockerfile contentt
- inline one-line Dockerfiles
- fix some vars to be properly camel-cased
- improve assert for error-types;
Before:
=== RUN TestBuildPlatformInvalid
build_test.go:685: assertion failed: expression is false: errdefs.IsInvalidParameter(err)
--- FAIL: TestBuildPlatformInvalid (0.01s)
FAIL
After:
=== RUN TestBuildPlatformInvalid
build_test.go:689: assertion failed: error is Error response from daemon: "foobar": unknown operating system or architecture: invalid argument (errdefs.errSystem), not errdefs.IsInvalidParameter
--- FAIL: TestBuildPlatformInvalid (0.01s)
FAIL
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This matcher was only used internally in the containerd implementation of
the image store. Un-export it, and make it a local utility in that package
to prevent external use.
This package was introduced in 1616a09b61
(v24.0), and there are no known external consumers of this package, so there
should be no need to deprecate / alias the old location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When resolving names in swarm mode, services with exposed ports are
connected to user overlay network, ingress network, and local (docker_gwbridge)
networks. Name resolution should prioritize returning the VIP/IPs on user
overlay network over ingress and local networks.
Sandbox.ResolveName implemented this by taking the list of endpoints,
splitting the list into 3 separate lists based on the type of network
that the endpoint was attached to (dynamic, ingress, local), and then
creating a new list, applying the networks in that order.
This patch refactors that logic to use a custom sorter (sort.Interface),
which makes the code more transparent, and prevents iterating over the
list of endpoints multiple times.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Permit container network attachments to set any static IP address within
the network's IPAM master pool, including when a subpool is configured.
Users have come to depend on being able to statically assign container
IP addresses which are guaranteed not to collide with automatically-
assigned container addresses.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This package was introduced in af59752712
as a utility package for devicemapper, which was removed in commit
dc11d2a2d8 (v25.0.0), and the package
was deprecated in bf692d47fb.
This patch removes the package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This flag was marked deprecated in commit 5a922dc16 (released in v24.0)
and to be removed in the next release.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Some configuration in a container depends on whether it has support for
IPv6 (including default entries for '::1' etc in '/etc/hosts').
Before this change, the container's support for IPv6 was determined by
whether it was connected to any IPv6-enabled networks. But, that can
change over time, it isn't a property of the container itself.
So, instead, detect IPv6 support by looking for '::1' on the container's
loopback interface. It will not be present if the kernel does not have
IPv6 support, or the user has disabled it in new namespaces by other
means.
Once IPv6 support has been determined for the container, its '/etc/hosts'
is re-generated accordingly.
The daemon no longer disables IPv6 on all interfaces during initialisation.
It now disables IPv6 only for interfaces that have not been assigned an
IPv6 address. (But, even if IPv6 is disabled for the container using the
sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6
networks still get IPv6 addresses that appear in the internal DNS. There's
more to-do!)
Signed-off-by: Rob Murray <rob.murray@docker.com>
All components of the path are locked before the check, and
released once the path is already mounted.
This makes it impossible to replace the mounted directory until it's
actually mounted in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All subpath components are opened with openat, relative to the base
volume directory and checked against the volume escape.
The final file descriptor is mounted from the /proc/self/fd/<fd> to a
temporary mount point owned by the daemon and then passed to the
underlying container runtime.
Temporary mountpoint is removed after the container is started.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`VolumeOptions` now has a `Subpath` field which allows to specify a path
relative to the volume that should be mounted as a destination.
Symlinks are supported, but they cannot escape the base volume
directory.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We constructed a "function level" logger, which was used once "as-is", but
also added additional Fields in a loop (for each resource), effectively
overwriting the previous one for each iteration. Adding additional
fields can result in some overhead, so let's construct a "logger" only for
inside the loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We have many "image" packages, so these vars easily conflict/shadow
imports. Let's rename them (and in some cases use a const) to
prevent that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For some time, when adding an interface with no IPv6 address (an
interface to a network that does not have IPv6 enabled), we've been
disabling IPv6 on that interface.
As part of a separate change, I'm removing that logic - there's nothing
wrong with having IPv6 enabled on an interface with no routable address.
The difference is that the kernel will assign a link-local address.
TestAddRemoveInterface does this...
- Assign an IPv6 link-local address to one end of a veth interface, and
add it to a namespace.
- Add a bridge with no assigned IPv6 address to the namespace.
- Remove the veth interface from the namespace.
- Put the veth interface back into the namespace, still with an
explicitly assigned IPv6 link local address.
When IPv6 is disabled on the bridge interface, the test passes.
But, when IPv6 is enabled, the bridge gets a kernel assigned link-local
address.
Then, when re-adding the veth interface, the test generates an error in
'osl/interface_linux.go:checkRouteConflict()'. The conflict is between
the explicitly assigned fe80::2 on the veth, and a route for fe80::/64
belonging to the bridge.
So, in preparation for not-disabling IPv6 on these interfaces, use a
unique-local address in the test instead of link-local.
I don't think that changes the intent of the test.
With the change to not-always disable IPv6, it is possible to repro the
problem with a real container, disconnect and re-connect a user-defined
network with '--subnet fe80::/64' while the container's connected to an
IPv4 network. So, strictly speaking, that will be a regression.
But, it's also possible to repro the problem in master, by disconnecting
and re-connecting the fe80::/64 network while another IPv6 network is
connected. So, I don't think it's a problem we need to address, perhaps
other than by prohibiting '--subnet fe80::/64'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Message is different with containerd backend. The Linux test
`TestPullLinuxImageFailsOnLinux` was adjusted before, but we missed this
one.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All commonly used filesystems should use ref-counted mounter, so make it
the default instead of having to whitelist them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to this commit, a container running with `--net=host` had
`{"type":"network","path":"/var/run/docker/netns/default"}` in
the ``.linux.namespaces` field of the OCI Runtime Config,
but this wasn't needed.
Close issue 47100
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The actual divergence is due to differences in the snapshotter and
graphfilter mount behaviour on Windows, but the snapshotter behaviour is
better, so we deal with it here rather than changing the snapshotter
behaviour.
We're relying on the internals of containerd's Windows mount
implementation here. Unless this code flow is replaced, future work is
to move getBackingDeviceForContainerdMount into containerd's mount
implementation.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
That means 'null', not that we can call builder-next on Windows. If and
when we do get builder-next going, this will need to be solved properly
in some way.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The existing API ImageService.GetLayerFolders didn't have access to the
ID of the container, and once we have that, the snapshotter Mounts API
provides all the information we need here.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Needed for Diff on Windows. Don't remount it afterwards as the layer is
going to be released anyway.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is consistent with layerStore's CreateRWLayer behaviour.
Potentially this can be refactored to avoid creating the -init layer,
but as noted in layerStore's initMount, this name may be special, and
should be cleared-out all-at-once.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This change adds a TempDir function that ensures the correct permissions for
the fake-root user in rootless mode.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Now the state dir is set to `${XDG_RUNTIME_DIR}/dockerd-rootless`.
This is similar to `${XDG_RUNTIME_DIR}/containerd-rootless` used in nerdctl:
https://github.com/containerd/nerdctl/blob/v1.7.2/extras/rootless/containerd-rootless.sh#L35
Prior to this commit, the state dir was unset and a random dir under `/tmp` was used.
(e.g., `/tmp/rootlesskit1869901982`)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
XDG_RUNTIME_DIR will contain sockets so its path mustn't be too long.
Prior to this commit, it was set to very long path like
`/go/src/github.com/docker/docker/bundles/test-integration/TestDiskUsage/de4fb36576d7d/xdgrun`
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Consider only images that were built `FROM scratch` as valid candidates
for the `FROM scratch` + INSTRUCTION build step.
The images are marked as `FROM scratch` based by the classic builder
with a special label. It must be a new label instead of empty parent
label, because empty label values are not persisted.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In order for the cache in the classic builder to work we need to:
- use the came comparison function as the graph drivers implementation
- save the container config when commiting the image
- use all images to search a 'FROM "scratch"' image
- load all images if `cacheFrom` is empty
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Protecting the environment relies on the shared state (containers,
images, etc) which might already be mutated by other tests if the test
opted in into the Parallel execution before Protect was called.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The health status and probe log of containers are not mission-criticial
data which must survive a crash. It is not worth prematrely wearing out
consumer-grade flash storage by overwriting and fsync()ing the container
config on after every probe. Update only the live Container object and
the ViewDB replica on every container health probe instead. It will
eventually get checkpointed along with some other state (or config)
change. Running containers will not be checkpointed on daemon shutdown
when live-restore is enabled, but it does not matter: the health status
and probe log will be zeroed out when the daemon starts back up.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The "builtin" port driver was marked as "Slow" in the row for the lxc-user-nic
network driver, while it was marked as "Fast" in other rows.
It had to be consistently marked as "Fast" regardless to the network driver.
It is still not as fast as rootful.
Follow-up to PR 47076
Fixes: b5a5ecf4a3
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
setupTest should be called before Parallel as it modifies the test
environment which might produce:
```
fatal error: concurrent map writes
goroutine 143 [running]:
github.com/docker/docker/testutil/environment.(*Execution).ProtectContainer(...)
/go/src/github.com/docker/docker/testutil/environment/protect.go:59
github.com/docker/docker/testutil/environment.ProtectContainers({0x12e8d98, 0xc00040e420}, {0x12f2878?, 0xc0004fc340}, 0xc0001fac00)
/go/src/github.com/docker/docker/testutil/environment/protect.go:68 +0xb1
github.com/docker/docker/testutil/environment.ProtectAll({0x12e8d98, 0xc00040e210}, {0x12f2878, 0xc0004fc340}, 0xc0001fac00)
/go/src/github.com/docker/docker/testutil/environment/protect.go:45 +0xf3
github.com/docker/docker/integration/image.setupTest(0xc0004fc340)
/go/src/github.com/docker/docker/integration/image/main_test.go:46 +0x59
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Save the unmodified manifest list to keep the image ID of the
multi-platform images when not all platforms are present.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`diffIDPaths` is not used and can be removed.
`savedConfig` stores if the config was already saved (ID of the image is
the ID of the config).
`savedLayers` stores if the layer (diff ID) was already saved.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
https://github.com/rootless-containers/rootlesskit/releases/tag/v2.0.0
=== Pasta ===
RootlessKit v2 adds the support for pasta (https://passt.top/passt/).
Pasta is similar to slirp4netns but its port forwarder achieves better
throughput than slirp4netns port driver.
It is still not faster than RootlessKit's `builtin` port driver, but unlike the
`builtin` port driver, pasta can retain source IP address information.
Network driver | Port driver | Net throughput | Port throughput | Src IP | No SUID | Note
---------------|----------------|----------------|-----------------|--------|---------|--------------------------------------------
slirp4netns | builtin | Slow | Fast ✅ | ❌ | ✅ | Default in typical setup
vpnkit | builtin | Slow | Fast ✅ | ❌ | ✅ | Default when slirp4netns is not installed
slirp4netns | slirp4netns | Slow | Slow | ✅ | ✅ |
**pasta** | **implicit** | Slow | Fast ✅ | ✅ | ✅ | Experimental
lxc-user-nic | builtin | Fast ✅ | Slow | ❌ | ❌ | Experimental
(bypass4netns) | (bypass4netns) | Fast ✅ | Fast ✅ | ✅ | ✅ | (Not integrated to RootlessKit)
=== Detach-netns ===
Aside from pasta, RootlessKit v2 also brings the support for
"detach-netns" mode, which leaves the runtime in the host network namespace to
eliminate the slirp overhead for pull/push and to allow accessing the "real"
127.0.0.1.
See containerd/nerdctl PR 2723 for how detach-netns is being adopted in
nerdctl v2.
Integrating detach-netns into Docker/Moby will need an extra work and will be
deferred to Docker v26 (or later).
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Update the TestDaemonRestartKilContainers integration test to assert
that a container's healthcheck status is always reset to the Starting
state after a daemon restart, even when the container is live-restored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Layer size is the sum of the individual files count, not the tar
archive. Use the total bytes read returned by `io.Copy` to populate the
`Size` field.
Also set the digest to the actual digest of the tar archive.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Switch github.com/imdario/mergo to dario.cat/mergo v1.0.0, because
the module was renamed, and reached v1.0.0
full diff: https://github.com/imdario/mergo/compare/v0.3.13...v1.0.0
vendor: github.com/containerd/containerd v1.7.12
- full diff: https://github.com/containerd/containerd/compare/v1.7.11...v1.7.12
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.12
Welcome to the v1.7.12 release of containerd!
The twelfth patch release for containerd 1.7 contains various fixes and updates.
Notable Updates
- Fix on dialer function for Windows
- Improve `/etc/group` handling when appending groups
- Update shim pidfile permissions to 0644
- Update runc binary to v1.1.11
- Allow import and export to reference missing content
- Remove runc import
- Update Go version to 1.20.13
Deprecation Warnings
- Emit deprecation warning for `containerd.io/restart.logpath` label usage
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.11...v1.7.12
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.12
Welcome to the v1.7.12 release of containerd!
The twelfth patch release for containerd 1.7 contains various fixes and updates.
Notable Updates
- Fix on dialer function for Windows
- Improve `/etc/group` handling when appending groups
- Update shim pidfile permissions to 0644
- Update runc binary to v1.1.11
- Allow import and export to reference missing content
- Remove runc import
- Update Go version to 1.20.13
Deprecation Warnings
- Emit deprecation warning for `containerd.io/restart.logpath` label usage
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The new OCI-compatible archive export relies on the Descriptors returned
by the layer (`distribution.Describable` interface implementation).
The issue with that is that the `roLayer` and the `referencedCacheLayer`
types don't implement this interface. Implementing that interface for
them based on their `descriptor` doesn't work though, because that
descriptor is empty.
To workaround this issue, just create a new descriptor if the one
provided by the layer is empty.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Split task creation and start into two separate method calls in the
libcontainerd API. Clients now have the opportunity to inspect the
freshly-created task and customize its runtime environment before
starting execution of the user-specified binary.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The container may have been running without health probes for an
indeterminate amount of time. The container may have become unhealthy in
the interim. We should probe it sooner than in steady-state, while also
giving it some leeway to recover from e.g. timed-out connections. This
is easy to achieve by probing the container like a freshly-started one.
The original author of health-checks came to the same conclusion; the
health monitor was reinitialized on live-restored containers before
v17.11.0, when health monitoring of live-restored containers was
accidentally broken. Revert to the original behavior.
Signed-off-by: Cory Snider <csnider@mirantis.com>
commit 4f9db655ed moved looking up the
userland-proxy binary to early in the startup process, and introduced
a log-message if the binary was missing.
However, a side-effect of this was this message would also be printed
when running "--version";
dockerd --version
time="2024-01-09T09:18:53.705271292Z" level=warning msg="failed to lookup default userland-proxy binary" error="exec: \"docker-proxy\": executable file not found in $PATH"
Docker version v25.0.0-rc.1, build 9cebefa717
We should look if we can avoid this, but let's change the message to be
a debug message as a short-term workaround.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 0b1c1877c5 updated the version in
hack/dockerfile/install/rootlesskit.installer, but forgot to update the
version in Dockerfile.
Also updating both to use a tag, instead of commit. While it's good to pin by
an immutable reference, I think it's reasonably safe to use the tag, which is
easier to use, and what we do for other binaries, such as runc as well.
Full diff: https://github.com/rootless-containers/rootlesskit/compare/v1.1.0...v1.1.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The only use is in `builder/builder-next/adapters/snapshot.EnsureLayer()`,
which always calls the function with an _empty_ `oldTarDataPath`;
7082aecd54/builder/builder-next/adapters/snapshot/layer.go (L81)
When called with an empty `oldTarDataPath`, this function was an alias for
`checksumForGraphIDNoTarsplit`, so let's make it that.
Note that this code was added in 500e77bad0, as
part of the migration from "v1" images to "v2" (content-addressable) images.
Given that the remaining code lives in a "migration" file, possibly more code
can be removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ExecBackend.ContainerKill()` function was called before removing a build-
container.
This function is backed by `daemon.ContainerKill()` which, if no signal is passed,
performed a `daemon.Kill()`, using `SIGKILL` as signal. However, the
`ExecBackend.ContainerRm()` (backed by `daemonContainerRm()`), which is called
after this, is executed with the `ForceRemove` option set, which calls
`daemon.cleanupContainer()` with `ForceRemove` set, which also results in
`daemon.Kill()` being called:
1a0c15abbb/daemon/delete.go (L84-L95)
This makes the `ExecBackend.ContainerKill()` redundant, so removing this from
the interface.
While looking at this code, one (possible) race-condition was found in
`daemon.cleanupContainer()`, where `daemon.Kill()` could return a `errdefs.Conflict`
if the container was already stopped. An extra check was added for this case to
prevent `daemon.cleanupContainer()` from terminating early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent cleanup from terminating early when failing to remove a container;
- continue trying to remove remaining containers
- ignore errors due to containers that were not found
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was just a very thin wrapper for backend.ContainerRm(), and the
error it returned was not handled, so moving this code inline.
Moving it inline also allows differentiating the error message to
distinguish the "removing all intermediate containers" from "removing container"
(when cancelling a build).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The output var was used in a `defer`, but named `err` and shadowed in various
places. Rename the var to a more explicit name to make clear where it's used
and to prevent it being accidentally shadowed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes:
- NewSystemd handles UnitExists when starting units
- makefile fixes
- cgroups2: export memory max usage and swap max usage
- build(deps): bump github.com/cilium/ebpf from v0.9.1 to v0.11.0
- support psi
- feat: add Threads for cgroupv2
- Linux.Swap is defined as memory+swap combined, while in cgroup v2 swap is a separate value
- fix(): support re-enabling oom killer refs #307 by @kestrelcjx
full diff: https://github.com/containerd/cgroups/compare/v3.0.2...v3.0.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
To make the version format in the `moby-bin` consistent with the
version we use in the release pipeline.
```diff
Server: Docker Engine - Community
Engine:
- Version: v25.0.0
+ Version: 25.0.0
...
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, the endpoint did not fall back to querying the upstream if the mirror
did not contain the given reference.
For pull-through registry-mirrors, this was not a problem, as in that case the
registry would forward the request, but for other mirrors, no fallback would
happen. This was inconsistent with how "pulling" images handled this situation;
when pulling images, both the mirror and upstream would be tried.
This patch brings the daemon-side lookup of image-manifests on-par with the
client-side lookup (the GET /distribution endpoint) as used in API 1.30 and
higher.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, the endpoint did not fall back to querying the upstream if the mirror
did not contain the given reference.
If the daemon is configured to use a mirror for the default (Docker Hub)
registry, did not fall back to querying the upstream if the mirror did not
contain the given reference.
For pull-through registry-mirrors, this was not a problem, as in that case the
registry would forward the request, but for other mirrors, no fallback would
happen. This was inconsistent with how "pulling" images handled this situation;
when pulling images, both the mirror and upstream would be tried.
This problem was caused by the logic used in GetRepository, which had an
optimization to only return the first registry it was successfully able to
configure (and connect to), with the assumption that the mirror either contained
all images used, or to be configured as a pull-through mirror.
This patch:
- Introduces a GetRepositories method, which returns all candidates (both
mirror(s) and upstream).
- Updates the endpoint to try all
Before this patch:
# the daemon is configured to use a mirror for Docker Hub
cat /etc/docker/daemon.json
{ "registry-mirrors": ["http://localhost:5000"]}
# start the mirror (empty registry, not configured as pull-through mirror)
docker run -d --name registry -p 127.0.0.1:5000:5000 registry:2
# querying the endpoint fails, because the image-manifest is not found in the mirror:
curl -s --unix-socket /var/run/docker.sock http://localhost/v1.43/distribution/docker.io/library/hello-world:latest/json
{
"message": "manifest unknown: manifest unknown"
}
With this patch applied:
# the daemon is configured to use a mirror for Docker Hub
cat /etc/docker/daemon.json
{ "registry-mirrors": ["http://localhost:5000"]}
# start the mirror (empty registry, not configured as pull-through mirror)
docker run -d --name registry -p 127.0.0.1:5000:5000 registry:2
# querying the endpoint succeeds (manifest is fetched from the upstream Docker Hub registry):
curl -s --unix-socket /var/run/docker.sock http://localhost/v1.43/distribution/docker.io/library/hello-world:latest/json | jq .
{
"Descriptor": {
"mediaType": "application/vnd.oci.image.index.v1+json",
"digest": "sha256:1b9844d846ce3a6a6af7013e999a373112c3c0450aca49e155ae444526a2c45e",
"size": 3849
},
"Platforms": [
{
"architecture": "amd64",
"os": "linux"
}
]
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A check was added to the bridge driver to detect when it was called to
create the default bridge nw whereas a stale default bridge already
existed. In such case, the bridge driver was deleting the stale network
before re-creating it. This check was introduced in docker/libnetwork@6b158eac6a
to fix an issue related to newly introduced live-restore.
However, since commit docker/docker@ecffb6d58c,
the daemon doesn't even try to create default networks if there're
active sandboxes (ie. due to live-restore).
Thus, now it's impossible for the default bridge network to be stale and
to exists when the driver's CreateNetwork() method is called. As such,
the check introduced in the first commit mentioned above is dead code
and can be safely removed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This hopefully makes the test less flakey (or removes any flake that
would be caused by the test itself).
1. Adds tail of cluster daemon logs when there is a test failure so we
can more easily see what may be happening
2. Scans the daemon logs to check if the key is rotated before
restarting the daemon. This is a little hacky but a little better
than assuming it is done after a hard-coded 3 seconds.
3. Cleans up the `node ls` check such that it uses a poll function
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The Go 1.21.5 compiler has a bug: per-file language version override
directives do not take effect when instantiating generic functions which
have certain nontrivial type constraints. Consequently, a module-mode
project with Moby as a dependency may fail to compile when the compiler
incorrectly applies go1.16 semantics to the generic function call.
As the offending function is trivial and is only used in one place, work
around the issue by converting it to a concretely-typed function.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Copy the swagger / OpenAPI file to the documentation. This is the API
version used by the upcoming v25.0.0 release.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The following fields are never written and are now marked as deprecated:
- `HairpinMode`
- `LinkLocalIPv6Address`
- `LinkLocalIPv6PrefixLen`
- `SecondaryIPAddress`
- `SecondaryIPv6Addresses`
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use stdlib's filepath.VolumeName to get the volume-name (if present) instead
of a self-crafted implementation, and unify the implementations for Windows
and Unix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package provided utilities to obtain the apparmor_parser version, as well
as loading a profile.
Commit e3e715666f (included in v24.0.0 through
bfffb0974e) deprecated GetVersion, as it was no
longer used, which made LoadProfile the only utility remaining in this package.
LoadProfile appears to have no external consumers, and the only use in our code
is "profiles/apparmor".
This patch moves the remaining code (LoadProfile) to profiles/apparmor as a
non-exported function, and deletes the package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit e3e715666f (included in v24.0.0 through
bfffb0974e) deprecated GetVersion, as it was no
longer used.
This patch removes the deprecated utility, and inlines the remaining code into
the LoadProfile function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function never returned an error, and was not matching an interface, so
remove the error-return.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).
This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
However, the container would still be created (but invalid);
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
869f41d7e94f nginx:alpine "/docker-entrypoint.…" 10 seconds ago Created romantic_wiles
This patch changes how the userland-proxy is configured;
- The path of the userland-proxy is now looked up / configured at daemon
startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
is now required to be set. It no longer looks up the executable, and
produces an error if no path was provided. While this change was not
required, it makes the daemon config the canonical source of truth, instead
of logic spread accross multiplee locations.
Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;
- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
preference in PATH)
With this patch:
Validating the config produces an error if the binary is not found:
dockerd --validate
WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
userland-proxy is enabled, but userland-proxy-path is not set
Disabling userland-proxy prints a warning, but validates as "OK":
dockerd --userland-proxy=false --validate
WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary error="exec: \"docker-proxy\": executable file not found in $PATH"
configuration OK
Speficying a non-absolute path produces an error:
dockerd --userland-proxy-path=docker-proxy --validate
invalid userland-proxy-path: must be an absolute path: docker-proxy
Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;
docker run -d -P nginx:alpine
4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.
Specifying an invalid userland-proxy-path produces an error as well:
dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory
mkdir -p /usr/local/bin/not-a-file
dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory
touch /usr/local/bin/not-an-executable
dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied
Same when using the daemon.json config-file;
echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary
dockerd --userland-proxy-path=hello --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The cleanup function never returns an error, so didn't add much value. This
patch removes the closure, and calls it inline to remove the extra
indirection, and removes the error which would never be returned.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The defer was set after the switch, but various code-paths inside the switch
could return with an error after the port was allocated / reserved, which
could result in those ports not being released.
This patch moves the defer into each individual branch of the switch to set
it immediately after succesfully reserving the port.
We can also remove a redundant ReleasePort from the cleanup function, as
it's only called if an error occurs, and the defers already take care of
that.
Note that the cleanup function was handling errors returned by ReleasePort,
but this function never returns an error, so it was fully redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent accidentally shadowing the error, which is used in a defer.
Also re-format the code to make it more clear we're not acting on
a locally-scoped "allocatedHostPort" variable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/klauspost/compress/compare/v1.17.2...v1.17.4
v1.17.4:
- huff0: Speed up symbol counting
- huff0: Remove byteReader
- gzhttp: Allow overriding decompression on transport
- gzhttp: Clamp compression level
- gzip: Error out if reserved bits are set
v1.17.3:
- fse: Fix max header size
- zstd: Improve better/best compression
- gzhttp: Fix missing content type on Close
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All underlying jobs inherit from the status of all parent jobs
in the tree, not just the very parent. We need to apply the same
kind of special condition.
Signed-off-by: CrazyMax <1951866+crazy-max@users.noreply.github.com>
If IPv6 is enabled for a bridge network, by the time configuration
is applied, the bridge will always have an address. Assert that, by
raising an error when the configuration is validated.
Use that to simplify the logic used to calculate which addresses
should be assigned to a bridge. Also remove a redundant check in
setupGatewayIPv6() and the error associated with it.
Fix unit tests that enabled IPv6, but didn't supply an IPv6 IPAM
address/pool. Before this change, these tests passed but silently
left the bridge without an IPv6 address.
(The daemon already ensured there was an IPv6 address, this change
does not add a new restriction on config at that level.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
Some checks in 'networkConfiguration.Validate()' were not running as
expected, they'd always pass - because 'parseNetworkOptions()' called
it before 'config.processIPAM()' had added IP addresses and gateways.
Signed-off-by: Rob Murray <rob.murray@docker.com>
No more concept of "anonymous endpoints". The equivalent is now an
endpoint with no DNSNames set.
Some of the code removed by this commit was mutating user-supplied
endpoint's Aliases to add container's short ID to that list. In order to
preserve backward compatibility for the ContainerInspect endpoint, this
commit also takes care of adding that short ID (and the container
hostname) to `EndpointSettings.Aliases` before returning the response.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The remapping in the commit code was in the wrong place, we would create
a diff and then remap the snapshot, but the descriptor created in
"CreateDiff" was still pointing to the old snapshot, we now remap the
snapshot before creating a diff. Also make sure we don't lose any
capabilities, they used to be lost after the chown.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
If the resolver's DNSBackend returns a name that cannot be marshaled
into a well-formed DNS message, the resolver will only discover this
when it attempts to write the reply message and it fails with an error.
No reply message is sent, leaving the client to wait out its timeout and
the user in the dark about what went wrong.
When writing the intended reply message fails, retry once with a
ServFail response to inform the client and user that the DNS query was
not resolved due to a problem with to the resolver, not the network.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The well-formedness of a DNS message is only checked when it is
serialized, through the (*dns.Msg).Pack() method. Add a call to Pack()
to our tstwriter mock to mirror the behaviour of the real
dns.ResponseWriter implementation. And fix tests which generated
ill-formed DNS query messages.
Signed-off-by: Cory Snider <csnider@mirantis.com>
They fail because exporting an image which targets a manifest list when
only one platform is available exports only the platform-specific
manifest so the ID of the loaded image is different (ID of the platform
manifest, not manifest list).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The semantics of an "anonymous" endpoint has always been weird: it was
set on endpoints which name shouldn't be taken into account when
inserting DNS records into libnetwork's `Controller.svcRecords` (and
into the NetworkDB). However, in that case the endpoint's aliases would
still be used to create DNS records; thus, making those "anonymous
endpoints" not so anonymous.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The `(*Endpoint).rename()` method is changed to only mutate `ep.name`
and let a new method `(*Endpoint).UpdateDNSNames()` handle DNS updates.
As a consequence, the rollback code that was part of
`(*Endpoint).rename()` is now removed, and DNS updates are now
rolled back by `ContainerRename`.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Instead of special-casing anonymous endpoints, use the list of DNS names
associated to the endpoint.
`(*Endpoint).isAnonymous()` has no more uses, so let's delete it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This new property will be empty if the daemon was upgraded with
live-restore enabled. To not break DNS resolutions for restored
containers, we need to populate dnsNames based on endpoint's myAliases &
anonymous properties.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Instead of special-casing anonymous endpoints in libnetwork, let the
daemon specify what (non fully qualified) DNS names should be associated
to container's endpoints.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
update the package, which contains a fix in the ssh package.
full diff: https://github.com/golang/crypto/compare/v0.16.0...v0.17.0
from the security mailing:
> Hello gophers,
>
> Version v0.17.0 of golang.org/x/crypto fixes a protocol weakness in the
> golang.org/x/crypto/ssh package that allowed a MITM attacker to compromise
> the integrity of the secure channel before it was established, allowing
> them to prevent transmission of a number of messages immediately after
> the secure channel was established without either side being aware.
>
> The impact of this attack is relatively limited, as it does not compromise
> confidentiality of the channel. Notably this attack would allow an attacker
> to prevent the transmission of the SSH2_MSG_EXT_INFO message, disabling a
> handful of newer security features.
>
> This protocol weakness was also fixed in OpenSSH 9.6.
>
> Thanks to Fabian Bäumer, Marcus Brinkmann, and Jörg Schwenk from Ruhr
> University Bochum for reporting this issue.
>
> This is CVE-2023-48795 and Go issue https://go.dev/issue/64784.
>
> Cheers,
> Roland on behalf of the Go team
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ensure that when removing an image, an image is checked consistently
against the images with the same target digest. Add unit testing around
delete.
Signed-off-by: Derek McGowan <derek@mcg.dev>
Simplify the hijack process by just performing the http request/response
on the connection and returning the raw conn after success. The client
conn from httputil is deprecated and easily replaced.
Signed-off-by: Derek McGowan <derek@mcg.dev>
This new property is meant to replace myAliases and anonymous
properties.
The end goal is to get rid of both properties by letting the daemon
determine what (non fully qualified) DNS names should be associated to
them.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
They just happen to exist on a network that doesn't support DNS-based
service discovery (ie. no embedded DNS servers are started for them).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.
(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)
IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.
Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.
Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Before this change `ParentId` was filled for images when calling the
`/images/json` (image list) endpoint but was not for the
`/images/<image>/json` (image inspect).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This command was originally added by ea7f555446
to test the code snippet put into libnet's README.md. Nothing compiles
this file and it doesn't add any value to the project. So better remove
it than maintaining it.
This commit also removes the code snippet from libnet's README.md for
the same reasons.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This repository is not yet a module (i.e., does not have a `go.mod`). This
is not problematic when building the code in GOPATH or "vendor" mode, but
when using the code as a module-dependency (in module-mode), different semantics
are applied since Go1.21, which switches Go _language versions_ on a per-module,
per-package, or even per-file base.
A condensed summary of that logic [is as follows][1]:
- For modules that have a go.mod containing a go version directive; that
version is considered a minimum _required_ version (starting with the
go1.19.13 and go1.20.8 patch releases: before those, it was only a
recommendation).
- For dependencies that don't have a go.mod (not a module), go language
version go1.16 is assumed.
- Likewise, for modules that have a go.mod, but the file does not have a
go version directive, go language version go1.16 is assumed.
- If a go.work file is present, but does not have a go version directive,
language version go1.17 is assumed.
When switching language versions, Go _downgrades_ the language version,
which means that language features (such as generics, and `any`) are not
available, and compilation fails. For example:
# github.com/docker/cli/cli/context/store
/go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/storeconfig.go:6:24: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
/go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/store.go:74:12: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
Note that these fallbacks are per-module, per-package, and can even be
per-file, so _(indirect) dependencies_ can still use modern language
features, as long as their respective go.mod has a version specified.
Unfortunately, these failures do not occur when building locally (using
vendor / GOPATH mode), but will affect consumers of the module.
Obviously, this situation is not ideal, and the ultimate solution is to
move to go modules (add a go.mod), but this comes with a non-insignificant
risk in other areas (due to our complex dependency tree).
We can revert to using go1.16 language features only, but this may be
limiting, and may still be problematic when (e.g.) matching signatures
of dependencies.
There is an escape hatch: adding a `//go:build` directive to files that
make use of go language features. From the [go toolchain docs][2]:
> The go line for each module sets the language version the compiler enforces
> when compiling packages in that module. The language version can be changed
> on a per-file basis by using a build constraint.
>
> For example, a module containing code that uses the Go 1.21 language version
> should have a `go.mod` file with a go line such as `go 1.21` or `go 1.21.3`.
> If a specific source file should be compiled only when using a newer Go
> toolchain, adding `//go:build go1.22` to that source file both ensures that
> only Go 1.22 and newer toolchains will compile the file and also changes
> the language version in that file to Go 1.22.
This patch adds `//go:build` directives to those files using recent additions
to the language. It's currently using go1.19 as version to match the version
in our "vendor.mod", but we can consider being more permissive ("any" requires
go1.18 or up), or more "optimistic" (force go1.21, which is the version we
currently use to build).
For completeness sake, note that any file _without_ a `//go:build` directive
will continue to use go1.16 language version when used as a module.
[1]: 58c28ba286/src/cmd/go/internal/gover/version.go (L9-L56)
[2]: https://go.dev/doc/toolchain
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These fields were an implementation detail of the classic image builder
and are empty when using buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When the doc job is skipped, the dependent ones will be skipped
as well. To fix this issue we need to apply special conditions
to always run dependent jobs but not if canceled or failed.
Signed-off-by: CrazyMax <1951866+crazy-max@users.noreply.github.com>
build --squash is an experimental feature that is not implemented in the
containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We already have everything needed to work inside a container, with this
configuration file developing in moby is even easier: the IDE will ask
you if you want to run everything inside a container and set it up for
you. No need to know that you have to run "BIN_DIR=. make shell" any
more.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
No need to copy the parent label from the source dangling image, because
it will already be copied from the source image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A validation step was added to prevent the daemon from considering "logentries"
as a dynamically loaded plugin, causing it to continue trying to load the plugin;
WARN[2023-12-12T21:53:16.866857127Z] Unable to locate plugin: logentries, retrying in 1s
WARN[2023-12-12T21:53:17.868296836Z] Unable to locate plugin: logentries, retrying in 2s
WARN[2023-12-12T21:53:19.874259254Z] Unable to locate plugin: logentries, retrying in 4s
WARN[2023-12-12T21:53:23.879869881Z] Unable to locate plugin: logentries, retrying in 8s
But would ultimately be returned as an error to the user:
docker container create --name foo --log-driver=logentries nginx:alpine
Error response from daemon: error looking up logging plugin logentries: plugin "logentries" not found
With the additional validation step, an error is returned immediately:
docker container create --log-driver=logentries busybox
Error response from daemon: the logentries logging driver has been deprecated and removed
A migration step was added on container restore. Containers using the
"logentries" logging driver are migrated to use the "local" logging driver:
WARN[2023-12-12T22:38:53.108349297Z] migrated deprecated logentries logging driver container=4c9309fedce75d807340ea1820cc78dc5c774d7bfcae09f3744a91b84ce6e4f7 error="<nil>"
As an alternative to the validation step, I also considered using a "stub"
deprecation driver, however this would not result in an error when creating
the container, and only produce an error when starting:
docker container create --name foo --log-driver=logentries nginx:alpine
4c9309fedce75d807340ea1820cc78dc5c774d7bfcae09f3744a91b84ce6e4f7
docker start foo
Error response from daemon: failed to create task for container: failed to initialize logging driver: the logentries logging driver has been deprecated and removed
Error: failed to start containers: foo
For containers, this validation is added in the backend (daemon). For services,
this was not sufficient, as SwarmKit would try to schedule the task, which
caused a close loop;
docker service create --log-driver=logentries --name foo nginx:alpine
zo0lputagpzaua7cwga4lfmhp
overall progress: 0 out of 1 tasks
1/1: no suitable node (missing plugin on 1 node)
Operation continuing in background.
DEBU[2023-12-12T22:50:28.132732757Z] Calling GET /v1.43/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22zo0lputagpzaua7cwga4lfmhp%22%3Atrue%7D%7D
DEBU[2023-12-12T22:50:28.137961549Z] Calling GET /v1.43/nodes
DEBU[2023-12-12T22:50:28.340665007Z] Calling GET /v1.43/services/zo0lputagpzaua7cwga4lfmhp?insertDefaults=false
DEBU[2023-12-12T22:50:28.343437632Z] Calling GET /v1.43/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22zo0lputagpzaua7cwga4lfmhp%22%3Atrue%7D%7D
DEBU[2023-12-12T22:50:28.345201257Z] Calling GET /v1.43/nodes
So a validation was added in the service create and update endpoints;
docker service create --log-driver=logentries --name foo nginx:alpine
Error response from daemon: the logentries logging driver has been deprecated and removed
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The service was discontinued on November 15, 2022, so
remove mentions of this driver in the API docs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The service was discontinued on November 15, 2022, so
remove mentions of this driver in the API docs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Logentries service will be discontinued next week:
> Dear Logentries user,
>
> We have identified you as the owner of, or collaborator of, a Logentries account.
>
> The Logentries service will be discontinued on November 15th, 2022. This means that your Logentries account access will be removed and all your log data will be permanently deleted on this date.
>
> Next Steps
> If you are interested in an alternative Rapid7 log management solution, InsightOps will be available for purchase through December 16th, 2022. Please note, there is no support to migrate your existing Logentries account to InsightOps.
>
> Thank you for being a valued user of Logentries.
>
> Thank you,
> Rapid7 Customer Success
There is no reason to preserve this code in Moby as a result.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit switches our code to use semconv 1.21, which is the version matching
the OTEL modules, as well as the containerd code.
The BuildKit 0.12.x module currently uses an older version of the OTEL modules,
and uses the semconv 0.17 schema. Mixing schema-versions is problematic, but
we still want to consume BuildKit's "detect" package to wire-up other parts
of OTEL.
To align the versions in our code, this patch sets the BuildKit detect.Resource
with the correct semconv version.
It's worth noting that the BuildKit package has a custom "serviceNameDetector";
https://github.com/moby/buildkit/blob/v0.12.4/util/tracing/detect/detect.go#L153-L169
Whith is merged with OTEL's default resource:
https://github.com/moby/buildkit/blob/v0.12.4/util/tracing/detect/detect.go#L100-L107
There's no need to duplicate that code, as OTEL's `resource.Default()` already
provides this functionality:
- It uses fromEnv{} detector internally: https://github.com/open-telemetry/opentelemetry-go/blob/v1.19.0/sdk/resource/resource.go#L208
- fromEnv{} detector reads OTEL_SERVICE_NAME: https://github.com/open-telemetry/opentelemetry-go/blob/v1.19.0/sdk/resource/env.go#L53
This patch also removes uses of the httpconv package, which is no longer included
in semconv 1.21 and now an internal package. Removing the use of this package
means that hijacked connections will not have the HTTP attributes on the Moby
client span, which isn't ideal, but a limited loss that'd impact exec/attach.
The span itself will still exist, it just won't the additional attributes that
are added by that package.
Alternatively, the httpconv call COULD remain - it will not error and will send
syntactically valid spans but we would be mixing & matching semconv versions,
so won't be compliant.
Some parts of the httpconv package were preserved through a very minimal local
implementation; a variant of `httpconv.ClientStatus(resp.StatusCode))` is added
to set the span status (`span.SetStatus()`). The `httpconv` package has complex
logic for this, but mostly drills down to HTTP status range (1xx/2xx/3xx/4xx/5xx)
to determine if the status was successfull or non-successful (4xx/5xx).
The additional logic it provided was to validate actual status-codes, and to
convert "bogus" status codes in "success" ranges (1xx, 2xx) into an error. That
code seemed over-reaching (and not accounting for potential future _valid_
status codes). Let's assume we only get valid status codes.
- https://github.com/open-telemetry/opentelemetry-go/blob/v1.21.0/semconv/v1.17.0/httpconv/http.go#L85-L89
- https://github.com/open-telemetry/opentelemetry-go/blob/v1.21.0/semconv/internal/v2/http.go#L322-L330
- https://github.com/open-telemetry/opentelemetry-go/blob/v1.21.0/semconv/internal/v2/http.go#L356-L404
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upgrade to the latest OpenTelemetry libraries; this will unblock a lot of
downstream projects in the ecosystem to upgrade, as some of the parts here
were pre-1.0/unstable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
update the dependency to v0.2.4 to prevent scanners from flagging the
vulnerability (GHSA-6xv5-86q9-7xr8 / GO-2023-2048). Note that that vulnerability
only affects Windows, and is currently only used in runc/libcontainer, so should
not impact our use (as that code is Linux-only).
full diff: https://github.com/cyphar/filepath-securejoin/compare/v0.2.3...v0.2.4
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.10...v1.7.11
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.11
Welcome to the v1.7.11 release of containerd!
The eleventh patch release for containerd 1.7 contains various fixes and
updates including one security issue.
Notable Updates
- Fix Windows default path overwrite issue
- Update push to always inherit distribution sources from parent
- Update shim to use net dial for gRPC shim sockets
- Fix otel version incompatibility
- Fix Windows snapshotter blocking snapshot GC on remove failure
- Mask /sys/devices/virtual/powercap path in runtime spec and deny in
default apparmor profile [GHSA-7ww5-4wqc-m92c]
Deprecation Warnings
- Emit deprecation warning for AUFS snapshotter
- Emit deprecation warning for v1 runtime
- Emit deprecation warning for deprecated CRI configs
- Emit deprecation warning for CRI v1alpha1 usage
- Emit deprecation warning for CRIU config in CRI
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.9...v1.7.10
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.10
Welcome to the v1.7.10 release of containerd!
The tenth patch release for containerd 1.7 contains various fixes and
updates.
Notable Updates
- Enhance container image unpack client logs
- cri: fix using the pinned label to pin image
- fix: ImagePull should close http connection if there is no available data to read.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When reading logs, timestamps should always be presented in UTC. Unlike
the "json-file" and other logging drivers, the "local" logging driver
was using local time.
Thanks to Roman Valov for reporting this issue, and locating the bug.
Before this change:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
docker logs --timestamps fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
2023-12-08T18:16:56.291023422+01:00 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T18:16:56.291056463+01:00 /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T18:16:56.291890130+01:00 /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
With this patch:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
docker logs --timestamps 14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
2023-12-08T17:18:46.635967625Z /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T17:18:46.635989792Z /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T17:18:46.636897417Z /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If no `dangling` filter is specified, prune should only delete dangling
images.
This wasn't visible by doing `docker image prune` because the CLI
explicitly sets this filter to true.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Isolate the prune effects by running the test in a separate daemon.
This minimizes the impact of/on other integration tests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Config serialization performed by the graphdriver implementation
maintained the distinction between an empty array and having no Cmd set.
With containerd integration we serialize the OCI types directly that use
the `omitempty` option which doesn't persist that distinction.
Considering that both values should have exactly the same semantics (no
cmd being passed) it should be fine if in this case the Cmd would be
null instead of an empty array.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Acquire the mutex in the help handler to synchronize access to the
handlers map. While a trivial issue---a panic in the request handler if
the node joins a swarm at just the right time, which would only result
in an HTTP 500 response---it is also a trivial race condition to fix.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We don't need C-style callback functions which accept a void* context
parameter: Go has closures. Drop the unnecessary httpHandlerCustom type
and refactor the diagnostic server handler functions into closures which
capture whatever context they need implicitly.
If the node leaves and rejoins a swarm, the cluster agent and its
associated NetworkDB are discarded and replaced with new instances. Upon
rejoin, the agent registers its NetworkDB instance with the diagnostic
server. These handlers would all conflict with the handlers registered
by the previous NetworkDB instance. Attempting to register a second
handler on a http.ServeMux with the same pattern will panic, which the
diagnostic server would historically deal with by ignoring the duplicate
handler registration. Consequently, the first NetworkDB instance to be
registered would "stick" to the diagnostic server for the lifetime of
the process, even after it is replaced with another instance. Improve
duplicate-handler registration such that the most recently-registered
handler for a pattern is used for all subsequent requests.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Rewrite `.build-empty-images` shell script that produced special images
(emptyfs with no layers, and empty danglign image) to a Go functions
that construct the same archives in a temporary directory.
Use them to load these images on demand only in the tests that need
them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This struct is intended for internal use only for the backend, and is
not intended to be used externally.
This moves the plugin-related `NetworkListConfig` types to the backend
package to prevent it being imported in the client, and to make it more
clear that this is part of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These structs are intended for internal use only for the backend, and are
not intended to be used externally.
This moves the plugin-related `PluginRmConfig`, `PluginEnableConfig`, and
`PluginDisableConfig` types to the backend package to prevent them being
imported in the client, and to make it more clear that this is part of
internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.21.5 (released 2023-12-05) includes security fixes to the go command,
and the net/http and path/filepath packages, as well as bug fixes to the
compiler, the go command, the runtime, and the crypto/rand, net, os, and
syscall packages. See the Go 1.21.5 milestone on our issue tracker for
details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.5+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.4...go1.21.5
from the security mailing:
[security] Go 1.21.5 and Go 1.20.12 are released
Hello gophers,
We have just released Go versions 1.21.5 and 1.20.12, minor point releases.
These minor releases include 3 security fixes following the security policy:
- net/http: limit chunked data overhead
A malicious HTTP sender can use chunk extensions to cause a receiver
reading from a request or response body to read many more bytes from
the network than are in the body.
A malicious HTTP client can further exploit this to cause a server to
automatically read a large amount of data (up to about 1GiB) when a
handler fails to read the entire body of a request.
Chunk extensions are a little-used HTTP feature which permit including
additional metadata in a request or response body sent using the chunked
encoding. The net/http chunked encoding reader discards this metadata.
A sender can exploit this by inserting a large metadata segment with
each byte transferred. The chunk reader now produces an error if the
ratio of real body to encoded bytes grows too small.
Thanks to Bartek Nowotarski for reporting this issue.
This is CVE-2023-39326 and Go issue https://go.dev/issue/64433.
- cmd/go: go get may unexpectedly fallback to insecure git
Using go get to fetch a module with the ".git" suffix may unexpectedly
fallback to the insecure "git://" protocol if the module is unavailable
via the secure "https://" and "git+ssh://" protocols, even if GOINSECURE
is not set for said module. This only affects users who are not using
the module proxy and are fetching modules directly (i.e. GOPROXY=off).
Thanks to David Leadbeater for reporting this issue.
This is CVE-2023-45285 and Go issue https://go.dev/issue/63845.
- path/filepath: retain trailing \ when cleaning paths like \\?\c:\
Go 1.20.11 and Go 1.21.4 inadvertently changed the definition of the
volume name in Windows paths starting with \\?\, resulting in
filepath.Clean(\\?\c:\) returning \\?\c: rather than \\?\c:\ (among
other effects). The previous behavior has been restored.
This is an update to CVE-2023-45283 and Go issue https://go.dev/issue/64028.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.21.4 (released 2023-11-07) includes security fixes to the path/filepath
package, as well as bug fixes to the linker, the runtime, the compiler, and
the go/types, net/http, and runtime/cgo packages. See the Go 1.21.4 milestone
on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.4+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.3...go1.21.4
from the security mailing:
[security] Go 1.21.4 and Go 1.20.11 are released
Hello gophers,
We have just released Go versions 1.21.4 and 1.20.11, minor point releases.
These minor releases include 2 security fixes following the security policy:
- path/filepath: recognize `\??\` as a Root Local Device path prefix.
On Windows, a path beginning with `\??\` is a Root Local Device path equivalent
to a path beginning with `\\?\`. Paths with a `\??\` prefix may be used to
access arbitrary locations on the system. For example, the path `\??\c:\x`
is equivalent to the more common path c:\x.
The filepath package did not recognize paths with a `\??\` prefix as special.
Clean could convert a rooted path such as `\a\..\??\b` into
the root local device path `\??\b`. It will now convert this
path into `.\??\b`.
`IsAbs` did not report paths beginning with `\??\` as absolute.
It now does so.
VolumeName now reports the `\??\` prefix as a volume name.
`Join(`\`, `??`, `b`)` could convert a seemingly innocent
sequence of path elements into the root local device path
`\??\b`. It will now convert this to `\.\??\b`.
This is CVE-2023-45283 and https://go.dev/issue/63713.
- path/filepath: recognize device names with trailing spaces and superscripts
The `IsLocal` function did not correctly detect reserved names in some cases:
- reserved names followed by spaces, such as "COM1 ".
- "COM" or "LPT" followed by a superscript 1, 2, or 3.
`IsLocal` now correctly reports these names as non-local.
This is CVE-2023-45284 and https://go.dev/issue/63713.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon currently provides support for API versions all the way back
to v1.12, which is the version of the API that shipped with docker 1.0. On
Windows, the minimum supported version is v1.24.
Such old versions of the client are rare, and supporting older API versions
has accumulated significant amounts of code to remain backward-compatible
(which is largely untested, and a "best-effort" at most).
This patch updates the minimum API version to v1.24, which is the fallback
API version used when API-version negotiation fails. The intent is to start
deprecating older API versions, but no code is removed yet as part of this
patch, and a DOCKER_MIN_API_VERSION environment variable is added, which
allows overriding the minimum version (to allow restoring the behavior from
before this patch).
With this patch the daemon defaults to API v1.24 as minimum:
docker version
Client:
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: dev
API version: 1.44 (minimum version 1.24)
Go version: go1.21.3
Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
Built: Mon Dec 4 15:22:17 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v1.7.9
GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Trying to use an older version of the API produces an error:
DOCKER_API_VERSION=1.23 docker version
Client:
Version: 24.0.2
API version: 1.23 (downgraded from 1.43)
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Error response from daemon: client version 1.23 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version
To restore the previous minimum, users can start the daemon with the
DOCKER_MIN_API_VERSION environment variable set:
DOCKER_MIN_API_VERSION=1.12 dockerd
API 1.12 is the oldest supported API version on Linux;
docker version
Client:
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:50:49 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: dev
API version: 1.44 (minimum version 1.12)
Go version: go1.21.3
Git commit: 0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
Built: Mon Dec 4 15:22:17 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v1.7.9
GitCommit: 4f03e100cb967922bec7459a78d16ccbac9bb81d
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
When using the `DOCKER_MIN_API_VERSION` with a version of the API that
is not supported, an error is produced when starting the daemon;
DOCKER_MIN_API_VERSION=1.11 dockerd --validate
invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: 1.11
DOCKER_MIN_API_VERSION=1.45 dockerd --validate
invalid DOCKER_MIN_API_VERSION: maximum supported API version is 1.44: 1.45
Specifying a malformed API version also produces the same error;
DOCKER_MIN_API_VERSION=hello dockerd --validate
invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: hello
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ContainerCreateConfig` and `ContainerRmConfig` structs are used for
options to be passed to the backend, and are not used in client code.
Thess struct currently is intended for internal use only (for example, the
`AdjustCPUShares` is an internal implementation details to adjust the container's
config when older API versions are used).
Somewhat ironically, the signature of the Backend has a nicer UX than that
of the client's `ContainerCreate` signature (which expects all options to
be passed as separate arguments), so we may want to update that signature
to be closer to what the backend is using, but that can be left as a future
exercise.
This patch moves the `ContainerCreateConfig` and `ContainerRmConfig` structs
to the backend package to prevent it being imported in the client, and to make
it more clear that this is part of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Follow-up to e72c4818c4, which updated the
Dockerfile to use Debian 12 "bookworm", but forgot to update the package
repository to use for the CRIU packages. Note that the criu stage is currently
not built by default (see d3d2823edf), so to
verify the stage, it needs to be built manually;
docker build --target=criu .
This patch adds an extra `criu --version` to the build, so that it's verified
to be "functional".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests build new images, setupTest sets up the test cleanup
function that clears the test environment from created images,
containers, etc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This removes various skips that accounted for running the integration tests
against older versions of the daemon before 20.10 (API version v1.41). Those
versions are EOL, and we don't run tests against them.
This reverts most of e440831802, and similar
PRs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the minimum API version as advertised by the test-daemon, instead of the
hard-coded API version from code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was using API version 1.20 to test old behavior, but the actual change
in behavior was API v1.25; see commit 6d98e344c7
and 63b5a37203.
This updates the test to use API v1.24 to test the old behavior.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestKillDifferentUserContainer was migrated from integration-cli in
commit 0855922cd3. Before migration, it
was not using a specific API version, so we can assume "current"
API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 59b83d8aae containerized these steps,
as they didn't work well on Debian Jessie:
> Because the `mount` here will sometimes fail when run in `debian:jessie`,
> which is what the environrment hosting the test suite is running if run
> from the `Makefile`.
> Also, why the heck not containerize it, all the things.
Follow-up commits, such as 228d74842f, and
1c5806cf57 updated the Debian distro, but
also updated this comment, losing the original context (the issue was
(originally) related to Debian Jessie).
This patch changes the test back to not use containers, which seems to
work fine (at least "it worked on my machine").
make TEST_IGNORE_CGROUP_CHECK=1 TEST_FILTER=TestDaemonNoSpaceLeftOnDeviceError DOCKER_GRAPHDRIVER=overlay2 test-integration
=== RUN TestDockerDaemonSuite/TestDaemonNoSpaceLeftOnDeviceError
check_test.go:589: [df36ad96a412b] daemon is not started
--- PASS: TestDockerDaemonSuite (5.12s)
--- PASS: TestDockerDaemonSuite/TestDaemonNoSpaceLeftOnDeviceError (5.12s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was originally added in 8ec8564691,
at which time the upstream debian package repositories were not always
reliable, so using a mirror helped with CI stability and performance.
Debian's package repositories are a lot more reliable now, so there's no
longer a need to use a mirror.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When creating a CIFS volume, generate an error if the device URL
includes a port number, for example:
--opt device="//some.server.com:2345/thepath"
The port must be specified in the port option instead, for example:
--opt o=username=USERNAME,password=PASSWORD,vers=3,sec=ntlmsspi,port=1234
Signed-off-by: Rob Murray <rob.murray@docker.com>
Add the TaskStatus, PortStatus and ContainerStatus to api docs. TaskStatus was moved to the swagger definitions root from anonymous type definition, and PortStatus and Container Status are its dependencies.
Signed-off-by: Martin Jirku <martin@jirku.sk>
Move the initialization logic to the attachContext itself, so that
the container doesn't have to be aware about mutexes and other logic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Skip TestListDanglingImagesWithDigests which tests graphdriver
implementation specific behavior of `docker images --filter
dangling=true`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This removes various templating functions that were added for the
docker CLI. These are not needed for the dockerd binary, which does
not have subcommands or management commands.
Revert "Only hide commands if the env variable is set."
This reverts commit a7c8bcac2b.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We used DEBIAN_FRONTEND in some places to prevent installation of packages
from being blocked. However, debian bookworm now [includes a fix][1] for
situations like this (it was specifically reported for Docker situations <3),
so we can get rid of these.
Thanks to Tianon for noticing this, and for linking to the Debian ticket!
[1]: https://bugs.debian.org/929417
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Replace `time.Sleep` with a poll that checks if process no longer exists
to avoid possible race condition.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
I noticed this log being logged as an error, but the kill logic actually
proceeds after this (doing a "direct" kill instead). While usually containers
are expected to be exiting within the given timeout, I don't think this
needs to be logged as an error (an error is returned after we fail to
kill the container).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The auth service error response is not a part of the spec and containerd
doesn't parse it like the Docker's distribution does.
Check for containerd specific errors instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Debian Woodworm ships with a newer version of iptables, which caused two
tests to fail:
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
docker_cli_daemon_test.go:841: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge6.*ext-bridge6", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge6 !ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge6 ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
docker_cli_daemon_test.go:803: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge5.*ext-bridge5", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge5 !ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge5 ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
Both the `TestDaemonICCPing`, and `TestDaemonICCLinkExpose` test were introduced
in dd0666e64f. These tests called `iptables` with
the `-n` (`--numeric`) option, which prevents it from doing a reverse-DNS lookup
as an optimization.
However, the `-n` option did not have an effect to the `prot` column before
commit [da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa] (iptables < v1.8.9 or v1.8.8).
Newer versions, such as the iptables version shipping with Debian Woodworm do,
so we need to update the expected output for this version.
This patch removes the `-n` option, to keep the test more portable, also when
run non-containerized, and removes the use of regular expressions to check the
result, as these regular expressions were quite permissive (using `.*` wild-
card matching). Instead, we're getting the
With this change;
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestDaemonICC TEST_IGNORE_CGROUP_CHECK=1 test-integration
...
--- PASS: TestDockerDaemonSuite (139.11s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCLinkExpose (54.62s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCPing (84.48s)
[da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa]: https://git.netfilter.org/iptables/commit/?id=da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When live-restore is enabled, containers with autoremove enabled
shouldn't be forcibly killed when engine restarts.
They still should be removed if they exited while the engine was down
though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Change the repo name used as for an intermediate image so it doesn't
try to mount from the image pushed by `TestBuildMultiStageImplicitPull`.
Before this patch, this test failed because the distribution.source
labels are not cleared between tests and the busybox content still has
the distribution.source label pointing to the `dockercli/testf`
repository which is no longer present in the test registry.
So both `dockercli/busybox` and `dockercli/testf` are equally valid
mount candidates for `dockercli/crossrepopush` and containerd algorithm
just happens to select the last one.
This changes the repo name to not have the common repository component
(`dockercli`) with the `dockercli/testf` repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Starting with [6e0ed3d19c54603f0f7d628ea04b550151d8a262], the minimum
allowed size is now 300MB. Given that this is a sparse image, and
the size of the image is irrelevant to the test (we check for
limits defined through project-quotas, not the size of the
device itself), we can raise the size of this image.
[6e0ed3d19c54603f0f7d628ea04b550151d8a262]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=6e0ed3d19c54603f0f7d628ea04b550151d8a262
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On bookworm, AppArmor failed to start inside the container, which can be
seen at startup of the dev-container:
Created symlink /etc/systemd/system/systemd-firstboot.service → /dev/null.
Created symlink /etc/systemd/system/systemd-udevd.service → /dev/null.
Created symlink /etc/systemd/system/multi-user.target.wants/docker-entrypoint.service → /etc/systemd/system/docker-entrypoint.service.
hack/dind-systemd: starting /lib/systemd/systemd --show-status=false --unit=docker-entrypoint.target
systemd 252.17-1~deb12u1 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.
modprobe@configfs.service: Deactivated successfully.
modprobe@dm_mod.service: Deactivated successfully.
modprobe@drm.service: Deactivated successfully.
modprobe@efi_pstore.service: Deactivated successfully.
modprobe@fuse.service: Deactivated successfully.
modprobe@loop.service: Deactivated successfully.
apparmor.service: Starting requested but asserts failed.
proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 49 (systemd-binfmt)
+ source /etc/docker-entrypoint-cmd
++ hack/make.sh dynbinary test-integration
When checking "aa-status", an error was printed that the filesystem was
not mounted:
aa-status
apparmor filesystem is not mounted.
apparmor module is loaded.
Checking if "local-fs.target" was loaded, that seemed to be the case;
systemctl status local-fs.target
● local-fs.target - Local File Systems
Loaded: loaded (/lib/systemd/system/local-fs.target; static)
Active: active since Mon 2023-11-27 10:48:38 UTC; 18s ago
Docs: man:systemd.special(7)
However, **on the host**, "/sys/kernel/security" has a mount, which was not
present inside the container:
mount | grep securityfs
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
Interestingly, on `debian:bullseye`, this was not the case either; no
`securityfs` mount was present inside the container, and apparmor actually
failed to start, but succeeded silently:
mount | grep securityfs
systemctl start apparmor
systemctl status apparmor
● apparmor.service - Load AppArmor profiles
Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2023-11-27 11:59:09 UTC; 44s ago
Docs: man:apparmor(7)
https://gitlab.com/apparmor/apparmor/wikis/home/
Process: 43 ExecStart=/lib/apparmor/apparmor.systemd reload (code=exited, status=0/SUCCESS)
Main PID: 43 (code=exited, status=0/SUCCESS)
CPU: 10ms
Nov 27 11:59:09 9519f89cade1 apparmor.systemd[43]: Not starting AppArmor in container
Same, using the `/etc/init.d/apparmor` script:
/etc/init.d/apparmor start
Starting apparmor (via systemctl): apparmor.service.
echo $?
0
And apparmor was not actually active:
aa-status
apparmor module is loaded.
apparmor filesystem is not mounted.
aa-enabled
Maybe - policy interface not available.
After further investigating, I found that the non-systemd dind script
had a mount for AppArmor, which was added in 31638ab2ad
The systemd variant was missing this mount, which may have gone unnoticed
because `debian:bullseye` was silently ignoring this when starting the
apparmor service.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
BaseFS is not serialized and is lost after an unclean shutdown. Unmount
method in the containerd image service implementation will not work
correctly in that case.
This patch will allow Unmount to restore the BaseFS if the target is
still mounted.
The reason it works with graphdrivers is that it doesn't directly
operate on BaseFS. It uses RWLayer, which is explicitly restored
immediately as soon as container is loaded.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is purely cosmetic - if a non-default MTU is configured, the bridge
will have the default MTU=1500 until a container's 'veth' is connected
and an MTU is set on the veth. That's a disconcerting, it looks like the
config has been ignored - so, set the bridge's MTU explicitly.
Fixes#37937
Signed-off-by: Rob Murray <rob.murray@docker.com>
The graphdriver implementation sets the ModTime of all image content to
match the `Created` time from the image config, whereas the containerd's
archive export code just leaves it empty (zero).
Adjust the test in the case where containerd integration is enabled to
check if config file ModTime is equal to zero (UNIX epoch) instead.
This behaviour is not a part of the Docker Image Specification and the
intention behind introducing it was to make the `docker save` produce
the same archive regardless of the time it was performed.
It would also be a bit problematic with the OCI archive layout which can
contain multiple images referencing the same content.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Also, err `e` is renamed into the more standard `err` as the defer
already uses `retErr` to avoid clashes (changed in f5a611a74).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
DNS config is a property of each adapter on Windows, thus we've a
dedicated `EndpointOption` for that.
The list of `EndpointOption` that should be applied to a given endpoint
is built by `buildCreateEndpointOptions`. This function contains a
seemingly flawed condition that adds the DNS config _iff_:
1. the network isn't internal ;
2. no ports are published / exposed through another sandbox endpoint ;
While 1. does make sense, there's actually no justification for 2.,
hence this commit remove this part of the condition.
This logic flaw has been made obvious by 0fd0e82, but it was originally
introduced by d1e0a78. Commit and PR comments don't mention why this is
done like so. Most probably, this was overlooked both by the original
author and the PR reviewers.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The `buildCreateEndpointOptions` does a lot of things to build the list
of `libnetwork.EndpointOption` from the `EndpointSettings` spec. To skip
ports-related options, an early return was put in the middle of that
function body.
Early returns are generally great, but put in the middle of a 150-loc
long function that does a lot, they're just a potential footgun. And I'm
the one who pulled the trigger in 052562f. Since this commit, generic
options won't be applied to endpoints if there's already one with
exposed/published ports. As a consequence, only the first endpoint can
have a user-defined MAC address right now.
Instead of moving up the code line that adds generic options, a better
change IMO is to move ports-related options, and the early-return gating
those options, to a dedicated func to make `buildCreateEndpointOptions`
slightly easier to read and reason about.
There was actually one oddity in the original
`buildCreateEndpointOptions`: the early-return also gates the addition
of `CreateOptionDNS`. These options are Windows-specific; a comment is
added to explain that. But the oddity is really: why are we checking if
an endpoint with exposed / published ports joined this sandbox to decide
whether we want to configure DNS server on the endpoint's adapter? Well,
this early-return was most probably overlooked by the original author
and by reviewers at the time these options were added (in commit d1e0a78)
Let's fix that in a follow-up commit.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The example still used the deprecated types.ContainerListOptions;
also slightly updated the example to show both stopped and running
containers, so that the example works even if no container is running.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The DirCopy() function in "graphdriver/copy/copy.go" has a special case for
skip file-attribute copying when making a hard link to an already-copied
file, if "copyMode == Hardlink". Do the same for copies of hard-links in
the source filesystem.
Significantly speeds up vfs's copy of a BusyBox filesystem (which
consists mainly of hard links to a single binary), making moby's
integration tests run more quickly and more reliably in a dev container.
Fixes#46810
Signed-off-by: Rob Murray <rob.murray@docker.com>
- full diff: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.10
This is the tenth (and most likely final) patch release in the 1.1.z
release branch of runc. It mainly fixes a few issues in cgroups, and a
umask-related issue in tmpcopyup.
- Add support for `hugetlb.<pagesize>.rsvd` limiting and accounting.
Fixes the issue of postgres failing when hugepage limits are set.
- Fixed permissions of a newly created directories to not depend on the value
of umask in tmpcopyup feature implementation.
- libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes`
(fixes the compatibility with Linux kernel 6.1+).
- Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb
configuration. This issue is not a security issue because it requires a
malicious config.json, which is outside of our threat model.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.10
This is the tenth (and most likely final) patch release in the 1.1.z
release branch of runc. It mainly fixes a few issues in cgroups, and a
umask-related issue in tmpcopyup.
- Add support for `hugetlb.<pagesize>.rsvd` limiting and accounting.
Fixes the issue of postgres failing when hugepage limits are set.
- Fixed permissions of a newly created directories to not depend on the value
of umask in tmpcopyup feature implementation.
- libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes`
(fixes the compatibility with Linux kernel 6.1+).
- Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb
configuration. This issue is not a security issue because it requires a
malicious config.json, which is outside of our threat model.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a strong type for the DNS IP-addresses so that we can use flags.IPSliceVar,
instead of implementing our own option-type and validation.
Behavior should be the same, although error-messages have slightly changed:
Before this patch:
dockerd --dns 1.1.1.1oooo --validate
Status: invalid argument "1.1.1.1oooo" for "--dns" flag: 1.1.1.1oooo is not an ip address
See 'dockerd --help'., Code: 125
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1"]}
dockerd --dns 2.2.2.2 --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: dns: (from flag: [2.2.2.2], from file: [1.1.1.1])
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1oooo"]}
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: 1.1.1.1ooooo is not an ip address
With this patch:
dockerd --dns 1.1.1.1oooo --validate
Status: invalid argument "1.1.1.1oooo" for "--dns" flag: invalid string being converted to IP address: 1.1.1.1oooo
See 'dockerd --help'., Code: 125
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1"]}
dockerd --dns 2.2.2.2 --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: dns: (from flag: [2.2.2.2], from file: [1.1.1.1])
cat /etc/docker/daemon.json
{"dns": ["1.1.1.1oooo"]}
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid IP address: 1.1.1.1oooo
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- document accepted values
- add test-coverage for the function's behavior (including whitespace handling),
and use sub-tests.
- improve error-message to use uppercase for "IP", and to use a common prefix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I was trying to find out why `docker info` was sometimes slow so
plumbing a context through to propagate trace data through.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The forwarding database (fdb) of Linux VXLAN links are restricted to
entries with destination VXLAN tunnel endpoint (VTEP) address of a
single address family. Which address family is permitted is set when the
link is created and cannot be modified. The overlay network driver
creates VXLAN links such that the kernel only allows fdb entries to be
created with IPv4 destination VTEP addresses. If the Swarm is configured
with IPv6 advertise addresses, creating fdb entries for remote peers
fails with EAFNOSUPPORT (address family not supported by protocol).
Make overlay networks functional over IPv6 transport by configuring the
VXLAN links for IPv6 VTEPs if the local node's advertise address is an
IPv6 address. Make encrypted overlay networks secure over IPv6 transport
by applying the iptables rules to the ip6tables when appropriate.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Early return if the iface or its address is nil to make the whole
function slightly easier to read.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
We remap the snapshot when we create a container, we have to to the
inverse when we commit the container into an image
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This test broke in 98323ac114.
This commit renamed WithMacAddress into WithContainerWideMacAddress.
This helper sets the MacAddress field in container.Config. However, API
v1.44 now ignores this field if the NetworkMode has no matching entry in
EndpointsConfig.
This fix uses the helper WithMacAddress and specify for which
EndpointConfig the MacAddress is specified.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When server address is not provided with the auth configuration,
use the domain from the image provided with the auth.
Signed-off-by: Derek McGowan <derek@mcg.dev>
The workaround is no longer required. The bug has been fixed in stable
versions of all supported containerd branches.
This reverts commit fb7ec1555c.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Change the non-refcounted implementation to perform the mount using the
same identity and access right. They should be the same regardless if
we're refcounting or not.
This also allows to refactor refCountMounter into a mounter decorator.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test was rewritten from an integration-cli test in commit
68d9beedbe, and originally implemented in
f4942ed864, which rewrote it from a unit-
test to an integration test.
Originally, it would check for the raw JSON response from the daemon, and
check for individual fields to be present in the output, but after commit
0fd5a65428, `client.Info()` was used, and
now the response is unmarshalled into a `system.Info`.
The remainder of the test remained the same in that rewrite, and as a
result were were now effectively testing if a `system.Info` struct,
when marshalled as JSON would show all the fields (surprise: it does).
TL;DR; the test would even pass with an empty `system.Info{}` struct,
which didn't provide much coverage, as it passed without a daemon:
func TestInfoAPI(t *testing.T) {
// always shown fields
stringsToCheck := []string{
"ID",
"Containers",
"ContainersRunning",
"ContainersPaused",
"ContainersStopped",
"Images",
"LoggingDriver",
"OperatingSystem",
"NCPU",
"OSType",
"Architecture",
"MemTotal",
"KernelVersion",
"Driver",
"ServerVersion",
"SecurityOptions",
}
out := fmt.Sprintf("%+v", system.Info{})
for _, linePrefix := range stringsToCheck {
assert.Check(t, is.Contains(out, linePrefix))
}
}
This patch makes the test _slightly_ better by checking if the fields
are non-empty. More work is needed on this test though; currently it
uses the (already running) daemon, so it's hard to check for specific
fields to be correct (withouth knowing state of the daemon), but it's
not unlikely that other tests (partially) cover some of that. A TODO
comment was added to look into that (we should probably combine some
tests to prevent overlap, and make it easier to spot "gaps" as well).
While working on this, also moving the `SystemTime` into this test,
because that field is (no longer) dependent on "debug" state
(It is was actually this change that led me down this rabbit-hole)
()_()
(-.-)
'(")(")'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Following tests are implemented in this specific commit:
- Inter-container communications for internal and non-internal
bridge networks, over IPv4 and IPv6.
- Inter-container communications using IPv6 link-local addresses for
internal and non-internal bridge networks.
- Inter-network communications for internal and non-internal bridge
networks, over IPv4 and IPv6, are disallowed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This commit introduces a new integration test suite aimed at testing
networking features like inter-container communication, network
isolation, port mapping, etc... and how they interact with daemon-level
and network-level parameters.
So far, there's pretty much no tests making sure our networks are well
configured: 1. there're a few tests for port mapping, but they don't
cover all use cases ; 2. there're a few tests that check if a specific
iptables rule exist, but that doesn't prevent that specific iptables
rule to be wrong in the first place.
As we're planning to refactor how iptables rules are written, and change
some of them to fix known security issues, we need a way to test all
combinations of parameters. So far, this was done by hand, which is
particularly painful and time consuming. As such, this new test suite is
foundational to upcoming work.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When the start interval is 0 we should treat that as unset.
This is especially important for older API versions where we reset the
value to 0.
Instead of using the default probe value we should be using the
configured `interval` value (which may be a default as well) which gives
us back the old behavior before support for start interval was added.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This syncs the seccomp profile with changes made to containerd's default
profile in [1].
The original containerd issue and PR mention:
> Security experts generally believe io_uring to be unsafe. In fact
> Google ChromeOS and Android have turned it off, plus all Google
> production servers turn it off. Based on the blog published by Google
> below it seems like a bunch of vulnerabilities related to io_uring can
> be exploited to breakout of the container.
>
> [2]
>
> Other security reaserchers also hold this opinion: see [3] for a
> blackhat presentation on io_uring exploits.
For the record, these syscalls were added to the allowlist in [4].
[1]: a48ddf4a20
[2]: https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html
[3]: https://i.blackhat.com/BH-US-23/Presentations/US-23-Lin-bad_io_uring.pdf
[4]: https://github.com/moby/moby/pull/39415
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test is very weird, the Size in the manifests that it creates is
wrong, graph drivers only print a warning in that case but containerd
fails because it verifies more things. The media types are also wrong in
the containerd case, the manifest list forces the media type to be
"schema2.MediaTypeManifest" but in the containerd case the media type is
an OCI one.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The build target is not quoted and it makes it difficult for some
persons to see what the problem is.
By quoting it we emphasize that the target name is variable.
Signed-off-by: Frank Villaro-Dixon <frank.villarodixon@merkle.com>
I am finally convinced that, given two netip.Prefix values a and b, the
expression
a.Contains(b.Addr()) || b.Contains(a.Addr())
is functionally equivalent to
a.Overlaps(b)
The (netip.Prefix).Contains method works by masking the address with the
prefix's mask and testing whether the remaining most-significant bits
are equal to the same bits in the prefix. The (netip.Prefix).Overlaps
method works by masking the longer prefix to the length of the shorter
prefix and testing whether the remaining most-significant bits are
equal. This is equivalent to
shorterPrefix.Contains(longerPrefix.Addr()), therefore applying Contains
symmetrically to two prefixes will always yield the same result as
applying Overlaps to the two prefixes in either order.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This delete was originally added in b37fdc5dd1
and migrated from `deleteImages(repoName)` in commit 1e55ace875,
however, deleting `foobar-save-multi-images-test` (`foobar-save-multi-images-test:latest`)
always resulted in an error;
Error response from daemon: No such image: foobar-save-multi-images-test:latest
This patch removes the redundant image delete.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Shutting down containers on Windows can take a long time (with hyper-v),
causing this test to be flaky; seen failing on windows 2022;
=== FAIL: github.com/docker/docker/integration/image TestSaveRepoWithMultipleImages (23.16s)
save_test.go:104: timeout waiting for container to exit
Looking at the test, we run a container only to commit it, and the test
does not make changes to the container's filesystem; it only runs a container
with a custom command (`true`).
Instead of running the container, we can _create_ a container and commit it;
this simplifies the tests, and prevents having to wait for the container to
exit (before committing).
To verify:
make BIND_DIR=. DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestSaveRepoWithMultipleImages test-integration
INFO: Testing against a local daemon
=== RUN TestSaveRepoWithMultipleImages
--- PASS: TestSaveRepoWithMultipleImages (1.20s)
PASS
DONE 1 tests in 2.668s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When starting a daemon in debug mode (such as used in CI), many log-messages
are printed during startup. As a result, the log message indicating whether
graph-drivers or snapshotters are used may appear far separate from the
informational log about the daemon (and selected storage-driver).
The existing log-driver also unconditionally uses the legacy "graph-driver"
terminology, instead of the more generic "storage-driver".
This patch changes the log message shown during startup to use the generic
"graph-driver" as field, and adds a new field that indicates wheter we're
using snapshotters or graph-drivers.
Given that snapshotters will be the default at some point, an alternative
could be to include the _type_ of driver used, for example;
`io.containerd.snapshotter.v1`, which may continue to be relevant after
snapshotters become the default, and at which point (potentially) the
type of snapshotter becomes more relevant.
Before this change:
TEST_INTEGRATION_USE_SNAPSHOTTER=1 DOCKER_GRAPHDRIVER=overlayfs dockerd
...
INFO[2023-10-31T09:12:33.586269801Z] Starting daemon with containerd snapshotter integration enabled
INFO[2023-10-31T09:12:33.586322176Z] Loading containers: start.
INFO[2023-10-31T09:12:33.640514759Z] Loading containers: done.
INFO[2023-10-31T09:12:33.646498134Z] Docker daemon commit=dcf7287d647bcb515015e389df46ccf1e09855b7 graphdriver=overlayfs version=dev
INFO[2023-10-31T09:12:33.646706551Z] Daemon has completed initialization
INFO[2023-10-31T09:12:33.658840592Z] API listen on /var/run/docker.sock
With this change;
TEST_INTEGRATION_USE_SNAPSHOTTER=1 DOCKER_GRAPHDRIVER=overlayfs dockerd
...
INFO[2023-10-31T08:41:38.841155928Z] Starting daemon with containerd snapshotter integration enabled
INFO[2023-10-31T08:41:38.841207512Z] Loading containers: start.
INFO[2023-10-31T08:41:38.902461053Z] Loading containers: done.
INFO[2023-10-31T08:41:38.910535137Z] Docker daemon commit=dcf7287d647bcb515015e389df46ccf1e09855b7 containerd-snapshotter=true storage-driver=overlayfs version=dev
INFO[2023-10-31T08:41:38.910936803Z] Daemon has completed initialization
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These platforms are filled by default from containerd
introspection API and may not be normalized. Initializing
wrong platform in here results in incorrect platform
for BUILDPLATFORM and TARGETPLATFORM build-args for
Dockerfile frontend (and probably other side effects).
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Redirecting check-config.sh output to a file puts control character
output into that file, which isn't helpful for reading.
Disable colorized output if either
1. NO_COLOR environment is set to "1"
2. stdout is not a terminal.
Signed-off-by: Scott Moser <smoser@brickies.net>
In case of `docker push -a`, we need to return an error if there is no
image for the given repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't wrap the `no basic auth credentials` error from containerd and
return it as-is.
The error will look like:
```
failed to resolve reference "docker.io/library/aodkoakds:latest": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The 403 error might not only be raised in swarm operations. It is
also returned when the given container is already connected to the
network and is currently running. I noticed this when during the
following PR: https://github.com/containers/podman/pull/20365
Signed-off-by: Philipp Fruck <dev@p-fruck.de>
When writing a tar file with archive/tar, extended attributes in the
deprecated (tar.Header).Xattrs map take precedence over conflicting
'SCHILY.xattr' records in the (tar.Header).PAXRecords map. Update
package tarsum to follow the same precedence rules as archive/tar.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Adds test ensuring that additional groups set with `--group-add`
are kept on exec when container had `--user` set on run.
Regression test for https://github.com/moby/moby/issues/46712
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Add a new `com.docker.network.host_ipv6` bridge option to compliment
the existing `com.docker.network.host_ipv4` option. When set to an
IPv6 address, this causes the bridge to insert `SNAT` rules instead of
`MASQUERADE` rules (assuming `ip6tables` is enabled). `SNAT` makes it
possible for users to control the source IP address used for outgoing
connections.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
Kept `coci` import alias since we use it elsewhere,
maybe to prevent confusion with our own `oci` package.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Having a sandbox/container-wide MacAddress field makes little sense
since a container can be connected to multiple networks at the same
time. This field is an artefact of old times where a container could be
connected to a single network only.
As we now have a way to specify per-endpoint mac address, this field is
now deprecated.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to this commit, only container.Config had a MacAddress field and
it's used only for the first network the container connects to. It's a
relic of old times where custom networks were not supported.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- Merge BC conds for API < v1.42 together
- Merge BC conds for API < v1.44 together
- Re-order BC conds by API version
- Move pids-limit normalization after BC conds
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The same error is already returned by `(*Daemon).containerCreate()` but
since this function is also called by the cluster executor, the error
has to be duplicated.
Doing that allows to remove a nil check on container config in
`postContainersCreate`.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
containerd's `WithUser` function now resets this property, starting with
[3eda46af12b1deedab3d0802adb2e81cb3521950][1] (v1.7.0-beta.4), so we no
longer need this function.
[1]: 3eda46af12
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/opencontainers/runc/libcontainer/user package was moved
to a separate module. While there's still uses of the old module in
our code-base, runc itself is migrating to the new module, and deprecated
the old package (for runc 1.2).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit def549c8f6 passed through the context
to the daemon.ContainerStart function. As a result, restarting containers
no longer is an atomic operation, because a context cancellation could
interrupt the restart (between "stopping" and "(re)starting"), resulting
in the container being stopped, but not restarted.
Restarting a container, or more factually; making a successful request on
the `/containers/{id]/restart` endpoint, should be an atomic operation.
This patch uses a context.WithoutCancel for restart requests.
It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).
Before this patch, starting a container that bind-mounts the docker socket,
then restarting itself from within the container would cancel the restart
operation. The container would be stopped, but not started after that:
docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3a2a741c65ff docker:cli "docker-entrypoint.s…" 26 seconds ago Exited (128) 7 seconds ago myself
With this patch: the stop still cancels the exec, but does not cancel the
restart operation, and the container is started again:
docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4393a01f7c75 docker:cli "docker-entrypoint.s…" About a minute ago Up 4 seconds myself
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix a silly bug in the implementation which had the effect of
len(h.Xattrs) blank entries being inserted in the middle of
orderedHeaders. Luckily this is not a load-bearing bug: empty headers
are ignored as the tarsum digest is computed by concatenating header
keys and values without any intervening delimiter.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing pkg/archive unit tests are primarily round-trip tests which
assert that pkg/archive produces tarballs which pkg/archive can unpack.
While these tests are effective at catching regressions in archiving or
unarchiving, they have a blind spot for regressions in compatibility
with the rest of the ecosystem. For example, a typo in the capabilities
extended attribute constant would result in subtly broken image layer
tarballs, but the existing tests would not catch the bug if both the
archiving and unarchiving implementations have the same typo.
Extend the test for archiving an overlay filesystem layer to assert that
the overlayfs style whiteouts (extended attributes and device files) are
transformed into AUFS-style whiteouts (magic file names).
Extend the test for archiving files with extended attributes to assert
that the extended attribute is encoded into the file's tar header in the
standard, interoperable format compatible with the rest of the
ecosystem.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Use context.WithoutCancel so that both the containerStop and
container.Wait can share the same parent context. This context is still
a "TODO", but can be wired up in future.
It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Follow-up to fc94ed0a86. Now that
f6e44bc0e8 added the compatcontext
package, we can start using context.WithoutCancel.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In the tagged case the error message when the image/tag is not found
should be "tag does not exist: ref"
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
While there is nothing inherently wrong with goto statements, their use
here is not helping with readability.
Signed-off-by: Cory Snider <csnider@mirantis.com>
ds.cache is never nil so the uncached code paths are unreachable in
practice. And given how many KVObject deep-copy implementations shallow
copy pointers and other reference-typed values, there is the distinct
possibility that disabling the datastore cache could break things.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The datastore cache only uses the reference to its datastore to get a
reference to the backing store. Modify the cache to take the backing
store reference directly so that methods on the datastore can't get
called, as that might result in infinite recursion between datastore and
cache methods.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Also removing some waitRun call, as they were not actually checked for
results, and the tests depended on that behavior (to get events about
the container starting etc).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When choosing the next image, don't reject images without the classic
builder parent label. The intention was to *prefer* images them instead
of making that a condition.
This fixes the ID not being filled for parent images that weren't built
with the classic builder.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `Tags` slice of each history entry was filled with tags of parent
image. Change it to correctly assign the current image tags.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Check for accurate values that may contain content sizes unknown to the
usage test in the calculation. Avoid asserting using deep equals when
only the expected value range is known to the test.
Signed-off-by: Derek McGowan <derek@mcg.dev>
Add IP_NF_MANGLE to "Generally Required" kernel features, since it appears to be necessary for Docker Swarm to work.
Closes https://github.com/moby/moby/issues/46636
Signed-off-by: Stephan Henningsen <stephan-henningsen@users.noreply.github.com>
Inline the tortured logic for deciding when to skip updating the svc
records to give us a fighting chance at deciphering the logic behind the
logic and spotting logic bugs.
Update the service records synchronously. The only potential for issues
is if this change introduces deadlocks, which should be fixed by
restrucuting the mutexes rather than papering over the issue with
sketchy hacks like deferring the operation to a goroutine.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Its only remaining purpose is to elide removing the endpoint from the
service records if it was not previously added. Deleting the service
records is an idempotent operation so it is harmless to delete service
records which do not exist.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The service db entry for each network is deleted by
(*Controller).cleanupServiceDiscovery() when the network is deleted.
There is no need to also eagerly delete it whenever the network's
endpoint count drops to zero.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The logic to rename an endpoint includes code which would synchronize
the renamed service records to peers through the distributed datastore.
It would trigger the remote peers to pick up the rename by touching a
datastore object which remote peers would have subscribed to events on.
The code also asserts that the local peer is subscribed to updates on
the network associated with the endpoint, presumably as a proxy for
asserting that the remote peers would also be subscribed.
https://github.com/moby/libnetwork/pull/712
Libnetwork no longer has support for distributed datastores or
subscribing to datastore object updates, so this logic can be deleted.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The meaning of the (*Controller).isDistributedControl() method is not
immediately clear from the name, and it does not have any doc comment.
It returns true if and only if the controller is neither a manager node
nor an agent node -- that is, if the daemon is _not_ participating in a
Swarm cluster. The method name likely comes from the old abandoned
datastore-as-IPC control plane architecture for libnetwork. Refactor
c.isDistributedControl() -> !c.isSwarmNode()
to make it easier to understand code which consumes the method.
Signed-off-by: Cory Snider <csnider@mirantis.com>
server: prohibit more than MaxConcurrentStreams handlers from running at once
(CVE-2023-44487).
In addition to this change, applications should ensure they do not leave running
tasks behind related to the RPC before returning from method handlers, or should
enforce appropriate limits on any such work.
- https://github.com/grpc/grpc-go/compare/v1.56.2...v1.56.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We support importing images for other platforms when
using the containerd image store, so we shouldn't validate
the image OS on import.
This commit also splits the test into two, so that we can
keep running the "success" import with a custom platform tests
running w/ c8d while skipping the "error/rejection" test cases.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
After a successful push, all pushed blobs should have a
distribution.source label pointing to the new registry.
Before this commit, the label was only appended to the top-level blob
(manifest or manifest list). Adjust this to also do that recursively to
its children.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use the distribution code to query the remote repository for tags and
pull them sequentially just like the non-c8d pull.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add a function to return tags for the given repository reference. This
is needed to implement the `pull -a` (pull all tags) for containerd
which doesn't directly use distribution, but we need to somehow make an
API call to the registry to obtain the available tags.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When the default bridge is disabled by setting dockerd's `--bridge=none`
option, the daemon still creates a sandbox for containers with no
network attachment specified. In that case `NetworkDisabled` will be set
to true.
However, currently the `releaseNetwork` call will early return if
NetworkDisabled is true. Thus, these sandboxes won't be deleted until
the daemon is restarted. If a high number of such containers are
created, the daemon would then take few minutes to start.
See https://github.com/moby/moby/issues/42461.
Signed-off-by: payall4u <payall4u@qq.com>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This is a follow-up to 2216d3ca8d, which
implemented the StartInterval for health-checks, but did not add validation
for the minimum accepted interval;
> The time to wait between checks in nanoseconds during the start period.
> It should be 0 or at least 1000000 (1 ms). 0 means inherit.
This patch adds validation for the minimum accepted interval (1ms).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While updating the docker/docker dependency in BuildKit, I noticed that the
dependency tree showed _two_ separate versions of the semconv package;
BuildKit and containerd were using the v1.17.0 version and docker/docker was
using v1.7.0.
This patch updates the version we use to align with BuildKit and containerd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Rename all variables/fields/map keys associated with the
`com.docker.network.host_ipv4` option from `HostIP` to `HostIPv4`.
Rationale:
* This makes the variable/field name consistent with the option
name.
* This makes the code more readable because it is clear that the
variable/field does not hold an IPv6 address. This will hopefully
avoid bugs like <https://github.com/moby/moby/issues/46445> in the
future.
* If IPv6 SNAT support is ever added, the names will be symmetric.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
Rather than pass an `iptables.IPVersion` value alongside every
`iptRule` parameter, embed the IP version in the `iptRule` struct.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
That field was only used to pass `-t nat` for NAT rules. Now `-t
<tableName>` (where `<tableName>` is one of the `iptables.Table`
values) is always passed, eliminating the need for `preArgs`.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
Pass the entire `*networkConfiguration` struct to
`setupIPTablesInternal` to simplify the function signature and improve
code readability.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
update buildkit to the latest code in the v0.12 branch:
full diff: f94ed7cec3...6560bb937e
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This reverts commit 8777592397, which
turns out to break other test cases/the registry flow.
The correct place to handle missing credentials is instead
15bf23df09/remotes/docker/authorizer.go (L200).
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use a unique parent view snapshot key for each diff request.
I considered using singleflight at first, but I realized it wouldn't
really be correct.
The diff can take some time, so there's a window of time between the
diff start and finish, where the file system can change.
These changes not always will be reflected in the running diff.
With singleflight, the second diff request which happened before the
previous diff was finished, would not include changes made to the
container filesystem after the first diff request has started.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Implement a behavior from the graphdriver's export where `docker save
something` (untagged reference) would export all images matching the
specified repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
On modern kernels this is an alias; however newer code has preferred
ctstate while older code has preferred the deprecated 'state' name.
Prefer the newer name for uniformity in the rules libnetwork creates,
and because some implementations/distributions of the xtables userland
tools may not support the legacy alias.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The image that this test pulls contains an error in the linux/amd64
manifest description, the reported size is 424 but the actual size is
524, making this test fail with containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
go1.21.3 (released 2023-10-10) includes a security fix to the net/http package.
See the Go 1.21.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.21.3+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.21.2...go1.21.3
From the security mailing:
[security] Go 1.21.3 and Go 1.20.10 are released
Hello gophers,
We have just released Go versions 1.21.3 and 1.20.10, minor point releases.
These minor releases include 1 security fixes following the security policy:
- net/http: rapid stream resets can cause excessive work
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource consumption.
While the total number of requests is bounded to the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing.
HTTP/2 servers now bound the number of simultaneously executing
handler goroutines to the stream concurrency limit. New requests
arriving when at the limit (which can only happen after the client
has reset an existing, in-flight request) will be queued until a
handler exits. If the request queue grows too large, the server
will terminate the connection.
This issue is also fixed in golang.org/x/net/http2 v0.17.0,
for users manually configuring HTTP/2.
The default stream concurrency limit is 250 streams (requests)
per HTTP/2 connection. This value may be adjusted using the
golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams
setting and the ConfigureServer function.
This is CVE-2023-39325 and Go issue https://go.dev/issue/63417.
This is also tracked by CVE-2023-44487.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.21.2 (released 2023-10-05) includes one security fixes to the cmd/go package,
as well as bug fixes to the compiler, the go command, the linker, the runtime,
and the runtime/metrics package. See the Go 1.21.2 milestone on our issue
tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.21.2+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.21.1...go1.21.2
From the security mailing:
[security] Go 1.21.2 and Go 1.20.9 are released
Hello gophers,
We have just released Go versions 1.21.2 and 1.20.9, minor point releases.
These minor releases include 1 security fixes following the security policy:
- cmd/go: line directives allows arbitrary execution during build
"//line" directives can be used to bypass the restrictions on "//go:cgo_"
directives, allowing blocked linker and compiler flags to be passed during
compliation. This can result in unexpected execution of arbitrary code when
running "go build". The line directive requires the absolute path of the file in
which the directive lives, which makes exploting this issue significantly more
complex.
This is CVE-2023-39323 and Go issue https://go.dev/issue/63211.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/golang/net/compare/v0.13.0...v0.17.0
This fixes the same CVE as go1.21.3 and go1.20.10;
- net/http: rapid stream resets can cause excessive work
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource consumption.
While the total number of requests is bounded to the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing.
HTTP/2 servers now bound the number of simultaneously executing
handler goroutines to the stream concurrency limit. New requests
arriving when at the limit (which can only happen after the client
has reset an existing, in-flight request) will be queued until a
handler exits. If the request queue grows too large, the server
will terminate the connection.
This issue is also fixed in golang.org/x/net/http2 v0.17.0,
for users manually configuring HTTP/2.
The default stream concurrency limit is 250 streams (requests)
per HTTP/2 connection. This value may be adjusted using the
golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams
setting and the ConfigureServer function.
This is CVE-2023-39325 and Go issue https://go.dev/issue/63417.
This is also tracked by CVE-2023-44487.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.
This patch moves our own uses of the package to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix storageDriver gcs not registered in binaries
- reference: replace uses of deprecated function SplitHostname
- Dont parse errors as JSON unless Content-Type is set to JSON
- update to go1.20.8
- Set Content-Type header in registry client ReadFrom
- deprecate reference package, migrate to github.com/distribution/reference
- digestset: deprecate package in favor of go-digest/digestset
- Do not close HTTP request body in HTTP handler
full diff: https://github.com/distribution/distribution/compare/v2.8.2...v2.8.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`docker build --squash` is an experimental feature which is not
implemented for containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`docker.io` is present in the `IndexConfigs` so the `Mirrors` property
would get lost because a fresh `RegistryConfig` object was created.
Instead of creating a new object, reuse the existing one and just
mutate its fields.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Extract the distribution source label append into its own function and
make it not fail on any error, we do still log the error.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These are not yet implemented with containerd snapshotters. We skip them
now because implementing this is not trivial with containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
So far, internal networks were only isolated from the host by iptables
DROP rules. As a consequence, outbound connections from containers would
timeout instead of being "rejected" through an immediate ICMP dest/port
unreachable, a TCP RST or a failing `connect` syscall.
This was visible when internal containers were trying to resolve a
domain that don't match any container on the same network (be it a truly
"external" domain, or a container that don't exist/is dead). In that
case, the embedded resolver would try to forward DNS queries for the
different values of resolv.conf `search` option, making DNS resolution
slow to return an error, and the slowness being exacerbated by some libc
implementations.
This change makes `connect` syscall to return ENETUNREACH, and thus
solves the broader issue of failing fast when external connections are
attempted.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Path prefixes were originally disallowed in the `--registry-mirrors`
option because the /v1 endpoint was assumed to be at the root of the
URI. This is no longer the case in v2.
Close#36598
Signed-off-by: Régis Behmo <regis@behmo.com>
This change creates a few OTEL spans and plumb context through the DNS
resolver and DNS backends (ie. Sandbox and Network). This should help
better understand how much lock contention impacts performance, and
help debug issues related to DNS queries (we basically have no
visibility into what's happening here right now).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This isn't something that user should do, but technically the dangling
images exist in the image store and user can pass its name (`moby-dangling@digest`).
Change it so rmi now recognizes that it's actually a dangling image and
doesn't handle it like a regular tagged image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- Pass empty containerd socket which forces the daemon to create a new
supervised containerd. Otherwise a global containerd daemon will be
used and the pulled image data will be stored in its data directory,
instead of the the newly specified `data-root` that has a limited
storage capacity.
- Don't try to use `vfs` snapshotter, instead use `native` which is
containerd's equivalent for `vfs`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Instead of passing a completely fresh context without any values, just
discard the cancellation.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This reverts commit a9fa147a92.
The commit is unfortunately broken as it is still using `providerHandle`
to write events but that handle is never actually set, so it is always
invalid. All logging fails.
Note: This is note a straight revert due to the change to
containerd/log.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
To match the graphdriver's push behavior which only shows the progress
for layers.
Exclude indexes, manifests and image configs from the push progress.
Don't explicitly check for `IsLayerType` to also handle other
potentially big blobs (like buildkit attestations).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This was mistakenly added to bklog.
Since this is getting attached to the standard logger, and bklog is
using the standard logger, we only need this added once.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Fix issue 46563 "Rootful-in-Rootless dind doesn't work since systemd v250 (due to oom score adj)"
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Before this commit, `doPack`, `doUnpack` and `doUnpackLayer` were not implemented for Darwin, causing build failure.
This change allows all non-Linux Unixes to use FreeBSD reexec-based pack/unpack implementation
See also: moby/buildkit#4059
See also: 8b843732b3
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
This required changes to the download-URL, as downloads are now provided
using the full version (including the `.0` patch version);
curl -sI https://go.dev/dl/go1.21.windows-amd64.zip | grep 'location'
location: https://dl.google.com/go/go1.21.windows-amd64.zip
curl -sI https://dl.google.com/go/go1.21.windows-amd64.zip
HTTP/2 404
# ...
curl -sI https://dl.google.com/go/go1.21.0.windows-amd64.zip
HTTP/2 200
# ...
Unfortunately this also means that the GO_VERSION can no longer be set to
versions lower than 1.21.0 (without additional changes), because older
versions do NOT provide the `.0` version, and Go 1.21.0 and up, no longer
provides URLs _without_ the `.0` version.
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test was depending on the client constructing an error based on the
http-status code, and the client not reading the response body if the
response was not a JSON response.
This fix;
- adds the correct content-type headers in the response
- includes error-messages in the response
- adds additional tests to cover both the plain (non-JSON) and JSON
error responses, as well as an empty response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was setting the content-type header after WriteHeader was
called, and the header was not sent because of that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use named return variables to make the function more self-describing
- rename variable for readability
- slightly optimize slice initialization, and keep linters happy
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copy the implementation of `context.WithoutCancel` introduced in Go 1.21
to be able to use it when building with older versions.
This will use the stdlib directly when building with Go 1.21+.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This commit moves one-shot stats processing out of the publishing
channels, i.e. collect stats directly.
Also changes the method of getSystemCPUUsage() on Linux to return
number of online CPUs also.
Signed-off-by: Xinfeng Liu <XinfengLiu@icloud.com>
It was only used in a single place, and it was defined far away from
where it was used.
Move the code inline, so that it's clear at a glance what it's doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The store field is only mutated by Controller.initStores(), which is
only called inside the cosntructor (libnetwork.New), so there should be
no need to protect the field with a mutex in non-exported functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Controller.getNetworkFromStore() already returns a ErrNoSuchNetwork if
no network was found, so we don't need to convert the existing error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently, all traces coming from the API have an empty operation
string, which make them indistinguishable from each other without looking
at the logs of the root span, and prevent proper filtering on Jaeger UI.
With this change, traces get the route pattern as the operation string.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The reason it doesn't change with the graphdrivers is caused by an
implementation detail and the fact that the image is loaded into the
same daemon it was saved from.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Rewrite TestSaveMultipleNames and TestSaveSingleTag so that they don't
use legacy `repositories` file (which isn't present in the OCI
archives).
`docker save` output is now OCI compatible, so we don't need
to use the legacy file.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The input archive is in the old Docker format that's not OCI compatible
and is not supported by the containerd archive import:
```
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/VERSION
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/json
17d1436ef796af2fc2210cc37c4672e5aa1b62cb08ac4b95dd15372321105a66/layer.tar
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/VERSION
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/json
25445a0fc5025c3917a0cd6e307d92322540e0da691614312ddea22511b71513/layer.tar
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/VERSION
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/json
9c7cb910d84346a3fbf3cc2be046f44bf0af7f11eb8db2ef1f45e93c1202faac/layer.tar
repositories
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This exposes `ACTIONS_RUNTIME_TOKEN` and `ACTIONS_CACHE_URL`, which are
used to skip cache exporter tests, when combined with
a8789cbd4a
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Now that this is a generic, we can define a struct type at the package
level, and remove the casting logic necessary when we had to use
interface{}.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
SourcePolicy was accounted for in 330cf7ae7d
TODO: replace applySourcePolicies with BuildKit's implementation, which
is currently unexported.
Co-authored-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
With BuildKit 0.12, some existing types are now required to be wrapped
by new types:
* containerd's LeaseManager and ContentStore have to be a
(namespace-aware) BuildKit type since f044e0a946
* BuildKit's solver.CacheManager is used instead of
bboltstorage.CacheKeyStorage since 2b30693409
* The MaxAge config field is a bkconfig.Duration since e06c96274f
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The following changes were required:
* integration/build: progressui's signature changed in 6b8fbed01e
* builder-next: flightcontrol.Group has become a generic type in 8ffc03b8f0
* builder-next/executor: add github.com/moby/buildkit/executor/resources types, necessitated by 6e87e4b455
* builder-next: stub util/network/Namespace.Sample(), necessitated by 963f16179f
Co-authored-by: CrazyMax <crazy-max@users.noreply.github.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The DeepEqual ignore required in the daemon tests is a bit ugly, but it
works given the new protoc output.
We also have to ignore lints related to schema1 deprecations; these do
not apply as we must continue to support this schema version.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The current executor is only tested on Linux, so let's be honest about
that. Stubbing this correctly helps avoid incorrectly trying to call
into Linux-only code in e.g. libnetwork.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Simplify the lock/unlock cycle, and make the "lookupAlias" branch
more similar to the non-lookupAlias variant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Skip faster when we're looking for aliases. Also check for the list
of aliases to be empty, not just `nil` (although in practice it should
be equivalent).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use `nameOrAlias` for the name (or alias) to resolve
- use `lookupAlias` to indicate what the intent is; this function
is either looking up aliases or "regular" names. Ideally we would
split the function, but let's keep that for a future exercise.
- name the `ipv6Miss` output variable. The "ipv6 miss" logic is rather
confusing, and should probably be revisited, but let's start with
giving the variable a name to make it more apparent what it is.
- use `nw` for networks, which is the more common local name
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove some intermediate vars, or move them closer to where they're used.
- ResolveService: use strings.SplitN to limit number of elements. This
code is only used to validate the input, results are not used.
- ResolveService: return early instead of breaking the loop. This makes
it clearer from the code that were not returning anything (nil, nil).
- Controller.sandboxCleanup(): rename a var, and slight refactor of
error-handling.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was introduced in
0a79e67e4f
Make use of it throughout our log-format handling code, and convert back
to a string before we pass it to the containerd client.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
This function was used to check if the network is a multi-host, swarm-scoped
network. Part of this check involved a check whether the cluster-agent was
present.
In all places where this function was used, the next step after checking if
the network was "cluster eligible", was to get the agent, and (again) check
if it was not nil.
This patch rewrites the isClusterEligible utility into a clusterAgent utility,
which both checks if the network is cluster-eligible, and returns the agent
(if set). For convenience, an "ok" bool is added, which callers can use to
return early (although just checking for nilness would likely have been
sufficient).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes redundant nil-checks in Endpoint.deleteServiceInfoFromCluster
and Endpoint.addServiceInfoToCluster.
These functions return early if the network is not ["cluster eligible"][1],
and the function used for that (`Network.isClusterEligible`) requires the
[agent to not be `nil`][2].
This check moved around a few times ([3][3], [4][4]), but was originally
added in [libnetwork 1570][5] which, among others, tried to avoid a nil-pointer
exception reported in [moby 28712][6], which accessed the `Controller.agent`
[without locking][7]. That issue was addressed by adding locks, adding a
`Controller.getAgent` accessor, and updating deleteServiceInfoFromCluster
to use a local var. It also sprinkled this `nil` check to be on the safe
side, but as `Network.isClusterEligible` already checks for the agent
to not be `nil`, this should not be redundant.
[1]: 5b53ddfcdd/libnetwork/agent.go (L529-L534)
[2]: 5b53ddfcdd/libnetwork/agent.go (L688-L696)
[3]: f2307265c7
[4]: 6426d1e66f
[5]: 8dcf9960aa
[6]: https://github.com/moby/moby/issues/28712
[7]: 75fd88ba89/vendor/github.com/docker/libnetwork/agent.go (L452)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When graphdriver is not provided the graphdriver is looked up
from docker info, but without quotes it may fail and set the
graphdriver to an incorrect value.
Signed-off-by: Derek McGowan <derek@mcg.dev>
It's not set when containerd is used as an image store and buildkit
never sets it either, so let's skip this test if snapshotters are used
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
We try to perform API-version negotiation as lazy as possible (and only execute
when we are about to make an API request). However, some code requires API-version
dependent handling (to set options, or remove options based on the version of the
API we're using).
Currently this code depended on the caller code to perform API negotiation (or
to configure the API version) first, which may not happen, and because of that
we may be missing options (or set options that are not supported on older API
versions).
This patch:
- splits the code that triggered API-version negotiation to a separate
Client.checkVersion() function.
- updates NewVersionError to accept a context
- updates NewVersionError to perform API-version negotiation (if enabled)
- updates various Client functions to manually trigger API-version negotiation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
moby and containerd have slightly different error messages when someone
tries to pull an image that doesn't contain the current platform,
instead of looking inside the error returned by containerd we match the
errors in the test related to what image backend we are using
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Diffing a container yielded some extra changes that come from the
files/directories that we mount inside the container (/etc/resolv.conf
for example). To avoid that we create an intermediate snapshot that has
these files, with this we can now diff the container fs with its parent
and only get the differences that were made inside the container.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Final progress messages were sent after the progress updater finished
which made the "Downloading" progress not being updated into "Download
complete".
Fix by sending the final messages after the progress has finished.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
It's still not "great", but implement a `newInterface()` constructor
to create a new Interface instance, instead of creating a partial
instance and applying "options" after the fact.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We're only using the results if the interface doesn't have an address
yet, so skip this step if we don't use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Flatten some nested "if"-statements, and improve error.
Errors returned by this function are not handled, and only logged, so
make them more informative if debugging is needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They were not consistently used, and the locations where they were
used were already "setters", so we may as well inline the code.
Also updating Namespace.Restore to keep the lock slightly longer,
instead of locking/unlocking for each property individually, although
we should consider to keep the long for the duration of the whole
function to make it more atomic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the mutex internal to the Namespace; locking/unlocking should not
be done externally, and this makes it easier to see where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Interface.Remove() was directly accessing Namespace "internals", such
as locking/unlocking. Move the code from Interface.Remove() into the
Namespace instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We weren't checking for the asked platform in the case the image was a
manifest, only if it was a manifest list.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Makes it possible to pull `application/vnd.docker.distribution.manifest.v1+prettyjws`
legacy manifests.
They are not stored in their original form but are converted to the OCI
manifests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
While this is not strictly necessary as the default OCI config masks this
path, it is possible that the user disabled path masking, passed their
own list, or is using a forked (or future) daemon version that has a
modified default config/allows changing the default config.
Add some defense-in-depth by also masking out this problematic hardware
device with the AppArmor LSM.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The ability to read these files may offer a power-based sidechannel
attack against any workloads running on the same kernel.
This was originally [CVE-2020-8694][1], which was fixed in
[949dd0104c496fa7c14991a23c03c62e44637e71][2] by restricting read access
to root. However, since many containers run as root, this is not
sufficient for our use case.
While untrusted code should ideally never be run, we can add some
defense in depth here by masking out the device class by default.
[Other mechanisms][3] to access this hardware exist, but they should not
be accessible to a container due to other safeguards in the
kernel/container stack (e.g. capabilities, perf paranoia).
[1]: https://nvd.nist.gov/vuln/detail/CVE-2020-8694
[2]: 949dd0104c
[3]: https://web.eece.maine.edu/~vweaver/projects/rapl/
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Return the number of containers that use an image if it was asked,
during a `docker system df` call for example.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This issue wasn't caught on ContainerCreate or NetworkConnect (when
container wasn't started yet).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Thus far, validation code would stop as soon as a bad value was found.
Now, we try to validate as much as we can, to return all errors to the
API client.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
So far, only a subset of NetworkingConfig was validated when calling
ContainerCreate. Other parameters would be validated when the container
was started. And the same goes for EndpointSettings on NetworkConnect.
This commit adds two validation steps:
1. Check if the IP addresses set in endpoint's IPAMConfig are valid,
when ContainerCreate and ConnectToNetwork is called ;
2. Check if the network allows static IP addresses, only on
ConnectToNetwork as we need the libnetwork's Network for that and it
might not exist until NetworkAttachment requests are sent to the
Swarm leader (which happens only when starting the container) ;
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Just return a regular error, because the API converts the error to
the expected ErrorResponse. Before/After produce the same API response:
curl -v --unix-socket /var/run/docker.sock 'http://localhost/v1.43/images/search?term=hello'
* Trying /var/run/docker.sock:0...
* Connected to localhost (/var/run/docker.sock) port 80 (#0)
> GET /v1.43/images/search?term=hello HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Api-Version: 1.44
< Content-Type: application/json
< Docker-Experimental: false
< Ostype: linux
< Server: Docker/dev (linux)
< Traceparent: 00-c38c2da5cf30305fcb66836a28e227bf-d16f4f7d2c7002a1-01
< Date: Mon, 18 Sep 2023 14:30:18 GMT
< Content-Length: 41
<
{"message":"Unexpected status code 409"}
* Connection #0 to host localhost left intact
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make `PullImage` accept `reference.Named` directly instead of
duplicating the parsing code for both graphdriver and containerd image
service implementations.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test checks how the layer store works, so we don't need it when we
use containerd as image store
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This package was introduced in af59752712
as a utility package for devicemapper, which was removed in commit
dc11d2a2d8 (v25.0.0).
It looks like there's no external consumers of this package, so we should
consider removing it, but deprecating it first, just in case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use local variables and remove some intermediate variables
- handle the events inside the switch itself; this makes all the
switch branches use the same logic, instead of "some" using
a `continue`, and others falling through to have the event handled
outside of the switch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were sending the "Pulling from ..." message too early, if the pull
progress wasn't able to resolve the image we wouldn't sent the error
back. Sending that first message would have flushed the output stream
and image_routes.go would return a nil error.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
full diff: https://github.com/containerd/containerd/compare/v1.6.22...v1.6.24
v1.6.24 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.6.23...v1.6.24
The twenty-fourth patch release for containerd 1.6 contains various fixes
and updates.
Notable Updates
- CRI: fix leaked shim caused by high IO pressure
- Update to go1.20.8
- Update runc to v1.1.9
- Backport: add configurable mount options to overlay snapshotter
- log: cleanups and improvements to decouple more from logrus
v1.6.23 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.6.22...v1.6.23
The twenty-third patch release for containerd 1.6 contains various fixes
and updates.
Notable Updates
- Add stable ABI support in windows platform matcher + update hcsshim tag
- cri: Don't use rel path for image volumes
- Upgrade GitHub actions packages in release workflow
- update to go1.19.12
- backport: ro option for userxattr mount check + cherry-pick: Fix ro mount option being passed
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add "The push referers to repository X" message which is present in the
push output when using the graphdrivers.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Update the version used in testing;
full diff: https://github.com/containerd/containerd/compare/v1.7.3...v1.7.6
v1.7.6 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.7.5...v1.7.6
The sixth patch release for containerd 1.7 contains various fixes and updates.
- Fix log package for clients overwriting the global logger
- Fix blockfile snapshotter copy on Darwin
- Add support for Linux usernames on non-Linux platforms
- Update Windows platform matcher to invoke stable ABI compability function
- Update Golang to 1.20.8
- Update push to inherit distribution sources from parent
v1.7.5 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.7.4...v1.7.5
The fifth patch release for containerd 1.7 fixes a versioning issue from
the previous release and includes some internal logging API changes.
v1.7.4 release notes:
full diff: https://github.com/containerd/containerd/compare/v1.7.3...v1.7.4
The fourth patch release for containerd 1.7 contains remote differ plugin support,
a new block file based snapshotter, and various fixes and updates.
Notable Updates
- Add blockfile snapshotter
- Add remote/proxy differ
- Update runc binary to v1.1.9
- Cri: Don't use rel path for image volumes
- Allow attaching to any combination of stdin/out/err
- Fix ro mount option being passed
- Fix leaked shim caused by high IO pressure
- Add configurable mount options to overlay snapshotter
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The API endpoint `/containers/create` accepts several EndpointsConfig
since v1.22 but the daemon would error out in such case. This check is
moved from the daemon to the api and is now applied only for API < 1.44,
effectively allowing the daemon to create containers connected to
several networks.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When registry token is provided, the authorization header can be
directly applied to the registry request. No other type of
authorization will be attempted when the registry token is provided.
Signed-off-by: Derek McGowan <derek@mcg.dev>
We occassionally receive contributions to this script that are outside
its intended scope. Let's add a comment to the script that outlines
what it's meant for, and a link to a GitHub ticket with alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Synchronize the code to do the same thing as Exec.
reap doesn't need to be called before the start event was sent.
There's already a defer block which cleans up the process in case where
an error occurs.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Error check in defer block used wrong error variable which is always nil
if the flow reaches the defer. This caused the `newProcess.Kill` to be
never called if the subsequent attemp to attach to the stdio failed.
Although this only happens in Exec (as Start does overwrite the error),
this also adjusts the Start to also use the returned error to avoid this
kind of mistake in future changes.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
None of the code using this function was setting the value, so let's
simplify and remove the argument.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When the daemon process or the host running it is abruptly terminated,
the layer metadata file can become inconsistent on the file system.
Specifically, `link` and `lower` files may exist but be empty, leading
to overlay mounting errors during layer extraction, such as:
"failed to register layer: error creating overlay mount to <path>:
too many levels of symbolic links."
This commit introduces the use of `AtomicWriteFile` to ensure that the
layer metadata files contain correct data when they exist on the file system.
Signed-off-by: Mike <mike.sul@foundries.io>
Fixes#18864, #20648, #33561, #40901.
[This GH comment][1] makes clear network name uniqueness has never been
enforced due to the eventually consistent nature of Classic Swarm
datastores:
> there is no guaranteed way to check for duplicates across a cluster of
> docker hosts.
And this is further confirmed by other comments made by @mrjana in that
same issue, eg. [this one][2]:
> we want to adopt a schema which can pave the way in the future for a
> completely decentralized cluster of docker hosts (if scalability is
> needed).
This decentralized model is what Classic Swarm was trying to be. It's
been superseded since then by Docker Swarm, which has a centralized
control plane.
To circumvent this drawback, the `NetworkCreate` endpoint accepts a
`CheckDuplicate` flag. However it's not perfectly reliable as it won't
catch concurrent requests.
Due to this design decision, API clients like Compose have to implement
workarounds to make sure names are really unique (eg.
docker/compose#9585). And the daemon itself has seen a string of issues
due to that decision, including some that aren't fixed to this day (for
instance moby/moby#40901):
> The problem is, that if you specify a network for a container using
> the ID, it will add that network to the container but it will then
> change it to reference the network by using the name.
To summarize, this "feature" is broken, has no practical use and is a
source of pain for Docker users and API consumers. So let's just remove
it for _all_ API versions.
[1]: https://github.com/moby/moby/issues/18864#issuecomment-167201414
[2]: https://github.com/moby/moby/issues/18864#issuecomment-167202589
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Windows doesn't support "FROM scratch", and the platform was only used
for validation on other platforms if a platform was provided, so no need
to set defaults.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
strong-type the fields with the expected type, to make it more explicit
what we're expecting here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
PR 4f47013feb added a validation step to `NetworkCreate` to ensure
no IPv6 subnet could be set on a network if its `EnableIPv6` parameter
is false.
Before that, the daemon was accepting such request but was doing nothing
with the IPv6 subnet.
This validation step is now deleted, and we automatically set
`EnableIPv6` if an IPv6 subnet was specified.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Use `ImageService.unpackImage` when we want to unpack an image and we
know the exact platform-manifest to be unpacked beforehand.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
DiffID is only a digest of the one tar layer and matches the snapshot ID
only for the first layer (DiffID = ChainID).
Instead of generating random ID as a key for rolayer, just use the
snapshot ID of the unpacked image content and use it later as a parent
for creating a new RWLayer.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
diffID is the digest of a tar archive containing changes to the parent
layer - rolayer doesn't have any changes to the parent.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
go1.20.8 (released 2023-09-06) includes two security fixes to the html/template
package, as well as bug fixes to the compiler, the go command, the runtime,
and the crypto/tls, go/types, net/http, and path/filepath packages. See the
Go 1.20.8 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.8+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.7...go1.20.8
From the security mailing:
[security] Go 1.21.1 and Go 1.20.8 are released
Hello gophers,
We have just released Go versions 1.21.1 and 1.20.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- cmd/go: go.mod toolchain directive allows arbitrary execution
The go.mod toolchain directive, introduced in Go 1.21, could be leveraged to
execute scripts and binaries relative to the root of the module when the "go"
command was executed within the module. This applies to modules downloaded using
the "go" command from the module proxy, as well as modules downloaded directly
using VCS software.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-39320 and Go issue https://go.dev/issue/62198.
- html/template: improper handling of HTML-like comments within script contexts
The html/template package did not properly handle HMTL-like "<!--" and "-->"
comment tokens, nor hashbang "#!" comment tokens, in <script> contexts. This may
cause the template parser to improperly interpret the contents of <script>
contexts, causing actions to be improperly escaped. This could be leveraged to
perform an XSS attack.
Thanks to Takeshi Kaneko (GMO Cybersecurity by Ierae, Inc.) for reporting this
issue.
This is CVE-2023-39318 and Go issue https://go.dev/issue/62196.
- html/template: improper handling of special tags within script contexts
The html/template package did not apply the proper rules for handling occurrences
of "<script", "<!--", and "</script" within JS literals in <script> contexts.
This may cause the template parser to improperly consider script contexts to be
terminated early, causing actions to be improperly escaped. This could be
leveraged to perform an XSS attack.
Thanks to Takeshi Kaneko (GMO Cybersecurity by Ierae, Inc.) for reporting this
issue.
This is CVE-2023-39319 and Go issue https://go.dev/issue/62197.
- crypto/tls: panic when processing post-handshake message on QUIC connections
Processing an incomplete post-handshake message for a QUIC connection caused a panic.
Thanks to Marten Seemann for reporting this issue.
This is CVE-2023-39321 and CVE-2023-39322 and Go issue https://go.dev/issue/62266.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
schema1 was deprecated a while ago, containerd fails to push to a
schema1 registry, let's just skip these tests for the containerd
integration
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Graph drivers create the parent directory with
rootPair().GID:CurrentIdentity().UID owner. This change brings these in
line
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Constants for both platform-specific and platform-independent networks
are added to the api/network package.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Before this commit, setting the `com.docker.network.host_ipv4` bridge
option when `enable_ipv6` is true and the experimental `ip6tables`
option is enabled would cause Docker to fail to create the network:
> failed to create network `test-network`: Error response from daemon:
> Failed to Setup IP tables: Unable to enable NAT rule: (iptables
> failed: `ip6tables --wait -t nat -I POSTROUTING -s fd01::/64 ! -o
> br-test -j SNAT --to-source 192.168.0.2`: ip6tables
> v1.8.7 (nf_tables): Bad IP address "192.168.0.2"
>
> Try `ip6tables -h` or `ip6tables --help` for more information.
> (exit status 2))
Fix this error by passing nil -- not the `host_ipv4` address -- when
creating the IPv6 rules.
Signed-off-by: Richard Hansen <rhansen@rhansen.org>
- store linkIndex in a local variable so that it can be reused
- remove / rename some intermediate vars that shadowed existing declaration
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This argument was originally added in libnetwork:
03f440667f
At the time, this argument was conditional, but currently it's always set
to "true", so let's remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The code ignores these errors, but will unconditionally print a warning;
> If the kernel deletion fails for the neighbor entry still remote it
> from the namespace cache. Otherwise if the neighbor moves back to the
> same host again, kernel update can fail.
Let's reduce noise if the neighbor wasn't found, to prevent logs like:
Aug 16 13:26:35 master1.local dockerd[4019880]: time="2023-08-16T13:26:35.186662370+02:00" level=warning msg="error while deleting neighbor entry" error="no such file or directory"
Aug 16 13:26:35 master1.local dockerd[4019880]: time="2023-08-16T13:26:35.366585939+02:00" level=warning msg="error while deleting neighbor entry" error="no such file or directory"
Aug 16 13:26:42 master1.local dockerd[4019880]: time="2023-08-16T13:26:42.366658513+02:00" level=warning msg="error while deleting neighbor entry" error="no such file or directory"
While changing this code, also slightly rephrase the code-comment, and
fix a typo ("remote -> remove").
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
libnetwork/osl: Namespace.DeleteNeighbor: rephrase code-comment
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
container.Run() should be a synchronous operation in normal circumstances;
the container is created and started, so polling after that for the
container to be in the "running" state should not be needed.
This should also prevent issues when a container (for whatever reason)
exited immediately after starting; in that case we would continue
polling for it to be running (which likely would never happen).
Let's skip the polling; if the container is not in the expected state
(i.e. exited), tests should fail as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
validateEndpoint was doing more than just validating; it was also implicitly
mutating the endpoint that was passed to it (by reference).
Given that validation only happend when constructing a new v1Endpoint, let's
merge these functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon would pass an EndpointCreateOption to set the interface MAC
address if the network name and the provided network mode were matching.
Obviously, if the network mode is a network ID, it won't work. To make
things worse, the network mode is never normalized if it's a partial ID.
To fix that: 1. the condition under what the container's mac-address is
applied is updated to also match the full ID; 2. the network mode is
normalized to a full ID when it's only a partial one.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Without these compile flags, Delve is unable to report the value of some
variables and it's not possible to jump into inlined code.
As the contributing docs already mention that `DOCKER_DEBUG` should
disable "build optimizations", the env var is reused here instead of
introducing a new one.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
validateEndpoint uses `v1Endpoint.ping` to verify if the search API can
use a secure connection, and to fall back to basic auth. For Docker Hub,
we don't allow insecure connections, and `v1Endpoint.ping` will not connect
to Docker Hub (Docker Hub also does not implement the `_ping` endpoint,
so doing so would always fail).
Let's make it more clear that we don't do any validation, and return
early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We have the response available, which is an io.Reader, so we don't have
to read the entire response into memory before decoding.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was making a request to the `_ping` endpoint, which (if
implemented) would return a JSON response, which we unmarshal (the only
field we use from the response is the `Standalone` field).
However, if the response had a `X-Docker-Registry-Standalone`, that header
took precedence, and would overwrite the earlier `Standalone` value we
obtained from the JSON response.
This patch adds a fast-path for situations where the header is present,
in which case we can skip handling the JSON response altogether.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We don't really want the daemon to panic for this so let's log a warning
about max downloads and uploads
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
With containerd image store the images don't depend on each other even
if they share the same content and it's totally fine to delete the
"parent" image.
The skip is necessary because deleting the "parent" image does not
produce an error with the c8d image store and deleting the `busybox`
image breaks other tests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- The `Version` field was not used for any purpose, other than a debug log
- The `X-Docker-Registry-Version` header was part of the registry v1 spec,
however, as we're not using the `Version` field, we don't need the
header for anything.
- The `X-Docker-Registry-Config` header was only set by the mock registry;
there's no code consuming it, so we don't need to mock it (even if an
actual v1 registry / search API would return it).
It's also worth noting that we never call the `_ping` endpoint when using
Docker Hub's search API, and Docker Hub does not even implement the `_ping`
endpoint;
curl -fsSL https://index.docker.io/_ping | head -n 4
<!DOCTYPE html>
<html lang="en">
<head>
<title>Docker</title>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In some cases, when the daemon launched by a test panics and quits, the
cleanup code would end with an error when trying to kill it by its pid.
In those cases the whole suite will end up waiting for the daemon that
we start in .integration-daemon-start to finish and we end up waiting 2
hours for the CI to cancel after a timeout.
Using process substitution makes the integration tests quit.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Images built by classic builder will have an additional label (in the
containerd image object, not image config) pointing to a parent of that
image.
This allows to differentiate intermediate images (dangling
images created as a result of a each Dockerfile instruction) from the
final images.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Reduce some of the boiler-plating, and by combining the tests, we skip
the testenv.Clean() in between each of the tests. Performance gain isn't
really measurable, but every bit should help :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
container.Run should be an synchronous operation; the container should
be running after the request was made (or produce an error). Simplify
these tests, and remove the redundant polling.
These were added as part of 8f800c9415,
but no such polls were in place before the refactor, and there's no
mention of these during review of the PR, so I assume these were just
added either as a "precaution", or a result of "copy/paste" from another
test.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was failing frequently on Windows, where the test was waiting
for the container to exit before continuing;
=== FAIL: github.com/docker/docker/integration/container TestResizeWhenContainerNotStarted (18.69s)
resize_test.go:58: timeout hit after 10s: waiting for container to be one of (exited), currently running
It looks like this test is merely validating that a container in any non-
running state should produce an error, so there's no need to run a container
(waiting for it to stop), and just "creating" a container (which would be
in `created` state) should work for this purpose.
Looking at 8f800c9415, I see `createSimpleContainer`
and `runSimpleContainer` utilities were added, so I'm even wondering if the
original intent was to use `createSimpleContainer` for this test.
While updating, also check if we get the expected error-type, instead of
only checking for the error-message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement a function that returns an error to replace existing uses of
the IsOSSupported utility, where callers had to produce the error after
checking.
The IsOSSupported function was used in combination with images, so implementing
a utility in "image" to prevent having to import pkg/system (which contains many
unrelated functions)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This wires up the integration tests to export spans to a jager instance.
After tests are finished it exports the data out of jaeger and uploads
as an artifact to the action run.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Integration tests will now configure clients to propagate traces as well
as create spans for all tests.
Some extra changes were needed (or desired for trace propagation) in the
test helpers to pass through tracing spans via context.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This uses otel standard environment variables to configure tracing in
the daemon.
It also adds support for propagating trace contexts in the client and
reading those from the API server.
See
https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
for details on otel environment variables.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- check if we have to download layers and print the approriate message
- show the digest of the pulled manifest(list)
- skip pulling if we already have the right manifest
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The compressor is already closed a few lines below and there's no error
returns between so the defer is not needed.
Calling Close twice on a writerCloserWrapper is unsafe as it causes it
to put the same buffer to the pool multiple times.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When running a `docker cp` to copy files to/from a container, the
lookup of the `getent` executable happens within the container's
filesystem, so we cannot re-use the results.
Unfortunately, that also means we can't preserve the results for
any other uses of these functions, but probably the lookup should not
be "too" costly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This causes the test to have a saner error message when the `images
-q` returns multiple images separated by newline.
Before this the test would fail with `invalid reference format` when
parsing the multiline string as an image reference.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The test-file had a duplicate definition for ErrNotImplemented, which
caused an error in this package, and was not used otherwise, so we can
remove this file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updated the description to clarify that this is the endpoint to use if
you want to pull an image.
Signed-off-by: David Karlsson <35727626+dvdksn@users.noreply.github.com>
This makes the c8d code which creates/reads OCI types not lose
Docker-specific features like ONBUILD or Healthcheck.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Parent is a graph-driver only field which is stored in the ImageStore.
It's not available when using containerd snapshotters.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
These were dependent on the DOCKER_ENGINE_GOARCH environment variable
but this var was no longer set. There was also some weird check to see
if the architecture is "windows" which doesn't make sense. Seeing how
nothing failed ever since the TIMEOUT was no longer platform-dependent
we can safely remove this check.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The test considered `Foo/bar` to be an invalid name, with the assumption
that it was `[docker.io]/Foo/bar`. However, this was incorrect, and the
test passed because the reference parsing had a bug; if the first element
(`Foo`) is not lowercase (so not a valid namespace / "path element"), then
it *should* be considered a domain (as uppercase domain names are valid).
The reference parser did not account for this, and running the test with
a version of the parser with a fix caused the test to fail:
=== Failed
=== FAIL: client TestImageTagInvalidSourceImageName/invalidRepo/FOO/bar (0.00s)
image_tag_test.go:54: assertion failed: expected error to contain "not a valid repository/tag", got "Error response from daemon: client should not have made an API call"
Error response from daemon: client should not have made an API call
=== FAIL: client TestImageTagInvalidSourceImageName (0.00s)
This patch removes the faulty test-case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add back files at the old locations, as there may be external links
referencing the specification.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This merges the v1.2 specs to provide a single history of the
specification.
To view the combined history:
git log --follow image/spec/spec.md
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in b18ae8c9cc, which
was part of v1.12.0-rc1 and later, which used image spec v1.2.0.
This patch amends the v1.2 spec to include the missing field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in 9db5db1b94, which
was part of v1.10.0-rc1 and later, which used image spec v1.1.0.
It's worth noting that documentation for the v1.1.0 image spec was not
yet available until commit 4fa0eccd10,
which was included in v1.12.0-rc1 and up. The `ArgsEscaped` field was
also adopted by the OCI image spec since [v1.1.0-rc3][1], but considered
deprecated, and not recommended to be used.
This patch amends the v1.1 and v1.2 specifications to describe the field.
[1]: 59780aa569
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in commit 9f994c9646,
which was merged before the image-spec v1.0.0 was released (which happened
in commit 79910625f0).
This patch backfills the specifications to describe the property.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove some trailing commas, which made the JSON invalid (some of these
were fixed in the 1.2 spec, but not in older versions).
- synchronise some formatting / phrasing between versions, to make them
easier to compare.
- remove non-breaking spaces (`NBSP`) in example outputs, and replace
them with regular spaces.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While there's not much we can do if we failed to store a snapshot of the
container's state, let's log the error in case it happens in stad of discarding.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Daemon.handleContainerExit() returns an error if snapshotting the container's
state to disk fails. There's not much we can do with the error if it occurs,
but let's log the error if that happens, instead of discarding it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function either had to create a new StaticRoute, or add the destination
to the list of routes. Skip creating a StaticRoute struct if we're not
gonna use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Tests are failing with this error:
E ValueError: scheme http+docker is invalid
Which is reported in docker-py in https://github.com/docker/docker-py/issues/1478.
Not sure what changed in the tests, but could be due to updated Python
version or dependencies, but let's skip it for now.
Test failure:
___________ AttachContainerTest.test_run_container_reading_socket_ws ___________
tests/integration/api_container_test.py:1245: in test_run_container_reading_socket_ws
pty_stdout = self.client.attach_socket(container, opts, ws=True)
docker/utils/decorators.py:19: in wrapped
return f(self, resource_id, *args, **kwargs)
docker/api/container.py:98: in attach_socket
return self._attach_websocket(container, params)
docker/utils/decorators.py:19: in wrapped
return f(self, resource_id, *args, **kwargs)
docker/api/client.py:312: in _attach_websocket
return self._create_websocket_connection(full_url)
docker/api/client.py:315: in _create_websocket_connection
return websocket.create_connection(url)
/usr/local/lib/python3.7/site-packages/websocket/_core.py:601: in create_connection
websock.connect(url, **options)
/usr/local/lib/python3.7/site-packages/websocket/_core.py:245: in connect
options.pop('socket', None))
/usr/local/lib/python3.7/site-packages/websocket/_http.py:117: in connect
hostname, port, resource, is_secure = parse_url(url)
/usr/local/lib/python3.7/site-packages/websocket/_url.py:62: in parse_url
raise ValueError("scheme %s is invalid" % scheme)
E ValueError: scheme http+docker is invalid
------- generated xml file: /src/bundles/test-docker-py/junit-report.xml -------
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 1980deffae, which
changed the implementation, but forgot to update imports in these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There is no meaningful distinction between driverapi.Registerer and
drvregistry.DriverNotifyFunc. They are both used to register a network
driver with an interested party. They have the same function signature.
The only difference is that the latter could be satisfied by an
anonymous closure. However, in practice the only implementation of
drvregistry.DriverNotifyFunc is the
(*libnetwork.Controller).RegisterDriver method. This same method also
makes the libnetwork.Controller type satisfy the Registerer interface,
therefore the DriverNotifyFunc type is redundant. Change
drvregistry.Networks to notify a Registerer and drop the
DriverNotifyFunc type.
Signed-off-by: Cory Snider <csnider@mirantis.com>
On startup all local volumes were unmounted as a cleanup mechanism for
the non-clean exit of the last engine process.
This caused live-restored volumes that used special volume opt mount
flags to be broken. While the refcount was restored, the _data directory
was just unmounted, so all new containers mounting this volume would
just have the access to the empty _data directory instead of the real
volume.
With this patch, the mountpoint isn't unmounted. Instead, if the volume
is already mounted, just mark it as mounted, so the next time Mount is
called only the ref count is incremented, but no second attempt to mount
it is performed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A package only needs one "import" comment to enforce, so keeping
one in the go.doc.
It should be noted that even with that; in most cases, go will ignore
these comments (if go modules are used, even in "vendor" mode).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single test, and was not using any of
the gotest.tools features, so let's remove it as dependency.
With this, the package has no external dependencies (only stdlib).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Log a warning if we encounter an error when releasing leases. While it
may not have direct consequences, failing to release the lease should be
unexpected, so let's make them visible.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was used when the code supported both "v1" and "v2" registries.
We no longer support v1 registries, and the only v1 endpoint that's still
used is for the legacy "search" endpoint, which does not use the APIEndpoint
type.
As no code is using this field, and the value will always be set to "v2",
we can deprecated the Version field.
I'm keeping this field for 1 release, to give notice to any potential
external consumer, after which we can delete it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure that the content in the live-restored volume mounted in a new
container is the same as the content in the old container.
This checks if volume's _data directory doesn't get unmounted on
startup.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Explain that search is not supported on v2 endpoints, and include the
offending endpoint in the error-message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
First, remove the loop over `apiVersions`. The `apiVersions` map has two
entries (`APIVersion1 => "v1"`, and `APIVersion2 => "v2"`), and `APIVersion1`
is skipped, which means that the loop effectively translates to;
if apiVersionStr == "v2" {
return "", invalidParamf("unsupported V1 version path %s", apiVersionStr)
}
Which leaves us with "anything else" being returned as-is.
This patch removes the loop, and replaces the remaining handling to check
for the "v2" suffix to produce an error, or to strip the "v1" suffix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Combine the two tests into a TestV1EndpointParse function, and rewrite
them to use gotest.tools for asserting.
Also changing the test-cases to use "https://", as the scheme doesn't
matter for this test, but using "http://" may trip-up some linters,
so let's avoid that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 70ad5b818f changed event.Type
to be a strong type, no longer an alias for string. for some reason,
this test passed on the PR, but failed later on;
=== Failed
=== FAIL: daemon/events TestLoadBufferedEventsOnlyFromPast (0.00s)
events_test.go:203: assertion failed: network (messages[0].Type events.Type) != network (string)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the error message slightly clearer on "what" part is not valid,
and provide suggestions on what are acceptable values.
Before this change:
docker create --restart=always:3 busybox
Error response from daemon: invalid restart policy: maximum retry count cannot be used with restart policy 'always'
docker create --restart=always:-1 busybox
Error response from daemon: invalid restart policy: maximum retry count cannot be used with restart policy 'always'
docker create --restart=unknown busybox
Error response from daemon: invalid restart policy 'unknown'
After this change:
docker create --restart=always:3 busybox
Error response from daemon: invalid restart policy: maximum retry count can only be used with 'on-failure'
docker create --restart=always:-1 busybox
Error response from daemon: invalid restart policy: maximum retry count can only be used with 'on-failure' and cannot be negative
docker create --restart=unknown busybox
Error response from daemon: invalid restart policy: unknown policy 'unknown'; use one of 'no', 'always', 'on-failure', or 'unless-stopped'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were testing the deprecated fields, instead of their non-deprecated
alternatives.
This patch adds a utility to verify that they match, and rewrites the tests
to check the non-deprecated fields instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- clean up "//import" comment, as test-files cannot be imported, and only
one "//import" comment is needed per package.
- remove some intermediate variables
- rewrite assertions to use gotest.tools
- use assert.Check()) (non-fatal) where possible
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added in 247f4796d2, and
at the time was added as an alias for string;
> api/types/events: add "Type" type for event-type enum
>
> Currently just an alias for string, but we can change it to be an
> actual type.
Now that all code uses the defined types, we should be able to make
this an actual type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updating the version of crun that we use in our tests to a version that
supports the "features" command (crun v1.8.6 and up). This should prevent
some warnings in our logs:
WARN[2023-08-26T17:05:35.042978552Z] Failed to run [/usr/local/bin/crun features]: "unknown command features\n" error="exit status 1"
full diff: https://github.com/containers/crun/compare/1.4.5...1.8.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add a fast-patch to some functions, to prevent locking/unlocking,
or other operations that would not be needed;
- Network.addDriverInfoToCluster
- Network.deleteDriverInfoFromCluster
- Network.addServiceInfoToCluster
- Network.deleteServiceInfoFromCluster
- Network.addDriverWatches
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- return early when failing to fetch the driver
- store network-ID and controller in a variable to prevent repeatedly
locking/unlocking. We don't expect the network's ID to change
during this operation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This method is not part of any interface, and identical to Endpoint.Iface,
but one returns an Interface-type (driverapi.InterfaceInfo) and the other
returns a concrete type (EndpointInterface).
Interface-matching should generally happen on the receiver side, and this
function was only used in a single location, and passed as argument to
Driver.CreateEndpoint, which already matches the interface by accepting
a driverapi.InterfaceInfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Interface-matching should generally happen on the receiver side, and this
function was only used in a single location, and passed as argument to
Driver.CreateEndpoint, which already matches the interface by accepting
a driverapi.InterfaceInfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Interface-matching should generally happen on the receiver side, and this
function was only used in a single location, and passed as argument to
Driver.CreateEndpoint, which already matches the interface by accepting
a driverapi.InterfaceInfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also swapping the order of arguments; putting the "attributes" arguments
last, so that variables can be more cleanly inlined.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The content of this file was removed in c0bc14e8dd,
and all it container since was the package name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The devicemapper graphdriver has been removed in commit
dc11d2a2d8, and we should
no longer need this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This check was added in 98fe4bd8f1, to check
whether dm_task_deferred_remove could be removed, and to conditionally set
the corresponding build-tags. Now that the devicemapper graphdriver has been
removed in dc11d2a2d8, we no longer need this.
This patch:
- removes uses of the (no longer used) `libdm`, `dlsym_deferred_remove`,
and `libdm_no_deferred_remove` build-tags.
- removes the `add_buildtag` utility, which is now unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This documentation moved to a different page, and the Go documentation
moved to the https://go.dev/ domain.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit ab35df454d removed most of the pre-go1.17
build-tags, but for some reason, "go fix" doesn't remove these, so removing
the remaining ones manually
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copying relevant documentation from the EndpointInfo interface. We should
remove this interface, and the related Info() function, but it's currently
acting as a "gate" to prevent accessing the Endpoint's accessors without
making sure it's fully hydrated.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While working on this code, I noticed that there's currently an issue
with userns enabled. When userns is enabled, joining another container's
namespace must also join its user-namespace.
However, a container can only be in a single user namespace, so if a
container joins namespaces from multiple containers, latter user-namespaces
overwrite former ones.
We must add validation for this, but in the meantime, add notes / todo's.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Most error-message returned would already include "container" and the
container ID in the error-message (e.g. "container %s is not running"),
so there's no need to add a custom prefix for that.
- os.Stat returns a PathError, which already includes the operation ("stat"),
the path, and the underlying error that occurred.
And while updating, let's also fix the name to be proper camelCase :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function didn't need the whole container, only its ID, so let's
use that as argument. This also makes it consistent with getIpcContainer.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`Daemon.getPidContainer()` was wrapping the error-message with a message
("cannot join PID of a non running container") that did not reflect the
actual reason for the error; `Daemon.GetContainer()` could either return
an invalid parameter (invalid / empty identifier), or a "not found" error
if the specified container-ID could not be found.
In the latter case, we don't want to return a "not found" error through
the API, as this would indicate that the container we're _starting_ was
not found (which is not the case), so we need to convert the error into
an `errdefs.ErrInvalidParameter` (the container-ID specified for the PID
namespace is invalid if the container doesn't exist).
This logic is similar to what we do for IPC namespaces. which received
a similar fix in c3d7a0c603.
This patch updates the error-types, and moves them into the getIpcContainer
and getPidContainer container functions, both of which should return
an "invalid parameter" if the container was not found.
It's worth noting that, while `WithNamespaces()` may return an "invalid
parameter" error, the `start` endpoint itself may _not_ be. as outlined
in commit bf1fb97575, starting a container
that has an invalid configuration should be considered an internal server
error, and is not an invalid _request_. However, for uses other than
container "start", `WithNamespaces()` should return the correct error
to allow code to handle it accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were using a mixture of approaches for these; aligning them a bit
to all use switch statements.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 12485d62ee to save some
duplication, but was really over-engineered to save a few lines of code,
at the cost of hiding away what it does and also potentially returning
inconsistent errors (not addressed in this patch). Let's start with
inlining these.
This removes;
- Daemon.checkContainer
- daemon.containerIsRunning
- daemon.containerIsNotRestarting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We used to have to clone and build the registry v2 but now that we have
updated the version we can directtly copy the binary from the official
distribution/distribution image.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Previous image created a new partially filled image.
This caused child images to lose their parent's layers.
Instead of creating a new object and trying to replace its fields, just
clone the original passed image and change its ID to the manifest
digest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
There's only one implementation; let's use that.
Also fixing a linting issue;
libnetwork/osl/interface_linux.go:91:2: S1001: should use copy(to, from) instead of a loop (gosimple)
for i, iface := range n.iFaces {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
InterfaceOptions() returned an IfaceOptionSetter interface, which contained
"methods" that returned functional options. Such a construct could have made
sense if the functional options returned would (e.g.) be pre-propagated with
information from the Sandbox (network namespace), but none of that was the case.
There was only one implementation of IfaceOptionSetter (networkNamespace),
which happened to be the same as the only implementation of Sandbox, so remove
the interface as well, to help networkNamespace with its multi-personality
disorder.
This patch:
- removes Sandbox.Bridge() and makes it a regular function (WithIsBridge)
- removes Sandbox.Master() and makes it a regular function (WithMaster)
- removes Sandbox.MacAddress() and makes it a regular function (WithMACAddress)
- removes Sandbox.Address() and makes it a regular function (WithIPv4Address)
- removes Sandbox.AddressIPv6() and makes it a regular function (WithIPv6Address)
- removes Sandbox.LinkLocalAddresses() and makes it a regular function (WithLinkLocalAddresses)
- removes Sandbox.Routes() and makes it a regular function (WithRoutes)
- removes Sandbox.InterfaceOptions().
- removes the IfaceOptionSetter interface.
Note that the IfaceOption signature was changes as well to allow returning
an error. This is not currently used, but will be used for some options
in the near future, so adding that in preparation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
NeighborOptions() returned an NeighborOptionSetter interface, which
contained "methods" that returned functional options. Such a construct
could have made sense if the functional options returned would (e.g.)
be pre-propagated with information from the Sandbox (network namespace),
but none of that was the case.
There was only one implementation of NeighborOptionSetter (networkNamespace),
which happened to be the same as the only implementation of Sandbox, so
remove the interface as well, to help networkNamespace with its multi-personality
disorder.
This patch:
- removes Sandbox.LinkName() and makes it a regular function (WithLinkName)
- removes Sandbox.Family() and makes it a regular function (WithFamily)
- removes Sandbox.NeighborOptions().
- removes the NeighborOptionSetter interface
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
osl.NewSandbox() always returns a nil interface on Windows (and other non-Linux
platforms). This means that any code that these fields are always nil, and
any code using these fields must be considered Linux-only.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
osl.NewSandbox() always returns a nil interface on Windows (and other non-Linux
platforms). This means that any code that these fields are always nil, and
any code using these fields must be considered Linux-only;
- libnetwork/Controller.defOsSbox
- libnetwork/Sandbox.osSbox
Ideally, these fields would live in Linux-only files, but they're referenced
in various platform-neutral parts of the code, so let's start with moving
the initialization code to Linux-only files.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copying the descriptions from the Sandbox, Info, NeighborOptionSetter,
and IfaceOptionSetter interfaces that it implements.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Check if firewalld is running before running the function, so that consumers
of the function don't have to check for the status.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test is currently failing with containerd-integration, which should
be looked into, but let's start with preventing it from panicking, to make
the test-failures less noisy;
--- FAIL: TestDiskUsage/after_container.Run (0.26s)
panic: runtime error: index out of range [0] with length 0 [recovered]
panic: runtime error: index out of range [0] with length 0
goroutine 280 [running]:
testing.tRunner.func1.2({0xb07a00, 0x40002006a8})
/usr/local/go/src/testing/testing.go:1526 +0x1c8
testing.tRunner.func1()
/usr/local/go/src/testing/testing.go:1529 +0x364
panic({0xb07a00, 0x40002006a8})
/usr/local/go/src/runtime/panic.go:884 +0x1f4
github.com/docker/docker/integration/system.TestDiskUsage.func3(0x0?, {0x0, {0x14ea4a8, 0x0, 0x0}, {0x14ea4a8, 0x0, 0x0}, {0x14ea4a8, 0x0, ...}, ...})
/go/src/github.com/docker/docker/integration/system/disk_usage_test.go:82 +0x7e4
github.com/docker/docker/integration/system.TestDiskUsage.func4(0x4000235c80?)
/go/src/github.com/docker/docker/integration/system/disk_usage_test.go:118 +0x8c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also remove integration-cli: `DockerAPISuite.TestContainerAPIDeleteConflict`,
which was testing the same conditions as `TestRemoveContainerRunning` in
integration/container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Saw this failure in a flaky test, and I wondered why we consider this an
error condition;
=== RUN TestKillWithStopSignalAndRestartPolicies
main_test.go:32: assertion failed: error is not nil: Error response from daemon: Could not kill running container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7, cannot remove - Container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7 is not running: failed to remove 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7
--- FAIL: TestKillWithStopSignalAndRestartPolicies (0.84s)
=== RUN TestKillWithStopSignalAndRestartPolicies/same-signal-disables-restart-policy
--- PASS: TestKillWithStopSignalAndRestartPolicies/same-signal-disables-restart-policy (0.42s)
=== RUN TestKillWithStopSignalAndRestartPolicies/different-signal-keep-restart-policy
--- PASS: TestKillWithStopSignalAndRestartPolicies/different-signal-keep-restart-policy (0.23s)
In the above;
1. `Error response from daemon: Could not kill running container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7`
2. `cannot remove - Container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7 is not running`
3. `failed to remove 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7`
So it looks like the removal fails because we couldn't kill the container
because it was already stopped, which may be a race condition where the first
check shows the container to be running (but may already be in process to be
removed or killed. In either case, we probably shouldn't fail the removal if
the container is already stopped.
This patch adds a `isNotRunning()` utility, so that we can ignore this case,
and proceed with the removal.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function never returns an error, so let's remove the error-return,
and give it a slightly more to-the-point name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Windows uses the container-iD as ID for sandboxes, so it's not needed to
generate an ID when running on Windows.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The BuildKit dockerignore package was integrated in the patternmatcher
repository / module. This patch updates our uses of the BuildKit package
with its new location.
A small local change was made to keep the format of the existing error message,
because the "ignorefile" package is slightly more agnostic in that respect
and doesn't include ".dockerignore" in the error message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the lease doesn't exit (for example when creating the container
failed), just ignore the not found error.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to moby/moby#44968, libnetwork would happily accept a ChildSubnet
with a bigger mask than its parent subnet. In such case, it was
producing IP addresses based on the parent subnet, and the child subnet
was not allocated from the address pool.
This commit automatically fixes invalid ChildSubnet for networks stored
in libnetwork's datastore.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Currently, IPAM config is never validated by the API. Some checks
are done by the CLI, but they're not exhaustive. And some of these
misconfigurations might be caught early by libnetwork (ie. when the
network is created), and others only surface when connecting a container
to a misconfigured network. In both cases, the API would return a 500.
Although the `NetworkCreate` endpoint might already return warnings,
these are never displayed by the CLI. As such, it was decided during a
maintainer's call to return validation errors _for all API versions_.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also move the validation function to live with the type definition,
which allows it to be used outside of the daemon as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the image for the wanted platform doesn't exist then the lease
doesn't exist either. Returning this error hides the real error, so
let's not return it.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The Controller.Sandboxes method was used by some SandboxWalkers. Now
that those have been removed, there are no longer any consumers of this
method, so let's remove it for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I had a CI run fail to "Upload reports":
Exponential backoff for retry #1. Waiting for 4565 milliseconds before continuing the upload at offset 0
Finished backoff for retry #1, continuing with upload
Total file count: 211 ---- Processed file #160 (75.8%)
...
Total file count: 211 ---- Processed file #164 (77.7%)
Total file count: 211 ---- Processed file #164 (77.7%)
Total file count: 211 ---- Processed file #164 (77.7%)
A 503 status code has been received, will attempt to retry the upload
##### Begin Diagnostic HTTP information #####
Status Code: 503
Status Message: Service Unavailable
Header Information: {
"content-length": "592",
"content-type": "application/json; charset=utf-8",
"date": "Mon, 21 Aug 2023 14:08:10 GMT",
"server": "Kestrel",
"cache-control": "no-store,no-cache",
"pragma": "no-cache",
"strict-transport-security": "max-age=2592000",
"x-tfs-processid": "b2fc902c-011a-48be-858d-c62e9c397cb6",
"activityid": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-tfs-session": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-vss-e2eid": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-vss-senderdeploymentid": "63be6134-28d1-8c82-e969-91f4e88fcdec",
"x-frame-options": "SAMEORIGIN"
}
###### End Diagnostic HTTP information ######
Retry limit has been reached for chunk at offset 0 to https://pipelinesghubeus5.actions.githubusercontent.com/Y2huPMnV2RyiTvKoReSyXTCrcRyxUdSDRZYoZr0ONBvpl5e9Nu/_apis/resources/Containers/8331549?itemPath=integration-reports%2Fubuntu-22.04-systemd%2Fbundles%2Ftest-integration%2FTestInfoRegistryMirrors%2Fd20ac12e48cea%2Fdocker.log
Warning: Aborting upload for /tmp/reports/ubuntu-22.04-systemd/bundles/test-integration/TestInfoRegistryMirrors/d20ac12e48cea/docker.log due to failure
Error: aborting artifact upload
Total file count: 211 ---- Processed file #165 (78.1%)
A 503 status code has been received, will attempt to retry the upload
Exponential backoff for retry #1. Waiting for 5799 milliseconds before continuing the upload at offset 0
As a result, the "Download reports" continued retrying:
...
Total file count: 1004 ---- Processed file #436 (43.4%)
Total file count: 1004 ---- Processed file #436 (43.4%)
Total file count: 1004 ---- Processed file #436 (43.4%)
An error occurred while attempting to download a file
Error: Request timeout: /Y2huPMnV2RyiTvKoReSyXTCrcRyxUdSDRZYoZr0ONBvpl5e9Nu/_apis/resources/Containers/8331549?itemPath=integration-reports%2Fubuntu-20.04%2Fbundles%2Ftest-integration%2FTestCreateWithDuplicateNetworkNames%2Fd47798cc212d1%2Fdocker.log
at ClientRequest.<anonymous> (/home/runner/work/_actions/actions/download-artifact/v3/dist/index.js:3681:26)
at Object.onceWrapper (node:events:627:28)
at ClientRequest.emit (node:events:513:28)
at TLSSocket.emitRequestTimeout (node:_http_client:839:9)
at Object.onceWrapper (node:events:627:28)
at TLSSocket.emit (node:events:525:35)
at TLSSocket.Socket._onTimeout (node:net:550:8)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)
Exponential backoff for retry #1. Waiting for 5305 milliseconds before continuing the download
Total file count: 1004 ---- Processed file #436 (43.4%)
And, it looks like GitHub doesn't allow cancelling the job, possibly
because it is defined with `if: always()`?
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This functionality has been replaced with Controller.GetSandbox, and is
no longer used anywhere.
This patch removes:
- the Controller.WalkSandboxes method
- the SandboxContainerWalker SandboxWalker
- the SandboxWalker type
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Various parts of the code were using "walkers" to iterate over the
controller's sandboxes, and the only condition for all of them was
to find the sandbox for a given container-ID. Iterating over all
sandboxes was also sub-optimal, because on Windows, the ContainerID
is used as Sandbox-ID, which can be used to lookup the sandbox from
the "sandboxes" map on the controller.
This patch implements a GetSandbox method on the controller that
looks up the sandbox for a given container-ID, using the most optimal
approach (depending on the platform).
The new method can return errors for invalid (empty) container-IDs, and
a "not found" error to allow consumers to detect non-existing sandboxes,
or potentially invalid IDs.
This new method replaces the (non-exported) Daemon.getNetworkSandbox(),
which was only used internally, in favor of directly accessing the
controller's method.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was not exported so let's remove the abstraction to not make it look
like something more than it is.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
"Pay no attention to the implementation behind the curtain!"
There's only one implementation of the Sandbox interface, and only one implementation
of the Info interface, and they both happens to be implemented by the same type:
networkNamespace. Let's merge these interfaces.
And now that we know that there's one, and only one Info, we can drop the charade,
and relieve the Sandbox from its dual personality.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This better aligns to GHA/CI settings, and is in general a better
practice in the year 2023.
We also drop the 'unsupported' fallback for `git rev-parse` in the
Makefile; we have a better fallback behavior for an empty
DOCKER_GITCOMMIT in `hack/make.sh`.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
There are still messy special cases (e.g. DOCKER_GITCOMMIT vs VERSION),
but this makes things a little easier to follow, as we keep
GHA-specifics in the GHA files.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
This is something that stood out to me: removing the intermediate
container is part of a build step, but unlike the other output from
the build, wasn't indented (and prefixed with `--->`) to be shown
as part of the build.
This patch adds the `--->` prefix, to make it clearer what step the
removal was part of.
While at it, I also updated the message itself: this output is printed
_after_ the intermediate container has been removed, so we may as well
make it match reality, so I changed "removing" to "removed".
Before:
echo -e 'FROM busybox\nRUN echo hello > /dev/null\nRUN echo world > /dev/null\n' | DOCKER_BUILDKIT=0 docker build --no-cache -
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM busybox
---> a416a98b71e2
Step 2/3 : RUN echo hello > /dev/null
---> Running in a1a65b9365ac
Removing intermediate container a1a65b9365ac
---> 8c6b57ebebdd
Step 3/3 : RUN echo world > /dev/null
---> Running in 9fa977b763a5
Removing intermediate container 9fa977b763a5
---> 795c1f2fc7b9
Successfully built 795c1f2fc7b9
After:
echo -e 'FROM busybox\nRUN echo hello > /dev/null\nRUN echo world > /dev/null\n' | DOCKER_BUILDKIT=0 docker build --no-cache -
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM busybox
---> fc9db2894f4e
Step 2/3 : RUN echo hello > /dev/null
---> Running in 38d7c34c2178
---> Removed intermediate container 38d7c34c2178
---> 7c0dbc45111d
Step 3/3 : RUN echo world > /dev/null
---> Running in 629620285d4c
---> Removed intermediate container 629620285d4c
---> b92f70f2e57d
Successfully built b92f70f2e57d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Slightly refactor Resolver.dialExtDNS:
- use net.JoinHostPort to properly format IPv6 addresses
- define a const for the default port, and avoid int -> string
conversion if no custom port is defined
- slightly simplify logic if the HostLoopback is used (at the cost of
duplicating one line); in that case we don't need to define the closure
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in 36fd9d02be
(libnetwork: ce6c6e8c35),
because there were multiple places where a DNS response was created,
which had to use the same options. However, new "common" options were
added since, and having it in a function separate from the other (also
common) options was just hiding logic, so let's remove it.
What the above probably _should_ have done was to create a common utility
to create a DNS response (as all other options are shared as well). This
was actually done in 0c22e1bd07 (libnetwork:
be3531759b),
which added a `createRespMsg` utility, but missed that it could be used
for both cases.
This patch:
- removes the setCommonFlags function
- uses createRespMsg instead to share common options
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removes the deprecated consts, which moved to a separate "scope" package
in commit 6ec03d6745, and are no longer used;
- datastore.LocalScope
- datastore.GlobalScope
- datastore.SwarmScope
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
UnavailableError is now compatible with errdefs.UnavailableError. These
errors will now return a 503 instead of a 500.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
InvalidParameter is now compatible with errdefs.InvalidParameter. Thus,
these errors will now return a 400 status code instead of a 500.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Fix a failure to inspect image if any of its present manifest references
an image config which isn't present locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- un-export ZoneSettings, because it's only used internally
- make conversion to a "interface" slice a method on the struct
- remove the getDockerZoneSettings() function, and move the type-definition
close to where it's used, as it was only used in a single location
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test didn't make a lot of sense, because `checkRunning()` depends on
the `connection` package-var being set, which is done by `firewalldInit()`,
so would never be true on its own.
Add a small utility that opens its own D-Bus connection to verify if
firewalld is running, and otherwise skips the tests (preserving any
error in the process).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
DelInterfaceFirewalld returns an error if the interface to delete was
not found. Let's ignore cases where we were successfully able to get
the list of interfaces in the zone, but the interface was not part of
the zone.
This patch changes the error for these cases to an errdefs.ErrNotFound,
and updates IPTable.ProgramChain to ignore those errors.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As we have a hard time figuring out what moby/moby#46099 should look
like, this drop-in replacement will solve the initial formatting problem
we have. It's made internal such that we can remove it whenever we want
and unlike moby/moby#46099 doesn't require thoughtful API changes.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
This patch:
- introduces a `retErr` output variable, to be used in defer statements.
- explicitly changes some `err` uses to locally-scoped variables.
- moves some variable definitions closer to where they're used (where possible).
While working on this change, there was one point in the code where
error handling was ambiguous. I added a note for that, in case this
was not a bug:
> This code was previously assigning the error to the global "err"
> variable (before it was renamed to "retErr"), but in case of a
> "MaskableError" did not *return* the error:
> b325dcbff6/libnetwork/controller.go (L566-L573)
>
> Depending on code paths further down, that meant that this error
> was either overwritten by other errors (and thus not handled in
> defer statements) or handled (if no other code was overwriting it.
>
> I suspect this was a bug (but possible without effect), but it could
> have been intentional. This logic is confusing at least, and even
> more so combined with the handling in defer statements that check for
> both the "err" return AND "skipCfgEpCount":
> b325dcbff6/libnetwork/controller.go (L586-L602)
>
> To save future visitors some time to dig up history:
>
> - config-only networks were added in 25082206df
> - the special error-handling and "skipCfgEpcoung" was added in ddd22a8198
> - and updated in 87b082f365 to don't use string-matching
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There were quite some places where the type collided with variables
named `agent`. Let's rename the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function has _four_ output variables of the same type, and several
defer statements that checked the error returned (but using the `err`
variable).
This patch names the return variables to make it clearer what's being
returned, and renames the error-return to `retErr` to make it clearer
where we're dealing with the returned error (and not any local err), to
prevent accidentally shadowing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There's nothing handling these results, and they're logged as debug-logs,
so we may as well remove the returned variables.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both functions were generating debug logs if there was nothing to log.
The function already produces logs if things failed while deleting entries,
so these logs would only be printed if there was nothing to delete, so can
safely be discarded.
Before this change:
DEBU[2023-08-14T12:33:23.082052638Z] Revoking external connectivity on endpoint sweet_swirles (1519f9376a3abe7a1c981600c25e8df6bbd0a3bc3a074f1c2b3bcbad0438443b)
DEBU[2023-08-14T12:33:23.085782847Z] DeleteConntrackEntries purged ipv4:0, ipv6:0
DEBU[2023-08-14T12:33:23.085793847Z] DeleteConntrackEntriesByPort for udp ports purged ipv4:0, ipv6:0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This goroutine was added in c458bca6dc, and
looks for errors from the wait channel. If no error is returned, it attempts
to start the container, and *updates* the error if a failure happened while
doing so, so that the code below it can update the container's status, and
perform auto-remove (if set for the container).
However, due to the formatting of the code, it was easy to overlook that
the "err" variable was not local to the "if" statement.
This patch breaks up the if-statement in an attempt to make it clearer that
this is not a local "err" variable, and adds a code-comment explaining the
logic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The function was checking in a loop if networking for the container was
disabled. Change the function to return early, and to only set hooks
if one needs to be set.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only called as part of the "libnetwork-setkey" re-exec, so un-exporting
it to make clear it's not for external use.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The basepath is only used on Linux, so no need to call it on other
platforms. SetBasePath was already stubbed out on other platforms,
but "osl" was still imported in various places where it was not actually
used, so trying to reduce imports to get a better picture of what parts
are used (and not used).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were implicitly skipped through the `getTestEnv()` utility,
which made it hard to discover they were not ran on Windows.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to spot if code is only used on Linux. Note that "all of"
the bridge driver is Linux-only.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removing this type, because:
- containerNotModifiedError is not an actual error, and abstracting it away
was hiding some of these details. It also wasn't used as a sentinel error
anywhere, so doesn't have to be its own type.
- Defining a type just to toggle the error-message between "not running"
and "not stopped" felt a bit over-the-top, as each variant was only used once.
- So "it only had one job", and it didn't even do that right; it produced
capitalized error messages, which makes linters unhappy.
So, let's just inline what it does in the two places it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There's no need for this to be a closure; let's just make it a regular
function. While moving it out, also make some minor code-changes and
add some code-comments to describe the flow / intent, which may not
be trivial for people that are not familiar with these details.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The container rw layer may already be mounted, so it's not safe to use
it in another overlay mount. Use the ref counted mounter (which will
reuse the existing mount if it exists) to avoid that.
Also, mount the parent mounts (layers of the base image) in a read-only
mode.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
To prevent mounting the container rootfs in a rw mode if it's already
mounted. This can't use `mount.WithReadonlyTempMount` because the
archive code does a chroot with a pivot_root, which creates a new
directory in the rootfs.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Check that operations that could potentially perform overlayfs mounts
that could cause undefined behaviors.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Any error that occurs while creating the spec, even if it's the
result of an invalid container config, must be considered a System
error (internal server error), as it's not an error with the request
to start the container.
Invalid configuration in the config itself must be validated when
creating the container (creating its config), but some errors are
dependent on the current state, for example when starting a container
that shares a namespace with another container, and that container
is not running (or missing).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was only used for a single test, and it was very limited
in functionality as it only allowed for a certain error-string to be
matched.
Let's change it into a more generic function; a helper that allows a
container to be created from a `TestContainerConfig` (which can be
constructed using `NewTestConfig`) and that returns the response from
client.ContainerCreate(), so that any result from that can be tested,
leaving it up to the test to check the results.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Introduce a NewTestConfig utility, to allow using the available utilities
for constructing a config, and use them with the regular API client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `client` variable was colliding with the `client` import. In some cases
the confusing `cli` name (it's not the "cli") was used. Given that such names
can easily start spreading (through copy/paste, or "code by example"), let's
make a one-time pass through all of them in this package to use the same name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `client` variable was colliding with the `client` import in various
files. While it didn't conflict in all files, there was inconsistency
in the naming, sometimes using the confusing `cli` name (it's not the
"cli"), and such names can easily start spreading (through copy/paste,
or "code by example").
Let's make a one-time pass through all of them in this package to use
the same name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In these cases, continuing after a non nil error will result in a nil
dereference in panic.
Change the `assert.Check` to `assert.NilError` to avoid that.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test was testing the client-side validation, so might as well
move it there, and validate that the client invalidates before
trying to make an API call.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Attach the context to the request while we're creating it, instead of
creating the context first, and adding the context later.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Re-use the request, and change the method to GET instead of building
a new request "from scratch".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't exit immediately (due to `set -e` bash behavior) when grep returns
with a non-zero exit code. Use empty dirs instead and let it print
messages about all tests being filtered out.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
To avoid passing the `/` prefix in the -test.run to the integration test
suite, which for some reason executes all tests, but works fine with
integration-cli.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previous check checked if ANY of the test directories isn't
integration-cli. This means it was true if TEST_FILTER matched multiple
tests from both integration and integration-cli suite.
Remove the grep `-v` inversion and replace it with a bash negation, so
it actually checks if there is no `integration-cli` in test dirs.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The mutex is only used on reads, but there's nothing protecting writes,
and it looks like nothing is mutating fields after creation, so let's
remove this altogether.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No context in the commit that added it, but PR discussion shows that
the API was mostly exploratory, and it was 8 Years go, so let's not
head in that direction :) b646784859
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that we removed the interface, there's no need to cast the Network
to a NetworkInfo interface, so we can remove uses of the `Info()` method.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These errors aren't used in our repo and seem unused by the OSS
community (this was checked with Sourcegraph).
- ErrIpamInternalError has never been used
- ErrInvalidRequest is unused since moby/libnetwork@c85356efa
- ErrPoolNotFound has never been used
- ErrOverlapPool has never been used
- ErrNoAvailablePool has never been used
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
PR moby/moby#45759 is going to use the new `errors.Join` function to
return a list of validation errors.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Check the preferredPool first, as other checks could be doing more
(such as locking, or validating / parsing). Also adding a note, as
it's unclear why we're ignoring invalid pools here.
The "invalid" conditions was added in [libnetwork#1095][1], which
moved code to reduce os-specific dependencies in the ipam package,
but also introduced a types.IsIPNetValid() function, which considers
"0.0.0.0/0" invalid, and added it to the condition to return early.
Unfortunately review does not mention this change, so there's no
context why. Possibly this was done to prevent errors further down
the line (when checking for overlaps), but returning an error here
instead would likely have avoided that as well, so we can only guess.
To make this code slightly more transparent, this patch also inlines
the "types.IsIPNetValid" function, as it's not used anywhere else,
and inlining it makes it more visible.
[1]: 5ca79d6b87 (diff-bdcd879439d041827d334846f9aba01de6e3683ed8fdd01e63917dae6df23846)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was only run if no preferred pool was specified, however,
since [libnetwork#1162][2], the function would already return early
if a preferred pools was set (and the overlap check to be skipped),
so this was now just dead code.
[2]: 9cc3385f44
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function intentionally holds a lock / lease on address-pools to
prevent trying the same pool repeatedly.
Let's try to make this logic slightly more transparent, and prevent
defining defers in a loop. Releasing all the pools in a singe defer
also allows us to get the network-name once, which prevents locking
and unlocking the network for each iteration.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both functions have multiple output vars with generic types, which made
it hard to grasp what's what.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to consume, without first having to create an empty
PoolID.
Performance is the same:
BenchmarkPoolIDFromString-10 6100345 196.5 ns/op 112 B/op 3 allocs/op
BenchmarkPoolIDFromString-10 6252750 192.0 ns/op 112 B/op 3 allocs/op
Note that I opted not to change the return-type to a pointer, as that seems
to perform less;
BenchmarkPoolIDFromString-10 6252750 192.0 ns/op 112 B/op 3 allocs/op
BenchmarkPoolIDFromString-10 5288682 226.6 ns/op 192 B/op 4 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As this function may be called repeatedly to convert to/from a string,
it may be worth optimizing it a bit. Adding a minimal Benchmark for
it as well.
Before/after:
BenchmarkPoolIDToString-10 2842830 424.3 ns/op 232 B/op 12 allocs/op
BenchmarkPoolIDToString-10 7176738 166.8 ns/op 112 B/op 7 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
network.requestPoolHelper and Allocator.RequestPool have many args and
output vars with generic types. Add names for them to make it easier to
grasp what's what.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The options are unused, other than for debug-logging, which made it look
as if they were actually consumed anywhere, but they aren't.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it slightly more readable to see what's returned in each of
the code-paths. Also move validation of pool/subpool earlier in the
function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were testing non-existing plugins, but therefore triggered
the retry-loop, which times out after 15-30 seconds. Add some options
to allow overriding this timeout during tests.
Before:
go test -v -run '^(TestGet|TestNewClientWithTimeout)$'
=== RUN TestGet
=== RUN TestGet/success
=== RUN TestGet/not_implemented
=== RUN TestGet/not_exists
WARN[0000] Unable to locate plugin: vegetable, retrying in 1s
WARN[0001] Unable to locate plugin: vegetable, retrying in 2s
WARN[0003] Unable to locate plugin: vegetable, retrying in 4s
WARN[0007] Unable to locate plugin: vegetable, retrying in 8s
--- PASS: TestGet (15.02s)
--- PASS: TestGet/success (0.00s)
--- PASS: TestGet/not_implemented (0.00s)
--- PASS: TestGet/not_exists (15.02s)
=== RUN TestNewClientWithTimeout
client_test.go:166: started remote plugin server listening on: http://127.0.0.1:36275
WARN[0015] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": context deadline exceeded (Client.Timeout exceeded while awaiting headers), retrying in 1s
WARN[0017] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": context deadline exceeded (Client.Timeout exceeded while awaiting headers), retrying in 2s
WARN[0019] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": net/http: request canceled (Client.Timeout exceeded while awaiting headers), retrying in 4s
WARN[0024] Unable to connect to plugin: 127.0.0.1:36275/Test.Echo: Post "http://127.0.0.1:36275/Test.Echo": net/http: request canceled (Client.Timeout exceeded while awaiting headers), retrying in 8s
--- PASS: TestNewClientWithTimeout (17.64s)
PASS
ok github.com/docker/docker/pkg/plugins 32.664s
After:
go test -v -run '^(TestGet|TestNewClientWithTimeout)$'
=== RUN TestGet
=== RUN TestGet/success
=== RUN TestGet/not_implemented
=== RUN TestGet/not_exists
WARN[0000] Unable to locate plugin: this-plugin-does-not-exist, retrying in 1s
--- PASS: TestGet (1.00s)
--- PASS: TestGet/success (0.00s)
--- PASS: TestGet/not_implemented (0.00s)
--- PASS: TestGet/not_exists (1.00s)
=== RUN TestNewClientWithTimeout
client_test.go:167: started remote plugin server listening on: http://127.0.0.1:45973
--- PASS: TestNewClientWithTimeout (0.04s)
PASS
ok github.com/docker/docker/pkg/plugins 1.050s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If TEST_INTEGRATION_FAIL_FAST is not set, run the integration-cli tests
even if integration tests failed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Collect a list of all the links we successfully enabled (if any), and
use a single defer to disable them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The iptables package has types defined for these actions; use them directly
instead of creating a string only to convert it to a known value.
As the linkContainers() function is only used internally, and with fixed
values, we can also remove the validation, and InvalidIPTablesCfgError
error, which is now unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Partially revert commit 94b880f.
The CheckDuplicate field has been introduced in commit 2ab94e1. At that
time, this check was done in the network router. It was then moved to
the daemon package in commit 3ca2982. However, commit 94b880f duplicated
the logic into the network router for no apparent reason. Finally,
commit ab18718 made sure a 409 would be returned instead of a 500.
As this logic is first done by the daemon, the error -> warning
conversion can't happen because CheckDuplicate has to be true for the
daemon package to return an error. If it's false, the daemon proceed
with the network creation, set the Warning field of its return value and
return no error.
Thus, the CheckDuplicate logic in the api is removed and
libnetwork.NetworkNameError now implements the ErrConflict interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
During review, it was decided to remove `LimitNOFILE` from `docker.service` to rely on the systemd v240 implicit default of `1024:524288`. On supported platforms with systemd prior to v240, packagers will patch the service with an explicit `LimitNOFILE=1024:524288`.
- `1024` soft limit is an implicit default, avoiding unexpected breakage. Software that needs a higher limit should request to raise the soft limit for its process.
- `524288` hard limit is an implicit default since systemd v240 and is adequate for most processes (_half of the historical limit from `fs.nr_open` of `1048576`_), while 4096 is the implicit default from the kernel (often too low). Individual containers can be started with `--ulimit` when a larger hard limit is required.
- The hard limit may not exceed `fs.nr_open` (_which a value of `infinity` will resolve to_). On most systems with systemd v240 or newer, this will resolve to an excessive size of 2^30 (over 1 billion).
- When set to `infinity` (usually as the soft limit) software may experience significantly increased resource usage, resulting in a performance regression or runtime failures that are difficult to troubleshoot.
- OpenRC current config approach lacks support for different soft/hard limits being set as it adjusts additional limits and `ulimit` does not support mixed usage of `-H` + `-S`. A soft limit of `524288` is not ideal, but 2^19 is much less overhead than 2^30, whilst a hard limit of 4096 would be problematic for Docker.
Signed-off-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
Remove some intermediate vars, move vars closer to where they're used,
and introduce local var for `nw.Name()` to reduce some locking/unlocking in:
- `Daemon.allocateNetwork()`
- `Daemon.releaseNetwork()`
- `Daemon.connectToNetwork()`
- `Daemon.disconnectFromNetwork()`
- `Daemon.findAndAttachNetwork()`
Also un-wrapping some lines to make it slightly easier to read the conditions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Remove intermediate variable
- Optimize the order of checks in the condition; check for unmanaged containers
first, before getting information about cluster state and network information.
- Simplify the log messages, as the error would already contain the same
information about the network (name or ID) and container (ID), so would
print the network ID twice:
error detaching from network <ID>: could not find network attachment for container <ID> to network <name or ID>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The function was declaring an err variable which was shadowed. It was
intended for directly assigning to a struct field, but as this function
is directly mutating an existing object, and the err variable was declared
far away from its use, let's use an intermediate var for that to make it
slightly more atomic.
While at it, also combined two "if" branches.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
store network.Name() in a variable to reduce repeatedly locking/unlocking
of the network (although this is very, very minimal in the grand scheme
of things).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests were made parallel to speed up the execution, but this
turned out to be flaky, because they mutate some shared state.
The tests use shared `storage` variable without any synchronization.
However, adding synchronization is not enough in all cases, some tests
register the same plugin, so they can't be run in parallel to each
other.
This commit adds the synchronization around `storage` variable
modification and removes parallel from the tests where it's not enough.
Before:
```
$ go test -race -v . -count 1
...
--- FAIL: TestGet (15.02s)
--- FAIL: TestGet/not_implemented (0.00s)
testing.go:1446: race detected during execution of test
testing.go:1446: race detected during execution of test
FAIL
FAIL github.com/docker/docker/pkg/plugins 17.655s
FAIL
```
After:
```
$ go test -race -v . -count 1
ok github.com/docker/docker/pkg/plugins 32.702s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This function is called by `daemon.containerCreate()` which is already
wrapping errors coming from `verifyNetworkingConfig()` with
`errdefs.InvalidParameter()`. So `verifyNetworkingConfig()` should only
return standard errors.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
"HEAD" will still be used as a version if no DOCKER_COMMIT is provided
(for example when not running via `make`), but it won't prevent it being
set to the GITHUB_SHA variable when it's present.
This should fix `Git commit` reported by `docker version` for the
binaries generated by `moby-bin`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This code was initializing a new PortBinding, and creating a deep copy
for each binding. It's unclear what the intent was here, but at least
PortBinding.GetCopy() wasn't adding much value, as it created a new
PortBinding, [copying all values from the original][1], which includes
a [copy of IPAddresses in it][2]. Our original "template" did not have any
of that, so let's forego that, and just create new PortBindings as we go.
[1]: 454b6a7cf5/libnetwork/types/types.go (L110-L120)
[2]: 454b6a7cf5/libnetwork/types/types.go (L236-L244)
Benchmarking before/after;
BenchmarkPortBindingCopy-10 166752 6230 ns/op 1600 B/op 100 allocs/op
BenchmarkPortBindingNoCopy-10 226989 5056 ns/op 1600 B/op 100 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were not adding much, so just getting rid of them. Also added a
TODO to move this code to the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move variables closer to where they're used instead of defining them all
at the start of the function.
Also removing some intermediate variables, unwrapped some lines, and combined
some checks to a single check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Outside of some tests, these options are the only code setting these fields,
so we can update them to set the value, instead of appending.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was created as a "method", but didn't use the Daemon in any
way, and all other options were checked inline, so let's not pretend this
function is more "special" than the other checks, and inline the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- store network.Name() in a variable to reduce repeatedly locking/unlocking
of the network (although this is very, very minimal in the grand scheme
of things).
- un-wrap long conditions
- ever so slightly optimise some conditions by changeing the order of checks.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was initializing a new PortBinding, and creating a deep copy
for each binding. It's unclear what the intent was here, but at least
PortBinding.GetCopy() wasn't adding much value, as it created a new
PortBinding, [copying all values from the original][1], which includes
a [copy of IPAddresses in it][2]. Our original "template" did not have any
of that, so let's forego that, and just create new PortBindings as we go.
[1]: 454b6a7cf5/libnetwork/types/types.go (L110-L120)
[2]: 454b6a7cf5/libnetwork/types/types.go (L236-L244)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were not adding much, so just getting rid of them. Also added a
TODO to move this code to the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move variables closer to where they're used instead of defining them all
at the start of the function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`getPortMapInfo` does many things; it creates a copy of all the sandbox
endpoints, gets the driver, endpoints, and network from store, and creates
port-bindings for all exposed and mapped ports.
We should look if we can create a more minimal implementation for this
purpose, but in the meantime, let's prevent it being called if we don't
need it by making it the second condition in the check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- don't initialize slices; it's not needed to append to them
- store network-ID in a var to prevent repeated lock/unlocking in nw.ID()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When we added this deprecation warning, some registries had not yet
moved away from the deprecated specification, so we made the warning
conditional for pulling from Docker Hub.
That condition was added in 647dfe99a5,
which is over 4 Years ago, which should be time enough for images
and registries to have moved to current specifications.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Use the same warning for both "v1 in manifest-index" and bare "v1" images.
- Update URL to use a "/go/" redirect, which allows the docs team to more
easily redirect the URL to relevant docs (if things move).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The original code in container.Exec was potentially leaking the copy
goroutine when the context was cancelled or timed out. The new
`demultiplexStreams()` function won't return until the goroutine has
finished its work, and to ensure that it takes care of closing the
hijacked connection.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Includes a fix for CVE-2023-29409
go1.20.7 (released 2023-08-01) includes a security fix to the crypto/tls
package, as well as bug fixes to the assembler and the compiler. See the
Go 1.20.7 milestone on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.7+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.6...go1.20.7
From the mailing list announcement:
[security] Go 1.20.7 and Go 1.19.12 are released
Hello gophers,
We have just released Go versions 1.20.7 and 1.19.12, minor point releases.
These minor releases include 1 security fixes following the security policy:
- crypto/tls: restrict RSA keys in certificates to <= 8192 bits
Extremely large RSA keys in certificate chains can cause a client/server
to expend significant CPU time verifying signatures. Limit this by
restricting the size of RSA keys transmitted during handshakes to <=
8192 bits.
Based on a survey of publicly trusted RSA keys, there are currently only
three certificates in circulation with keys larger than this, and all
three appear to be test certificates that are not actively deployed. It
is possible there are larger keys in use in private PKIs, but we target
the web PKI, so causing breakage here in the interests of increasing the
default safety of users of crypto/tls seems reasonable.
Thanks to Mateusz Poliwczak for reporting this issue.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.20.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it slightly clearer what it does, as "resolve" may give the
impression it's doing more than just returning the TLS config configured
for the client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
fallbackDial was only used in a single place, and it was defined far away
from where it's used, so let's inline it, so that it's clear at a glance
what we're doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The is-automated field is being deprecated by Docker Hub's search API,
and will always be set to "false" in future.
This patch deprecates the field and related filter for the Engine's API.
In future, the `is-automated` filter will no longer yield any results
when searching for `is-automated=true`, and will be ignored when
searching for `is-automated=false`.
Given that this field is deprecated by an external API, the deprecation
will not be versioned, and will apply to any API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed this was always being skipped because of race conditions
checking the logs.
This change adds a log scanner which will look through the logs line by
line rather than allocating a big buffer.
Additionally it adds a `poll.Check` which we can use to actually wait
for the desired log entry.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv4AddrNoMatchError and IPv6AddrNoMatchError are currently implementing
BadRequestError. They are returned in two cases, and none are due to a
bad user request:
- When calling daemon's CreateNetwork route, if the bridge's IPv4
address or none of the bridge's IPv6 addresses match what's requested.
If that happens, there's a big issue somewhere in libnetwork or the
kernel.
- When restoring a network, for the same reason. In that case, the
on-disk state drifted from the interface state.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error can only be reached because of an error in our code, so it's
not a "bad user request". As it's never type asserted, no need to keep
it around.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error is only used in defensive checks whereas the precondition is
already checked by caller. If we reach it, we messed something else. So
it's definitely not a BadRequest. Also, it's not type asserted anywhere,
so just inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
It was not used as a sentinel error, and didn't carry a specific type,
which made it a rather complex way to create an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added moved to the types package as part of a refactor
in 778e2a72b3
but the introduction of the sandbox API changed the existing API to
weak types (not using a plain string);
9a47be244a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- InvalidIPTablesCfgError: implement InternalError instead of
BadRequestError. This error is returned when an invalid iptables
action is passed as argument (ie. none of -A, -I, or -D).
- ErrInvalidDriverConfig: don't implement BadRequestError. This is
returned when libnetwork controller initialization pass bad driver
config -- there's no call from an HTTP route.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- full diff: https://github.com/containerd/containerd/compare/v1.6.21...v1.6.22
- release notes: https://github.com/containerd/containerd/releases/tag/v1.6.22
---
Notable Updates
- RunC: Update runc binary to v1.1.8
- CRI: Fix `additionalGids`: it should fallback to `imageConfig.User`
when `securityContext.RunAsUser`, `RunAsUsername` are empty
- CRI: Write generated CNI config atomically
- Fix concurrent writes for `UpdateContainerStats`
- Make `checkContainerTimestamps` less strict on Windows
- Port-Forward: Correctly handle known errors
- Resolve `docker.NewResolver` race condition
- SecComp: Always allow `name_to_handle_at`
- Adding support to run hcsshim from local clone
- Pinned image support
- Runtime/V2/RunC: Handle early exits w/o big locks
- CRITool: Move up to CRI-TOOLS v1.27.0
- Fix cpu architecture detection issue on emulated ARM platform
- Task: Don't `close()` io before `cancel()`
- Fix panic when remote differ returns empty result
- Plugins: Notify readiness when registered plugins are ready
- Unwrap io errors in server connection receive error handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- go.mod: update dependencies and go version by
- Use Go1.20
- Fix couple of typos
- Added `WithStdout` and `WithStderr` helpers
- Moved `cmdOperators` handling from `RunCmd` to `StartCmd`
- Deprecate `assert.ErrorType`
- Remove outdated Dockerfile
- add godoc links
full diff: https://github.com/gotestyourself/gotest.tools/compare/v3.4.0...v3.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Follow-up to fca38bcd0a, which made the
Discover API optional for drivers to implement, but forgot to remove the
stubs from the Windows drivers, which didn't implement this API.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "Capability" type defines DataScope and ConnectivityScope fields,
but their value was set from consts in the datastore package, which
required importing that package and its dependencies for the consts
only.
This patch:
- Moves the consts to a separate "scope" package
- Adds aliases for the consts in the datastore package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
changes:
- specs-go: remove artifact prefixed annotations
- Switch from scratch to empty
- Remove artifact media type reference
- image-index: add artifactType to specs and schema
- Add artifactType to image index
- Apply version change from #1050
- Specify the content of the scratch blob
- Add language from artifacttype field to forbid allowlists of media types
- spec: clarify descriptor, align with de facto artifact usage
- Remove special guidance around wasm
- Update descriptor.go
- releases: use +dev as in-development suffix
- version: bump HEAD back to -dev
- image-index: add the subject field
full diff: https://github.com/opencontainers/image-spec/compare/v1.1.0-rc3...v1.1.0-rc4
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most drivers do not implement this, so detect if a driver implements
the discoverAPI, and remove the implementation from drivers that do
not support it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.2...v1.7.3
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.3
----
Welcome to the v1.7.3 release of containerd!
The third patch release for containerd 1.7 contains various fixes and updates.
Notable Updates
- RunC: Update runc binary to v1.1.8
- CRI: Fix `additionalGids`: it should fallback to `imageConfig.User`
when `securityContext.RunAsUser`,`RunAsUsername` are empty
- CRI: write generated CNI config atomically
- Port-Forward: Correctly handle known errors
- Resolve docker.NewResolver race condition
- Fix `net.ipv4.ping_group_range` with userns
- Runtime/V2/RunC: handle early exits w/o big locks
- SecComp: always allow `name_to_handle_at`
- CRI: Windows Pod Stats: Add a check to skip stats for containers that
are not running
- Task: don't `close()` io before cancel()
- Remove CNI conf_template deprecation
- Fix issue for HPC pod metrics
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.7.1...v1.7.2
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.2
----
Welcome to the v1.7.2 release of containerd!
The second patch release for containerd 1.7 includes enhancements to CRI
sandbox mode, Windows snapshot mounting support, and CRI and container IO
bug fixes.
CRI/Sandbox Updates
- Publish sandbox events
- Make stats respect sandbox's platform
Other Notable Updates
- Mount snapshots on Windows
- Notify readiness when registered plugins are ready
- Fix `cio.Cancel()` should close pipes
- CDI: Use CRI `Config.CDIDevices` field for CDI injection
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows.
This issue was not limited to the go command itself, and could also affect binaries
that use `os.Command`, `os.LookPath`, etc.
From the related blogpost (https://blog.golang.org/path-security):
> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing
At time of the go1.15 release, the Go team considered changing the behavior of
`os.LookPath()` and `exec.LookPath()` to be a breaking change, and made the
behavior "opt-in" by providing the `golang.org/x/sys/execabs` package as a
replacement.
However, for the go1.19 release, this changed, and the default behavior of
`os.LookPath()` and `exec.LookPath()` was changed. From the release notes:
https://go.dev/doc/go1.19#os-exec-path
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe)
> in the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
A result of this change was that registering the daemon as a Windows service
no longer worked when done from within the directory of the binary itself:
C:\> cd "Program Files\Docker\Docker\resources"
C:\Program Files\Docker\Docker\resources> dockerd --register-service
exec: "dockerd": cannot run executable found relative to current directory
Note that using an absolute path would work around the issue:
C:\Program Files\Docker\Docker>resources\dockerd.exe --register-service
This patch changes `registerService()` to use `os.Executable()`, instead of
depending on `os.Args[0]` and `exec.LookPath()` for resolving the absolute
path of the binary.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.
To solve this, this commit changes the following rules:
```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```
into:
```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```
These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.
Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.
Solve #45460.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
refreshImage is the only function used as a reducer and it doesn't use
the `filter *listContext`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Aggregate same images into one object and add the list of tags pointing
to it to the RepoTags array
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
- use assert.Check to continue the test even if a check fails
- assert the total number of images returned, not only their RepoTags
- use subtests
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Refactor GetContainerLayerSize to calculate unpacked image size only by
following the snapshot parent tree directly instead of following it by
using diff ids from image config.
This works even if the original manifest/config used to create that
container is no longer present in the content store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Check for generic `errdefs.NotFound` rather than specific error helper
struct when checking if the error is caused by the image not being
present.
It still works for `ErrImageDoesNotExist` because it
implements the NotFound errdefs interface too.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Now that all helper functions are updated, we can use a struct-literal
for this function, which makes it slightly easier to read.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the buildDetailedNetworkResources function into separate functions for
collecting container attachments (`buildContainerAttachments`) and service
attachments (`buildServiceAttachments`). This allows us to get rid of the
"verbose" bool, and makes the logic slightly more transparent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Pass the endpoint and endpoint-info, instead of individual fields from the
endpoint.
- Remove redundant nil-check, as it's already checked on the call-side
in `buildDetailedNetworkResources`, which skips endpoints without info.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the function return the constructed network.IPAM instead of applying
it to a network struct, and rename it to "buildIPAMResources".
Rewrite the function itself:
- Use struct-literals where possible to make it slightly more readable.
- Use a boolean (hasIPv4Config, hasIPv6Config) for both IPv4 and IPv6 to
check whether the IPAM-info needs to be added. This makes the logic the
same for both, and makes the processing order-independent. This also
allows for the `network.IpamInfo()` call to be skipped if it's not needed.
- Change order of "ipv4 config / ipv4 info" and "ipv6 config / ipv4 info"
blocks to make it slightly clearer (and to allow skipping the forementioned
call to `network.IpamInfo()`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move the length-check into the function, and change the code to
be a basic type-case, as networkdb.PeerInfo and network.PeerInfo
are identical types.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This variable was only accessed from within LocalRegistry methods, but
due to being a package-level variable, tests had to deal with setting
and resetting it.
Move it to be a field scoped to the LocalRegistry. This simplifies the
tests, and to make this more transparent, also removing the "Setup()"
helper (which, wasn't marked as a t.Helper()).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The client's transport can only be set by newClientWithTransport, which
is not exported, and always uses a transport.HTTPTransport.
However, requestFactory is mocked in one of the tests, so keep the interface,
but make it a local, non-exported one.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The interface is not consumed anywhere, and only non-exported functions
produced one, so we can remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was exported, but never mutated outside of the package, and
effectively a rather "creative" way to define a method on LocalRegistry.
While un-exporting also store these paths in a field, instead of constructing
them on every call, as the results won't change during the lifecycle of the
LocalRegistry.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the exported SpecsPaths from the platform-specific implementations,
so that documentation can be maintained in a single location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since 0e5eaf8ee3, these implementations
were fully identical, so removing the duplicate, and move it to a
platform-agnostic file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use gotest.tools assertions
- use consts and struct-literals where possible
- use assert.Check instead of t.Fatal() where possible
- fix some unhandled errors
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was used by the postNetworkConnect() handler, but is handled
by the backend itself, starting with d63a5a1ff5.
Since that commit, this function was no longer used, so we can remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The rootChain variable that the Key function references is a
package-global slice. As the append() built-in may append to the slice's
backing array in place, it is theoretically possible for the temporary
slices in concurrent Key() calls to share the same backing array, which
would be a data race. Thankfully in my tests (on Go 1.20.6)
cap(rootChain) == len(rootChain)
held true, so in practice a new slice is always allocated and there is
no race. But that is a very brittle assumption to depend upon, which
could blow up in our faces at any time without warning. Rewrite the
implementation in a way which cannot lead to data races.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Before this change, integration test would fail fast and not execute all
test suites when one suite fails.
Change this behavior into opt-in enabled by TEST_INTEGRATION_FAIL_FAST
variable.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
It only had a single implementation, so let's remove the interface.
While changing, also renaming;
- datastore.DataStore -> datastore.Store
- datastore.NewDataStore -> datastore.New
- datastore.NewDataStoreFromConfig -> datastore.FromConfig
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were only used internally, and ErrConntrackNotConfigurable was not used
as a sentinel error anywhere. Remove ErrConntrackNotConfigurable, and change
IsConntrackProgrammable to return an error instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
arrangeUserFilterRule uses the package-level [`ctrl` variable][1], which
holds a reference to a controller instance. This variable is set by
[`setupArrangeUserFilterRule()`][2], which is called when initialization
a controller ([`libnetwork.New`][3]).
In normal circumstances, there would only be one controller, created during
daemon startup, and the instance of the controller would be the same as
the controller that `NewNetwork` is called from, but there's no protection
for the `ctrl` variable, and various integration tests create their own
controller instance.
The global `ctrl` var was introduced in [54e7900fb89b1aeeb188d935f29cf05514fd419b][4],
with the assumption that [only one controller could ever exist][5].
This patch tries to reduce uses of the `ctrl` variable, and as we're calling
this code from inside a method on a specific controller, we inline the code
and use that specific controller instead.
[1]: 37b908aa62/libnetwork/firewall_linux.go (L12)
[2]: 37b908aa62/libnetwork/firewall_linux.go (L14-L17)
[3]: 37b908aa62/libnetwork/controller.go (L163)
[4]: 54e7900fb8
[5]: https://github.com/moby/libnetwork/pull/2471#discussion_r343457183
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in libnetwork through 50964c9948
and, based on the name of the function and its signature, I think it
was meant to be a test. This patch refactors it to be one.
Changing it into a test made it slightly broken:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
errors_test.go:15: Failed to detect err network not found is of type BadRequestError. Got type: libnetwork.ErrNoSuchNetwork
errors_test.go:15: Failed to detect err endpoint not found is of type BadRequestError. Got type: libnetwork.ErrNoSuchEndpoint
errors_test.go:42: Failed to detect err unknown driver "" is of type ForbiddenError. Got type: libnetwork.NetworkTypeError
errors_test.go:42: Failed to detect err unknown network id is of type ForbiddenError. Got type: *libnetwork.UnknownNetworkError
errors_test.go:42: Failed to detect err unknown endpoint id is of type ForbiddenError. Got type: *libnetwork.UnknownEndpointError
--- FAIL: TestErrorInterfaces (0.00s)
FAIL
This was because some errors were tested twice, but for the wrong type
(`NetworkTypeError`, `UnknownNetworkError`, `UnknownEndpointError`).
Moving them to the right test left no test-cases for `types.ForbiddenError`,
so I added `ActiveContainerError` to not make that part of the code feel lonely.
Other failures were because some errors were changed from `types.BadRequestError`
to a `types.NotFoundError` error in commit ba012a703a,
so I moved those to the right part.
Before this patch:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
--- PASS: TestErrorInterfaces (0.00s)
PASS
ok github.com/docker/docker/libnetwork 0.013s
After this patch:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
--- PASS: TestErrorInterfaces (0.00s)
PASS
ok github.com/docker/docker/libnetwork 0.013s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit ffd75c2e0c updated this function to
set up the DOCKER-USER chain for both iptables and ip6tables, however the
function would return early if a failure happened (instead of continuing
with the next iptables version).
This patch extracts setting up the chain to a separate function, and updates
arrangeUserFilterRule to log the failure as a warning, but continue with
the next iptables version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These functions were mostly identical, except for iptables being enabled
by default (unless explicitly disabled by config).
Rewrite the function to a enabledIptablesVersions, which returns the list
of iptables-versions that are enabled for the controller. This prevents
having to acquire a lock twice, and simplifies arrangeUserFilterRule, which
can now just iterate over the enabled versions.
Also moving this function to a linux-only file, as other platforms don't have
the iptables types defined.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
"ro-non-recursive", "ro-force-recursive", and "rro" are
now removed from the legacy mount API.
CLI may still support them via the new mount API (if we want).
Follow-up to PR 45278
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Unfortunately also brings in golang.org/x/tools as a dependency, due to
go-winio using a "tools.go" file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove gotest.tools dependency as it was only used in one test,
and only for a trivial check
- use t.TempDir()
- rename vars that collided with package types
- don't use un-keyed structs
- explicitly ignore some errors to please linters
- use iotest.ErrReader
- TestReadCloserWrapperClose: verify reading works before closing :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.8
full diff: https://github.com/opencontainers/runc/compare/v1.1.7...v1.1.9
This is the eighth patch release of the 1.1.z release branch of runc.
The most notable change is the addition of RISC-V support, along with a
few bug fixes.
- Support riscv64.
- init: do not print environment variable value.
- libct: fix a race with systemd removal.
- tests/int: increase num retries for oom tests.
- man/runc: fixes.
- Fix tmpfs mode opts when dir already exists.
- docs/systemd: fix a broken link.
- ci/cirrus: enable some rootless tests on cs9.
- runc delete: call systemd's reset-failed.
- libct/cg/sd/v1: do not update non-frozen cgroup after frozen failed.
- CI: bump Fedora, Vagrant, bats.
- .codespellrc: update for 2.2.5.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- jsonpb: accept 'null' as a valid representation of NullValue in unmarshal
The canonical JSON representation for NullValue is JSON "null".
full diff: https://github.com/golang/protobuf/compare/v1.5.2...v1.5.3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- unmarshal does not return nil object when value is nil
- fixes "ctr tasks checkpoint returns invalid task checkpoint option for io.containerd.runc.v2: unknown"
full diff: https://github.com/containerd/typeurl/compare/v2.1.0...v2.1.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change add leases for all the content that will be exported, once
the image(s) are exported the lease is removed, thus letting
containerd's GC to do its job if needed. This fixes the case where
someone would remove an image that is still being exported.
This fixes the TestAPIImagesSaveAndLoad cli integration test.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
When resolving a reference that is both a Named and Digested, it could
be resolved to an image that has the same digest, but completely
different repository name.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
If image name is already an untagged digested reference, don't produce
additional digested ref.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Post-f8c0d92a22bad004cb9cbb4db704495527521c42, BUILDKIT_REPO doesn't
really do what it claims to. Instead, don't allow overloading since the
import path for BuildKit is always the same, and make clear the
provenance of values when generating the final variable definitions.
We also better document the script, and follow some best practices for
both POSIX sh and Bash.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The imgSvcConfig is defined locally, and discarded if an error occurs,
so no need to use the intermediate vars here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The interface is defined on the receiver-side, and returning concrete
types makes it more transparent what we're creating.
As these namespaced wrappers were not exported, let's inline them, so
that it's clear at a glance what it's doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The containerdCli was somewhat confusing (is it the CLI?); let's rename
to make it match what it is :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
gotest.tools has an init() which registers a '-update' flag;
a80f057529/internal/source/update.go (L21-L23)
The quota helper contains a testhelpers file, which is meant for usage
in (integration) tests, but as it's in the same pacakge as production
code, would also trigger the gotest.tools init.
This patch removes the gotest.tools code from this file.
Before this patch:
$ (exec -a libnetwork-setkey "$(which dockerd)" -help)
Usage of libnetwork-setkey:
-exec-root string
docker exec root (default "/run/docker")
-update
update golden values
With this patch applied:
$ (exec -a libnetwork-setkey "$(which dockerd)" -help)
Usage of libnetwork-setkey:
-exec-root string
docker exec root (default "/run/docker")
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add `-f` to output nothing to tar if the curl fails, and `-S` to report
errors if they happen.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use bind-mounts instead of a `COPY` for cli.sh, and use `COPY --link`
for rootlesskit's build stage.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use a non-slash escape sequence to support mirrors with a path
component, and do not unconditionally replace the mirror in
Dockerfile.simple.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
This aligns `docker build` as invoked by the Makefile with both `docker
buildx bake` as invoked by the Makefile and directly by the user.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
- Fix copy on windows plus tests
- Fix follow symlinkResolver on Windows
- Implement proper renameFile on Windows
- Fix potential nil pointer dereference
- Use RunWithPrivileges
- Fix leaking file handle
- handle mkdir race for diskwriter
- walk: avoid stat()'ing files unnecessarily
- ci: fix freebsd workflow
- update to Go 1.20
full diff: fb433841cb...36ef4d8c0d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
insecure-registries supports using CIDR notation, however, buildkit in
moby was not respecting these. We can update the RegistryHosts function
to support this by inserting the correct host into the lookup map if
it's explicitly marked as insecure.
Signed-off-by: Justin Chadwell <me@jedevc.com>
The RegistryHosts lookup function is used by both BuildKit and by the
containerd snapshotter. However, this function differs in behaviour from
the config parser for the RegistryConfig:
- The protocol for insecure registries is treated as significant by
RegistryHosts, while the RegistryConfig strips this information.
- RegistryConfig validates and deduplicates mirrors.
- RegistryConfig does not parse the insecure-registries as URLs, which
can lead to parsing opaque URLs as was possible by the RegistryHosts
function.
This patch updates the lookup function to ensure consistency.
Signed-off-by: Justin Chadwell <me@jedevc.com>
- remove some intermediate variables
- explicitly return "nil" if there's no error
- remove redundant check for response-headers being nil
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The driver-configurations are only set when creating a new controller,
using the `config.OptionDriverConfig()` option that can be passed to
`New()`, and used as "read-only" after that.
Taking away any other paths that set these options, the only type used
for per-driver options are a `map[string]interface{}`, so we can change
the type from `map[string]interface{}` to a `map[string]map[string]interface{}`,
(or its "modern" variant: `map[string]map[string]any`), so that it's
no longer needed to cast the type before use.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't fail early if we can still test more, and be slightly more strict
in what error we're looking for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test already creates instances for each ip-version, so let's
re-use them. While changing, also use assert.Check to not fail early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
New() allows for driver-options to be passed using the config.OptionDriverConfig.
Update the test to not manually mutate the controller's configuration after
creating.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Not critical, but when used from ChainInfo, we had to construct an IPTable
based on the version of the ChainInfo, which then only used the version
we passed to get the right loopback.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make some variables local to the if-branches to be slightly more iodiomatic,
and to make clear it's only used in that branch.
Move the bestEffortLock locking later in IPtable.raw(), because that function'
could return before the lock was even needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that all consumers of these functions are passing non-empty values,
let's validate that no empty strings for either chain or table are passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only used internally, and it was last used in commit:
0220b06cd6
But moved into the iptables package in this commit:
998f3ce22c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was not used for "Config", but for Networks and Endpoints.
Having this utility made it look like more than it was, and the related
test was effectively testing stdlib.
Abstracting the validation also was hiding that, while validation does
not allow "empty" names, it happily allows leading/trailing whitespace,
and does not remove that before creating networks or endpoints;
docker network create "bridge "
docker network create "bridge "
docker network create "bridge "
docker network create " bridge "
docker network create " bridge "
docker network create " bridge"
docker network ls --filter driver=bridge
NETWORK ID NAME DRIVER SCOPE
d4d53210f185 bridge bridge local
e9afba0d99de bridge bridge local
69fb7a7ba67c bridge bridge local
a452bf065403 bridge bridge local
49d96c59061d bridge bridge local
8eae1c4be12c bridge bridge local
86dd65b881b9 bridge bridge local
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the NewGeneric utility as it was not used anywhere, except for
in tests.
Also "modernize" the type, and use `any` instead of `interface{}`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Use a more modern approach to check error-types
- Touch-up grammar of the error-message
- Remove redundant "nil" check for errors, as it's never nil at that point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
tlsconfig.Client() does various things, including reading certs and
checking them. So we may as well return early if we're not gonna be
able to use the config.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- don't use un-keyed structs
- use http consts where possible
- use errors.As instead of manually checking the error-type
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use Client.buildRequest instead of a local copy of the same logic so
that we're using the same logic, and there's less chance of diverging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.6 (released 2023-07-11) includes a security fix to the net/http package,
as well as bug fixes to the compiler, cgo, the cover tool, the go command,
the runtime, and the crypto/ecdsa, go/build, go/printer, net/mail, and text/template
packages. See the Go 1.20.6 milestone on our issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.20.6+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.20.5...go1.20.6
These minor releases include 1 security fixes following the security policy:
net/http: insufficient sanitization of Host header
The HTTP/1 client did not fully validate the contents of the Host header.
A maliciously crafted Host header could inject additional headers or entire
requests. The HTTP/1 client now refuses to send requests containing an
invalid Request.Host or Request.URL.Host value.
Thanks to Bartek Nowotarski for reporting this issue.
Includes security fixes for [CVE-2023-29406 ][1] and Go issue https://go.dev/issue/60374
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For local communications (npipe://, unix://), the hostname is not used,
but we need valid and meaningful hostname.
The current code used the socket path as hostname, which gets rejected by
go1.20.6 and go1.19.11 because of a security fix for [CVE-2023-29406 ][1],
which was implemented in https://go.dev/issue/60374.
Prior versions go Go would clean the host header, and strip slashes in the
process, but go1.20.6 and go1.19.11 no longer do, and reject the host
header.
Before this patch, tests would fail on go1.20.6:
=== FAIL: pkg/authorization TestAuthZRequestPlugin (15.01s)
time="2023-07-12T12:53:45Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 1s"
time="2023-07-12T12:53:46Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 2s"
time="2023-07-12T12:53:48Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 4s"
time="2023-07-12T12:53:52Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 8s"
authz_unix_test.go:82: Failed to authorize request Post "http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq": http: invalid Host header
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For local communications (npipe://, unix://), the hostname is not used,
but we need valid and meaningful hostname.
The current code used the client's `addr` as hostname in some cases, which
could contain the path for the unix-socket (`/var/run/docker.sock`), which
gets rejected by go1.20.6 and go1.19.11 because of a security fix for
[CVE-2023-29406 ][1], which was implemented in https://go.dev/issue/60374.
Prior versions go Go would clean the host header, and strip slashes in the
process, but go1.20.6 and go1.19.11 no longer do, and reject the host
header.
This patch introduces a `DummyHost` const, and uses this dummy host for
cases where we don't need an actual hostname.
Before this patch (using go1.20.6):
make GO_VERSION=1.20.6 TEST_FILTER=TestAttach test-integration
=== RUN TestAttachWithTTY
attach_test.go:46: assertion failed: error is not nil: http: invalid Host header
--- FAIL: TestAttachWithTTY (0.11s)
=== RUN TestAttachWithoutTTy
attach_test.go:46: assertion failed: error is not nil: http: invalid Host header
--- FAIL: TestAttachWithoutTTy (0.02s)
FAIL
With this patch applied:
make GO_VERSION=1.20.6 TEST_FILTER=TestAttach test-integration
INFO: Testing against a local daemon
=== RUN TestAttachWithTTY
--- PASS: TestAttachWithTTY (0.12s)
=== RUN TestAttachWithoutTTy
--- PASS: TestAttachWithoutTTy (0.02s)
PASS
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Combine TestAttachWithTTY and TestAttachWithoutTTy to a single test using sub-tests
- Set up and tear-down the test-environment once
- Remove redundant client.ContainerRemove, as it's taken care of by testEnv.Clean()
- Run both tests in parallel
make TEST_FILTER=TestAttach DOCKER_GRAPHDRIVER=overlay2 TESTDEBUG=1 test-integration
Loaded image: busybox:latest
Loaded image: busybox:glibc
Loaded image: debian:bullseye-slim
Loaded image: hello-world:latest
Loaded image: arm32v7/hello-world:latest
INFO: Testing against a local daemon
=== RUN TestAttach
=== RUN TestAttach/without_TTY
=== PAUSE TestAttach/without_TTY
=== RUN TestAttach/with_TTY
=== PAUSE TestAttach/with_TTY
=== CONT TestAttach/without_TTY
=== CONT TestAttach/with_TTY
--- PASS: TestAttach (0.00s)
--- PASS: TestAttach/without_TTY (0.03s)
--- PASS: TestAttach/with_TTY (0.03s)
PASS
DONE 3 tests in 1.347s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Calling function returned from setupTest (which calls testEnv.Clean) in
a defer block inside a test that spawns parallel subtests caused the
cleanup function to be called before any of the subtest did anything.
Change the defer expressions to use `t.Cleanup` instead to call it only
after all subtests have also finished.
This only changes tests which have parallel subtests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- use assert.Check() where possible to not fail early
- improve checks for error-types
- rename "testURL" var to be more descriptive, and use a const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests are testing timeouts and take a long time to run. Run the tests
in parallel, so that the test-suite takes shorter to run.
Before:
ok github.com/docker/docker/pkg/plugins 34.013s
After:
ok github.com/docker/docker/pkg/plugins 17.945s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Refactor setupRemotePluginServer() to be a helper, and to spin up a test-
server for each test instead of sharing the same instance between tests.
This allows the tests to be run in parallel without stepping on each-other's
toes (tearing down the server).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's convenient to have in the dev container when debugging issues which
reproduce consistently when deploying containers through compose.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We can't upload the same file in a matrix so generate
metadata in prepare job instead. Also fixes wrong bake meta
file in merge job.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Use http.Header, which is more descriptive on intent, and we're already
importing the package in the client. Removing the "header" type also fixes
various locations where the type was shadowed by local variables named
"headers".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "version" header was added in c0afd9c873,
but used the wrong information to get the API version.
This issue was fixed in a9d20916c3, which switched
the API handler code to get the API version from the context. That change is part
of Docker Engine 20.10 (API v1.30 and up)
This patch updates the code to only set the header on APi v1.29 and older, as it's
not used by newer API versions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With this change, the API will now return a 403 instead of a 500 when
trying to create an overlay network on a non-manager node.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The commit befff0e13f inadvertendly
disabled the error returned when trying to create an overlay network on
a node which is not part of a Swarm cluster.
Since commit e3708a89cc the overlay
netdriver returns the error: `no VNI provided`.
This commit reinstate the original error message by checking if the node
is a manager before calling libnetwork's `controller.NewNetwork()`.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This patch contains some optimizations I still had stashed when working
on eaa9494b71.
- Use the bytes package for handling the output of "lsof", instead of
converting to a string.
- Count the number of newlines in the output, instead of splitting the
output into a slice of strings. We're only interested in the number
of lines in the output.
- Use lsof's -F option to only print the file-descriptor for each line,
as we don't need other information.
- Use the -l, -n, and -P options to omit converting usernames, host names,
and port numbers.
From the [LSOF(8)][1] man-page:
-l This option inhibits the conversion of user ID numbers to
login names. It is also useful when login name lookup is
working improperly or slowly.
-n This option inhibits the conversion of network numbers to host
names for network files. Inhibiting conversion can make lsof run faster.
It is also useful when host name lookup is not working properly.
-P This option inhibits the conversion of port numbers to port names for network files.
Inhibiting the conversion can make lsof run a little faster.
It is also useful when host name lookup is not working properly.
Output looks something like;
lsof -lnP -Ff -p 39849
p39849
fcwd
ftxt
ftxt
f0
f1
f2
f3
f4
f5
f6
f7
f8
f9
f10
f11
Before/After:
BenchmarkGetTotalUsedFds-10 122 9479384 ns/op 10816 B/op 63 allocs/op
BenchmarkGetTotalUsedFds-10 154 7814697 ns/op 7257 B/op 60 allocs/op
[1]: https://opensource.apple.com/source/lsof/lsof-49/lsof/lsof.man.auto.html
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- return a errdefs.System if we fail to decode the registry's response
- use strconv.Itoa instead of fmt.Sprintf
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon sleeps for 15 seconds at start up when the API binds to a TCP
socket with no TLS certificate set. That's what the hack/make/run script
does, but it doesn't explicitly disable tls, thus we're experiencing
this annoying delay every time we use this script.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Golang map iteration order is not guaranteed, so in some cases the built slice has it's output of order as well. This means that testing for exact warning messages in docker build output would result in random test failures, making it more annoying for end-users to test against this functionality.
Signed-off-by: Jose Diaz-Gonzalez <email@josediazgonzalez.com>
...that Swarmkit no longer needs now that it has been migrated to use
the new-style driver registration APIs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The only remaining user is Swarmkit, which now has its own private copy
of the package tailored to its needs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The daemon.lazyInitializeVolume() function only handles restoring Volumes
if a Driver is specified. The Container's MountPoints field may also
contain other kind of mounts (e.g., bind-mounts). Those were ignored, and
don't return an error; 1d9c8619cd/daemon/volumes.go (L243-L252C2)
However, the prepareMountPoints() assumed each MountPoint was a volume,
and logged an informational message about the volume being restored;
1d9c8619cd/daemon/mounts.go (L18-L25)
This would panic if the MountPoint was not a volume;
github.com/docker/docker/daemon.(*Daemon).prepareMountPoints(0xc00054b7b8?, 0xc0007c2500)
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/mounts.go:24 +0x1c0
github.com/docker/docker/daemon.(*Daemon).restore.func5(0xc0007c2500, 0x0?)
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/daemon.go:552 +0x271
created by github.com/docker/docker/daemon.(*Daemon).restore
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/daemon.go:530 +0x8d8
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x564e9be4c7c0]
This issue was introduced in 647c2a6cdd
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently moby drops ep sets before the entrypoint is executed.
This does mean that with combination of no-new-privileges the
file capabilities stops working with non-root containers.
This is undesired as the usability of such containers is harmed
comparing to running root containers.
This commit therefore sets the effective/permitted set in order
to allow use of file capabilities or libcap(3)/prctl(2) respectively
with combination of no-new-privileges and without respectively.
For no-new-privileges the container will be able to obtain capabilities
that are requested.
Signed-off-by: Luboslav Pivarc <lpivarc@redhat.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
...which ignore the config argument. Notably, none of the network
drivers referenced by Swarmkit use config, which is good as Swarmkit
unconditionally passes nil for the config when registering drivers.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Albin is currently a curator, has been contributing for various years prior
to that, and has taken on the daunting task to work on Moby's networking stack.
Albin would be a great addition to our list of maintainers and to allow him
to perform his work in these areas in a more official capacity.
I nominated Albin as maintainer, and votes passed, so opening a PR to
make it official.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Kevin is a maintainer for BuildKit, Buildx, and Docker's official GitHub
actions (among others), has been our "in-house GitHub actions expert"
for a long time, and has made significant contributions to the integration
with BuildKit, and to improve our build pipeline(s).
Kevin would be a great addition to our list of maintainers and to allow him
to perform his work in these areas in a more official capacity.
I nominated Kevin as maintainer, and votes passed, so opening a PR to
make it official.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Laura has done significant work on the containerd integration, helping
triage and fixing bugs, both in this repository, containerd, and the
docker CLI, and would make a great addition to our list of maintainers.
I nominated Laura as maintainer, and votes passed, so opening a PR to
make it official.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds an additional interval to be used by healthchecks during the
start period.
Typically when a container is just starting you want to check if it is
ready more quickly than a typical healthcheck might run. Without this
users have to balance between running healthchecks to frequently vs
taking a very long time to mark a container as healthy for the first
time.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fixes a case where on Docker For Mac if you need to bind mount the
bundles dir (e.g. to get test results back).
The unix socket does not work over oxsfs, so instead we put it in a
tmpfs.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
1. On failed start tail the daemon logs
2. Exposes generic tailing functions to make test debugging simpler
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
setupBridgeNetFiltering:
- Indicate that the bridgeInterface argument is unused (but it's needed
to satisfy the signature).
- Return instead of nullifying the err. Still not great, but I thought it
was very slightly more logical thing to do.
checkBridgeNetFiltering:
- Remove unused argument, and scope ipVerName to the branch where it's
used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
initConnection was effectively just part of the constructor; ot was not
used elsewhere. Merge the two functions to simplify things a bit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const was added in 8301dcc6d7, before
being moved to libnetwork, and moved back, but it was never used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove local bridgeName variable that shadowed the const, but
used the same value
- remove some redundant `var` declarations, and changed fixed
values to a const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
None of these errors were string-matched anywhere, so let's change them
to be non-capitalized, as they should.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
looks like this error was added in 1cbdaebaa1,
and later moved to libnetwork in 44c96449c2
which also updated the description to something that doesn't match what
it means.
In either case, this error was never used as a special / sentinel error,
so we can just use a regular error return.
While at it, I also lower-cased the error-message; it's not string-matched
anywhere, so we can update it to make linters more happy.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- validate input variables before constructing the ChainInfo
- only construct the ChainInfo if things were successful
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Clarify that the argument to New is an exclusive upper bound.
Correct the documentation for SetAnyInRange: the end argument is
inclusive rather than exclusive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The idm package wraps bitseq.Handle to provide an offset and
synchronization. bitseq.Handle wraps bitmap.Bitmap to provide
persistence in a datastore. As no datastore is passed and the offset is
zero, the idm.Idm instance is nothing more than a concurrency-safe
wrapper around a bitmap.Bitmap with differently-named methods. Switch
over to using bitmap.Bitmap directly, using the ovmanager driver's mutex
for concurrency control.
Hold the driver mutex for the entire duration that VXLANs are being
assigned to the new network. This makes allocating VXLANs for a network
an atomic operation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
In the network.obtainVxlanID() method, the mutex only guards a local
variable and a function argument. Locking is therefore unnecessary.
The network.releaseVxlanID() method is only called in two contexts:
driver.NetworkAllocate(), where the network struct is a local variable
and network.releaseVxlanID() is only called in failure code-paths in
which the network does not escape; and driver.NetworkFree(), while the
driver mutex is held. Locking is therefore unnecessary.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Multiple daemons starting/running concurrently can collide with each
other when editing iptables rules. Most integration tests which opt into
parallelism and start daemons work around this problem by starting the
daemon with the --iptables=false option. However, some of the tests
neglect to pass the option when starting or restarting the daemon,
resulting in those tests being flaky.
Audit the integration tests which call t.Parallel() and (*Daemon).Stop()
and add --iptables=false arguments where needed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestClientWithRequestTimeout has been observed to flake in CI. The
timing in the test is quite tight, only giving the client a 10ms window
to time out, which could potentially be missed if the host is under
load and the goroutine scheduling is unlucky. Give the client a full
five seconds of grace to time out before failing the test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Now that the MTU field was moved, this function only needs the BridgeConfig,
which contains all options for the default "bridge" network.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option is only used for the default bridge network; let's move the
field to that struct to make it clearer what it's used for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The --mtu option is only used for the default "bridge" network on Linux.
On Windows, the flag is available, but ignored. As this option has been
available for a long time, and was always silently ignored, deprecating
or removing it would be a breaking change (and perhaps it's possible to
support it in future).
This patch:
- hides the option on Windows binaries
- logs a warning if the option is set to any non-zero value other than
the default on a Windows binary
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The OptionLocalKVProvider, OptionLocalKVProviderURL, and OptionLocalKVProviderConfig
options were only used in tests, so un-export them, and move them to the
test-files.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was implemented in dd4950f36d
which added a "key" field, but that field was never used anywhere, and
still appears unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `store.Watch()` was only used in `Controller.processEndpointCreate()`,
and skipped if the store was not "watchable" (`store.Watchable()`).
Whether a store is watchable depends on the store's datastore.scope;
local stores are not watchable;
func (ds *datastore) Watchable() bool {
return ds.scope != LocalScope
}
datastore is only initialized in two locations, and both locations set the
scope field to LocalScope:
datastore.newClient() (also called by datastore.NewDataStore()):
3e4c9d90cf/libnetwork/datastore/datastore.go (L213)
datastore.NewTestDataStore() (used in tests);
3e4c9d90cf/libnetwork/datastore/datastore_test.go (L14-L17)
Furthermore, the backing BoltDB kvstore does not implement the Watch()
method;
3e4c9d90cf/libnetwork/internal/kvstore/boltdb/boltdb.go (L464-L467)
Based on the above;
- our datastore is never Watchable()
- so datastore.Watch() is never used
This patch removes the Watchable(), Watch(), and RestartWatch() functions,
as well as the code handling watching.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The sequential field determined whether a lock was needed when storing
and retrieving data. This field was always set to true, with the exception
of NewTestDataStore() in the tests.
This field was added in a18e2f9965
to make locking optional for non-local scoped stores. Such stores are no
longer used, so we can remove this field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the code slightly more idiomatic; remove some "var" declarations,
remove some intermediate variables and redundant error-checks, and remove
the filePerm const.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This boolean was not used anywhere, so we can remove it. Also cleaning up
the implementation a bit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The WriteOptions struct was only used to set the "IsDir" option. This option
was added in d635a8e32b
and was only supported by the etcd libkv store.
The BoltDB store does not support this option, making the WriteOptions
struct fully redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The only remaining kvstore is BoltDB, which doesn't use TLS connections
or authentication, so we can remove these options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use string-literal for reduce escaped quotes, which makes for easier grepping.
While at it, also changed http -> https to keep some linters at bay.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent potential suggestion when many concurrent requests happen on
the /info endpoint. It's worth noting that with this change,
requests to the endpoint while another request is still in flight
will share the results, hence might be slightly incorrect (for example,
the output includes SystemTime, which may now be incorrect).
Assuming that under normal circumstances, requests will still
happen fast enough to not be shared, this may not be a problem,
but we could decide to update specific fields to not be shared.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Got a linter warning on this one, and I don't think eventFilter() was
intentionally using a value (not pointer).
> Struct containerConfig has methods on both value and pointer receivers.
> Such usage is not recommended by the Go Documentation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These aliases were not needed, and only used in a couple of places,
which made it inconsistent, so let's use the import without aliasing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They're not used anywhere, so let's remove them; better to have
a compile error than a panic at runtime.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Add the field as a "deprecated" field in the API type.
- Don't error when failing to parse the options, but produce a warning
instead, because the client won't be able to fix issues in the daemon
configuration. This was unlikely to happen, as the daemon probably
would fail to start with an invalid config, but just in case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was added in 8cb2229cd1 for
API version 1.28, but forgot to update the documentation and version
history.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field's documentation was still referring to the Swarm V1 API, which
is deprecated, and the link redirects to SwarmKit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ClusterStore` and `ClusterAdvertise` fields were deprecated in commit
616e64b42f (and would no longer be included in
the `/info` API response), and were fully removed in 24.0.0 through commit
68bf777ece
This patch removes the fields from the swagger file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ClusterStore` and `ClusterAdvertise` fields were deprecated in commit
616e64b42f (and would no longer be included in
the `/info` API response), and were fully removed in 24.0.0 through commit
68bf777ece
This patch removes the fields from the swagger file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ClusterStore` and `ClusterAdvertise` fields were deprecated in commit
616e64b42f (and would no longer be included in
the `/info` API response), and were fully removed in 24.0.0 through commit
68bf777ece
This patch removes the fields from the swagger file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is an indirect dependency for github.com/fluent/fluent-logger-golang,
which does not yet use a go.mod. Update the dependency to the latest patch
release, which contains some fixes, and updates for newer go versions;
full diff: https://github.com/tinylib/msgp/compare/v1.1.6...v1.1.8
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/cgroups/compare/v3.0.1...v3.0.2
relevant changes:
- cgroup2: only enable the cpuset controller if cpus or mems is specified
- cgroup1 delete: proceed to the next subsystem when a cgroup is not found
- Cgroup2: Reduce allocations for manager.Stat
- Improve performance by for pid stats (cgroups1) re-using readuint
- Reduce allocs in ReadUint64 by pre-allocating byte buffer
- cgroup2: rm/simplify some code
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If an image is only by id instead of its name, don't prune it
completely. but only untag it and create a dangling image for it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This field was added in f0e6e135a8, and
from that change I suspect it was intended to store the default SELinux
mount-labels to be set on containers.
However, it was never used, so let's remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the intermediate variable, and move the option closer
to where it's used, as in some cases we created the variable,
but could return with an error before it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was not using the DriverCallback interface, and only
required the Registerer interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
CI failed sometimes if no daemon.json was present:
Run sudo rm /etc/docker/daemon.json
sudo rm /etc/docker/daemon.json
sudo service docker restart
docker version
docker info
shell: /usr/bin/bash -e {0}
env:
DESTDIR: ./build
BUILDKIT_REPO: moby/buildkit
BUILDKIT_TEST_DISABLE_FEATURES: cache_backend_azblob,cache_backend_s3,merge_diff
BUILDKIT_REF: 798ad6b0ce9f2fe86dfb2b0277e6770d0b545871
rm: cannot remove '/etc/docker/daemon.json': No such file or directory
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Linux 6.2 and up (commit [f1f1f2569901ec5b9d425f2e91c09a0e320768f3][1])
provides a fast path for the number of open files for the process.
From the [Linux docs][2]:
> The number of open files for the process is stored in 'size' member of
> `stat()` output for /proc/<pid>/fd for fast access.
[1]: f1f1f25699
[2]: https://docs.kernel.org/filesystems/proc.html#proc-pid-fd-list-of-symlinks-to-open-files
This patch adds a fast-path for Kernels that support this, and falls back
to the slow path if the Size fields is zero.
Comparing on a Fedora 38 (kernel 6.2.9-300.fc38.x86_64):
Before/After:
go test -bench ^BenchmarkGetTotalUsedFds$ -run ^$ ./pkg/fileutils/
BenchmarkGetTotalUsedFds 57264 18595 ns/op 408 B/op 10 allocs/op
BenchmarkGetTotalUsedFds 370392 3271 ns/op 40 B/op 3 allocs/op
Note that the slow path has 1 more file-descriptor, due to the open
file-handle for /proc/<pid>/fd during the calculation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use File.Readdirnames instead of os.ReadDir, as we're only interested in
the number of files, and results don't have to be sorted.
Before:
BenchmarkGetTotalUsedFds-5 149272 7896 ns/op 945 B/op 20 allocs/op
After:
BenchmarkGetTotalUsedFds-5 153517 7644 ns/op 408 B/op 10 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8d56108ffb moved this function from
the generic (no build-tags) fileutils.go to a unix file, adding "freebsd"
to the build-tags.
This likely was a wrong assumption (as other files had freebsd build-tags).
FreeBSD's procfs does not mention `/proc/<pid>/fd` in the manpage, and
we don't test FreeBSD in CI, so let's drop it, and make this a Linux-only
file.
While updating also dropping the import-tag, as we're planning to move
this file internal to the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update docker to support a '--log-format' option, which accepts either
'text' (default) or 'json'. Propagate the log format to containerd as
well, to ensure that everything will be logged consistently.
Signed-off-by: Philip K. Warren <pkwarren@gmail.com>
When live-restoring a container the volume driver needs be notified that
there is an active mount for the volume.
Before this change the count is zero until the container stops and the
uint64 overflows pretty much making it so the volume can never be
removed until another daemon restart.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
I think this may be missing a sudo (as all other operations do use
sudo to access daemon.json);
Run if [ ! -e /etc/docker/daemon.json ]; then
if [ ! -e /etc/docker/daemon.json ]; then
echo '{}' | tee /etc/docker/daemon.json >/dev/null
fi
DOCKERD_CONFIG=$(jq '.+{"experimental":true,"live-restore":true,"ipv6":true,"fixed-cidr-v6":"2001:db8:1::/64"}' /etc/docker/daemon.json)
sudo tee /etc/docker/daemon.json <<<"$DOCKERD_CONFIG" >/dev/null
sudo service docker restart
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
GO_VERSION: 1.20.5
GOTESTLIST_VERSION: v0.3.1
TESTSTAT_VERSION: v0.1.3
ITG_CLI_MATRIX_SIZE: 6
DOCKER_EXPERIMENTAL: 1
DOCKER_GRAPHDRIVER: overlay2
tee: /etc/docker/daemon.json: Permission denied
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Dockerfile in this repository performs many stages in parallel. If any of
those stages fails to build (which could be due to networking congestion),
other stages are also (forcibly?) terminated, which can cause an unclean
shutdown.
In some case, this can cause `git` to be terminated, leaving a `.lock` file
behind in the cache mount. Retrying the build now will fail, and the only
workaround is to clean the build-cache (which causes many stages to be
built again, potentially triggering the problem again).
> [dockercli-integration 3/3] RUN --mount=type=cache,id=dockercli-integration-git-linux/arm64/v8,target=./.git --mount=type=cache,target=/root/.cache/go-build,id=dockercli-integration-build-linux/arm64/v8 /download-or-build-cli.sh v17.06.2-ce https://github.com/docker/cli.git /build:
#0 1.575 fatal: Unable to create '/go/src/github.com/docker/cli/.git/shallow.lock': File exists.
#0 1.575
#0 1.575 Another git process seems to be running in this repository, e.g.
#0 1.575 an editor opened by 'git commit'. Please make sure all processes
#0 1.575 are terminated then try again. If it still fails, a git process
#0 1.575 may have crashed in this repository earlier:
#0 1.575 remove the file manually to continue.
This patch:
- Updates the Dockerfile to remove `.lock` files (`shallow.lock`, `index.lock`)
that may have been left behind from previous builds. I put this code in the
Dockerfile itself (not the script), as the script may be used in other
situations outside of the Dockerfile (for which we cannot guarantee no other
git session is active).
- Adds a `docker --version` step to the stage; this is mostly to verify the
build was successful (and to be consistent with other stages).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
re-enable the DCO check, which was temporarily disabled to migrate
old commits from github.com/docker/libkv
This reverts commit 7d7225fae6.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ideally, this should actually do a lookup across images that have no parent, but I wasn't 100% sure how to accomplish that so I opted for the smaller change of having `FROM scratch` builds not be cached for now.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
The migrated history has some commits that missed a DCO:
These commits do not have a proper 'Signed-off-by:' marker:
- 3fa22634a617e2c52d2c5f061826e5107e27985f
- 9b11053e9147884c43c9a9d8ebfcd7bb9470e8b5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A reduced set of the dependency, only taking the parts that are used. Taken from
upstream commit: dfacc563de
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/libkv.git temp_libkv
cd temp_libkv
# create branch to work with
git checkout -b migrate_libkv
# remove all code, except for the files we need; rename the remaining ones to their new target location
git filter-repo --force \
--path libkv.go \
--path store/store.go \
--path store/boltdb/boltdb.go \
--path-rename libkv.go:libnetwork/internal/kvstore/kvstore_manage.go \
--path-rename store/store.go:libnetwork/internal/kvstore/kvstore.go \
--path-rename store/boltdb/:libnetwork/internal/kvstore/boltdb/
# go to the target github.com/moby/moby repository
cd ~/projects/docker
# create a branch to work with
git checkout -b integrate_libkv
# add the temporary repository as an upstream and make sure it's up-to-date
git remote add temp_libkv ~/projects/temp_libkv
git fetch temp_libkv
# merge the upstream code, rewriting "pkg/symlink" to "symlink"
git merge --allow-unrelated-histories --signoff -S temp_libkv/migrate_libkv
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was only used in tests, and internally, and no longer
used since we switch to using os.UserHomeDir() from stdlib.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was last used in the pkg/mflag package, which was removed
in 14712f9ff0, and is no longer used in
libnetwork code since e6de8aec2f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added in f0e5b3d7d8 to
account for older versions of the engine (Docker EE LTS versions), which
did not yet provide the OSType field in Docker info, and had to be manually
set using the TEST_OSTYPE env-var.
This patch removes the field in favor of the equivalent in DaemonInfo. It's
more verbose, but also less ambiguous what information we're using (i.e.,
the platform the daemon is running on, not the local platform).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This env-var was added in f0e5b3d7d8 to
account for older versions of the engine (Docker EE LTS versions), which
did not yet provide the OSType field in Docker info.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before 4bafaa00aa, if the daemon was
killed while a container was running and the container shim is killed
before the daemon is restarted, such as if the host system is
hard-rebooted, the daemon would restore the container to the stopped
state and set the exit code to 255. The aforementioned commit introduced
a regression where the container's exit code would instead be set to 0.
Fix the regression so that the exit code is once against set to 255 on
restore.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If an exec fails to start in such a way that containerd publishes an
exit event for it, daemon.ProcessEvent will race
daemon.ContainerExecStart in handling the failure. This race has been a
long-standing bug, which was mostly harmless until
4bafaa00aa. After that change, the daemon
would dereference a nil pointer and crash if ProcessEvent won the race.
Restore the status quo buggy behaviour by adding a check to skip the
dereference if execConfig.Process is nil.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We missed a case when parsing extra hosts from the dockerfile
frontend so the build fails.
To handle this case we need to set a dedicated worker label
that contains the host gateway IP so clients like Buildx
can just set the proper host:ip when parsing extra hosts
that contain the special string "host-gateway".
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The stargz snapshotter cannot be re-mounted, so the reference-counted
path must be used.
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Commit 90de570cfa passed through the request
context to daemon.ContainerStop(). As a result, cancelling the context would
cancel the "graceful" stop of the container, and would proceed with forcefully
killing the container.
This patch partially reverts the changes from 90de570cfa
and breaks the context to prevent cancelling the context from cancelling the stop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The official Python images on Docker Hub switched to debian bookworm,
which is now the current stable version of Debian.
However, the location of the apt repository config file changed, which
causes the Dockerfile build to fail;
Loaded image: emptyfs:latest
Loaded image ID: sha256:0df1207206e5288f4a989a2f13d1f5b3c4e70467702c1d5d21dfc9f002b7bd43
INFO: Building docker-sdk-python3:5.0.3...
tests/Dockerfile:6
--------------------
5 | ARG APT_MIRROR
6 | >>> RUN sed -ri "s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g" /etc/apt/sources.list \
7 | >>> && sed -ri "s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g" /etc/apt/sources.list
8 |
--------------------
ERROR: failed to solve: process "/bin/sh -c sed -ri \"s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g\" /etc/apt/sources.list && sed -ri \"s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g\" /etc/apt/sources.list" did not complete successfully: exit code: 2
This needs to be fixed in docker-py, but in the meantime, we can pin to
the bullseye variant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This enables picking up OTLP tracing context for the gRPC
requests.
Also sets up the in-memory recorder that BuildKit History API
can use to store the traces associated with specific build
in a database after build completes.
This doesn't enable Jaeger tracing endpoints from env
but this can be easily enabled by adding another import if
maintainers want it.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
go1.20.5 (released 2023-06-06) includes four security fixes to the cmd/go and
runtime packages, as well as bug fixes to the compiler, the go command, the
runtime, and the crypto/rsa, net, and os packages. See the Go 1.20.5 milestone
on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.5+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.4...go1.20.5
These minor releases include 3 security fixes following the security policy:
- cmd/go: cgo code injection
The go command may generate unexpected code at build time when using cgo. This
may result in unexpected behavior when running a go program which uses cgo.
This may occur when running an untrusted module which contains directories with
newline characters in their names. Modules which are retrieved using the go command,
i.e. via "go get", are not affected (modules retrieved using GOPATH-mode, i.e.
GO111MODULE=off, may be affected).
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29402 and Go issue https://go.dev/issue/60167.
- runtime: unexpected behavior of setuid/setgid binaries
The Go runtime didn't act any differently when a binary had the setuid/setgid
bit set. On Unix platforms, if a setuid/setgid binary was executed with standard
I/O file descriptors closed, opening any files could result in unexpected
content being read/written with elevated prilieges. Similarly if a setuid/setgid
program was terminated, either via panic or signal, it could leak the contents
of its registers.
Thanks to Vincent Dehors from Synacktiv for reporting this issue.
This is CVE-2023-29403 and Go issue https://go.dev/issue/60272.
- cmd/go: improper sanitization of LDFLAGS
The go command may execute arbitrary code at build time when using cgo. This may
occur when running "go get" on a malicious module, or when running any other
command which builds untrusted code. This is can by triggered by linker flags,
specified via a "#cgo LDFLAGS" directive.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29404 and CVE-2023-29405 and Go issues https://go.dev/issue/60305 and https://go.dev/issue/60306.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon.generateNewName() already reserves the generated name, but its name
did not indicate it did. The daemon.registerName() assumed that the generated
name still had to be reserved, which could mean it would try to reserve the
same name again.
This patch renames daemon.generateNewName to daemon.generateAndReserveName
to make it clearer what it does, and updates registerName() to return early
if it successfully generated (and registered) the container name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The most notable change here is that the OCI's type uses a pointer for `Created`, which we probably should've been too, so most of these changes are accounting for that (and embedding our `Equal` implementation in the one single place it was used).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This utility was only used in a single location (as part of `docker info`),
but the `pkg/rootless` package is imported in various locations, causing
rootlesskit to be a dependency for consumers of that package.
Move GetRootlessKitClient to the daemon code, which is the only location
it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For configured runtimes with a runtimeType other than
io.containerd.runc.v1, io.containerd.runc.v2 and
io.containerd.runhcs.v1, the only supported way to pass configuration is
through the generic containerd "runtimeoptions/v1".Options type. Add a
unit test case which verifies that the options set in the daemon config
are correctly unmarshaled into the daemon's in-memory runtime config,
and that the map keys for the daemon config align with the ones used
when configuring cri-containerd (PascalCase, not camelCase or
snake_case).
Signed-off-by: Cory Snider <csnider@mirantis.com>
When constructing the client, and setting the User-Agent, care must be
taken to apply the header in the right location, as custom headers can
be set in the CLI configuration, and merging these custom headers should
not override the User-Agent header.
This patch adds a dedicated `WithUserAgent()` option, which stores the
user-agent separate from other headers, centralizing the merging of
other headers, so that other parts of the (CLI) code don't have to be
concerned with merging them in the right order.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Convert CreateMountpoint, ReadOnlyNonRecursive, and ReadOnlyForceRecursive.
See moby/swarmkit PR 3134
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Some snapshotters (like overlayfs or zfs) can't mount the same
directories twice. For example if the same directroy is used as an upper
directory in two mounts the kernel will output this warning:
overlayfs: upperdir is in-use as upperdir/workdir of another mount, accessing files from both mounts will result in undefined behavior.
And indeed accessing the files from both mounts will result in an "No
such file or directory" error.
This change introduces reference counts for the mounts, if a directory
is already mounted the mount interface will only increment the mount
counter and return the mount target effectively making sure that the
filesystem doesn't end up in an undefined behavior.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Audit the OCI spec options used for Linux containers to ensure they are
less order-dependent. Ensure they don't assume that any pointer fields
are non-nil and that they don't unintentionally clobber mutations to the
spec applied by other options.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Many of the fields in LinuxResources struct are pointers to scalars for
some reason, presumably to differentiate between set-to-zero and unset
when unmarshaling from JSON, despite zero being outside the acceptable
range for the corresponding kernel tunables. When creating the OCI spec
for a container, the daemon sets the container's OCI spec CPUShares and
BlkioWeight parameters to zero when the corresponding Docker container
configuration values are zero, signifying unset, despite the minimum
acceptable value for CPUShares being two, and BlkioWeight ten. This has
gone unnoticed as runC does not distingiush set-to-zero from unset as it
also uses zero internally to represent unset for those fields. However,
kata-containers v3.2.0-alpha.3 tries to apply the explicit-zero resource
parameters to the container, exactly as instructed, and fails loudly.
The OCI runtime-spec is silent on how the runtime should handle the case
when those parameters are explicitly set to out-of-range values and
kata's behaviour is not unreasonable, so the daemon must therefore be in
the wrong.
Translate unset values in the Docker container's resources HostConfig to
omit the corresponding fields in the container's OCI spec when starting
and updating a container in order to maximize compatibility with
runtimes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Switch to using t.TempDir() instead of rolling our own.
Clean up mounts leaked by the tests as otherwise the tests fail due to
the leaked mounts because unlike the old cleanup code, t.TempDir()
cleanup does not ignore errors from os.RemoveAll.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Avoids invalidation of dev-systemd-true and dev-base when changing the
CLI version/repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't show `error: No such remote: 'origin'` error when building for the
first time and the cached git repository doesn't a remote yet.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Installs the buildx cli plugin in the container shell by default.
Previously user had to manually download the buildx binary to use
buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use separate cli for integration-cli to allow use newer CLI for
interactive dev shell usage.
Both versions can be overriden with DOCKERCLI_VERSION or
DOCKERCLI_INTEGRATION_VERSION. Binary is downloaded from
download.docker.com if it's available, otherwise it's built from the
source.
For backwards compatibility DOCKER_CLI_PATH overrides BOTH clis.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
I missed the most important COPY in 637ca59375
Copying the source code into the dev-container does not depend on the parent
layers, so can use the --link option as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.
Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.
Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.
Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.
Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Config reloading has interleaved validations and other fallible
operations with mutating the live daemon configuration. The daemon
configuration could be left in a partially-reloaded state if any of the
operations returns an error. Mutating a copy of the configuration and
atomically swapping the config struct on success is not currently an
option as config values are not copyable due to the presence of
sync.Mutex fields. Introduce a two-phase commit protocol to defer any
mutations of the daemon state until after all fallible operations have
succeeded.
Reload transactions are not yet entirely hermetic. The platform
reloading logic for custom runtimes on *nix could still leave the
directory of generated runtime wrapper scripts in an indeterminate state
if an error is encountered.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Historically, daemon.RegistryHosts() has returned a docker.RegistryHosts
callback function which closes over a point-in-time snapshot of the
daemon configuration. When constructing the BuildKit builder at daemon
startup, the return value of daemon.RegistryHosts() has been used.
Therefore the BuildKit builder would use the registry configuration as
it was at daemon startup for the life of the process, even if the
registry configuration is changed and the configuration reloaded.
Provide BuildKit with a RegistryHosts callback which reflects the
live daemon configuration after reloads so that registry operations
performed by BuildKit always use the same configuration as the rest of
the daemon.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Passing around a bare pointer to the map of configured features in order
to propagate to consumers changes to the configuration across reloads is
dangerous. Map operations are not atomic, so concurrently reading from
the map while it is being updated is a data race as there is no
synchronization. Use a getter function to retrieve the current features
map so the features can be retrieved race-free.
Remove the unused features argument from the build router.
Signed-off-by: Cory Snider <csnider@mirantis.com>
With this patch, the user-agent has information about the containerd-client
version and the storage-driver that's used when using the containerd-integration;
time="2023-06-01T11:27:07.959822887Z" level=info msg="listening on [::]:5000" go.version=go1.19.9 instance.id=53590f34-096a-4fd1-9c58-d3b8eb7e5092 service=registry version=2.8.2
...
172.18.0.1 - - [01/Jun/2023:11:30:12 +0000] "HEAD /v2/multifoo/blobs/sha256:c7ec7661263e5e597156f2281d97b160b91af56fa1fd2cc045061c7adac4babd HTTP/1.1" 404 157 "" "docker/dev go/go1.20.4 git-commit/8d67d0c1a8 kernel/5.15.49-linuxkit-pr os/linux arch/arm64 containerd-client/1.6.21+unknown storage-driver/overlayfs UpstreamClient(Docker-Client/24.0.2 \\(linux\\))"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was not really "inserting" anything, just formatting and appending.
Simplify this by changing this in to a `getUpstreamUserAgent()` function
which returns the upstream User-Agent (if any) into a `UpstreamClient()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a const for the characters to escape, instead of implementing
this as a generic escaping function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this, the client would report itself as containerd, and the containerd
version from the containerd go module:
time="2023-06-01T09:43:21.907359755Z" level=info msg="listening on [::]:5000" go.version=go1.19.9 instance.id=67b89d83-eac0-4f85-b36b-b1b18e80bde1 service=registry version=2.8.2
...
172.18.0.1 - - [01/Jun/2023:09:43:33 +0000] "HEAD /v2/multifoo/blobs/sha256:cb269d7c0c1ca22fb5a70342c3ed2196c57a825f94b3f0e5ce3aa8c55baee829 HTTP/1.1" 404 157 "" "containerd/1.6.21+unknown"
With this patch, the user-agent has the docker daemon information;
time="2023-06-01T11:27:07.959822887Z" level=info msg="listening on [::]:5000" go.version=go1.19.9 instance.id=53590f34-096a-4fd1-9c58-d3b8eb7e5092 service=registry version=2.8.2
...
172.18.0.1 - - [01/Jun/2023:11:27:20 +0000] "HEAD /v2/multifoo/blobs/sha256:c7ec7661263e5e597156f2281d97b160b91af56fa1fd2cc045061c7adac4babd HTTP/1.1" 404 157 "" "docker/dev go/go1.20.4 git-commit/8d67d0c1a8 kernel/5.15.49-linuxkit-pr os/linux arch/arm64 UpstreamClient(Docker-Client/24.0.2 \\(linux\\))"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix timeouts from very long raft messages
- fix: code optimization
- update dependencies
full diff: 75e92ce14f...01bb7a4139
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Build-cache for the build-stages themselves are already invalidated if the
base-images they're using is updated, and the COPY operations don't depend
on previous steps (as there's no overlap between artifacts copied).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The default implementation of the containerd.Image interface provided by
the containerd operates on the parent index/manifest list of the image
and the platform matcher.
This isn't convenient when a specific manifest is already known and it's
redundant to search the whole index for a manifest that matches the
given platform matcher. It can also result in a different manifest
picked up than expected when multiple manifests with the same platform
are present.
This introduces a walkImageManifests which walks the provided image and
calls a handler with a ImageManifest, which is a simple wrapper that
implements containerd.Image interfaces and performs all containerd.Image
operations against a platform specific manifest instead of the root
manifest list/index.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Resolver.setupIPTable() checks whether it needs to flush or create the
user chains used for NATing container DNS requests by testing for the
existence of the rules which jump to said user chains. Unfortunately it
does so using the IPTable.RawCombinedOutputNative() method, which
returns a non-nil error if the iptables command returns any output even
if the command exits with a zero status code. While that is fine with
iptables-legacy as it prints no output if the rule exists, iptables-nft
v1.8.7 prints some information about the rule. Consequently,
Resolver.setupIPTable() would incorrectly think that the rule does not
exist during container restore and attempt to create it. This happened
work work by coincidence before 8f5a9a741b
because the failure to create the already-existing table would be
ignored and the new NAT rules would be inserted before the stale rules
left in the table from when the container was last started/restored. Now
that failing to create the table is treated as a fatal error, the
incompatibility with iptables-nft is no longer hidden.
Switch to using IPTable.ExistsNative() to test for the existence of the
jump rules as it correctly only checks the iptables command's exit
status without regard for whether it outputs anything.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The method to restore a network namespace takes a collection of
interfaces to restore with the options to apply. The interface names are
structured data, tuples of (SrcName, DstPrefix) but for whatever reason
are being passed into Restore() serialized to strings. A refactor,
f0be4d126d, accidentally broke the
serialization by dropping the delimiter. Rather than fix the
serialization and leave the time-bomb for someone else to trip over,
pass the interface names as structured data.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Switching snapshotter implementations would result in an error when
preparing a snapshot, check that the image is indeed unpacked for the
current snapshot before trying to prepare a snapshot.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This type (as well as TarsumBackup), was used for the experimental --stream
support for the classic builder. This feature was removed in commit
6ca3ec88ae, which also removed uses of
the CachableSource type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Dockerfile.e2e is not used anymore. Integration tests run
through the main Dockerfile.
Also removes the daemon OS/Arch detection script that is not
necessary anymore. It was used to select the Dockerfile based
on the arch like Dockerfile.arm64 but we don't have those
anymore. Was also used to check referenced frozen images
in the Dockerfile.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Adds a Dockerfile and make targets to update and validate
generated files (proto, seccomp default profile)
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This makes the output of `docker save` fully OCI compliant.
When using the containerd image store, this code is not used. That
exporter will just use containerd's export method and should give us the
output we want for multi-arch images.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In cases where an exec start failed the exec process will be nil even
though the channel to signal that the exec started was closed.
Ideally ExecConfig would get a nice refactor to handle this case better
(ie. it's not started so don't close that channel).
This is a minimal fix to prevent NPE. Luckilly this would only get
called by a client and only the http request goroutine gets the panic
(http lib recovers the panic).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
While the VXLAN interface and the iptables rules to mark outgoing VXLAN
packets for encryption are configured to use the Swarm data path port,
the XFRM policies for actually applying the encryption are hardcoded to
match packets with destination port 4789/udp. Consequently, encrypted
overlay networks do not pass traffic when the Swarm is configured with
any other data path port: encryption is not applied to the outgoing
VXLAN packets and the destination host drops the received cleartext
packets. Use the configured data path port instead of hardcoding port
4789 in the XFRM policies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This struct was never modified; let's just use consts for these.
Also remove the args return from detectContentType(), as it was
not used anywhere.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type (as well as TarsumBackup), was used for the experimental --stream
support for the classic builder. This feature was removed in commit
6ca3ec88ae, which also removed uses of
the CachableSource type.
As far as I could find, there's no external consumers of these types,
but let's deprecated it, to give potential users a heads-up that it
will be removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It turns out that the unnecessary serialization removed in
b75246202a happened to work around a bug
in containerd. When many exec processes are started concurrently in the
same containerd task, it takes seconds to minutes for them all to start.
Add the workaround back in, only deliberately this time.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`docker run -v /foo:/foo:ro` is now recursively read-only on kernel >= 5.12.
Automatically falls back to the legacy non-recursively read-only mount mode on kernel < 5.12.
Use `ro-non-recursive` to disable RRO.
Use `ro-force-recursive` or `rro` to explicitly enable RRO. (Fails on kernel < 5.12)
Fix issue 44978
Fix docker/for-linux issue 788
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Feature flags are one of the configuration items which can be reloaded
without restarting the daemon. Whether the daemon uses the containerd
snapshotter service or the legacy graph drivers is controlled by a
feature flag. However, much of the code which checks the snapshotter
feature flag assumes that the flag cannot change at runtime. Make it so
that the snapshotter setting can only be changed by restarting the
daemon, even if the flag state changes after a live configuration
reload.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Starting with go1.19, the Go runtime on Windows now supports the `netgo` build-
flag to use a native Go DNS resolver. Prior to that version, the build-flag
only had an effect on non-Windows platforms. When using the `netgo` build-flag,
the Windows's host resolver is not used, and as a result, custom entries in
`etc/hosts` are ignored, which is a change in behavior from binaries compiled
with older versions of the Go runtime.
From the go1.19 release notes: https://go.dev/doc/go1.19#net
> Resolver.PreferGo is now implemented on Windows and Plan 9. It previously
> only worked on Unix platforms. Combined with Dialer.Resolver and Resolver.Dial,
> it's now possible to write portable programs and be in control of all DNS name
> lookups when dialing.
>
> The net package now has initial support for the netgo build tag on Windows.
> When used, the package uses the Go DNS client (as used by Resolver.PreferGo)
> instead of asking Windows for DNS results. The upstream DNS server it discovers
> from Windows may not yet be correct with complex system network configurations,
> however.
Our Windows binaries are compiled with the "static" (`make/binary-daemon`)
script, which has the `netgo` option set by default. This patch unsets the
`netgo` option when cross-compiling for Windows.
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Docker with containerd integration emits "Exists" progress action when a
layer of the currently pulled image already exists. This is different
from the non-c8d Docker which emits "Already exists".
This makes both implementations consistent by emitting backwards
compatible "Already exists" action.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
To allow skipping integration tests that don't apply to the
containerd snapshotter.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This moves the blobs around so they follow the OCI spec.
Note that because docker reads paths from the manifest.json inside the
tar this is not a breaking change.
This does, however, remove the old layer "VERSION" file which had a big
"why is this even here" in the code comments. I suspect it does not
matter at all even for really old versions of Docker. In any case it is
a useless file for any even relatively modern version of Docker.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
TestProxyNXDOMAIN has proven to be susceptible to failing as a
consequence of unlocked threads being set to the wrong network
namespace. As the failure mode looks a lot like a bug in the test
itself, it seems prudent to add a check for mismatched namespaces to the
test so we will know for next time that the root cause lies elsewhere.
Signed-off-by: Cory Snider <csnider@mirantis.com>
osl.setIPv6 mistakenly captured the calling goroutine's thread's network
namespace instead of the network namespace of the thread getting its
namespace temporarily changed. As this function appears to only be
called from contexts in the process's initial network namespace, this
mistake would be of little consequence at runtime. The libnetwork unit
tests, on the other hand, unshare network namespaces so as not to
interfere with each other or the host's network namespace. But due to
this bug, the isolation backfires and the network namespace of
goroutines used by a test which are expected to be in the initial
network namespace can randomly become the isolated network namespace of
some other test. Symptoms include a loopback network server running in
one goroutine being inexplicably and randomly being unreachable by a
client in another goroutine.
Capture the original network namespace of the thread from the thread to
be tampered with, after locking the goroutine to the thread.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Swapping out the global logger on the fly is causing tests to flake out
by logging to a test's log output after the test function has returned.
Refactor Resolver to use a dependency-injected logger and the resolver
unit tests to inject a private logger instance into the Resolver under
test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
tstwriter mocks the server-side connection between the resolver and the
container, not the resolver and the external DNS server, so returning
the external DNS server's address as w.LocalAddr() is technically
incorrect and misleading. Only the protocols need to match as the
resolver uses the client's choice of protocol to determine which
protocol to use when forwarding the query to the external DNS server.
While this change has no material impact on the tests, it makes the
tests slightly more comprehensible for the next person.
Signed-off-by: Cory Snider <csnider@mirantis.com>
commit dc11d2a2d8 removed the devicemapper
storage-driver, so these warnings are no longer relevant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 0abb8dec3f removed support for
running overlay/overlay2 on top of a backing filesystem without d_type
support, and turned it into a fatal error when starting the daemon,
so there's no need to generate warnings for this situation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/spf13/cobra/releases/tag/v1.7.0
Features
- Allow to preserve ordering of completions in bash, zsh, pwsh, & fish
- Add support for PowerShell 7.2+ in completions
- Allow sourcing zsh completion script
Bug fixes
- Don't remove flag values that match sub-command name
- Fix powershell completions not returning single word
- Remove masked template import variable name
- Correctly detect completions with dash in argument
Testing & CI/CD
- Deprecate Go 1.15 in CI
- Deprecate Go 1.16 in CI
- Add testing for Go 1.20 in CI
- Add tests to illustrate unknown flag bug
Maintenance
- Update main image to better handle dark backgrounds
- Fix stale.yaml mispellings
- Remove stale bot from GitHub actions
- Add makefile target for installing dependencies
- Add Sia to projects using Cobra
- Add Vitess and Arewefastyet to projects using cobra
- Fixup for Kubescape github org
- Fix route for GitHub workflows badge
- Fixup for GoDoc style documentation
- Various bash scripting improvements for completion
- Add Constellation to projects using Cobra
Documentation
- Add documentation about disabling completion descriptions
- Improve MarkFlagsMutuallyExclusive example in user guide
- Update shell_completions.md
- Update copywrite year
- Document suggested layout of subcommands
- Replace deprecated ExactValidArgs with MatchAll in doc
full diff: https://github.com/spf13/cobra/compare/v1.6.1...v1.7.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Extended attributes are set on files in container images for a reason.
Fail to unpack if extended attributes are present in a layer and setting
the attributes on the unpacked files fails for any reason.
Add an option to the vfs graph driver to opt into the old behaviour
where ENOTSUPP and EPERM errors encountered when setting extended
attributes are ignored. Make it abundantly clear to users and anyone
triaging their bug reports that they are shooting themselves in the
foot by enabling this option.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Our resolver is just a forwarder for external DNS so it should act like
it. Unless it's a server failure or refusal, take the response at face
value and forward it along to the client. RFC 8020 is only applicable to
caching recursive name servers and our resolver is neither caching nor
recursive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fixes a bug where, if a user pulls an image with a tag != `latest` and
a specific platform, we return an NotFound error for the wrong (`latest`) tag.
see: https://github.com/moby/moby/issues/45558
This bug was introduced in 779a5b3029
in the changes to `daemon/images/image_pull.go`, when we started returning the error from the call to
`GetImage` after the pull. We do this call, if pulling with a specified platform, to check if the platform
of the pulled image matches the requested platform (for cases with single-arch images).
However, when we call `GetImage` we're not passing the image tag, only name, so `GetImage` assumes `latest`
which breaks when the user has requested a different tag, since there might not be such an image in the store.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
The error returned by DecodeConfig was changed in
b6d58d749c and caused this to regress.
Allow empty request bodies for this endpoint once again.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The internal Client request methods which accept an object as a body use
nil to signal that the request should not have a body. But it is easy to
accidentally pass a typed-nil value as the object, e.g. if the object
comes from a function argument or struct field of a concrete type. The
result is that these requests will, surprisingly, have a JSON body of
`null`. Treat typed-nil pointers the same as untyped nils for the
purposes of determining whether or not the request should include a
body.
Stop assuming that POST requests should always have a body. POST /commit
does not require a body, for example.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Upstart has been EOL for 8 years and isn't used by any distributions we support any more.
Additionally, this removes the "cgroups v1" setup code because it's more reasonable now for us to expect something _else_ to have set up cgroups appropriately (especially cgroups v2).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
These changes add basic CDI integration to the docker daemon.
A cdi driver is added to handle cdi device requests. This
is gated by an experimental feature flag and is only supported on linux
This change also adds a CDISpecDirs (cdi-spec-dirs) option to the config.
This allows the default values of `/etc/cdi`, /var/run/cdi` to be overridden
which is useful for testing.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Fixing case where username may contain a backslash.
This case can happen for winbind/samba active directory domain users.
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Use more meaningful variable name
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Update contrib/dockerd-rootless-setuptool.sh
Co-authored-by: Akihiro Suda <suda.kyoto@gmail.com>
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Use more meaningful variable name
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
Update contrib/dockerd-rootless-setuptool.sh
Co-authored-by: Akihiro Suda <suda.kyoto@gmail.com>
Signed-off-by: Jean-Michel Rouet <jean-michel.rouet@philips.com>
When the `ServerAddress` in the `AuthConfig` provided by the client is
empty, default to the default registry (registry-1.docker.io).
This makes the behaviour the same as with the containerd image store
integration disabled.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- use is.ErrorType
- replace uses of client.IsErrNotFound for errdefs.IsNotFound, as
the client no longer returns the old error-type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use is.ErrorType
- replace uses of client.IsErrNotFound for errdefs.IsNotFound, as
the client no longer returns the old error-type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use is.ErrorType
- replace uses of client.IsErrNotFound for errdefs.IsNotFound, as
the client no longer returns the old error-type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
None of the client will return the old error-types, so there's no need
to keep the compatibility code. We can consider deprecating this function
in favor of the errdefs equivalent this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that most uses of reexec have been replaced with non-reexec
solutions, most of the reexec.Init() calls peppered throughout the test
suites are unnecessary. Furthermore, most of the reexec.Init() calls in
test code neglects to check the return value to determine whether to
exit, which would result in the reexec'ed subprocesses proceeding to run
the tests, which would reexec another subprocess which would proceed to
run the tests, recursively. (That would explain why every reexec
callback used to unconditionally call os.Exit() instead of returning...)
Remove unneeded reexec.Init() calls from test and example code which no
longer needs it, and fix the reexec.Init() calls which are not inert to
exit after a reexec callback is invoked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Our templates no longer contain version-specific rules, so this function
is no longer used. This patch deprecates it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 7008a51449 removed version-conditional
rules from the template, so we no longer need the apparmor_parser Version.
This patch removes the call to `aaparser.GetVersion()`
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 2e19a4d56b removed all other version-
conditional statements from the AppArmor template, but left this one in place.
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.9)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the remaining conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add `execDuration` field to the event attributes map. This is useful for tracking how long the container ran.
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Ports over all the previous image delete logic, such as:
- Introduce `prune` and `force` flags
- Introduce the concept of hard and soft image delete conflics, which represent:
- image referenced in multiple tags (soft conflict)
- image being used by a stopped container (soft conflict)
- image being used by a running container (hard conflict)
- Implement delete logic such as:
- if deleting by reference, and there are other references to the same image, just
delete the passed reference
- if deleting by reference, and there is only 1 reference and the image is being used
by a running container, throw an error if !force, or delete the reference and create
a dangling reference otherwise
- if deleting by imageID, and force is true, remove all tags (otherwise soft conflict)
- if imageID, check if stopped container is using the image (soft conflict), and
delete anyway if force
- if imageID was passed in, check if running container is using the image (hard conflict)
- if `prune` is true, and the image being deleted has dangling parents, remove them
This commit also implements logic to get image parents in c8d by comparing shared layers.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
This option was deprecated in 5a922dc162, which
is part of the v24.0.0 release, so we can remove it from master.
This patch;
- adds a check to ValidatePlatformConfig, and produces a fatal error
if oom-score-adjust is set
- removes the deprecated libcontainerd/supervisor.WithOOMScore
- removes the warning from docker info
With this patch:
dockerd --oom-score-adjust=-500 --validate
Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.
And when using `daemon.json`:
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in dbb48e4b29, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in 818ee96219, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in 9d5e754caa, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in 9f3e5eead5, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in 2d49080056, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was deprecated in c63ea32a17, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const was deprecated in 5c78cbd3be, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field is deprecated since 1261fe69a3,
and will now be omitted on API v1.44 and up for the `GET /images/json`,
`GET /images/{id}/json`, and `GET /system/df` endpoints.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 1261fe69a3 deprecated the VirtualSize
field, but forgot to mention that it's also included in the /system/df
endpoint.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 24.0 branch was created, so changes in master/main should now be
targeting the next version of the API (1.44).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/containerd/containerd/releases/tag/v1.6.21
Notable Updates
- update runc binary to v1.1.7
- Remove entry for container from container store on error
- oci: partially restore comment on read-only mounts for uid/gid uses
- windows: Add ArgsEscaped support for CRI
- oci: Use WithReadonlyTempMount when adding users/groups
- archive: consistently respect value of WithSkipDockerManifest
full diff: https://github.com/containerd/containerd/compare/c0efc63d3907...v1.6.21
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- forward-port changes from 0ffaa6c785 to api/swagger.yaml (v1.44-dev)
- backports the changes to v1.43;
- Update container OOMKilled flag immediately 57d2d6ef62
- Add no-new-privileges to SecurityOptions returned by /info eb7738221c
- API: deprecate VirtualSize field for /images/json and /images/{id}/json 1261fe69a3
- api/types/container: create type for changes endpoint dbb48e4b29
- builder-next/prune: Handle "until" filter timestamps 54a125f677
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat copying extended attributes from a source filesystem which does
not support extended attributes as a no-op, same as if the file did not
possess the extended attribute. Only fail copying extended attributes if
the source file has the attribute and the destination filesystem does
not support xattrs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This special case was added in 3043c26419 as
a sentinel error (`AuthRequiredError`) to check whether authentication
is required (and to prompt the users to authenticate). A later refactor
(946bbee39a) removed the `AuthRequiredError`,
but kept the error-message and logic.
Starting with fcee6056dc, it looks like we
no longer depend on this specific error, so we can return the registry's
error message instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.4 (released 2023-05-02) includes three security fixes to the html/template
package, as well as bug fixes to the compiler, the runtime, and the crypto/subtle,
crypto/tls, net/http, and syscall packages. See the Go 1.20.4 milestone on our
issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.4+label%3ACherryPickApproved
release notes: https://go.dev/doc/devel/release#go1.20.4
full diff: https://github.com/golang/go/compare/go1.20.3...go1.20.4
from the announcement:
> These minor releases include 3 security fixes following the security policy:
>
> - html/template: improper sanitization of CSS values
>
> Angle brackets (`<>`) were not considered dangerous characters when inserted
> into CSS contexts. Templates containing multiple actions separated by a '/'
> character could result in unexpectedly closing the CSS context and allowing
> for injection of unexpected HMTL, if executed with untrusted input.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24539 and Go issue https://go.dev/issue/59720.
>
> - html/template: improper handling of JavaScript whitespace
>
> Not all valid JavaScript whitespace characters were considered to be
> whitespace. Templates containing whitespace characters outside of the character
> set "\t\n\f\r\u0020\u2028\u2029" in JavaScript contexts that also contain
> actions may not be properly sanitized during execution.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24540 and Go issue https://go.dev/issue/59721.
>
> - html/template: improper handling of empty HTML attributes
>
> Templates containing actions in unquoted HTML attributes (e.g. "attr={{.}}")
> executed with empty input could result in output that would have unexpected
> results when parsed due to HTML normalization rules. This may allow injection
> of arbitrary attributes into tags.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-29400 and Go issue https://go.dev/issue/59722.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
term: remove interrupt handler on termios
On termios platforms, interrupt signals are not generated in raw mode
terminals as the ISIG setting is not enabled. Remove interrupt handler
as it does nothing for raw mode and prevents other uses of INT signal
with this library.
This code seems to go back all the way to moby/moby#214 where signal
handling was improved for monolithic docker repository. Raw mode -ISIG
got reintroduced in moby/moby@3f63b87807, but the INT handler was left
behind.
full diff: abb19827d3...1aeaba8785
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runtime-spec/releases/tag/v1.1.0-rc.2
Additions
- config-linux: add support for rsvd hugetlb cgroup
- features: add features.md to formalize the runc features JSON
- config-linux: add support for time namespace
Minor fixes and documentation
- config-linux: clarify where device nodes can be created
- runtime: remove When serialized in JSON, the format MUST adhere to the following pattern
- Update CI to Go 1.20
- config: clarify Linux mount options
- config-linux: fix url error
- schema: fix schema for timeOffsets
- schema: remove duplicate keys
full diff: https://github.com/opencontainers/runtime-spec/compare/v1.1.0-rc.1...v1.1.0-rc.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon.CreateImageFromContainer() already constructs a new config by taking
the image config, applying custom options (`docker commit --change ..`) (if
any), and merging those with the containers' configuration, so there is
no need to merge options again.
e22758bfb2/daemon/commit.go (L152-L158)
This patch removes the merge logic from generateCommitImageConfig, and
removes the unused arguments and error-return.
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The OCI image-spec now also provides ArgsEscaped for backward compatibility
with the option used by Docker.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Split these options to a separate struct, so that we can handle them in isolation.
- Change some tests to use subtests, and improve coverage
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- sandbox, endpoint changed in c71555f030, but
missed updating the stubs.
- add missing stub for Controller.cleanupServiceDiscovery()
- While at it also doing some minor (formatting) changes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function included a defer to close the net.Conn if an error occurred,
but the calling function (SetExternalKey()) also had a defer to close it
unconditionally.
Rewrite it to use json.NewEncoder(), which accepts a writer, and inline
the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's a no-op on Windows and other non-Linux, non-FreeBSD platforms,
so there's no need to register the re-exec.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just print the error and os.Exit() instead, which makes it more
explicit that we're exiting, and there's no need to decorate the
error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the function into a "backing" function that returns an error, and the
re-exec entrypoint, which handles the error to provide a more idiomatic approach.
This was part of a larger change accross multiple re-exec functions (now removed).
For history's sake; here's the description for that;
The `reexec.Register()` function accepts reexec entrypoints, which are a `func()`
without return (matching a binary's `main()` function). As these functions cannot
return an error, it's the entrypoint's responsibility to handle any error, and to
indicate failures through `os.Exit()`.
I noticed that some of these entrypoint functions had `defer()` statements, but
called `os.Exit()` either explicitly or implicitly (e.g. through `logrus.Fatal()`).
defer statements are not executed if `os.Exit()` is called, which rendered these
statements useless.
While I doubt these were problematic (I expect files to be closed when the process
exists, and `runtime.LockOSThread()` to not have side-effects after exit), it also
didn't seem to "hurt" to call these as was expected by the function.
This patch rewrites some of the entrypoints to split them into a "backing function"
that can return an error (being slightly more iodiomatic Go) and an wrapper function
to act as entrypoint (which can handle the error and exit the executable).
To some extend, I'm wondering if we should change the signatures of the entrypoints
to return an error so that `reexec.Init()` can handle (or return) the errors, so
that logging can be handled more consistently (currently, some some use logrus,
some just print); this would also keep logging out of some packages, as well as
allows us to provide more metadata about the error (which reexec produced the
error for example).
A quick search showed that there's some external consumers of pkg/reexec, so I
kept this for a future discussion / exercise.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Create dangling images for imported images which don't have a name
annotation attached. Previously the content got loaded, but no image
referencing it was created which caused it to be garbage collected
immediately.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Verify the content to be equal, not "contains"; this output should be
predictable.
- Also verify the content returned by the function to match.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like the intent is to exclude windows (which wouldn't have /etc/resolv.conf
nor systemd), but most tests would run fine elsewhere. This allows running the
tests on macOS for local testing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use t.TempDir() for convenience, and change some t.Fatal's to Errors,
so that all tests can run instead of failing early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test was assuming that the "source" file was always "/etc/resolv.conf",
but the `Get()` function uses `Path()` to find the location of resolv.conf,
which may be different.
While at it, also changed some `t.Fatalf()` to `t.Errorf()`, and renamed
some variables for clarity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After my last change, I noticed that the hash is used as a []byte in most
cases (other than tests). This patch updates the type to use a []byte, which
(although unlikely very important) also improves performance:
Compared to the previous version:
benchstat new.txt new2.txt
name old time/op new time/op delta
HashData-10 128ns ± 1% 116ns ± 1% -9.77% (p=0.000 n=20+20)
name old alloc/op new alloc/op delta
HashData-10 208B ± 0% 88B ± 0% -57.69% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=20+20)
And compared to the original version:
benchstat old.txt new2.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 116ns ± 1% -42.39% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 88B ± 0% -78.85% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 2.00 ± 0% -66.67% (p=0.000 n=20+20)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The code seemed overly complicated, requiring a reader to be constructed,
where in all cases, the data was already available in a variable. This patch
simplifies the utility to not require a reader, which also makes it a bit
more performant:
go install golang.org/x/perf/cmd/benchstat@latest
GO111MODULE=off go test -run='^$' -bench=. -count=20 > old.txt
GO111MODULE=off go test -run='^$' -bench=. -count=20 > new.txt
benchstat old.txt new.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 128ns ± 1% -36.16% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 208B ± 0% -50.00% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=20+20)
A small change was made in `Build()`, which previously returned the resolv.conf
data, even if the function failed to write it. In the new variation, `nil` is
consistently returned on failures.
Note that in various places, the hash is not even used, so we may be able to
simplify things more after this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As of Go 1.8, "net/http".Server provides facilities to close all
listeners, making the same facilities in server.Server redundant.
http.Server also improves upon server.Server by additionally providing a
facility to also wait for outstanding requests to complete after closing
all listeners. Leverage those facilities to give in-flight requests up
to five seconds to finish up after all containers have been shut down.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Use logrus.Fields instead of multiple WithField
- Split one giant debug log into one log per image
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Logging through a dependency-injected interface value was a vestige of
when Trap was in pkg/signal to avoid importing logrus in a reusable
package: cc4da81128.
Now that Trap lives under cmd/dockerd, nobody will be importing this so
we no longer need to worry about minimizing the package's dependencies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Always calling os.Exit() on clean shutdown may not always be desirable
as deferred functions are not run. Let the cleanup callback decide
whether or not to call os.Exit() itself. Allow the process to exit the
normal way, by returning from func main().
Simplify the trap.Trap implementation. The signal notifications are
buffered in a channel so there is little need to spawn a new goroutine
for each received signal. With all signals being handled in the same
goroutine, there are no longer any concurrency concerns around the
interrupt counter.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The image store sends events when a new image is created/tagged, using
it instead of the reference store makes sure we send the "tag" event
when a new image is built using buildx.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Don't panic when processing containers created under fork containerd
integration (this field was added in the upstream and didn't exist in
fork).
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The fix to ignore SIGPIPE signals was originally added in the Go 1.4
era. signal.Ignore was first added in Go 1.5.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Since cc19eba (backported to v23.0.4), the PreferredPool for docker0 is
set only when the user provides the bip config parameter or when the
default bridge already exist. That means, if a user provides the
fixed-cidr parameter on a fresh install or reboot their computer/server
without bip set, dockerd throw the following error when it starts:
> failed to start daemon: Error initializing network controller: Error
> creating default "bridge" network: failed to parse pool request for
> address space "LocalDefault" pool "" subpool "100.64.0.0/26": Invalid
> Address SubPool
See #45356.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This is a workaround to have buildinfo with deps embedded in the
binary. We need to create a go.mod file before building with
-modfile=vendor.mod, otherwise it fails with:
"-modfile cannot be used to set the module root directory."
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This sets BuildKit version from the build information embedded
in running binary so we are aligned with the expected vendoring.
We iterate over all dependencies and find the BuildKit one
and set the right version. We also check if the module is
replaced and use it this case.
There is also additional checks if a pseudo version is
detected. See comments in code for more info.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The slice which stores chain ids used for computing shared size was
mistakenly initialized with the length set instead of the capacity.
This caused a panic when iterating over it later and dereferncing nil
pointer from empty items.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The man page for sched_setaffinity(2) states the following about the pid
argument [1]:
> If pid is zero, then the mask of the calling thread is returned.
Thus the additional call to unix.Getpid can be omitted and pid = 0
passed to unix.SchedGetaffinity.
[1] https://man7.org/linux/man-pages/man2/sched_setaffinity.2.html#DESCRIPTION
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use Linux BPF extensions to locate the offset of the VXLAN header within
the packet so that the same BPF program works with VXLAN packets
received over either IPv4 or IPv6.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Fixes `docker system prune --filter until=<timestamp>`.
`docker system prune` claims to support "until" filter for timestamps,
but it doesn't work because builder "until" filter only supports
duration.
Use the same filter parsing logic and then convert the timestamp to a
relative "keep-duration" supported by buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Unconditionally checking for RT_GROUP_SCHED is harmful. It is one of
the options that you want inactive unless you know that you want it
active.
Systemd recommends to disable it [1], a rationale for doing so is
provided in
https://bugzilla.redhat.com/show_bug.cgi?id=1229700#c0.
The essence is that you can not simply enable RT_GROUP_SCHED, you also
have to assign budgets manually. If you do not assign budgets, then
your realtime scheduling will be affected.
If check-config.sh keeps recommending to enable this, without further
advice, then users will follow the recommendation and likely run into
issues.
Again, this is one of the options that you want inactive, unless you
know that you want to use it.
Related Gentoo bugs:
- https://bugs.gentoo.org/904264
- https://bugs.gentoo.org/606548
1: 39857544ee/README (L144-L150)
Signed-off-by: Florian Schmaus <flo@geekplace.eu>
This test does not apply when running with snapshotters enabled;
go test -v -run TestGetInspectData .
=== RUN TestGetInspectData
inspect_test.go:27: RWLayer of container inspect-me is unexpectedly nil
--- FAIL: TestGetInspectData (0.00s)
FAIL
FAIL github.com/docker/docker/daemon 0.049s
FAIL
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In versions of Docker before v1.10, this field was calculated from
the image itself and all of its parent images. Images are now stored
self-contained, and no longer use a parent-chain, making this field
an equivalent of the Size field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In versions of Docker before v1.10, this field was calculated from
the image itself and all of its parent images. Images are now stored
self-contained, and no longer use a parent-chain, making this field
an equivalent of the Size field.
For the containerd integration, the Size should be the sum of the
image's compressed / packaged and unpacked (snapshots) layers.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit removes iptables rules configured for secure overlay
networks when a network is deleted. Prior to this commit, only
CreateNetwork() was taking care of removing stale iptables rules.
If one of the iptables rule can't be removed, the erorr is logged but
it doesn't prevent network deletion.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
There's still some locations refering to AuFS;
- pkg/archive: I suspect most of that code is because the whiteout-files
are modelled after aufs (but possibly some code is only relevant to
images created with AuFS as storage driver; to be looked into).
- contrib/apparmor/template: likely some rules can be removed
- contrib/dockerize-disk.sh: very old contribution, and unlikely used
by anyone, but perhaps could be updated if we want to (or just removed).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update the version manually (we don't have automation for this yet), and
add a comment to vendor.mod to help users remind to update it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In the case of an error when calling snapshotter.Prepare we would return
nil. This change fixes that and returns the error from Prepare all the
time.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
It's not originally supported by image list, but we need it for `prune`
needs it, so `list` gets it for free.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.6
full diff: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6
This is the sixth patch release in the 1.1.z series of runc, which fixes
a series of cgroup-related issues.
Note that this release can no longer be built from sources using Go
1.16. Using a latest maintained Go 1.20.x or Go 1.19.x release is
recommended. Go 1.17 can still be used.
- systemd cgroup v1 and v2 drivers were deliberately ignoring UnitExist error
from systemd while trying to create a systemd unit, which in some scenarios
may result in a container not being added to the proper systemd unit and
cgroup.
- systemd cgroup v2 driver was incorrectly translating cpuset range from spec's
resources.cpu.cpus to systemd unit property (AllowedCPUs) in case of more
than 8 CPUs, resulting in the wrong AllowedCPUs setting.
- systemd cgroup v1 driver was prefixing container's cgroup path with the path
of PID 1 cgroup, resulting in inability to place PID 1 in a non-root cgroup.
- runc run/start may return "permission denied" error when starting a rootless
container when the file to be executed does not have executable bit set for
the user, not taking the CAP_DAC_OVERRIDE capability into account. This is
a regression in runc 1.1.4, as well as in Go 1.20 and 1.20.1
- cgroup v1 drivers are now aware of misc controller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.6
full diff: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6
This is the sixth patch release in the 1.1.z series of runc, which fixes
a series of cgroup-related issues.
Note that this release can no longer be built from sources using Go
1.16. Using a latest maintained Go 1.20.x or Go 1.19.x release is
recommended. Go 1.17 can still be used.
- systemd cgroup v1 and v2 drivers were deliberately ignoring UnitExist error
from systemd while trying to create a systemd unit, which in some scenarios
may result in a container not being added to the proper systemd unit and
cgroup.
- systemd cgroup v2 driver was incorrectly translating cpuset range from spec's
resources.cpu.cpus to systemd unit property (AllowedCPUs) in case of more
than 8 CPUs, resulting in the wrong AllowedCPUs setting.
- systemd cgroup v1 driver was prefixing container's cgroup path with the path
of PID 1 cgroup, resulting in inability to place PID 1 in a non-root cgroup.
- runc run/start may return "permission denied" error when starting a rootless
container when the file to be executed does not have executable bit set for
the user, not taking the CAP_DAC_OVERRIDE capability into account. This is
a regression in runc 1.1.4, as well as in Go 1.20 and 1.20.1
- cgroup v1 drivers are now aware of misc controller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Change return value in function signature and return fatal errors so
they can actually be reported to the caller instead of just being logged
to daemon log.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `oom-score-adjust` option was added in a894aec8d8,
to prevent the daemon from being OOM-killed before other processes. This
option was mostly added as a "convenience", as running the daemon as a
systemd unit was not yet common.
Having the daemon set its own limits is not best-practice, and something
better handled by the process-manager starting the daemon.
Commit cf7a5be0f2 fixed this option to allow
disabling it, and 2b8e68ef06 removed the default
score adjust.
This patch deprecates the option altogether, recommending users to set these
limits through the process manager used, such as the "OOMScoreAdjust" option
in systemd units.
With this patch:
dockerd --oom-score-adjust=-500 --validate
Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
configuration OK
echo '{"oom-score-adjust":-500}' > /etc/docker/daemon.json
dockerd
INFO[2023-04-12T21:34:51.133389627Z] Starting up
INFO[2023-04-12T21:34:51.135607544Z] containerd not running, starting managed containerd
WARN[2023-04-12T21:34:51.135629086Z] DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release.
docker info
Client:
Context: default
Debug Mode: false
...
DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The call to an unsecure registry doesn't return an error saying that the
"server gave an HTTP response to an HTTPS client" but a
tls.RecordHeaderError saying that the "first record does not look like a
TLS handshake", this changeset looks for the right error for that case.
This fixes the http fallback when using an insecure registry
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The (*network).ipamRelease function nils out the network's IPAM info
fields, putting the network struct into an inconsistent state. The
network-restore startup code panics if it tries to restore a network
from a struct which has fewer IPAM config entries than IPAM info
entries. Therefore (*network).delete contains a critical section: by
persisting the network to the store after ipamRelease(), the datastore
will contain an inconsistent network until the deletion operation
completes and finishes deleting the network from the datastore. If for
any reason the deletion operation is interrupted between ipamRelease()
and deleteFromStore(), the daemon will crash on startup when it tries to
restore the network.
Updating the datastore after releasing the network's IPAM pools may have
served a purpose in the past, when a global datastore was used for
intra-cluster communication and the IPAM allocator had persistent global
state, but nowadays there is no global datastore and the IPAM allocator
has no persistent state whatsoever. Remove the vestigial datastore
update as it is no longer necessary and only serves to cause problems.
If the network deletion is interrupted before the network is deleted
from the datastore, the deletion will resume during the next daemon
startup, including releasing the IPAM pools.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The GetRepository method interacts directly with the registry, and does
not depend on the snapshotter, but is used for two purposes;
For the GET /distribution/{name:.*}/json route;
dd3b71d17c/api/server/router/distribution/backend.go (L11-L15)
And to satisfy the "executor.ImageBackend" interface as used by Swarm;
58c027ac8b/daemon/cluster/executor/backend.go (L77)
This patch removes the method from the ImageService interface, and instead
implements it through an composite struct that satisfies both interfaces,
and an ImageBackend() method is added to the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
remove GetRepository from ImageService
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Prevent from descriptor leak
- Fixes optlen in getsockopt() for s390x
full diff: 9a39160e90...7ff4192f6f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This enforces the github.com/containerd/containerd/errdefs package to
be aliased as "cerrdefs". Any other alias (or no alias used) results
in a linting failure:
integration/container/pause_test.go:9:2: import "github.com/containerd/containerd/errdefs" imported as "c8derrdefs" but must be "cerrdefs" according to config (importas)
c8derrdefs "github.com/containerd/containerd/errdefs"
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The signatures of functions in containerd's errdefs packages are very
similar to those in our own, and it's easy to accidentally use the wrong
package.
This patch uses a consistent alias for all occurrences of this import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const looks to only be there for "convenience", or _possibly_ was created
with future normalization or special handling in mind.
In either case, currently it is just a direct copy (alias) for runtime.GOOS,
and defining our own type for this gives the impression that it's more than
that. It's only used in a single place, and there's no external consumers, so
let's deprecate this const, and use runtime.GOOS instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These consts are only used internally, and never returned to the user.
Un-export to make it clear these are not for external consumption.
While looking at the code, I also noticed that we may be using the wrong
Windows API to collect this information (and found an implementation elsewhere
that does use the correct API). I did not yet update the code, in cases there
are specific reasons.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was not immediately clear why we were not using runtime.GOARCH for
these (with a conversion to other formats, such as x86_64). These docs
are based on comments that were posted when implementing this package;
- https://github.com/moby/moby/pull/13921#issuecomment-130106474
- https://github.com/moby/moby/pull/13921#issuecomment-140270124
Some links were now redirecting to a new location, so updated them to
not depend on the redirect.
While at it, also updated a call to logrus to use structured formatting
(WithError()).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change makes is possible to run `docker exec -u <UID> ...` when the
containerd integration is activated.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
- update various dependencies
- Windows: Support local drivers internal, l2bridge and nat
full diff: e28e8ba9bc...75e92ce14f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When running hack/vendor.sh, I noticed this file was added to vendor.
I suspect this should've been part of 0233029d5a,
but the vendor check doesn't appear to be catching this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Linux kernel prior to v3.16 was not supporting netns for vxlan
interfaces. As such, moby/libnetwork#821 introduced a "host mode" to the
overlay driver. The related kernel fix is available for rhel7 users
since v7.2.
This mode could be forced through the use of the env var
_OVERLAY_HOST_MODE. However this env var has never been documented and
is not referenced in any blog post, so there's little chance many people
rely on it. Moreover, this host mode is deemed as an implementation
details by maintainers. As such, we can consider it dead and we can
remove it without a prior deprecation warning.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since 0fa873c, there's no function writing overlay networks to some
datastore. As such, overlay network struct doesn't need to implement
KVObject interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since a few commits, subnet's vni don't change during the lifetime of
the subnet struct, so there's no need to lock the network before
accessing it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since the previous commit, data from the local store are never read,
thus proving it was only used for Classic Swarm.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver in Swarm v2 mode doesn't support live-restore, ie.
the daemon won't even start if the node is part of a Swarm cluster and
live-restore is enabled. This feature was only used by Swarm Classic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
VNI allocations made by the overlay driver were only used by Classic
Swarm. With Swarm v2 mode, the driver ovmanager is responsible of
allocating & releasing them.
Previously, vxlanIdm was initialized when a global store was available
but since 142b522, no global store can be instantiated. As such,
releaseVxlanID actually does actually nothing and iptables rules are
never removed.
The last line of dead code detected by golangci-lint is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Prior to 0fa873c, the serf-based event loop was started when a global
store was available. Since there's no more global store, this event loop
and all its associated code is dead.
Most dead code detected by golangci-lint in prior commits is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- LocalKVProvider, LocalKVProviderURL, LocalKVProviderConfig,
GlobalKVProvider, GlobalKVProviderURL and GlobalKVProviderConfig
are all unused since moby/libnetwork@be2b6962 (moby/libnetwork#908).
- GlobalKVClient is unused since 0fa873c and c8d2c6e.
- MakeKVProvider, MakeKVProviderURL and MakeKVProviderConfig are unused
since 96cfb076 (moby/moby#44683).
- MakeKVClient is unused since 142b5229 (moby/moby#44875).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is unused anymore since 142b522 (moby/moby#44875).
It was also creating a local store whenever netlabel.LocalKVClient was
specificed in its config argument. This store is unused since
moby/libnetwork@9e72136 (moby/libnetwork#1636).
Finally, the sync.Once properties are never used and thus can be
deleted.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is not used anymore since 142b522 (moby/moby#44875).
golangci-lint now detects dead code. This will be fixed in subsequent
commits.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This command was useful when overlay networks based on external KV store
was developed but is unused nowadays.
As the last reference to OverlayBindInterface and OverlayNeighborIP
netlabels are in the ovrouter cmd, they're removed too.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This only makes the containerd ImageService implementation respect
context cancellation though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Implement Children method for containerd image store which makes the
`ancestor` filter work for `docker ps`. Checking if image is a children
of other image is implemented by comparing their rootfs diffids because
containerd image store doesn't have a concept of image parentship like
the graphdriver store. The child is expected to have more layers than
the parent and should start with all parent layers.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Before:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Use CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
After:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Using CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Drop support for platforms which only have xt_u32 but not xt_bpf. No
attempt is made to clean up old xt_u32 iptables rules left over from a
previous daemon instance.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Encrypted overlay networks are unique in that they are the only kind of
network for which libnetwork programs an iptables rule to explicitly
accept incoming packets. No other network driver does this. The overlay
driver doesn't even do this for unencrypted networks!
Because the ACCEPT rule is appended to the end of INPUT table rather
than inserted at the front, the rule can be entirely inert on many
common configurations. For example, FirewallD programs an unconditional
REJECT rule at the end of the INPUT table, so any ACCEPT rules appended
after it have no effect. And on systems where the rule is effective, its
presence may subvert the administrator's intentions. In particular,
automatically appending the ACCEPT rule could allow incoming traffic
which the administrator was expecting to be dropped implicitly with a
default-DROP policy.
Let the administrator always have the final say in how incoming
encrypted overlay packets are filtered by no longer automatically
programming INPUT ACCEPT iptables rules for them.
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.20.3 (released 2023-04-04) includes security fixes to the go/parser,
html/template, mime/multipart, net/http, and net/textproto packages, as well
as bug fixes to the compiler, the linker, the runtime, and the time package.
See the Go 1.20.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.2...go1.20.3
Further details from the announcement on the mailing list:
We have just released Go versions 1.20.3 and 1.19.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- go/parser: infinite loop in parsing
Calling any of the Parse functions on Go source code which contains `//line`
directives with very large line numbers can cause an infinite loop due to
integer overflow.
Thanks to Philippe Antoine (Catena cyber) for reporting this issue.
This is CVE-2023-24537 and Go issue https://go.dev/issue/59180.
- html/template: backticks not treated as string delimiters
Templates did not properly consider backticks (`) as Javascript string
delimiters, and as such did not escape them as expected. Backticks are
used, since ES6, for JS template literals. If a template contained a Go
template action within a Javascript template literal, the contents of the
action could be used to terminate the literal, injecting arbitrary Javascript
code into the Go template.
As ES6 template literals are rather complex, and themselves can do string
interpolation, we've decided to simply disallow Go template actions from being
used inside of them (e.g. "var a = {{.}}"), since there is no obviously safe
way to allow this behavior. This takes the same approach as
github.com/google/safehtml. Template.Parse will now return an Error when it
encounters templates like this, with a currently unexported ErrorCode with a
value of 12. This ErrorCode will be exported in the next major release.
Users who rely on this behavior can re-enable it using the GODEBUG flag
jstmpllitinterp=1, with the caveat that backticks will now be escaped. This
should be used with caution.
Thanks to Sohom Datta, Manipal Institute of Technology, for reporting this issue.
This is CVE-2023-24538 and Go issue https://go.dev/issue/59234.
- net/http, net/textproto: denial of service from excessive memory allocation
HTTP and MIME header parsing could allocate large amounts of memory, even when
parsing small inputs.
Certain unusual patterns of input data could cause the common function used to
parse HTTP and MIME headers to allocate substantially more memory than
required to hold the parsed headers. An attacker can exploit this behavior to
cause an HTTP server to allocate large amounts of memory from a small request,
potentially leading to memory exhaustion and a denial of service.
Header parsing now correctly allocates only the memory required to hold parsed
headers.
Thanks to Jakob Ackermann (@das7pad) for discovering this issue.
This is CVE-2023-24534 and Go issue https://go.dev/issue/58975.
- net/http, net/textproto, mime/multipart: denial of service from excessive resource consumption
Multipart form parsing can consume large amounts of CPU and memory when
processing form inputs containing very large numbers of parts. This stems from
several causes:
mime/multipart.Reader.ReadForm limits the total memory a parsed multipart form
can consume. ReadForm could undercount the amount of memory consumed, leading
it to accept larger inputs than intended. Limiting total memory does not
account for increased pressure on the garbage collector from large numbers of
small allocations in forms with many parts. ReadForm could allocate a large
number of short-lived buffers, further increasing pressure on the garbage
collector. The combination of these factors can permit an attacker to cause an
program that parses multipart forms to consume large amounts of CPU and
memory, potentially resulting in a denial of service. This affects programs
that use mime/multipart.Reader.ReadForm, as well as form parsing in the
net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm now does a better job of estimating the memory consumption of parsed
forms, and performs many fewer short-lived allocations.
In addition, mime/multipart.Reader now imposes the following limits on the
size of parsed forms:
Forms parsed with ReadForm may contain no more than 1000 parts. This limit may
be adjusted with the environment variable GODEBUG=multipartmaxparts=. Form
parts parsed with NextPart and NextRawPart may contain no more than 10,000
header fields. In addition, forms parsed with ReadForm may contain no more
than 10,000 header fields across all parts. This limit may be adjusted with
the environment variable GODEBUG=multipartmaxheaders=.
Thanks to Jakob Ackermann for discovering this issue.
This is CVE-2023-24536 and Go issue https://go.dev/issue/59153.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While we currently do not provide an option to specify the snapshotter to use
for individual containers (we may want to add this option in future), currently
it already is possible to configure the snapshotter in the daemon configuration,
which could (likely) cause issues when changing and restarting the daemon.
This patch updates some code-paths that have the container available to use
the snapshotter that's configured for the container (instead of the default
snapshotter configured).
There are still code-paths to be looked into, and a tracking ticket as well as
some TODO's were added for those.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, the AWSLogs driver attempted to implement
non-blocking itself. Non-blocking is supposed to
implemented solely by the Docker RingBuffer that
wraps the log driver.
Please see issue and explanation here:
https://github.com/moby/moby/issues/45217
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Prevent the daemon from erroring out if daemon.json contains default
network options for network drivers aside from bridge. Configuring
defaults for the bridge driver previously worked by coincidence because
the unrelated CLI flag '--bridge' exists.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Closing stdin of a container or exec (a.k.a.: task or process) has been
somewhat broken ever since support for ContainerD 1.0 was introduced
back in Docker v17.11: the error returned from the CloseIO() call was
effectively ignored due to it being assigned to a local variable which
shadowed the intended variable. Serendipitously, that oversight
prevented a data race. In my recent refactor of libcontainerd, I
corrected the variable shadowing issue and introduced the aforementioned
data race in the process.
Avoid deadlocking when closing stdin without swallowing errors or
introducing data races by calling CloseIO() synchronously if the process
handle is available, falling back to an asynchronous close-and-log
strategy otherwise. This solution is inelegant and complex, but looks to
be the best that could be done without changing the libcontainerd API.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- use apiClient for api-clients to reduce shadowing (also more "accurate")
- use "ctr" instead of "container"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Attestation manifests have an OCI image media type, which makes them
being listed like they were a separate platform supported by the image.
Don't use `images.Platforms` and walk the manifest list ourselves
looking for all manifests that are an actual image manifest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit 3c69b9f2c5 replaced these functions
and types with github.com/moby/patternmatcher. That commit has shipped with
docker 23.0, and BuildKit v0.11 no longer uses the old functions, so we can
remove these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 'Deprecated:' line in NewClient's doc comment was not in a new
paragraph, so GoDoc, linters, and IDEs were unaware that it was
deprecated. The package documentation also continued to reference
NewClient. Update the doc comments to finish documenting that NewClient
is deprecated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Distribution source labels don't store port of the repository. If the
content was obtained from repository 172.17.0.2:5000 then its
corresponding label will have a key "containerd.io/distribution.source.172.17.0.2".
Fix the check in canBeMounted to ignore the :port part of the domain.
This also removes the check which prevented insecure repositories to use
cross-repo mount - the real cause was the mismatch in domain comparison.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Distribution source label can specify multiple repositories - in this
case value is a comma separated list of source repositories.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previously the labels would be appended for content that was pushed
even if subsequent pushes of other content failed.
Change the behavior to only append the labels if the whole push
operation succeeded.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Handler is called in parallel and modifying a map without
synchronization is a race condition.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a follow-up of 48ad9e1. This commit removed the function
ElectInterfaceAddresses from utils_linux.go but not their FreeBSD &
Windows counterpart. As these functions are never called, they can be
safely removed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The ExtDNS2 field was added in
aad1632c15
to migrate existing state from < 1.14 to a new type. As it's unlikely
that installations still have state from before 1.14, rename ExtDNS2
back to ExtDNS and drop the migration code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
- make jobs.Add accept a list of jobs, so that we don't have to
repeatedly lock/unlock the mutex
- rename some variables that collided with imports or types
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This implements `docker push` under containerd image store. When
pushing manifest lists that reference a content which is not present in
the local content store, it will attempt to perform the cross-repo mount
the content if possible.
Considering this scenario:
```bash
$ docker pull docker.io/library/busybox
```
This will download manifest list and only host platform-specific
manifest and blobs.
Note, tagging to a different repository (but still the same registry) and pushing:
```bash
$ docker tag docker.io/library/busybox docker.io/private-repo/mybusybox
$ docker push docker.io/private-repo/mybusybox
```
will result in error, because the neither we nor the target repository
doesn't have the manifests that the busybox manifest list references
(because manifests can't be cross-repo mounted).
If for some reason the manifests and configs for all other platforms
would be present in the content store, but only layer blobs were
missing, then the push would work, because the blobs can be cross-repo
mounted (only if we push to the same registry).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Push the reference parsing from repo and tag names into the api and pass
a reference object to the ImageService.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.5
diff: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5
This is the fifth patch release in the 1.1.z series of runc, which fixes
three CVEs found in runc.
* CVE-2023-25809 is a vulnerability involving rootless containers where
(under specific configurations), the container would have write access
to the /sys/fs/cgroup/user.slice/... cgroup hierarchy. No other
hierarchies on the host were affected. This vulnerability was
discovered by Akihiro Suda.
<https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc>
* CVE-2023-27561 was a regression which effectively re-introduced
CVE-2019-19921. This bug was present from v1.0.0-rc95 to v1.1.4. This
regression was discovered by @Beuc.
<https://github.com/advisories/GHSA-vpvm-3wq2-2wvm>
* CVE-2023-28642 is a variant of CVE-2023-27561 and was fixed by the same
patch. This variant of the above vulnerability was reported by Lei
Wang.
<https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c>
In addition, the following other fixes are included in this release:
* Fix the inability to use `/dev/null` when inside a container.
* Fix changing the ownership of host's `/dev/null` caused by fd redirection
(a regression in 1.1.1).
* Fix rare runc exec/enter unshare error on older kernels, including
CentOS < 7.7.
* nsexec: Check for errors in `write_log()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
no changes in vendored code, just keeping scanners happy :)
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.5
diff: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5
This is the fifth patch release in the 1.1.z series of runc, which fixes
three CVEs found in runc.
* CVE-2023-25809 is a vulnerability involving rootless containers where
(under specific configurations), the container would have write access
to the /sys/fs/cgroup/user.slice/... cgroup hierarchy. No other
hierarchies on the host were affected. This vulnerability was
discovered by Akihiro Suda.
<https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc>
* CVE-2023-27561 was a regression which effectively re-introduced
CVE-2019-19921. This bug was present from v1.0.0-rc95 to v1.1.4. This
regression was discovered by @Beuc.
<https://github.com/advisories/GHSA-vpvm-3wq2-2wvm>
* CVE-2023-28642 is a variant of CVE-2023-27561 and was fixed by the same
patch. This variant of the above vulnerability was reported by Lei
Wang.
<https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c>
In addition, the following other fixes are included in this release:
* Fix the inability to use `/dev/null` when inside a container.
* Fix changing the ownership of host's `/dev/null` caused by fd redirection
(a regression in 1.1.1).
* Fix rare runc exec/enter unshare error on older kernels, including
CentOS < 7.7.
* nsexec: Check for errors in `write_log()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously commit incorrectly used image config digest as an image id
for the new image which isn't consistent with the image target.
This changes it to use manifest digest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't try to parse dangling images name (they have a non-canonical
format - `moby-dangling@sha256:...`) as a reference.
Log a warning if the image is not dangling and its name is not a valid
named reference.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Allow SetMatrix to be used as a value type with a ready-to-use zero
value. SetMatrix values are already non-copyable by virtue of having a
mutex field so there is no harm in allowing non-pointer values to be
used as local variables or struct fields. Any attempts to pass around
by-value copies, e.g. as function arguments, will be flagged by go vet.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `docker-init` binary is not intended to be a user-facing command, and as such it is more appropriate for it to be found in `/usr/libexec` (or similar) than in `PATH` (see the FHS, especially https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html and https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA).
This adjusts the logic for using that configuration option to take this into account and appropriately search for `docker-init` (or the user's configured alternative) in these directories before falling back to the existing `PATH` lookup behavior.
This behavior _used_ to exist for the old `dockerinit` binary (of a similar name and used in a similar way but for an alternative purpose), but that behavior was removed in 4357ed4a73 when that older `dockerinit` was also removed.
Most of this reasoning _also_ applies to `docker-proxy` (and various `containerd-xxx` binaries such as the shims), but this change does not affect those. It would be relatively straightforward to adapt `LookupInitPath` to be a more generic function such as `libexecLookupPath` or similar if we wanted to explore that.
See 14482589df/cli-plugins/manager/manager_unix.go for the related path list in the CLI which loads CLI plugins from a similar set of paths (with a similar rationale - plugin binaries are not typically intended to be run directly by users but rather invoked _via_ the CLI binary).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This adds a function to the client package which can be used to create a
buildkit client from our moby client.
Example:
```go
package main
import (
"context"
"github.com/moby/moby/client"
bkclient "github.com/moby/buildkit/client"
)
func main() {
c := client.NewWithOpts()
bc, _ := bkclient.New(context.Background(), ""
client.BuildkitClientOpts(c),
)
// ...
}
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
FirewallD creates the root INPUT chain with a default-accept policy and
a terminal rule which rejects all packets not accepted by any prior
rule. Any subsequent rules appended to the chain are therefore inert.
The administrator would have to open the VXLAN UDP port to make overlay
networks work at all, which would result in all VXLAN traffic being
accepted and defeating our attempts to enforce encryption on encrypted
overlay networks.
Insert the rule to drop unencrypted VXLAN packets tagged for encrypted
overlay networks at the top of the INPUT chain so that enforcement of
mandatory encryption takes precedence over any accept rules configured
by the administrator. Continue to append the accept rule to the bottom
of the chain so as not to override any administrator-configured drop
rules.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The User-Agent includes the kernel version, which involves making a syscall
(and parsing the results) on Linux, and reading (plus parsing) the registry
on Windows. These operations are relatively costly, and we should not perform
those on every request that uses the User-Agent.
This patch adds a sync.Once so that we only perform these actions once for
the lifetime of the daemon's process.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
GRPC is logging a *lot* of garbage at info level.
This configures the GRPC logger such that it is only giving us logs when
at debug level and also adds a log field indicating where the logs are
coming from.
containerd is still currently spewing these same log messages and needs
a separate update.
Without this change `docker build` is extremely noisy in the daemon
logs.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit 3991faf464 moved search into the registry
package, which also made the `dockerversion` package a dependency for registry,
which brings additional (indirect) dependencies, such as `pkg/parsers/kernel`,
and `golang.org/x/sys/windows/registry`.
Client code, such as used in docker/cli may depend on the `registry` package,
but should not depend on those additional dependencies.
This patch moves setting the userAgent to the API router, and instead of
passing it as a separate argument, includes it into the "headers".
As these headers now not only contain the `X-Meta-...` headers, the variables
were renamed accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use `exec.Command` created by this function instead of obtaining it from
daemon struct. This prevents a race condition where `daemon.Kill` is
called before the goroutine has the chance to call `cmd.Wait`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
TestDaemonRestartKillContainers test was always executing the last case
(`container created should not be restarted`) because the iterated
variables were not copied correctly.
Capture iterated values by value correctly and rename c to tc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Some newer distros such as RHEL 9 have stopped making the xt_u32 kernel
module available with the kernels they ship. They do ship the xt_bpf
kernel module, which can do everything xt_u32 can and more. Add an
alternative implementation of the iptables match rule which uses xt_bpf
to implement exactly the same logic as the u32 filter using a BPF
program. Try programming the BPF-powered rules as a fallback when
programming the u32-powered rules fails.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rule clause used to match on the VNI of VXLAN datagrams
looks like line noise to the uninitiated. It doesn't help that the
expression is repeated twice and neither copy has any commentary.
DRY out the rule builder to a common function, and document what the
rule does and how it works.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rules which make encryption mandatory on an encrypted
overlay network are only programmed once there is a second node
participating in the network. This leaves single-node encrypted overlay
networks vulnerable to packet injection. Furthermore, failure to program
the rules is not treated as a fatal error.
Program the iptables rules to make encryption mandatory before creating
the VXLAN link to guarantee that there is no window of time where
incoming cleartext VXLAN packets for the network would be accepted, or
outgoing cleartext packets be transmitted. Only create the VXLAN link if
programming the rules succeeds to ensure that it fails closed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The overlay-network encryption code is woefully under-documented, which
is especially problematic as it operates on under-documented kernel
interfaces. Document what I have puzzled out of the implementation for
the benefit of the next poor soul to touch this code.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Use the utility introduced in 1bd486666b to
share the same implementation as similar options. The IPCModeContainer const
is left for now, but we need to consider what to do with these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit e7d75c8db7 fixed validation of "host"
mode values, but also introduced a regression for validating "container:"
mode PID-modes.
PID-mode implemented a stricter validation than the other options and, unlike
the other options, did not accept an empty container name/ID. This feature was
originally implemented in fb43ef649b, added some
some integration tests (but no coverage for this case), and the related changes
in the API types did not have unit-tests.
While a later change (d4aec5f0a6) added a test
for the `--pid=container:` (empty name) case, that test was later migrated to
the CLI repository, as it covered parsing the flag (and validating the result).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 1bd486666b refactored this code, but
it looks like I removed some changes in this part of the code when extracting
these changes from a branch I was working on, and the behavior did not match
the function's description (name to be empty if there is no "container:" prefix
Unfortunately, there was no test coverage for this in this repository, so we
didn't catch this.
This patch:
- fixes containerID() to not return a name/ID if no container: prefix is present
- adds test-coverage for TestCgroupSpec
- adds test-coverage for NetworkMode.ConnectedContainer
- updates some test-tables to remove duplicates, defaults, and use similar cases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make if more explicit which test-cases should be valid, and make it the
first field, because the "valid" field is shared among all test-cases in
the test-table, and making it the first field makes it slightly easier
to distinguish valid from invalid cases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 6a516acb2e moved the MemInfo type and
ReadMemInfo() function into the pkg/sysinfo package. In an attempt to assist
consumers of these to migrate to the new location, an alias was added.
Unfortunately, the side effect of this alias is that pkg/system now depends
on pkg/sysinfo, which means that consumers of this (such as docker/cli) now
get all (indirect) dependencies of that package as dependency, which includes
many dependencies that should only be needed for the daemon / runtime;
- github.com/cilium/ebpf
- github.com/containerd/cgroups
- github.com/coreos/go-systemd/v22
- github.com/godbus/dbus/v5
- github.com/moby/sys/mountinfo
- github.com/opencontainers/runtime-spec
This patch moves the MemInfo related code to its own package. As the previous move
was not yet part of a release, we're not adding new aliases in pkg/sysinfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Funneling the peer operations into an unbuffered channel only serves to
achieve the same result as a mutex, using a lot more boilerplate and
indirection. Get rid of the boilerplate and unnecessary indirection by
using a mutex and calling the operations directly.
Signed-off-by: Cory Snider <csnider@mirantis.com>
idtools.MkdirAllAndChown on Windows does not chown directories, which makes
idtools.MkdirAllAndChown() just an alias for system.MkDirAll().
Also setting the filemode to `0`, as changing filemode is a no-op on Windows as
well; both of these changes should make it more transparent that no chown'ing,
nor changing filemode takes place.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Volumes created from the image config were not being pruned because the
volume service did not think they were anonymous since the code to
create passes along a generated name instead of letting the volume
service generate it.
This changes the code path to have the volume service generate the name
instead of doing it ahead of time.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit 3246db3755 added handling for removing
cluster volumes, but in some conditions, this resulted in errors not being
returned if the volume was in use;
docker swarm init
docker volume create foo
docker create -v foo:/foo busybox top
docker volume rm foo
This patch changes the logic for ignoring "local" volume errors if swarm
is enabled (and cluster volumes supported).
While working on this fix, I also discovered that Cluster.RemoveVolume()
did not handle the "force" option correctly; while swarm correctly handled
these, the cluster backend performs a lookup of the volume first (to obtain
its ID), which would fail if the volume didn't exist.
Before this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
volume_test.go:122: assertion failed: error is nil, not errdefs.IsConflict
volume_test.go:123: assertion failed: expected an error, got nil
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
volume_test.go:143: assertion failed: error is not nil: Error response from daemon: volume no_such_volume not found
--- FAIL: TestVolumesRemoveSwarmEnabled (1.57s)
--- FAIL: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- FAIL: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
FAIL
With this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
--- PASS: TestVolumesRemoveSwarmEnabled (1.53s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
SearchRegistryForImages does not make sense as part of the image
service interface. The implementation just wraps the search API of the
registry service to filter the results client-side. It has nothing to do
with local image storage, and the implementation of search does not need
to change when changing which backend (graph driver vs. containerd
snapshotter) is used for local image storage.
Filtering of the search results is an implementation detail: the
consumer of the results does not care which actor does the filtering so
long as the results are filtered as requested. Move filtering into the
exported API of the registry service to hide the implementation details.
Only one thing---the registry service implementation---would need to
change in order to support server-side filtering of search results if
Docker Hub or other registry servers were to add support for it to their
APIs.
Use a fake registry server in the search unit tests to avoid having to
mock out the registry API client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes a security fix for crypto/elliptic (CVE-2023-24532).
> go1.20.2 (released 2023-03-07) includes a security fix to the crypto/elliptic package,
> as well as bug fixes to the compiler, the covdata command, the linker, the runtime, and
> the crypto/ecdh, crypto/rsa, crypto/x509, os, and syscall packages.
> See the Go 1.20.2 milestone on our issue tracker for details.
https://go.dev/doc/devel/release#go1.20.minor
From the announcement:
> We have just released Go versions 1.20.2 and 1.19.7, minor point releases.
>
> These minor releases include 1 security fixes following the security policy:
>
> - crypto/elliptic: incorrect P-256 ScalarMult and ScalarBaseMult results
>
> The ScalarMult and ScalarBaseMult methods of the P256 Curve may return an
> incorrect result if called with some specific unreduced scalars (a scalar larger
> than the order of the curve).
>
> This does not impact usages of crypto/ecdsa or crypto/ecdh.
>
> This is CVE-2023-24532 and Go issue https://go.dev/issue/58647.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
5008409b5c introduced the usage of
`strings.Cut` to help parse listener addresses.
Part of that also made it error out if no addr is specified after the
protocol spec (e.g. `tcp://`).
Before the change a proto spec without an address just used the default
address for that proto.
e.g. `tcp://` would be `tcp://127.0.0.1:2375`, `unix://` would be
`unix:///var/run/docker.sock`.
Critically, socket activation (`fd://`) never has an address.
This change brings back the old behavior but keeps the usage of
`strings.Cut`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Set `dangling-name-prefix` exporter attribute to `moby-dangling` which
makes it create an containerd image even when user didn't provide any
name for the new image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In preparation for containerd v1.7 which migrates off gogo/protobuf
and changes the protobuf Any type to one that's not supported by our
vendored version of typeurl.
This fixes a compile error on usages of `typeurl.UnmarshalAny` when
upgrading to containerd v1.7.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- fix docker service create doesn't work when network and generic-resource are both attached
- Fix removing tasks when a jobs service is removed
- CSI: Allow NodePublishVolume even when plugin does not support staging
full diff: 904c221ac2...80a528a868
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Only use the image exporter in build if we don't use containerd
Without this "docker build" fails with:
Error response from daemon: exporter "image" could not be found
- let buildx know we support containerd snapshotter
- Pass the current snapshotter to the buildkit worker
If buildkit uses a different snapshotter we can't list the images any
more because we can't find the snapshot.
builder/builder-next: make ContainerdWorker a minimal wrapper
Note that this makes "Worker" a public field, so technically one could
overwrite it.
builder-next: reenable runc executor
Currently, without special CNI config the builder would
only create host network containers that is a security issue.
Using runc directly instead of shim is faster as well
as builder doesn’t need anything from shim. The overhead
of setting up network sandbox is much slower of course.
builder/builder-next: simplify options handling
Trying to simplify the logic;
- Use an early return if multiple outputs are provided
- Only construct the list of tags if we're using an image (or moby) exporter
- Combine some logic for snapshotter and non-snapshotter handling
Create a constant for the moby exporter
Pass a context when creating a router
The context has a 10 seconds timeout which should be more than enough to
get the answer from containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Tonis Tiigi <tonistiigi@gmail.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Stopping container on Windows can sometimes take longer than 10s which
caused this test to be flaky.
Increase the timeout to 75s when running this test on Windows.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This addresses the previous issue with the containerd store where, after a container is created, we can't deterministically resolve which image variant was used to run it (since we also don't store what platform the image was fetched for).
This is required for things like `docker commit`, and computing the containers layer size later, since we need to resolve the specific image variant.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
When reading through some bug reports, I noticed that the error-message for
unsupported storage drivers is not very informative, as it does not include
the actual storage driver. Some of these errors are used as sentinel errors
internally, so improving the error returned by graphdriver.New() may need
some additional work, but this patch makes a start by including the name
of the graphdriver (if set) in the error-message.
Before this patch:
dockerd --storage-driver=foobar
...
failed to start daemon: error initializing graphdriver: driver not supported
With this patch:
dockerd --storage-driver=foobar
...
failed to start daemon: error initializing graphdriver: driver not supported: foobar
It's worth noting that there may be code "in the wild" that perform string-
matching on this error (e.g. [balena][1]), which is why I included the name as a separate "component"
in the output, to allow matching parts of the error.
[1]: 3d5c77a466/lib/preload.ts (L34-L35)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These variables collided with the "repository" and "store" types declared
in this package. Rename the variables colliding with "repository", and
rename the "store" type to "refStore".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Buildkit deprecated build information in v0.11 and will remove it in v0.12.
It's suggested to use provenance attestations instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The authorization.Middleware contains a sync.Mutex field, making it
non-copyable. Remove one of the barriers to allowing deep copies of
config.Config values.
Inject the middleware into Daemon as a constructor argument instead.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`cp` variable is used later to populate the `info.Checkpoint` field
option used by Task creation.
Previous changes mistakenly changed assignment of the `cp` variable to
declaration of a new variable that's scoped only to the if block.
Restore the old assignment behavior.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
It's surprising that the method to begin serving requests is named Wait.
And it is unidiomatic: it is a synchronous call, but it sends its return
value to the channel passed in as an argument instead of just returning
the value. And ultimately it is just a trivial wrapper around serveAPI.
Export the ServeAPI method instead so callers can decide how to call and
synchronize around it.
Call ServeAPI synchronously on the main goroutine in cmd/dockerd. The
goroutine and channel which the Wait() API demanded are superfluous
after all. The notifyReady() call was always concurrent and asynchronous
with respect to serving the API (its implementation spawns a goroutine)
so it makes no difference whether it is called before ServeAPI() or
after `go ServeAPI()`.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Retain pause.exe as entrypoint for default pause images
- wcow: support graceful termination of servercore containers
full diff: https://github.com/Microsoft/hcsshim/compare/v0.9.6...v0.9.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Server.cfg field is never referenced by any code in package
"./api/server". "./api/server".Config struct values are used by
DaemonCli code, but only to pass around configuration copied out of the
daemon config within the "./cmd/dockerd" package. Delete the
"./api/server".Config struct definition and refactor the "./cmd/dockerd"
package to pull configuration directly from cli.Config.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Deprecate `<none>:<none>` and `<none>@<none>` magic strings included in
`RepoTags` and `RepoDigests`.
Produce an empty arrays instead and leave the presentation of
untagged/dangling images up to the client.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The netip types can be used as map keys, unlike net.IP and friends,
which is a very useful property to have for this application.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The address spaces are orthogonal. There is no shared state between them
logically so there is no reason for them to share any in-memory data
structures. addrSpace is responsible for allocating subnets and
addresses, while Allocator is responsible for implementing the IPAM API.
Lower all the implementation details of allocation into addrSpace.
There is no longer a need to include the name of the address space in
the map keys for subnets now that each addrSpace holds its own state
independently from other addrSpaces. Remove the AddressSpace field from
the struct used for map keys within an addrSpace so that an addrSpace
does not need to know its own name.
Pool allocations were encoded in a tree structure, using parent
references and reference counters. This structure affords for pools
subdivided an arbitrary number of times to be modeled, in theory. In
practice, the max depth is only two: master pools and sub-pools. The
allocator data model has also been heavily influenced by the
requirements and limitations of Datastore persistence, which are no
longer applicable.
Address allocations are always associated with a master pool. Sub-pools
only serve to restrict the range of addresses within the master pool
which could be allocated from. Model pool allocations within an address
space as a two-level hierarchy of maps.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There is only one call site, in (*Allocator).RequestPool. The logic is
tightly coupled between the caller and callee, and having them separate
makes it harder to reason about.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Only two address spaces are supported: LocalDefault and GlobalDefault.
Support for non-default address spaces in the IPAM Allocator is
vestigial, from a time when IPAM state was stored in a persistent shared
datastore. There is no way to create non-default address spaces through
the IPAM API so there is no need to retain code to support the use of
such address spaces. Drop all pretense that more address spaces can
exist, to the extent that the IPAM API allows.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The two-phase commit dance serves no purpose with the current IPAM
allocator implementation. There are no fallible operations between the
call to aSpace.updatePoolDBOnAdd() and invoking the returned closure.
Allocate the subnet in the address space immediately when called and get
rid of the closure return.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Checking if the image creation failed due to IsAlreadyExists didn't use
the error from ImageService.Create.
Error from ImageService.Create was stored in a separate variable and
later IsAlreadyExists checked the standard `err` variable instead of the
`createErr`.
As there's no need to store the error in a separate variable - just
assign it to err variable and fix the check.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Kubernetes only permits RuntimeClass values which are valid lowercase
RFC 1123 labels, which disallows the period character. This prevents
cri-dockerd from being able to support configuring alternative shimv2
runtimes for a pod as shimv2 runtime names must contain at least one
period character. Add support for configuring named shimv2 runtimes in
daemon.json so that runtime names can be aliased to
Kubernetes-compatible names.
Allow options to be set on shimv2 runtimes in daemon.json.
The names of the new daemon runtime config fields have been selected to
correspond with the equivalent field names in cri-containerd's
configuration so that users can more easily follow documentation from
the runtime vendor written for cri-containerd and apply it to
daemon.json.
Signed-off-by: Cory Snider <csnider@mirantis.com>
PTR queries with domain names unknown to us are not necessarily invalid.
Act like a well-behaved middlebox and fall back to forwarding
externally, same as we do with the other query types.
Signed-off-by: Cory Snider <csnider@mirantis.com>
...for limiting concurrent external DNS requests with
"golang.org/x/sync/semaphore".Weighted. Replace the ad-hoc rate limiter
for when the concurrency limit is hit (which contains a data-race bug)
with "golang.org/x/time/rate".Sometimes.
Immediately retrying with the next server if the concurrency limit has
been hit just further compounds the problem. Wait on the semaphore and
refuse the query if it could not be acquired in a reasonable amount of
time.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It handles figuring out the UDP receive buffer size and setting IO
timeouts, which simplifies our code. It is also more robust to receiving
UDP replies to earlier queries which timed out.
Log failures to perform a client exchange at level error so they are
more visible to operators and administrators.
Signed-off-by: Cory Snider <csnider@mirantis.com>
forwardExtDNS() will now continue with the next external DNS sever if
co.ReadMsg() returns (nil, nil). Previously it would abort resolving the
query and not reply to the container client. The implementation of
ReadMsg() in the currently- vendored version of miekg/dns cannot return
(nil, nil) so the difference is immaterial in practice.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*dns.Msg).Truncate() is more intelligent and standards-compliant about
truncating DNS response messages than our hand-rolled version. Fix a
silly fencepost error the max TCP message size: the limit is
dns.MaxMsgSize (65535), full stop.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The TC flag in a DNS message indicates that the sender had to
truncate it to fit within the length limit of the transmission channel.
It does NOT indicate that part of the message was lost before reaching
the recipient. Older versions of github.com/miekg/dns conflated the two
cases by returning ErrTruncated from ReadMsg() if the message was parsed
without error but had the TC flag set. The version of miekg/dns
currently vendored no longer returns an error when a well-formed DNS
message is received which has its TC flag set, but there was some
confusion on how to update libnetwork to deal with this behaviour
change. Truncated DNS replies are no longer different from any other
reply message: they are normal replies which do not need any special-
case handling to proxy back to the client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.19.6 (released 2023-02-14) includes security fixes to the crypto/tls,
mime/multipart, net/http, and path/filepath packages, as well as bug fixes to
the go command, the linker, the runtime, and the crypto/x509, net/http, and
time packages. See the Go 1.19.6 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.6+label%3ACherryPickApproved
From the announcement on the security mailing:
We have just released Go versions 1.20.1 and 1.19.6, minor point releases.
These minor releases include 4 security fixes following the security policy:
- path/filepath: path traversal in filepath.Clean on Windows
On Windows, the filepath.Clean function could transform an invalid path such
as a/../c:/b into the valid path c:\b. This transformation of a relative (if
invalid) path into an absolute path could enable a directory traversal attack.
The filepath.Clean function will now transform this path into the relative
(but still invalid) path .\c:\b.
This is CVE-2022-41722 and Go issue https://go.dev/issue/57274.
- net/http, mime/multipart: denial of service from excessive resource
consumption
Multipart form parsing with mime/multipart.Reader.ReadForm can consume largely
unlimited amounts of memory and disk files. This also affects form parsing in
the net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm takes a maxMemory parameter, and is documented as storing "up to
maxMemory bytes +10MB (reserved for non-file parts) in memory". File parts
which cannot be stored in memory are stored on disk in temporary files. The
unconfigurable 10MB reserved for non-file parts is excessively large and can
potentially open a denial of service vector on its own. However, ReadForm did
not properly account for all memory consumed by a parsed form, such as map
ntry overhead, part names, and MIME headers, permitting a maliciously crafted
form to consume well over 10MB. In addition, ReadForm contained no limit on
the number of disk files created, permitting a relatively small request body
to create a large number of disk temporary files.
ReadForm now properly accounts for various forms of memory overhead, and
should now stay within its documented limit of 10MB + maxMemory bytes of
memory consumption. Users should still be aware that this limit is high and
may still be hazardous.
ReadForm now creates at most one on-disk temporary file, combining multiple
form parts into a single temporary file. The mime/multipart.File interface
type's documentation states, "If stored on disk, the File's underlying
concrete type will be an *os.File.". This is no longer the case when a form
contains more than one file part, due to this coalescing of parts into a
single file. The previous behavior of using distinct files for each form part
may be reenabled with the environment variable
GODEBUG=multipartfiles=distinct.
Users should be aware that multipart.ReadForm and the http.Request methods
that call it do not limit the amount of disk consumed by temporary files.
Callers can limit the size of form data with http.MaxBytesReader.
This is CVE-2022-41725 and Go issue https://go.dev/issue/58006.
- crypto/tls: large handshake records may cause panics
Both clients and servers may send large TLS handshake records which cause
servers and clients, respectively, to panic when attempting to construct
responses.
This affects all TLS 1.3 clients, TLS 1.2 clients which explicitly enable
session resumption (by setting Config.ClientSessionCache to a non-nil value),
and TLS 1.3 servers which request client certificates (by setting
Config.ClientAuth
> = RequestClientCert).
This is CVE-2022-41724 and Go issue https://go.dev/issue/58001.
- net/http: avoid quadratic complexity in HPACK decoding
A maliciously crafted HTTP/2 stream could cause excessive CPU consumption
in the HPACK decoder, sufficient to cause a denial of service from a small
number of small requests.
This issue is also fixed in golang.org/x/net/http2 v0.7.0, for users manually
configuring HTTP/2.
This is CVE-2022-41723 and Go issue https://go.dev/issue/57855.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestRequestReleaseAddressDuplicate gets flagged by go test -race because
the same err variable inside the test is assigned to from multiple
goroutines without synchronization, which obscures whether or not there
are any data races in the code under test.
Trouble is, the test _depends on_ the data race to exit the loop if an
error occurs inside a spawned goroutine. And the test contains a logical
concurrency bug (not flagged by the Go race detector) which can result
in false-positive test failures. Because a release operation is logged
after the IP is released, the other goroutine could reacquire the
address and log that it was reacquired before the release is logged.
Fix up the test so it is no longer subject to data races or
false-positive test failures, i.e. flakes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the resolver encounters an error before it attempts to forward the
request to external DNS, do not try to log information about the
external connection, because at this point `extConn` is `nil`. This
makes sure `dockerd` won't panic and crash from a nil pointer
dereference when it sees an invalid DNS query.
fixes#44979
Signed-off-by: er0k <er0k@er0k.net>
(cherry picked from commit 6c2637be11)
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This reverts commit ab3fa46502.
This fix was partial, and is not needed with the proper fix in
containerd.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
The errors are already returned to the client in the API response, so
logging them to the daemon log is redundant. Log the errors at level
Debug so as not to pollute the end-users' daemon logs with noise.
Refactor the logs to use structured fields. Add the request context to
the log entry so that logrus hooks could annotate the log entries with
contextual information about the API request in the hypothetical future.
Fixes#44997
Signed-off-by: Cory Snider <csnider@mirantis.com>
Go 1.20 made a change to the behaviour of package "os/exec" which was
not mentioned in the release notes:
2b8f214094
Attempts to execute a directory now return syscall.EISDIR instead of
syscall.EACCESS. Check for EISDIR errors from the runtime and fudge the
returned error message to maintain compatibility with existing versions
of docker/cli when using a version of runc compiled with Go 1.20+.
Signed-off-by: Cory Snider <csnider@mirantis.com>
maxDownloadAttempts maps to the daemon configuration flag
--max-download-attempts int
Set the max download attempts for each pull (default 5)
and the daemon configuration machinery interprets a value of 0 as "apply
the default value" and not a valid user value (config validation/
normalization bugs notwithstanding). The intention is clearly that this
configuration value should be an upper limit on the number of times the
daemon should try to download a particular layer before giving up. So it
is surprising to have the configuration value interpreted as a _retry_
limit. The daemon will make up to N+1 attempts to download a layer! This
also means users cannot disable retries even if they wanted to.
Fix the fencepost bug so that max attempts really means max attempts,
not max retries. And fix the fencepost bug with the retry-backoff delay
so that the first backoff is 5s, not 10s.
Signed-off-by: Cory Snider <csnider@mirantis.com>
"math/rand".Seed
- Migrate to using local RNG instances.
"archive/tar".TypeRegA
- The deprecated constant tar.TypeRegA is the same value as
tar.TypeReg and so is not needed at all.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This addresses the same CVE as is patched in go1.19.6. From that announcement:
> net/http: avoid quadratic complexity in HPACK decoding
>
> A maliciously crafted HTTP/2 stream could cause excessive CPU consumption
> in the HPACK decoder, sufficient to cause a denial of service from a small
> number of small requests.
>
> This issue is also fixed in golang.org/x/net/http2 v0.7.0, for users manually
> configuring HTTP/2.
>
> This is CVE-2022-41723 and Go issue https://go.dev/issue/57855.
full diff: https://github.com/golang/net/compare/v0.5.0...v0.7.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
DNS servers in the loopback address range should always be resolved in
the host network namespace when the servers are configured by reading
from the host's /etc/resolv.conf. The daemon mistakenly conflated the
presence of DNS options (docker run --dns-opt) with user-supplied DNS
servers, treating the list of servers loaded from the host as a user-
supplied list and attempting to resolve in the container's network
namespace. Correct this oversight so that loopback DNS servers are only
resolved in the container's network namespace when the user provides the
DNS server list, irrespective of other DNS configuration.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The per-network statistics counters are loaded and incremented without
any concurrency control. Use atomic integers to prevent data races
without having to add any synchronization.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The (*bitmap.Handle).String() method can be rather expensive to call. It
is all the more tragic when the expensively-constructed string is
immediately discarded because the log level is not high enough. Let
logrus stringify the arguments to debug logs so they are only
stringified when the log level is high enough.
# Before
ok github.com/docker/docker/libnetwork/ipam 10.159s
# After
ok github.com/docker/docker/libnetwork/ipam 2.484s
Signed-off-by: Cory Snider <csnider@mirantis.com>
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.9)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.8.96)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implements image tagging under containerd image store.
If an image with this tag already exists, and there's no other image
with the same target, change its name. The name will have a special
format `moby-dangling@<digest>` which isn't a valid canonical reference
and doesn't resolve to any repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
TagImage is just a wrapper for TagImageWithReference which parses the
repo and tag into a reference. Change TagImageWithReference into
TagImage and move the responsibility of reference parsing to caller.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Adds overrides with specific tests suites in our tests
matrix so we can reduce build time significantly.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
`hostSupports` doesn't check if the apparmor_parser is available.
It's possible in some environments that the apparmor will be enabled but
the tool to load the profile is not available which will cause the
ensureDefaultAppArmorProfile to fail completely.
This patch checks if the apparmor_parser is available. Otherwise the
function returns early, but still logs a warning to the daemon log.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
c8d/daemon: Mount root and fill BaseFS
This fixes things that were broken due to nil BaseFS like `docker cp`
and running a container with workdir override.
This is more of a temporary hack than a real solution.
The correct fix would be to refactor the code to make BaseFS and LayerRW
an implementation detail of the old image store implementation and use
the temporary mounts for the c8d implementation instead.
That requires more work though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
daemon/images: Don't unset BaseFS
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit 043dbc05df temporarily switched to a
fork of BuildKit to workaround a failure in CI. These fixes have been
backported to the v0.11 branch in BuildKit, so we can switch back to upstream.
We can remove this override once we update vendor.mod to BuildKit v0.11.3.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPVLAN networks created on Moby v20.10 do not have the IpvlanFlag
configuration value persisted in the libnetwork database as that config
value did not exist before v23.0.0. Gracefully migrate configurations on
unmarshal to prevent type-assertion panics at daemon start after upgrade.
Fixes#44925
Signed-off-by: Cory Snider <csnider@mirantis.com>
CI is failing when bind-mounting source from the host into the dev-container;
fatal: detected dubious ownership in repository at '/go/src/github.com/docker/docker'
To add an exception for this directory, call:
git config --global --add safe.directory /go/src/github.com/docker/docker
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes the `docker save` and `docker load` work with the containerd
image store. The archive is both OCI and Docker compatible.
Saved archive will only contain content which is available locally. In
case the saved image is a multi-platform manifest list, the behavior
depends on the local availability of the content. This is to be
reconsidered when we have the `--platform` option in the CLI.
- If all manifests and their contents, referenced by the manifest list
are present, then the manifest-list is exported directly and the ID
will be the same.
- If only one platform manifest is present, only that manifest is
exported (the image id will change and will be the id of
platform-specific manifest, instead of the full manifest list).
- If multiple, but not all, platform manifests are available, a new
manifest list will be created which will be a subset of the original
manifest list.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Running `ctr` in `make shell` doesn't work by default - one has to
invoke it via `ctr --address /run/docker/containerd/containerd.sock -n
moby` or set the CONTAINERD_{ADDRESS,NAMESPACE} environment variables to
be able to just use `ctr`.
Considering ephemeral nature of these containers it's cumbersome to do
it every time manually - this commit sets the variables to the default
containerd socket address and makes it much easier to use ctr.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The Pid field of an exit event cannot be relied upon to differentiate
exits of the container's task from exits of other container processes,
i.e. execs. The Pid is reported by the runtime and is implementation-
defined so there is no guarantee that a task's pid is distinct from the
pids of any other process in the same container. In particular,
kata-containers reports the pid of the hypervisor for all exit events.
ContainerD guarantees that the process ID of a task is set to the
corresponding container ID, so use that invariant to distinguish task
exits from other process exits.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The ID of the task is known at the time that the FIFOs need to be
created (it's passed into the IO-creator callback, and is also the same
as the container ID) so there is no need to hardcode it to "init". Name
the FIFOs after the task ID to be consistent with the FIFO names of
exec'ed processes. Delete the now-unused InitProcessName constant so it
can never again be used in place of a task/process ID.
Signed-off-by: Cory Snider <csnider@mirantis.com>
ContainerD unconditionally sets the ID of a task to its container's ID.
Emulate this behaviour in the libcontainerd local_windows implementation
so that the daemon can use ProcessID == ContainerID (in libcontainerd
terminology) to identify that an exit event is for the container's task
and not for another process (i.e. an exec) in the same container.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The latest version of containerd-shim-runhcs-v1 (v0.10.0-rc.4) pulled in
with the bump to ContainerD v1.7.0-rc.3 had several changes to make it
more robust, which had the side effect of increasing the worst-case
amount of time it takes for a container to exit in the worst case.
Notably, the total timeout for shutting down a task increased from 30
seconds to 60! Increase the timeouts hardcoded in the daemon and
integration tests so that they don't give up too soon.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Notable Updates
- Fix push error propagation
- Fix slice append error with HugepageLimits for Linux
- Update default seccomp profile for PKU and CAP_SYS_NICE
- Fix overlayfs error when upperdirlabel option is set
full diff: https://github.com/containerd/containerd/compare/v1.6.15...v1.6.16
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The (*bitseq.Handle).Destroy() method deletes the persisted KVObject
from the datastore. This is a no-op on all the bitseq handles in package
ipam as they are not persisted in any datastore.
Signed-off-by: Cory Snider <csnider@mirantis.com>
That method was only referenced by ipam.Allocator, but as it no longer
stores any state persistently there is no possibility for it to load an
inconsistent bit-sequence from Docker 1.9.x.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The DatastoreConfig discovery type is unused. Remove the constant and
any resulting dead code. Today's biggest loser is the IPAM Allocator:
DatastoreConfig was the only type of discovery event it was listening
for, and there was no other place where a non-nil datastore could be
passed into the allocator. Strip out all the dead persistence code from
Allocator, leaving it as purely an in-memory implementation.
There is no more need to check the consistency of the allocator's
bit-sequences as there is no persistent storage for inconsistent bit
sequences to be loaded from.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Per the Interface Segregation Principle, network drivers should not have
to depend on GetPluginGetter methods they do not use. The remote network
driver is the only one which needs a PluginGetter, and it is already
special-cased in Controller so there is no sense warping the interfaces
to achieve a foolish consistency. Replace all other network drivers' Init
functions with Register functions which take a driverapi.Registerer
argument instead of a driverapi.DriverCallback. Add back in Init wrapper
functions for only the drivers which Swarmkit references so that
Swarmkit can continue to build.
Refactor the libnetwork Controller to use the new drvregistry.Networks
and drvregistry.IPAMs driver registries in place of the legacy ones.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There is no benefit to having a single registry for both IPAM drivers
and network drivers. IPAM drivers are registered in a separate namespace
from network drivers, have separate registration methods, separate
accessor methods and do not interact with network drivers within a
DrvRegistry in any way. The only point of commonality is
interface { GetPluginGetter() plugingetter.PluginGetter }
which is only used by the respective remote drivers and therefore should
be outside of the scope of a driver registry.
Create new, separate registry types for network drivers and IPAM
drivers, respectively. These types are "legacy-free". Neither type has
GetPluginGetter methods. The IPAMs registry does not have an
IPAMDefaultAddressSpaces method as that information can be queried
directly from the driver using its GetDefaultAddressSpaces method.
The Networks registry does not have an AddDriver method as that method
is a trivial wrapper around calling one of its arguments with its other
arguments.
Refactor DrvRegistry in terms of the new IPAMs and Networks registries
so that existing code in libnetwork and Swarmkit will continue to work.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The datastore arguments to the IPAM driver Init() functions are always
nil, even in Swarmkit. The only IPAM driver which consumed the
datastores was builtin; all others (null, remote, windowsipam) simply
ignored it. As the signatures of the IPAM driver init functions cannot
be changed without breaking the Swarmkit build, they have to be left
with the same signatures for the time being. Assert that nil datastores
are always passed into the builtin IPAM driver's init function so that
there is no ambiguity the datastores are no longer respected.
Add new Register functions for the IPAM drivers which are free from the
legacy baggage of the Init functions. (The legacy Init functions can be
removed once Swarmkit is migrated to using the Register functions.) As
the remote IPAM driver is the only one which depends on a PluginGetter,
pass it in explicitly as an argument to Register. The other IPAM drivers
should not be forced to depend on a GetPluginGetter() method they do not
use (Interface Segregation Principle).
Signed-off-by: Cory Snider <csnider@mirantis.com>
Without (*Controller).ReloadConfiguration, the only way to configure
datastore scopes would be by passing config.Options to libnetwork.New.
The only options defined which relate to datastore scopes are limited to
configuring the local-scope datastore. Furthermore, the default
datastore config only defines configuration for the local-scope
datastore. The local-scope datastore is therefore the only datastore
scope possible in libnetwork. Start removing code which is only
needed to support multiple datastore scopes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Repro steps:
- Run Docker Desktop
- Run `docker run busybox tail -f /dev/null`
- Run `pkill "Docker Desktop"
Expected:
An error message that indicates that Docker Desktop is shutting down.
Actual:
An error message that looks like this:
```
error waiting for container: invalid character 's' looking for beginning of value
```
here's an example:
https://github.com/docker/for-mac/issues/6575#issuecomment-1324879001
After this change, you get an error message like:
```
error waiting for container: copying response body from Docker: unexpected EOF
```
which is a bit more explicit.
Signed-off-by: Nick Santos <nick.santos@docker.com>
ConfigLocalScopeDefaultNetworks is now dead code, thank goodness! Make
sure it stays dead by deleting the function. Refactor package ipamutils
to simplify things given its newly-reduced (ahem) scope.
Signed-off-by: Cory Snider <csnider@mirantis.com>
ipam.Allocator is not a singleton, but it references mutable singleton
state. Address that deficiency by refactoring it to instead take the
predefined address spaces as constructor arguments. Unfortunately some
work is needed on the Swarmkit side before the mutable singleton state
can be completely eliminated from the IPAMs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The function references global shared, mutable state and is no longer
needed. Deleting it brings us one step closer to getting rid of that
pesky shared state.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The netutils.ElectInterfaceAddresses function is only used in one place
outside of tests: in the daemon, to configure the default bridge
network. The function is also messy to reason about as it references the
shared mutable state of ipamutils.PredefinedLocalScopeDefaultNetworks.
It uses the list of predefined default networks to always return an IPv4
address even if the named interface does not exist or does not have any
IPv4 addresses. This list happens to be the same as the one used to
initialize the address pool of the 'builtin' IPAM driver, though that is
far from obvious. (Start with "./libnetwork".initIPAMDrivers and trace
the dataflow of the addressPool value. Surprise! Global state is being
mutated using the value of other global mutable state.)
The daemon does not need the fallback behaviour of
ElectInterfaceAddresses. In fact, the daemon does not have to configure
an address pool for the network at all! libnetwork will acquire one of
the available address ranges from the network's IPAM driver when the
preferred-pool configuration is unset. It will do so using the same list
of address ranges and the exact same logic
(netutils.FindAvailableNetworks) as ElectInterfaceAddresses. So unless
the daemon needs to force the network to use a specific address range
because the bridge interface already exists, it can leave the details
up to libnetwork.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The pattern of parsing bool was repeated across multiple files and
caused the duplication of the invalidFilter error helper.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
These package-level variables were copied over from the Linux
implementation; drop them for clarity's sake.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
netlink offers the netlink.LinkNotFoundError type, which we can use with
errors.As() to detect a unused link name.
Additionally, early return if GenerateRandomName fails, as reading
random bytes should be a highly reliable operation, and otherwise the
error would be swallowed by the fall-through return.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
GenerateRandomName now uses length to represent the overall length of
the string; this will help future users avoid creating interface names
that are too long for the kernel to accept by mistake. The test coverage
is increased and cleaned up using gotest.tools.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
After the context is cancelled, update the progress for the last time.
This makes sure that the progress also includes finishing updates.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
*bitseq.Handle should not implement sync.Locker. The mutex is a private
implementation detail which external consumers should not be able to
manipulate.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The byte-slice temporary is fully overwritten on each loop iteration so
it can be safely reused to reduce GC pressure.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There is a solid bit-vector datatype hidden inside bitseq.Handle, but
it's obscured by all the intrusive datastore and KVObject nonsense.
It can be used without a datastore, but the datastore baggage limits its
utility inside and outside libnetwork. Extract the datatype goodness
into its own package which depends only on the standard library so it
can be used in more situations.
Signed-off-by: Cory Snider <csnider@mirantis.com>
...in preparation for separating the bit-sequence datatype from the
datastore persistence (KVObject) concerns. The new package's contents
are identical to those of package libnetwork/bitseq to assist in
reviewing the changes made on each side of the split.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This var was used for the cross target but it has been removed
in 8086f40123 so not necessary anymore
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Has been introduced in 232d59baeb to work around a bug with
"go build" but not required anymore since go 1.5: 4dab6d01f1
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
We already prefer ld for cross-building arm64 but that seems
not enough as native arm64 build also has a linker issue with lld
so we need to also prefer ld for native arm64 build.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Static binaries for dockerd are broken on armhf and armel (32-bit).
It seems to be an issue with GCC as building using clang solves
this issue. Also adds extra instruction to prefer ld for
cross-compiling arm64 in bullseye otherwise it doesn't link.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Disables user.Lookup() and net.LookupHost() in the init() function on Windows.
Any package that simply imports pkg/chrootarchive will panic on Windows
Nano Server, due to missing netapi32.dll. While docker itself is not
meant to run on Nano Server, binaries that may import this package and
run on Nano server, will fail even if they don't really use any of the
functionality in this package while running on Nano.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Current implementation in hack/make.sh overwrites PKG_CONFIG
if not defined and set it to pkg-config. When a build is invoked
using xx in our Dockerfile, it will set PKG_CONFIG to the right
value in go environments depending on the target architecture: 8015613ccc/base/xx-go (L75-L78)
Also needs to install dpkg-dev to use pkg-config when cross-building
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Build currently doesn't set the right name for target ARM
architecture through switches in CGO_CFLAGS and CGO_CXXFLAGS
when doing cross-compilation. This was previously fixed in https://github.com/moby/moby/pull/43474
Also removes the toolchain configuration. Following changes for
cross-compilation in https://github.com/moby/moby/pull/44546,
we forgot to remove the toolchain configuration that is
not used anymore as xx already sets correct cc/cxx envs already.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This reverts commit 2d397beb00.
moby#44706 and moby#44805 were both merged, and both refactored the same
file. The combination broke the build, and was not detected in CI as
only the combination of the two, applied to the same parent commit,
caused the failure.
moby#44706 should be carried forward, based on the current master, in
order to resolve this conflict.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *network implements sync.Locker because the mutex is
embedded. Change the mutex to an unexported field so *network no
longer spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *endpoint implements sync.Locker because the mutex is
embedded. Change the mutex to an unexported field so *endpoint no
longer spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Basically every exported method which takes a libnetwork.Sandbox
argument asserts that the value's concrete type is *sandbox. Passing any
other implementation of the interface is a runtime error! This interface
is a footgun, and clearly not necessary. Export and use the concrete
type instead.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *sandbox implements sync.Locker because the mutex is
embedded. Change the mutex to an unexported field so *sandbox no
longer spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *controller implements sync.Locker because the mutex is
embedded.
c, _ := libnetwork.New()
c.(sync.Locker).Lock()
Change the mutex to an unexported field so *controller no longer
spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
unshare.Go() is not used as an existing network namespace needs to be
entered, not a new one created. Explicitly lock main() to the initial
thread so as not to depend on the side effects of importing the
internal/unshare package to achieve the same.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- The oldest kernel version currently supported is v3.10. Bridge
parameters can be set through netlink since v3.8 (see
torvalds/linux@25c71c7). As such, we don't need to fallback to sysfs to
set hairpin mode.
- `scanInterfaceStats()` is never called, so no need to keep it alive.
- Document why `default_pvid` is set through sysfs
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
containerd: Push progress
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When userland-proxy is turned off and on again, the iptables nat rule
doing hairpinning isn't properly removed. This fix makes sure this nat
rule is removed whenever the bridge is torn down or hairpinning is
disabled (through setting userland-proxy to true).
Unlike for ip masquerading and ICC, the `programChainRule()` call
setting up the "MASQ LOCAL HOST" rule has to be called unconditionally
because the hairpin parameter isn't restored from the driver store, but
always comes from the driver config.
For the "SKIP DNAT" rule, things are a bit different: this rule is
always deleted by `removeIPChains()` when the bridge driver is
initialized.
Fixes#44721.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
If the imported layer archive is uncompressed, it gets compressed with
gzip before writing to the content store.
Archives compressed with gzip and zstd are imported as-is.
Xz and bzip2 are recompressed into gzip.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This helps ensure that users are not surprised by unexpected tokens in
the JSON parser, or fallout later in the daemon.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a pragmatic but impure choice, in order to better support the
default tools available on Windows Server, and reduce user confusion due
to otherwise inscrutable-to-the-uninitiated errors like the following:
> invalid character 'þ' looking for beginning of value
> invalid character 'ÿ' looking for beginning of value
While meaningful to those who are familiar with and are equipped to
diagnose encoding issues, these characters will be hidden when the file
is edited with a BOM-aware text editor, and further confuse the user.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
dockerd handles SIGQUIT by dumping all goroutine stacks to standard
error and exiting. In contrast, the Go runtime's default SIGQUIT
behaviour... dumps all goroutine stacks to standard error and exits.
The default SIGQUIT behaviour is implemented directly in the runtime's
signal handler, and so is both more robust to bugs in the Go runtime and
does not perturb the state of the process to anywhere near same degree
as dumping goroutine stacks from a user goroutine. The only notable
difference from a user's perspective is that the process exits with
status 2 instead of 128+SIGQUIT.
Signed-off-by: Cory Snider <csnider@mirantis.com>
synchronises some fixes between these API versions for the documentation,
including fixes from:
- 52a9f1689a
- 345346d7c6
- 18f85467e7
- 1557892c37
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
synchronises some fixes between these API versions for the documentation,
including fixes from:
- 18f85467e7
- 345346d7c6
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Trying to remove the "docker.io" domain from locations where it's not relevant.
In these cases, this domain was used as a "random" domain for testing or example
purposes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Trying to remove the "docker.io" domain from locations where it's not relevant.
In these cases, this domain was used as a "random" domain for testing or example
purposes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
[RFC 8259] allows for JSON implementations to optionally ignore a BOM
when it helps with interoperability; do so in Moby as Notepad (the only
text editor available out of the box in many versions of Windows Server)
insists on writing UTF-8 with a BOM.
[RFC 8259]: https://tools.ietf.org/html/rfc8259#section-8.1
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
We only need suitable UAPI headers now. They are available on kernel 4.7
and newer; out of the distributions currently in support that users
might be interested in, only Enterprise Linux 7 has too old a kernel
(3.10).
Users of Enterprise Linux 7 distros can compile using a newer platform,
disable the Btrfs graphdriver as documented in this file, or use newer
kernel headers on their older distro.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
By relying on the kernel UAPI (userspace API), we can drop a dependency
and simplify building Moby, while also ensuring that we are using a
stable/supported source of the C types and defines we need.
btrfs-progs mirrors the kernel headers, but the headers it ships with
are not the canonical source and as [we have seen before][44698], could
be subject to changes.
Depending on the canonical headers from the kernel both is more
idiomatic, and ensures we are protected by the kernel's promise to not
break userspace.
[44698]: https://github.com/moby/moby/issues/44698
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is actually quite meaningless as we are reporting the libbtrfs
version, but we do not use libbtrfs. We only use the kernel interface to
btrfs instead.
While we could report the version of the kernel headers in play, they're
rather all-or-nothing: they provide the structures and defines we need,
or they don't. As such, drop all version information as the host kernel
version is the only thing that matters.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
cmd.Wait is called twice from different goroutines which can cause the
test to hang completely. Fix by calling Wait only once and sending its
return value over a channel.
In TestLogsFollowGoroutinesWithStdout also added additional closes and
process kills to ensure that we don't leak anything in case test returns
early because of failed test assertion.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a squashed version of various PRs (or related code-changes)
to implement image inspect with the containerd-integration;
- add support for image inspect
- introduce GetImageOpts to manage image inspect data in backend
- GetImage to return image tags with details
- list images matching digest to discover all tags
- Add ExposedPorts and Volumes to the image returned
- Refactor resolving/getting images
- Return the image ID on inspect
- consider digest and ignore tag when both are set
- docker run --platform
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make it possible to add `-race` to the BUILDFLAGS without making the
build fail with error:
"-buildmode=pie not supported when -race is enabled"
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Conntrack entries are created for UDP flows even if there's nowhere to
route these packets (ie. no listening socket and no NAT rules to
apply). Moreover, iptables NAT rules are evaluated by netfilter only
when creating a new conntrack entry.
When Docker adds NAT rules, netfilter will ignore them for any packet
matching a pre-existing conntrack entry. In such case, when
dockerd runs with userland proxy enabled, packets got routed to it and
the main symptom will be bad source IP address (as shown by #44688).
If the publishing container is run through Docker Swarm or in
"standalone" Docker but with no userland proxy, affected packets will
be dropped (eg. routed to nowhere).
As such, Docker needs to flush all conntrack entries for published UDP
ports to make sure NAT rules are correctly applied to all packets.
- Fixes#44688
- Fixes#8795
- Fixes#16720
- Fixes#7540
- Fixesmoby/libnetwork#2423
- and probably more.
As a precautionary measure, those conntrack entries are also flushed
when revoking external connectivity to avoid those entries to be reused
when a new sandbox is created (although the kernel should already
prevent such case).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The CreatedAt date was determined from the volume's `_data`
directory (`/var/lib/docker/volumes/<volumename>/_data`).
However, when initializing a volume, this directory is updated,
causing the date to change.
Instead of using the `_data` directory, use its parent directory,
which is not updated afterwards, and should reflect the time that
the volume was created.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The image store's used are an interface, so there's no guarantee
that implementations don't wrap the errors. Make sure to catch
such cases by using errors.Is.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We still need a stage that build binaries and extra tools as well for
docker-ce-packaging repo: ff110508ff/static/Makefile (L41-L57)
This could be removed if we create a package for each project
like it's done in docker-packaging repo: https://github.com/docker/packaging/tree/main/pkg
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Better support for cross compilation so we can fully rely
on `--platform` flag of buildx for a seamless integration.
This removes unnecessary extra cross logic in the Dockerfile,
DOCKER_CROSSPLATFORMS and CROSS vars and some hack scripts as well.
Non-sandboxed build invocation is still supported and dev stages
in the Dockerfile have been updated accordingly.
Bake definition and GitHub Actions workflows have been updated
accordingly as well.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This function was suppressing errors coming from ConfigureTransport, with the
assumption that it will only be used with the Default configuration, and only return
errors for invalid / unsupported schemes (e.g., using "npipe" on a Linux environment);
d109e429dd/vendor/github.com/docker/go-connections/sockets/sockets_unix.go (L27-L29)
Those errors won't happen when the default is passed, so this is mostly theoretical.
Let's return the error instead (which should always be nil), just to be on the save
side, and to make sure that any other use of this function will return errors that
occurred, which may also be when parsing proxy environment variables;
d109e429dd/vendor/github.com/docker/go-connections/sockets/sockets.go (L29)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code below is run when restoring all images (which can be "many"),
constructing the "logrus.WithFields" is deliberately not "DRY", as the
logger is only used for error-cases, and we don't want to do allocations
if we don't need it. A "f" type-alias was added to make it ever so slightly
more DRY, but that's just for convenience :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Having this function hides what it's doing, which is just to type-cast
to an image.ID (which is a digest). Using a cast is more transparent,
so deprecating this function in favor of a regular typecast.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Simplify the error message so that we don't have to distinguish between static-
and non-static builds. Also update the link to the storage-driver section to
use a "/go/" redirect in the docs, as the anchor link was no longer correct.
Using a "/go/" redirect makes sure the link remains functional if docs is moving
around.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in b86e3bee5a to
work around an issue in os/user.Current(), which SEGFAULTS when compiling
statically with cgo enabled (see golang/go#13470).
We hit similar issues in other parts, and contributed a "osusergo" build-
tag in https://go-review.googlesource.com/c/go/+/330753. The "osusergo"
build tag must be set when compiling static binaries with cgo enabled.
If that build-tag is set, the cgo implementation for user.Current() won't
be used, and a pure-go implementation is used instead;
https://github.com/golang/go/blob/go1.19.4/src/os/user/cgo_lookup_unix.go#L5
With the above in place, we no longer need this workaround, and can remove
the ensureHomeIfIAmStatic() function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The IPCMode type was added in 497fc8876e, and from
that patch, the intent was to allow `host` (without `:`), `""` (empty, default)
or `container:<container ID>`, but the `Valid()` function seems to be too relaxed
and accepting both `:`, as well as `host:<anything>`. No unit-tests were added
in that patch, and integration-tests only tested for valid values.
Later on, `PidMode`, and `UTSMode` were added in 23feaaa240
and f2e5207fc9, both of which were implemented as
a straight copy of the `IPCMode` implementation, copying the same bug.
Finally, commit d4aec5f0a6 implemented unit-tests
for these types, but testing for the wrong behavior of the implementation.
This patch updates the validation to correctly invalidate `host[:<anything>]`
and empty (`:`) types.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It only had a single implementation, so we may as well remove the added
complexity of defining it as an interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After further looking at the code, it appears that the default exit-code
for unknown (other) errors is 128 (set in `defer`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These matches were overwriting the previous "match", so reversing the
order in which they're tried so that we can return early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Rename the variable make it more visible where it's used, as there's were
other "err" variables masking it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They were just small wrappers arround cli.Args(), and the abstraction
made one wonder if they were doing some "magic" things, but they weren't,
so just inlining the `cli.Args()` makes it more transparent what's executed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Remove `WaitRestart()` as it was no longer used
- Un-export `WaitForInspectResult()` as it was only used internally, and we want
to reduce uses of these utilities.
- Inline `appendDocker()` into `Docker()` as it was the only place it was used,
and the name was incorrect anyway (should've been named `prependXX`).
- Simplify `Args()`, as it was first splitting the slice (into `command` and `args`),
only to join them again into a single slice (in `icmd.Command()`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were moved to api/types/container in 7ac4232e70,
but the unit-tests for them were not moved. This patch moves the unit-tests back together
with the types.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
iptables -C flag was introduced in v1.4.11, which was released ten
years ago. Thus, there're no more Linux distributions supported by
Docker using this version. As such, this commit removes the old way of
checking if an iptables rule exists (by using substring matching).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The former was doing some checks and logging warnings, whereas
the latter was doing the same checks but to set some internal variables.
As both are called only once and from the same place, there're now
merged together.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
These were only used in a single location, and in a rather bad way;
replace them with strings.Cut() which should be all we need for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- config collided with import
- cap collided with a built-in
- c collided with the "controller" receiver
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was forked from libcontainer (now runc) in
fb6dd9766e
From the description of this code:
> THIS CODE DOES NOT COMMUNICATE WITH KERNEL VIA RTNETLINK INTERFACE
> IT IS HERE FOR BACKWARDS COMPATIBILITY WITH OLDER LINUX KERNELS
> WHICH SHIP WITH OLDER NOT ENTIRELY FUNCTIONAL VERSION OF NETLINK
That comment was added as part of a refactor in;
4fe2c7a4db
Digging deeper into the code, it describes:
> This is more backward-compatible than netlink.NetworkSetMaster and
> works on RHEL 6.
That comment (and code) moved around a few times;
- moved into the libcontainer pkg: 6158ccad97
- moved within the networkdriver pkg: 4cdcea2047
- moved into the networkdriver pkg: 90494600d3
Ultimately leading to 7a94cdf8ed, which implemented
this:
> create the bridge device with ioctl
>
> On RHEL 6, creation of a bridge device with netlink fails. Use the more
> backward-compatible ioctl instead. This fixes networking on RHEL 6.
So from that information, it looks indeed to support RHEL 6, and Ubuntu 12.04
which are both EOL, and we haven't supported for a long time, so probably time
to remove this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fixes a (theoretical?) panic if ID would be shorter than 12
characters. Also trim the ID _after_ cutting off the suffix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We're looking for a specific prefix, so remove the prefix instead. Also remove
redundant error-wrapping, as `os.Open()` already provides details in the error
returned;
open /no/such/file: no such file or directory
open /etc/os-release: permission denied
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a single exported implementation, so that we can maintain the
GoDoc string in one place, and use non-exported functions for the
actual implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types and functions are more closely related to the functionality
provided by pkg/systeminfo, and used in conjunction with the other functions
in that package, so moving them there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a single exported implementation, so that we can maintain the
GoDoc string in one place, and use non-exported functions for the
actual implementation (which were already in place).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility wasn't very related to all other utilities in pkg/ioutils.
Moving it to longpath to also make it more clear what it does.
It looks like there's only a single (public) external consumer of this
utility, and only used in a test, and it's not 100% clear if it was
intentional to use our package, of if it was a case of "I actually meant
`io/ioutil.MkdirTemp`" so we could consider skipping the alias.
While moving the package, I also renamed `TempDir` to `MkdirTemp`, which
is the signature it matches in "os" from stdlib.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No changes in vendored code, but removes some indirect dependencies.
full diff: b17f02f0a0...0da442b278
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since b58de39ca7, this option was now only used
to produce a fatal error when starting the daemon. That change is in the 23.0
release, so we can remove it from the master branch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added in 677a6b3506, and named
"common", because at the time, the "docker" and "dockerd" (daemon) code
were still in the same repository, and shared this type. Renaming it, now
that's no longer the case.
As there are no external consumers of this type, I'm not adding an alias.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The configDir (and "DOCKER_CONFIG" environment variable) is now only used
for the default location for TLS certificates to secure the daemon API.
It is a leftover from when the "docker" and "dockerd" CLI shared the same
binary, allowing the DOCKER_CONFIG environment variable to set the location
for certificates to be used by both.
This patch merges it into cmd/dockerd, which is where it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The cli package is a leftover from when the "docker" and "dockerd" cli
were both maintained in this repository; the only consumer of this is
now the dockerd CLI, so we can move this code (it should not be imported
by anyone).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Moby is not a Go module; to prevent anyone from mistakenly trying to
convert it to one before we are ready, introduce a check (usable in CI
and locally) for a go.mod file.
This is preferable to trying to .gitignore the file as we can ensure
that a mistakenly created go.mod is surfaced by Git-based tooling and is
less likely to surprise a contributor.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
To make the local build environment more correct and consistent, we
should never leave an uncommitted go.mod in the tree; however, we need a
go.mod for certain commands to work properly. Use a wrapper script to
create and destroy the go.mod as needed instead of potentially changing
tooling behavior by leaving it.
If a go.mod already exists, this script will warn and call the wrapped
command with GO111MODULE=on.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(*Daemon).registerLinks() calling the WriteHostConfig() method of its
container argument is a vestigial behaviour. In the distant past,
registerLinks() would persist the container links in an SQLite database
and drop the link config from the container's persisted HostConfig. This
changed in Docker v1.10 (#16032) which migrated away from SQLite and
began using the link config in the container's HostConfig as the
persistent source of truth. registerLinks() no longer mutates the
HostConfig at all so persisting the HostConfig to disk falls outside of
its scope of responsibilities.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*Container).CheckpointTo() upserts a snapshot of the container to the
daemon's in-memory ViewDB and also persists the snapshot to disk. It
does not register the live container object with the daemon's container
store, however. The ViewDB and container store are used as the source of
truth for different operations, so having a container registered in one
but not the other can result in inconsistencies. In particular, the List
Containers API uses the ViewDB as its source of truth and the Container
Inspect API uses the container store.
The (*Daemon).setHostConfig() method is called fairly early in the
process of creating a container, long before the container is registered
in the daemon's container store. Due to a rogue CheckpointTo() call
inside setHostConfig(), there is a window of time where a container can
be included in a List Containers API response but "not exist" according
to the Container Inspect API and similar endpoints which operate on a
particular container. Remove the rogue call so that the caller has full
control over when the container is checkpointed and update callers to
checkpoint explicitly. No changes to (*Daemon).create() are needed as it
checkpoints the fully-created container via (*Daemon).Register().
Fixes#44512.
Signed-off-by: Cory Snider <csnider@mirantis.com>
GetContainer() would return (nil, nil) when looking up a container
if the container was inserted into the containersReplica ViewDB but not
the containers Store at the time of the lookup. Callers which reasonably
assume that the returned err == nil implies returned container != nil
would dereference a nil pointer and panic. Change GetContainer() so that
it always returns a container or an error.
Signed-off-by: Cory Snider <csnider@mirantis.com>
the non-exported "daemon.createNetwork" already returns nil if there's
an error, so no need to check the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a dependency of github.com/fluent/fluent-logger-golang, which
currently does not provide a go.mod, but tests against the latest
versions of its dependencies.
Updating this dependency to the latest version.
Notable changes:
- all: implement omitempty
- fix: JSON encoder may produce invalid utf-8 when provided invalid utf-8 message pack string.
- added Unwrap method to errWrapped plus tests; switched travis to go 1.14
- CopyToJSON: fix bitSize for floats
- Add Reader/Writer constructors with custom buffer
- Add missing bin header functions
- msgp/unsafe: bring code in line with unsafe guidelines
- msgp/msgp: fix ReadMapKeyZC (fix "Fail to decode string encoded as bin type")
full diff: https://github.com/tinylib/msgp/compare/v1.1.0...v1.1.6
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is an (indirect) dependency of github.com/fluent/fluent-logger-golang,
which currently does not provide a go.mod, but tests against the latest
versions of its dependencies.
Updating this dependency to the latest version.
full diff: https://github.com/philhofer/fwd/compare/v1.0.0...v1.1.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows differentiating how the detailed data is collected between
the containerd-integration code and the existing implementation.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The List Images API endpoint has accepted multiple values for the
`since` and `before` filter predicates, but thanks to Go's randomizing
of map iteration order, it would pick an arbitrary image to compare
created timestamps against. In other words, the behaviour was undefined.
Change these filter predicates to have well-defined semantics: the
logical AND of all values for each of the respective predicates. As
timestamps are a totally-ordered relation, this is exactly equivalent to
applying the newest and oldest creation timestamps for the `since` and
`before` predicates, respectively.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Also using `bytes.TrimSuffix()`, which is slightly more readable, and
makes sure we're only stripping the null terminator.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
no changes in vendored code, but containerd v1.6.12 is a security release,
so updating, to prevent scanners marking the dependency to have a vulnerability.
full diff: https://github.com/containerd/containerd/compare/v1.6.11...v1.6.12
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
golang.org/x/net contains a fix for CVE-2022-41717, which was addressed
in stdlib in go1.19.4 and go1.18.9;
> net/http: limit canonical header cache by bytes, not entries
>
> An attacker can cause excessive memory growth in a Go server accepting
> HTTP/2 requests.
>
> HTTP/2 server connections contain a cache of HTTP header keys sent by
> the client. While the total number of entries in this cache is capped,
> an attacker sending very large keys can cause the server to allocate
> approximately 64 MiB per open connection.
>
> This issue is also fixed in golang.org/x/net/http2 v0.4.0,
> for users manually configuring HTTP/2.
full diff: https://github.com/golang/net/compare/v0.2.0...v0.4.0
other dependency updates (due to circular dependencies):
- golang.org/x/sys v0.3.0: https://github.com/golang/sys/compare/v0.2.0...v0.3.0
- golang.org/x/text v0.5.0: https://github.com/golang/text/compare/v0.4.0...v0.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix nil pointer deference for Windows containers in CRI plugin
- Fix lease labels unexpectedly overwriting expiration
- Fix for simultaneous diff creation using the same parent snapshot
full diff: https://github.com/containerd/containerd/v1.6.10...v1.6.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes security fixes for net/http (CVE-2022-41717, CVE-2022-41720),
and os (CVE-2022-41720).
These minor releases include 2 security fixes following the security policy:
- os, net/http: avoid escapes from os.DirFS and http.Dir on Windows
The os.DirFS function and http.Dir type provide access to a tree of files
rooted at a given directory. These functions permitted access to Windows
device files under that root. For example, os.DirFS("C:/tmp").Open("COM1")
would open the COM1 device.
Both os.DirFS and http.Dir only provide read-only filesystem access.
In addition, on Windows, an os.DirFS for the directory \(the root of the
current drive) can permit a maliciously crafted path to escape from the
drive and access any path on the system.
The behavior of os.DirFS("") has changed. Previously, an empty root was
treated equivalently to "/", so os.DirFS("").Open("tmp") would open the
path "/tmp". This now returns an error.
This is CVE-2022-41720 and Go issue https://go.dev/issue/56694.
- net/http: limit canonical header cache by bytes, not entries
An attacker can cause excessive memory growth in a Go server accepting
HTTP/2 requests.
HTTP/2 server connections contain a cache of HTTP header keys sent by
the client. While the total number of entries in this cache is capped,
an attacker sending very large keys can cause the server to allocate
approximately 64 MiB per open connection.
This issue is also fixed in golang.org/x/net/http2 vX.Y.Z, for users
manually configuring HTTP/2.
Thanks to Josselin Costanzi for reporting this issue.
This is CVE-2022-41717 and Go issue https://go.dev/issue/56350.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.19.4
And the milestone on the issue tracker:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.19.3...go1.19.4
The golang.org/x/net fix is in 1e63c2f08a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Kevin has been doing a lot of work helping modernize our CI, and helping
with BuildKit-related changes. Being our "resident GitHub actions expert",
I think it make sense to add him to the maintainers list as a curator (for
starters), to grant him permissions on the repository.
Perhaps we should nominate him to become a maintainer (with a focus on his
area of expertise) :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This addresses a regression introduced in 407e3a4552,
which turned out to be "too strict", as there's old images that use, for example;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
unsupported media type application/octet-stream
Before 407e3a4552, such mediatypes were accepted;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
e110a4a17941: Pull complete
30dac23631f0: Pull complete
202fc3980a36: Pull complete
Digest: sha256:f88925c97b9709dd6da0cb2f811726da9d724464e9be17a964c70f067d2aa64a
Status: Downloaded newer image for python:3.5.1-alpine
docker.io/library/python:3.5.1-alpine
This patch copies the additional media-types, using the list of types that
were added in a215e15cb1, which fixed a
similar issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syncs the seccomp-profile with the latest changes in containerd's
profile, applying the same changes as 17a9324035
Some background from the associated ticket:
> We want to use vsock for guest-host communication on KubeVirt
> (https://github.com/kubevirt/kubevirt). In KubeVirt we run VMs in pods.
>
> However since anyone can just connect from any pod to any VM with the
> default seccomp settings, we cannot limit connection attempts to our
> privileged node-agent.
>
> ### Describe the solution you'd like
> We want to deny the `socket` syscall for the `AF_VSOCK` family by default.
>
> I see in [1] and [2] that AF_VSOCK was actually already blocked for some
> time, but that got reverted since some architectures support the `socketcall`
> syscall which can't be restricted properly. However we are mostly interested
> in `arm64` and `amd64` where limiting `socket` would probably be enough.
>
> ### Additional context
> I know that in theory we could use our own seccomp profiles, but we would want
> to provide security for as many users as possible which use KubeVirt, and there
> it would be very helpful if this protection could be added by being part of the
> DefaultRuntime profile to easily ensure that it is active for all pods [3].
>
> Impact on existing workloads: It is unlikely that this will disturb any existing
> workload, becuase VSOCK is almost exclusively used for host-guest commmunication.
> However if someone would still use it: Privileged pods would still be able to
> use `socket` for `AF_VSOCK`, custom seccomp policies could be applied too.
> Further it was already blocked for quite some time and the blockade got lifted
> due to reasons not related to AF_VSOCK.
>
> The PR in KubeVirt which adds VSOCK support for additional context: [4]
>
> [1]: https://github.com/moby/moby/pull/29076#commitcomment-21831387
> [2]: dcf2632945
> [3]: https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads
> [4]: https://github.com/kubevirt/kubevirt/pull/8546
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit allows to remove dependency on the mutable version armon/go-radix.
The go-immutable-radix package is better maintained.
It is likely that a bit more memory will be used when using the
immutable version, though discarded nodes are being reused in a pool.
These changes happen when networks are added/removed or nodes come and
go in a cluster, so we are still talking about a relatively low
frequency event.
The major changes compared to the old radix are when modifying (insert
or delete) a tree, and those are pretty self-contained: we replace the
entire immutable tree under a lock.
Signed-off-by: Tibor Vass <teabee89@gmail.com>
This makes the `ImageList` function to add `shared-size=1` to the url
query when user caller sets the SharedSize.
SharedSize support was introduced in API version 1.42. This field was
added to the options struct, but client wasn't adjusted.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some of these options required parsing the resolv.conf file, but the function
could return an error further down; this patch moves the parsing closer to
where their results are used (which may not happen if we're encountering an
error before).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing code was always parsing the host's resolv.conf to read
the nameservers, searchdomain and options, but those options were
only needed if these options were not configured on the sandbox.
This patch reverses the logic to only parse the resolv.conf if
no options are present in the sandbox configuration.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the
code by just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This one is a "bit" fuzzy, as it may not be _directly_ related to `archive`,
but it's always used _in combination_ with the archive package, so moving it
there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch:
- Deprecates pkg/system.DefaultPathEnv
- Moves the implementation inside oci
- Adds TODOs to align the default in the Builder with the one used elsewhere
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The singleflight function was capturing the context.Context of the first
caller that invoked the `singleflight.Do`. This could cause all
concurrent calls to be cancelled when the first request is cancelled.
singleflight calls were also moved from the ImageService to Daemon, to
avoid having to implement this logic in both graphdriver and containerd
based image services.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is only used for tests, and the key is not verified anymore, so
instead of creating a key and storing it, we can just use an ad-hoc
one.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turned out that the loadOrCreateTrustKey() utility was doing exactly the
same as libtrust.LoadOrCreateTrustKey(), so making it a thin wrapped. I kept
the tests to verify the behavior, but we could remove them as we only need this
for our integration tests.
The storage location for the generated key was changed (again as we only need
this for some integration tests), so we can remove the TrustKeyPath from the
config.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The migration code is in the 22.06 branch, and if we don't migrate
the only side-effect is the daemon's ID being regenerated (as a
UUID).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This `err` is special (as described at the top of the function), but due
to its name is easy to overlook, which risks the chance of inadvertently
shadowing it.
This patch renames the variable to reduce the chance of this happening.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sam is on my team, and we started to do weekly triage sessions to
clean up the backlog. Adding him, so that he can help with doing
triage without my assistance :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The reference.ParseNormalizedNamed() utility already returns a Named
reference, but we're interested in wether the digest has a digest, so
check for that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Current Dockerfile downloads vpnkit for both linux/amd64
and linux/arm64 platforms even if target platform does not
match. This change will download vpnkit only if target
platform matches, otherwise it will just use a dummy scratch
stage.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
no significant changes in vendored code, other than updating build-tags
for go1.17, but removes some dependencies from the module, which can
help with future updates;
full diff: 3f7ff695ad...abb19827d3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While this replace was needed in swarmkit itself, it looks like
it doesn't cause issues when removed in this repository, so
let's remove it here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously we had to use a replace rule, as later versions of this
module resulted in a panic. This issue was fixed in:
f30034d788
Which means we can remove the replace rule, and update the dependency.
No new release was tagged yet, so sticking to a "commit" for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Raspberry Pi allows to start system under overlayfs.
Docker is successfully fallbacks to fuse-overlay but not starting
because of the `Error starting daemon: rename /var/lib/docker/runtimes /var/lib/docker/runtimes-old: invalid cross-device link` error
It's happening because `rename` is not supported by overlayfs.
After manually removing directory `runtimes` docker starts and works successfully
Signed-off-by: Illia Antypenko <ilya@antipenko.pp.ua>
using TARGETVARIANT in frozen-images stage implies changes in
`download-frozen-image-v2.sh` script to add support for variants
so we are able to build against more platforms.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The golang.org/x/ projects are now doing tagged releases.
Some notable changes:
- authhandler: Add support for PKCE
- Introduce new AuthenticationError type returned by errWrappingTokenSource.Token
- Add support to set JWT Audience in JWTConfigFromJSON()
- google/internal: Add AWS Session Token to Metadata Requests
- go.mod: update vulnerable net library
- google: add support for "impersonated_service_account" credential type.
- google/externalaccount: add support for workforce pool credentials
full diff: https://github.com/golang/oauth2/compare/2bc19b11175f...v0.1.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The file will still be available in Git history; we should drop it
however as it is misleading and obsolete.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a partial revert of 7ca0cb7ffa, which
switched from os/exec to the golang.org/x/sys/execabs package to mitigate
security issues (mainly on Windows) with lookups resolving to binaries in the
current directory.
from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path
> ## PATH lookups
>
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in
> the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a partial revert of 7ca0cb7ffa, which
switched from os/exec to the golang.org/x/sys/execabs package to mitigate
security issues (mainly on Windows) with lookups resolving to binaries in the
current directory.
from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path
> ## PATH lookups
>
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in
> the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that this function is only ever called from contexts where a
*testing.T value is available, several improvements can be made to it.
Refactor it to be a test helper function so that callers do not need to
check and fail the test themselves. Leverage t.TempDir() so that the
temporary file is cleaned up automatically. Change it to return a single
config.Option to get better ergonomics at call sites.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestCreateParallel, which was ostensibly added as a regression test for
race conditions inside the bridge driver, contains a race condition. The
getIPv4Data() calls race the network configuration and so will sometimes
see the existing address assignments return IP address ranges which do
not conflict with them. While normally a good thing, the test asserts
that exactly one of the 100 networks is successfully created. Pass the
same IPAM data when attempting to create every network to ensure that
the address ranges conflict.
Signed-off-by: Cory Snider <csnider@mirantis.com>
addresses() would incorrectly return all IP addresses assigned to any
interface in the network namespace if exists() is false. This went
unnoticed as the unit test covering this case tested the method inside a
clean new network namespace, which had no interfaces brought up and
therefore no IP addresses assigned. Modifying
testutils.SetupTestOSContext() to bring up the loopback interface 'lo'
resulted in the loopback addresses 127.0.0.1 and [::1] being assigned to
the loopback interface, causing addresses() to return the loopback
addresses and TestAddressesEmptyInterface to start failing. Fix the
implementation of addresses() so that it only ever returns addresses for
the bridge interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
portallocator.PortAllocator holds persistent state on port allocations
in the network namespace. It normal operation it is used as a singleton,
which is a problem in unit tests as every test runs in a clean network
namespace. The singleton "default" PortAllocator remembers all the port
allocations made in other tests---in other network namespaces---and
can result in spurious test failures. Refactor the bridge driver so that
tests can construct driver instances with unique PortAllocators.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The global netlink handle ns.NlHandle() is indirectly cached for the
life of the process by the netutils.CheckRouteOverlaps() function. This
caching behaviour is a problem for the libnetwork unit tests as the
global netlink handle changes every time testutils.SetupTestOSContext()
is called, i.e. at the start of nearly every test case. Route overlaps
can be checked for in the wrong network namespace, causing spurious test
failures e.g. when running the same test twice in a row with -count=2.
Stop the netlink handle from being cached by shadowing the package-scope
variable with a function-scoped one.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestParallel has been written in an unusual style which relies on the
testing package's intra-test parallelism feature and lots of global
state to test one thing using three cooperating parallel tests. It is
complicated to reason about and quite brittle. For example, the command
go test -run TestParallel1 ./libnetwork
would deadlock, waiting until the test timeout for TestParallel2 and
TestParallel3 to run. And the test would be skipped if the
'-test.parallel' flag was less than three, either explicitly or
implicitly (default: GOMAXPROCS).
Overhaul TestParallel to address the aforementioned deficiencies and
get rid of mutable global state.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Reusing the same "OS context" (read: network namespace) and
NetworkController across multiple tests risks tests interfering with
each other, or worse: _depending on_ other tests to set up
preconditions. Construct a new controller for every test which needs
one, and run every test which mutates or inspects the host environment
in a fresh OS context.
The only outlier is runParallelTests. It is the only remaining test
implementation which references the "global" package-scoped controller,
so the global controller instance is effectively private to that one
test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Sharing a single NetworkController instance between multiple tests
makes it possible for tests to interfere with each other. As a first
step towards giving each test its own private controller instance, make
explicit which controller createTestNetwork() creates the test network
on.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There are a handful of libnetwork tests which need to have multiple
concurrent goroutines inside the same test OS context (network
namespace), including some platform-agnostic tests. Provide test
utilities for spawning goroutines inside an OS context and
setting/restoring an existing goroutine's OS context to abstract away
platform differences and so that each test does not need to reinvent the
wheel.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The SetupTestOSContext calls were made conditional in
https://github.com/moby/libnetwork/pull/148 to work around limitations
in runtime.LockOSThread() which existed before Go 1.10. This workaround
is no longer necessary now that runtime.UnlockOSThread() needs to be
called an equal number of times before the goroutine is unlocked from
the OS thread.
Unfortunately some tests break when SetupTestOSContext is not skipped.
(Evidently this code path has not been exercised in a long time.) A
newly-created network namespace is very barebones: it contains a
loopback interface in the down state and little else. Even pinging
localhost does not work inside of a brand new namespace. Set the
loopback interface to up during namespace setup so that tests which
need to use the loopback interface can do so.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the GenericData option is missing from the map, the bridge driver
would skip configuration. At first glance this seems fine: the driver
defaults are to not configure anything. But it also skips over
initializing the persistent storage, which is configured through other
option keys in the map. Fix this oversight by removing the early return.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Each call to datastore.DefaultScopes() would modify and return the same
map. Consequently, some of the config for every NetworkController in the
process would be mutated each time one is constructed or reconfigured.
This behaviour is unexpected, unintended, and undesirable. Stop
accidentally sharing configuration by changing DefaultScopes() to return
a distinct map on each call.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*networkNamespace).InvokeFunc() cleaned up the state of the locked
thread by setting its network namespace to the netns of the goroutine
which called InvokeFunc(). If InvokeFunc() was to be called after the
caller had modified its thread's network namespace, InvokeFunc() would
incorrectly "restore" the state of its goroutine thread to the wrong
namespace, violating the invariant that unlocked threads are fungible.
Change the implementation to restore the thread's netns to the netns
that particular thread had before InvokeFunc() modified it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Both of these were deprecated in 55f675811a,
but the format of the GoDoc comments didn't follow the correct format, which
caused them not being picked up by tools as "deprecated".
This patch updates uses in the codebase to use the alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
opencontainers/go-digest is a 1:1 copy of the one in distribution. It's no
longer used in distribution itself, so may be removed there at some point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This update:
- removes support for go1.11
- removes the use of "golang.org/x/crypto/ed25519", which is now part of stdlib:
> Beginning with Go 1.13, the functionality of this package was moved to the
> standard library as crypto/ed25519. This package only acts as a compatibility
> wrapper.
Note that this is not the latest release; version v1.1.44 introduced a tools.go
file, which added golang.org/x/tools to the dependency tree (but only used for
"go:generate") see commit:
df84acab71
full diff: https://github.com/miekg/dns/compare/v1.1.27...v1.1.43
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When running inside a container, testns == origns. Consequently, closing
testns causes the deferred netns.Set(origns) call to fail. Stop closing
the aliased original namespace handle.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestResolvConfHost and TestParallel both depended on a network named
"testhost" already existing in the libnetwork controller. Only TestHost
created that network, so the aforementioned tests would fail unless run
after TestHost. Fix those tests so they can be run in any order.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While this was convenient for our use, it's somewhat unexpected for a function
that writes a file to also create all parent directories; even more because
this function may be executed as root.
This patch makes the package more "safe" to use as a generic package by removing
this functionality, and leaving it up to the caller to create parent directories,
if needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
unix.Kill() does not produce an error for PID 0, -1. As a result, checking
process.Alive() would return "true" for both 0 and -1 on macOS (and previously
on Linux as well).
Let's shortcut these values to consider them "not alive", to prevent someone
trying to kill them.
A basic test was added to check the behavior.
Given that the intent of these functions is to handle single processes, this patch
also prevents 0 and negative values to be used.
From KILL(2): https://man7.org/linux/man-pages/man2/kill.2.html
If pid is positive, then signal sig is sent to the process with
the ID specified by pid.
If pid equals 0, then sig is sent to every process in the process
group of the calling process.
If pid equals -1, then sig is sent to every process for which the
calling process has permission to send signals, except for
process 1 (init), but see below.
If pid is less than -1, then sig is sent to every process in the
process group whose ID is -pid.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using the implementation from pkg/pidfile for windows, as that implementation
looks to be handling more cases to check if a process is still alive (or to be
considered alive).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the file doesn't exist, the process isn't running, so we should be able
to ignore that.
Also remove an intermediate variable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 48dd89375d...6341884e5f
Pulls in a set of fixes to SwarmKit's nascent Cluster Volumes support
discovered during subsequent development and testing.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Deleting a containerd task whose status is Created fails with a
"precondition failed" error. This is because (aside from Windows)
a process is spawned when the task is created, and deleting the task
while the process is running would leak the process if it was allowed.
libcontainerd and the containerd plugin executor mistakenly try to clean
up from a failed start by deleting the created task, which will always
fail with the aforementined error. Change them to pass the
`WithProcessKill` delete option so the cleanup has a chance to succeed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Currently an attempt to pull a reference which resolves to an OCI
artifact (Helm chart for example), results in a bit unrelated error
message `invalid rootfs in image configuration`.
This provides a more meaningful error in case a user attempts to
download a media type which isn't image related.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This fallback is used when we filter the manifest list by the user-provided platform and find no matches such that we match the previous Docker behavior (before it supported variant matching). This has been deprecated long enough that I think it's time we finally stop supporting this weird fallback, especially since it makes for buggy behavior like `docker pull --platform linux/arm/v5 alpine:3.16` leading to a `linux/arm/v6` image being pulled (I specified a variant, every manifest list entry specifies a variant, so clearly the only behavior I as a user could reasonably expect is an error that `linux/arm/v5` is not supported, but instead I get an explicitly incompatible image despite doing everything I as a user can to prevent that situation).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This debug message already includes a full platform string, so this ends up being something like `linux/arm/v7/amd64` in the end result.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
On Windows, syscall.StartProcess and os/exec.Cmd did not properly
check for invalid environment variable values. A malicious
environment variable value could exploit this behavior to set a
value for a different environment variable. For example, the
environment variable string "A=B\x00C=D" set the variables "A=B" and
"C=D".
Thanks to RyotaK (https://twitter.com/ryotkak) for reporting this
issue.
This is CVE-2022-41716 and Go issue https://go.dev/issue/56284.
This Go release also fixes https://github.com/golang/go/issues/56309, a
runtime bug which can cause random memory corruption when a goroutine
exits with runtime.LockOSThread() set. This fix is necessary to unblock
work to replace certain uses of pkg/reexec with unshared OS threads.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `execCmd()` utility was a basic wrapper around `exec.Command()`. Inlining it
makes the code more transparent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the example actually do something, and include the output, so that it
shows up in the documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some of these tests are failing (but not enabled in CI), but the current output
doesn't provide any details on the failure, so this patch is just to improve the
test output to allow debugging the actual failure.
Before this, tests would fail like:
make BIND_DIR=. TEST_FILTER=TestPluginInstallImage test-integration
...
=== FAIL: amd64.integration-cli TestDockerPluginSuite/TestPluginInstallImage (15.22s)
docker_cli_plugins_test.go:220: assertion failed: expression is false: strings.Contains(out, `Encountered remote "application/vnd.docker.container.image.v1+json"(image) when fetching`)
--- FAIL: TestDockerPluginSuite/TestPluginInstallImage (15.22s)
With this patch, tests provide more useful output:
make BIND_DIR=. TEST_FILTER=TestPluginInstallImage test-integration
...
=== FAIL: amd64.integration-cli TestDockerPluginSuite/TestPluginInstallImage (1.15s)
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json"
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json" digest="sha256:bee151d3fef5c1f787e7846efe4fa42b25a02db4e7543e54e8c679cf19d78598"
mediatype=application/vnd.docker.plugin.v1+json size=522
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json"
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json" digest="sha256:bee151d3fef5c1f787e7846efe4fa42b25a02db4e7543e54e8c679cf19d78598"
mediatype=application/vnd.docker.plugin.v1+json size=522
docker_cli_plugins_test.go:221: assertion failed: string "Error response from daemon: application/vnd.docker.distribution.manifest.v1+prettyjws not supported\n" does not contain "Encountered remote
\"application/vnd.docker.container.image.v1+json\"(image) when fetching"
--- FAIL: TestDockerPluginSuite/TestPluginInstallImage (1.15s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The ParseLink() function has special handling for legacy formats;
> This is kept because we can actually get a HostConfig with links
> from an already created container and the format is not `foo:bar`
> but `/foo:/c1/bar`
This patch adds a test-case for this format. While updating, also switching
to use gotest.tools assertions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The new daemon.containerFSView type covers all the use-cases on Linux
with a much more intuitive API, but is not portable to Windows.
Discourage people from using the old and busted functions in new Linux
code by excluding them entirely from Linux builds.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Mounting a container's volumes under its rootfs directory inside the
host mount namespace causes problems with cross-namespace mount
propagation when /var/lib/docker is bind-mounted into the container as a
volume. The mount event propagates into the container's mount namespace,
overmounting the volume, but the propagated unmount events do not fully
reverse the effect. Each archive operation causes the mount table in the
container's mount namespace to grow larger and larger, until the kernel
limiton the number of mounts in a namespace is hit. The only solution to
this issue which is not subject to race conditions or other blocker
caveats is to avoid mounting volumes into the container's rootfs
directory in the host mount namespace in the first place.
Mount the container volumes inside an unshared mount namespace to
prevent any mount events from propagating into any other mount
namespace. Greatly simplify the archiving implementations by also
chrooting into the container rootfs to sidestep the need to resolve
paths in the host.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing archive implementation is not easy to reason about by
reading the source. Prepare to rewrite it by covering more edge cases in
tests. The new test cases were determined by black-box characterizing
the existing behaviour.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It is only applicable to Windows so it does not need to be called from
platform-generic code. Fix locking in the Windows implementation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Linux implementation needs to diverge significantly from the Windows
one in order to fix platform-specific bugs. Cut the generic
implementation out of daemon/archive.go and paste identical, verbatim
copies of that implementation into daemon/archive_{windows,linux}.go to
make it easier to compare the progression of changes to the respective
implementations through Git history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This change was introduced early in the development of rootless support,
before all the kinks were worked out and rootlesskit was built. The
author was testing the daemon by inside a user namespace set up by runc,
observed that the unshare(2) syscall was returning EPERM, and assumed
that it was a fundamental limitation of user namespaces. Seeing as the
kernel documentation (of today) disagrees with that assessment and that
unshare demonstrably works inside user namespaces, I can only assume
that the EPERM was due to a quirk of their test environment, such as a
seccomp filter set up by runc blocking the unshare syscall.
https://github.com/moby/moby/pull/20902#issuecomment-236409406
Mount namespaces are necessary to address #38995 and #43390. Revert the
special-casing so those issues can also be fixed for rootless daemons.
This reverts commit dc950567c1.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Unshare the thread's file system attributes and, if applicable, mount
namespace so that the chroot operation does not affect the rest of the
process.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The applyLayer implementation in pkg/chrootarchive has to set the TMPDIR
environment variable so that archive.UnpackLayer() can successfully
create the whiteout-file temp directory. Change UnpackLayer to create
the temporary directory under the destination path so that environment
variables do not need to be touched.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fix tries to address issues raised in #44346.
The max-concurrent-downloads and max-concurrent-uploads limits are applied for the whole engine and not for each pull/push command.
Signed-off-by: Luis Henrique Mulinari <luis.mulinari@gmail.com>
Aside from unconditionally unlocking the OS thread even if restoring the
thread's network namespace fails, func (*networkNamespace).InvokeFunc()
correctly implements invoking a function inside a network namespace.
This is far from obvious, however. func InitOSContext() does much of the
heavy lifting but in a bizarre fashion: it restores the initial network
namespace before it is changed in the first place, and the cleanup
function it returns does not restore the network namespace at all! The
InvokeFunc() implementation has to restore the network namespace
explicitly by deferring a call to ns.SetNamespace().
func InitOSContext() is a leaky abstraction taped to a footgun. On the
one hand, it defensively resets the current thread's network namespace,
which has the potential to fix up the thread state if other buggy code
had failed to maintain the invariant that an OS thread must be locked to
a goroutine unless it is interchangeable with a "clean" thread as
spawned by the Go runtime. On the other hand, it _facilitates_ writing
buggy code which fails to maintain the aforementioned invariant because
the cleanup function it returns unlocks the thread from the goroutine
unconditionally while neglecting to restore the thread's network
namespace! It is quite scary to need a function which fixes up threads'
network namespaces after the fact as an arbitrary number of goroutines
could have been scheduled onto a "dirty" thread and run non-libnetwork
code before the thread's namespace is fixed up. Any number of
(not-so-)subtle misbehaviours could result if an unfortunate goroutine
is scheduled onto a "dirty" thread. The whole repository has been
audited to ensure that the aforementioned invariant is never violated,
making after-the-fact fixing up of thread network namespaces redundant.
Make InitOSContext() a no-op on Linux and inline the thread-locking into
the function (singular) which previously relied on it to do so.
func ns.SetNamespace() is of similarly dubious utility. It intermixes
capturing the initial network namespace and restoring the thread's
network namespace, which could result in threads getting put into the
wrong network namespace if the wrong thread is the first to call it.
Delete it entirely; functions which need to manipulate a thread's
network namespace are better served by being explicit about capturing
and restoring the thread's namespace.
Rewrite InvokeFunc() to invoke the closure inside a goroutine to enable
a graceful and safe recovery if the thread's network namespace could not
be restored. Avoid any potential race conditions due to changing the
main thread's network namespace by preventing the aforementioned
goroutines from being eligible to be scheduled onto the main thread.
Signed-off-by: Cory Snider <csnider@mirantis.com>
func (*network) watchMiss() correctly locks its goroutine to an OS
thread before changing the thread's network namespace, but neglects to
restore the thread's network namespace before unlocking. Fix this
oversight by unlocking iff the thread's network namespace is
successfully restored.
Prevent the watchMiss goroutine from being locked to the main thread to
avoid the issues which would arise if such a situation was to occur.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The parallel tests were unconditionally unlocking the test case
goroutine from the OS thread, irrespective of whether the thread's
network namespace was successfully restored. This was not a problem in
practice as the unpaired calls to runtime.LockOSThread() peppered
through the test case would have prevented the goroutine from being
unlocked. Unlock the goroutine from the thread iff the thread's network
namespace is successfully restored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
testutils.SetupTestOSContext() sets the calling thread's network
namespace but neglected to restore it on teardown. This was not a
problem in practice as it called runtime.LockOSThread() twice but
runtime.UnlockOSThread() only once, so the tampered threads would be
terminated by the runtime when the test case returned and replaced with
a clean thread. Correct the utility so it restores the thread's network
namespace during teardown and unlocks the goroutine from the thread on
success.
Remove unnecessary runtime.LockOSThread() calls peppering test cases
which leverage testutils.SetupTestOSContext().
Signed-off-by: Cory Snider <csnider@mirantis.com>
cmd.Environ() is new in go1.19, and not needed for this specific case.
Without this, trying to use this package in code that uses go1.18 will fail;
builder/remotecontext/git/gitutils.go:216:23: cmd.Environ undefined (type *exec.Cmd has no field or method Environ)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- On Windows, we don't build and run a local test registry (we're not running
docker-in-docker), so we need to skip this test.
- On rootless, networking doesn't support this (currently)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is accomplished by storing the distribution source in the content
labels. If the distribution source is not found then we check to the
registry to see if the digest exists in the repo, if it does exist then
the puller will use it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
GitHub uses these parameters to construct a name; removing the ./ prefix
to make them more readable (and add them back where it's used)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Setting cmd.Env overrides the default of passing through the parent
process' environment, which works out fine most of the time, except when
it doesn't. For whatever reason, leaving out all the environment causes
git-for-windows sh.exe subprocesses to enter an infinite loop of
access violations during Cygwin initialization in certain environments
(specifically, our very own dev container image).
Signed-off-by: Cory Snider <csnider@mirantis.com>
While it is undesirable for the system or user git config to be used
when the daemon clones a Git repo, it could break workflows if it was
unconditionally applied to docker/cli as well.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Prevent git commands we run from reading the user or system
configuration, or cloning submodules from the local filesystem.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Keep It Simple! Set the working directory for git commands by...setting
the git process's working directory. Git commands can be run in the
parent process's working directory by passing the empty string.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Make the test more debuggable by logging all git command output and
running each table-driven test case as a subtest.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`system.MkdirAll()` is a special version of os.Mkdir to handle creating directories
using Windows volume paths (`"\\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}"`).
This may be important when `MkdirAll` is used, which traverses all parent paths to
create them if missing (ultimately landing on the "volume" path).
The daemon.NewDaemon() function used `system.MkdirAll()` in various places where
a subdirectory within `daemon.Root` was created. This appeared to be mostly out
of convenience (to not have to handle `os.ErrExist` errors). The `daemon.Root`
directory should already be set up in these locations, and should be set up with
correct permissions. Using `system.MkdirAll()` would potentially mask errors if
the root directory is missing, and instead set up parent directories (possibly
with incorrect permissions).
Because of the above, this patch changes `system.MkdirAll` to `os.Mkdir`. As we
are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
One location continues to use `system.MkdirAll`, as the temp-directory may be
configured to be outside of `daemon.Root`, but a redundant `os.Stat(realTmp)`
was removed, as `system.MkdirAll` is expected to handle this.
As we are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it more transparent that it's unused for Linux,
and we don't pass "root", which has no relation with the
path on Linux.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows us to run CI with the containerd snapshotter enabled, without
patching the daemon.json, or changing how tests set up daemon flags.
A warning log is added during startup, to inform if this variable is set,
as it should only be used for our integration tests.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were discarding the underlying error, which made it impossible for
callers to detect (e.g.) an os.ErrNotExist.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use t.TempDir() to make sure we're testing from a clean state
- improve checks for errors to have the correct error-type where possible
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
system.MkdirAll is a special version of os.Mkdir to handle creating directories
using Windows volume paths ("\\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}").
This may be important when MkdirAll is used, which traverses all parent paths to
create them if missing (ultimately landing on the "volume" path).
Commit 62f648b061 introduced the system.MkdirAll
calls, as a change was made in applyLayer() for Windows to use Windows volume
paths as an alternative for chroot (which is not supported on Windows). Later
iteractions changed this to regular Windows long-paths (`\\?\<path>`) in
230cfc6ed2, and 9b648dfac6.
Such paths are handled by the `os` package.
However, in these tests, the parent path already exists (all paths created are
a direct subdirectory within `tmpDir`). It looks like `MkdirAll` here is used
out of convenience to not have to handle `os.ErrExist` errors. As all these
tests are running in a fresh temporary directory, there should be no need to
handle those, and it's actually desirable to produce an error in that case, as
the directory already existing would be unexpected.
Because of the above, this test changes `system.MkdirAll` to `os.Mkdir`. As we
are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Introduced in 3ac6394b80, which makes no mention
of a reason for extracting to the same directory as we created the archive from,
so I assume this was a copy/paste mistake and the path was meant to be "dest",
not "src".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, Docker Hub was excluded when configuring "allow-nondistributable-artifacts".
With the updated policy announced by Microsoft, we can remove this restriction;
https://techcommunity.microsoft.com/t5/containers/announcing-windows-container-base-image-redistribution-rights/ba-p/3645201
There are plans to deprecated support for foreign layers altogether in the OCI,
and we should consider to make this option the default, but as that requires
deprecating the option (and possibly keeping an "opt-out" option), we can look
at that separately.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The implementation of CanAccess() is very rudimentary, and should
not be used for anything other than a basic check (and maybe not
even for that). It's only used in a single location in the daemon,
so move it there, and un-export it to not encourage others to use
it out of context.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type felt really redundant; `pidfile.New()` takes the path of the file to
create as an argument, so this is already known. The only thing the PIDFile
type provided was a `Remove()` method, which was just calling `os.Remove()` on
the path of the file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use bytes.TrimSpace instead of using the strings package, which is
more performant, and allows us to skip the intermediate variable.
Also combined some "if" statements to reduce cyclomatic complexity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's ok to ignore if the file doesn't exist, or if the file doesn't
have a PID in it, but we should produce an error if the file exists,
but we're unable to read it for other reasons.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The same attribute was generated for each path that was created, but always
the same, so instead of generating it in each iteration, generate it once,
and pass it to our mkdirall() implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The regex only matched volume paths without a trailing path-separator. In cases
where a path would be passed with a trailing path-separator, it would depend on
further code in mkdirall to strip the trailing slash, then to perform the regex
again in the next iteration.
While regexes aren't ideal, we're already executing this one, so we may as well
use it to match those situations as well (instead of executing it twice), to
allow us to return early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ideally, we would construct this lazily, but adding a function and a
sync.Once felt like a bit "too much".
Also updated the GoDoc for some functions to better describe what they do.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg-imports validation prevents reusable library packages from
depending on the whole daemon, accidentally or intentionally. The
allowlist is overly restrictive as it also prevents us from reusing code
in both pkg/ and daemon/ unless that code is also made into a reusable
library package under pkg/. Allow pkg/ packages to import internal/
packages which do not transitively depend on disallowed packages.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Previously we waited for 60 seconds after the service faults to restart
it. However, there isn't much benefit to waiting this long. We expect
15 seconds to be a more reasonable delay.
Co-Authored-by: Kevin Parsons <kevpar@microsoft.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Building off insights from the great work Cory Snider has been doing,
this replaces a reexec with a much lower overhead implementation which
performs the `Chddir` in a new goroutine that is locked to a specific
thread with CLONE_FS unshared.
The thread is thrown away afterwards and the Chdir does effectively the
same thing as what the reexec was being used for.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
golang.org/x/sys/windows now implements this, so we can use that
instead of a local implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `IsAnInteractiveSession` was deprecated, and `IsWindowsService` is marked
as the recommended replacement.
For details, see 280f808b4a
> CL 244958 includes isWindowsService function that determines if a
> process is running as a service. The code of the function is based on
> public .Net implementation.
>
> IsAnInteractiveSession function implements similar functionality, but
> is based on an old Stackoverflow post., which is not as authoritative
> as code written by Microsoft for their official product.
>
> This change copies CL 244958 isWindowsService function into svc package
> and makes it public. The intention is that future users will prefer
> IsWindowsService to IsAnInteractiveSession.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go-winio now defines this function, so we can consume that.
Note that there's a difference between the old implementation and the original
one (added in 1cb9e9b44e). The old implementation
had special handling for win32 error codes, which was removed in the go-winio
implementation in 0966e1ad56
As `go-winio.GetFileSystemType()` calls `filepath.VolumeName(path)` internally,
this patch also removes the `string(home[0])`, which is redundant, and could
potentially panic if an empty string would be passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 955c1f881a (Docker v17.12.0) replaced
detection of support for multiple lowerdirs (as required by overlay2) to not
depend on the kernel version. The `overlay2.override_kernel_check` was still
used to print a warning that older kernel versions may not have full support.
After this, commit e226aea280 (Docker v20.10.0,
backported to v19.03.7) removed uses of the `overlay2.override_kernel_check`
option altogether, but we were still parsing it.
This patch changes the `parseOptions()` function to not parse the option,
printing a deprecation warning instead. We should change this to be an error,
but the `overlay2.override_kernel_check` option was not deprecated in the
documentation, so keeping it around for one more release.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These consts were defined locally, but are now defined in golang.org/x/sys, so
we can use those.
Also added some documentation about how this function works, taking the description
from the GetExitCodeProcess function (processthreadsapi.h) API reference:
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getexitcodeprocess
> The GetExitCodeProcess function returns a valid error code defined by the
> application only after the thread terminates. Therefore, an application should
> not use `STILL_ACTIVE` (259) as an error code (`STILL_ACTIVE` is a macro for
> `STATUS_PENDING` (minwinbase.h)). If a thread returns `STILL_ACTIVE` (259) as
> an error code, then applications that test for that value could interpret it
> to mean that the thread is still running, and continue to test for the
> completion of the thread after the thread has terminated, which could put
> the application into an infinite loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most of the package was using stdlib's errors package, so replacing two calls
to pkg/errors with stdlib. Also fixing capitalization of error strings.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On unix, it's an alias for os.MkdirAll, so remove its use to be
more transparent what's being used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Merge the accessible() function into CanAccess, and check world-
readable permissions first, before checking owner and group.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the IoctlRetInt, IoctlSetInt and IoctlLoopSetStatus64 helper
functions defined in the golang.org/x/sys/unix package instead of
manually wrapping these using a locally defined function.
Inspired by 3cc3d8a560
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests were effectively doing "subtests", using comments to describe each,
however;
- due to the use of `t.Fatal()` would terminate before completing all "subtests"
- The error returned by the function being tested (`Chtimes`), was not checked,
and the test used "indirect" checks to verify if it worked correctly. Adding
assertions to check if the function didn't produce an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removing the "Linux" suffix from one test, which should probably be
rewritten to be run on "unix", to provide test-coverage for those
implementations.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With t.TempDir(), some of the test-utilities became so small that
it was more transparent to inline them. This also helps separating
concenrs, as we're in the process of thinning out and decoupling
some packages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It looks like this function was converting the time (`windows.NsecToTimespec()`),
only to convert it back (`windows.TimespecToNsec()`). This became clear when
moving the lines together:
```go
ctimespec := windows.NsecToTimespec(ctime.UnixNano())
c := windows.NsecToFiletime(windows.TimespecToNsec(ctimespec))
```
And looking at the Golang code, it looks like they're indeed the exact reverse:
```go
func TimespecToNsec(ts Timespec) int64 { return int64(ts.Sec)*1e9 + int64(ts.Nsec) }
func NsecToTimespec(nsec int64) (ts Timespec) {
ts.Sec = nsec / 1e9
ts.Nsec = nsec % 1e9
return
}
```
While modifying this code, also renaming the `e` variable to a more common `err`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This more closely matches to how it's used everywhere. Also move the comment
describing "what" ChTimes() does inside its GoDoc.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code caused me some head-scratches, and initially I wondered
if this was a bug, but it looks to be intentional to set nsec, not
sec, as time.Unix() internally divides nsec, and sets sec accordingly;
https://github.com/golang/go/blob/go1.19.2/src/time/time.go#L1364-L1380
// Unix returns the local Time corresponding to the given Unix time,
// sec seconds and nsec nanoseconds since January 1, 1970 UTC.
// It is valid to pass nsec outside the range [0, 999999999].
// Not all sec values have a corresponding time value. One such
// value is 1<<63-1 (the largest int64 value).
func Unix(sec int64, nsec int64) Time {
if nsec < 0 || nsec >= 1e9 {
n := nsec / 1e9
sec += n
nsec -= n * 1e9
if nsec < 0 {
nsec += 1e9
sec--
}
}
return unixTime(sec, int32(nsec))
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was moved to a separate file in fe5b34ba88,
but it's unclear why it was moved (as this file is not excluded on Windows).
Moving the code back into the chtimes file, to move it closer to where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a couple of places, and in most places shouldn't be used
as those locations were in unix/linux-only files, so didn't need the wrapper.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal
will be sent to the process when the OS thread on which cmd.Start() was
executed dies. The runtime terminates an OS thread when a goroutine
exits after being wired to the thread with runtime.LockOSThread(). If
other goroutines are allowed to be scheduled onto a thread which called
cmd.Start(), an unrelated goroutine could cause the thread to be
terminated and prematurely signal the command. See
https://github.com/golang/go/issues/27505 for more information.
Prevent started subprocesses with Pdeathsig from getting signaled
prematurely by wiring the starting goroutine to the OS thread until the
subprocess has exited. No other goroutines can be scheduled onto a
locked thread so it will remain alive until unlocked or the daemon
process exits.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This utility was only used in a single place, and had no external consumers.
Move it to where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg/fsutils package was forked in containerd, and later moved to
containerd/continuity/fs. As we're moving more bits to containerd, let's also
use the same implementation to reduce code-duplication and to prevent them from
diverging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was added in 442b45628e as part of
user-namespaces, and first used in 44e1023a93 to
set up the daemon root, and move the existing content;
44e1023a93/daemon/daemon_experimental.go (L68-L71)
A later iteration no longer _moved_ the existing root directory, and removed the
use of `directory.MoveToSubdir()` e8532023f2
It looks like there's no external consumers of this utility, so we should be
save to remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- separate exported function from implementation, to allow for GoDoc to be
maintained in a single location.
- don't use named return variables (no "bare" return, and potentially shadowing
variables)
- reverse the `os.IsNotExist(err) && d != dir` condition, putting the "lighter"
`d != dir` first.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed the comment above this code, but didn't see a corresponding type-cast.
Looking at this file's history, I found that these were removed as part of
2f5f0af3fd, which looks to have overlooked some
deliberate type-casts.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds a new filter argument to the volume prune endpoint "all".
When this is not set, or it is a false-y value, then only anonymous
volumes are considered for pruning.
When `all` is set to a truth-y value, you get the old behavior.
This is an API change, but I think one that is what most people would
want.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.2 and 1.18.7, minor point releases.
These minor releases include 3 security fixes following the security policy:
- archive/tar: unbounded memory consumption when reading headers
Reader.Read did not set a limit on the maximum size of file headers.
A maliciously crafted archive could cause Read to allocate unbounded
amounts of memory, potentially causing resource exhaustion or panics.
Reader.Read now limits the maximum size of header blocks to 1 MiB.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.
This is CVE-2022-2879 and Go issue https://go.dev/issue/54853.
- net/http/httputil: ReverseProxy should not forward unparseable query parameters
Requests forwarded by ReverseProxy included the raw query parameters from the
inbound request, including unparseable parameters rejected by net/http. This
could permit query parameter smuggling when a Go proxy forwards a parameter
with an unparseable value.
ReverseProxy will now sanitize the query parameters in the forwarded query
when the outbound request's Form field is set after the ReverseProxy.Director
function returns, indicating that the proxy has parsed the query parameters.
Proxies which do not parse query parameters continue to forward the original
query parameters unchanged.
Thanks to Gal Goldstein (Security Researcher, Oxeye) and
Daniel Abeles (Head of Research, Oxeye) for reporting this issue.
This is CVE-2022-2880 and Go issue https://go.dev/issue/54663.
- regexp/syntax: limit memory used by parsing regexps
The parsed regexp representation is linear in the size of the input,
but in some cases the constant factor can be as high as 40,000,
making relatively small regexps consume much larger amounts of memory.
Each regexp being parsed is now limited to a 256 MB memory footprint.
Regular expressions whose representation would use more space than that
are now rejected. Normal use of regular expressions is unaffected.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.
This is CVE-2022-41715 and Go issue https://go.dev/issue/55949.
View the release notes for more information: https://go.dev/doc/devel/release#go1.19.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7b153b9e28 updated the main
swagger file, but didn't update the v1.42 version used for the
documentation as it wasn't created yet at the time.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change, the awslogs collectBatch and processEvent
function documentation still referenced the batchPublishFrequency
constant which was removed in favor of the configurable log stream
forceFlushInterval member.
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
Before this change restarting the daemon in live-restore with running
containers + a restart policy meant that volume refs were not restored.
This specifically happens when the container is still running *and*
there is a restart policy that would make sure the container was running
again on restart.
The bug allows volumes to be removed even though containers are
referencing them. 😱
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This package was moved to a separate repository, using the steps below:
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/docker.git moby_pubsub_temp
cd moby_pubsub_temp
# for reference
git rev-parse HEAD
# --> 572ca799db
# remove all code, except for pkg/pubsub, license, and notice, and rename pkg/pubsub to /
git filter-repo --path pkg/pubsub/ --path LICENSE --path NOTICE --path-rename pkg/pubsub/:
# remove canonical imports
git revert -s -S 585ff0ebbe6bc25b801a0e0087dd5353099cb72e
# initialize module
go mod init github.com/moby/pubsub
go mod tidy
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Managed containerd processes are executed with SysProcAttr.Pdeathsig set
to syscall.SIGKILL so that the managed containerd is automatically
killed along with the daemon. At least, that is the intention. In
practice, the signal is sent to the process when the creating _OS
thread_ dies! If a goroutine exits while locked to an OS thread, the Go
runtime will terminate the thread. If that thread happens to be the
same thread which the subprocess was started from, the subprocess will
be signaled. Prevent the journald driver from sometimes unintentionally
killing child processes by ensuring that all runtime.LockOSThread()
calls are paired with runtime.UnlockOSThread().
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `docker` CLI currently doesn't handle situations where the current context
(as defined in `~/.docker/config.json`) is invalid or doesn't exist. As loading
(and checking) the context happens during initialization of the CLI, this
prevents `docker context` commands from being used, which makes it complicated
to fix the situation. For example, running `docker context use <correct context>`
would fail, which makes it not possible to update the `~/.docker/config.json`,
unless doing so manually.
For example, given the following `~/.docker/config.json`:
```json
{
"currentContext": "nosuchcontext"
}
```
All of the commands below fail:
```bash
docker context inspect rootless
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
docker context rm --force rootless
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
docker context use default
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
```
While these things should be fixed, this patch updates the script to switch
the context using the `--context` flag; this flag is taken into account when
initializing the CLI, so that having an invalid context configured won't
block `docker context` commands from being executed. Given that all `context`
commands are local operations, "any" context can be used (it doesn't need to
make a connection with the daemon).
With this patch, those commands can now be run (and won't fail for the wrong
reason);
```bash
docker --context=default context inspect -f "{{.Name}}" rootless
rootless
docker --context=default context inspect -f "{{.Name}}" rootless-doesnt-exist
context "rootless-doesnt-exist" does not exist
```
One other issue may also cause things to fail during uninstall; trying to remove
a context that doesn't exist will fail (even with the `-f` / `--force` option
set);
```bash
docker --context=default context rm blablabla
Error: context "blablabla": not found
```
While this is "ok" in most circumstances, it also means that (potentially) the
current context is not reset to "default", so this patch adds an explicit
`docker context use`, as well as unsetting the `DOCKER_HOST` and `DOCKER_CONTEXT`
environment variables.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
runconfig/config_test.go:23:46: empty-lines: extra empty line at the start of a block (revive)
runconfig/config_test.go:75:55: empty-lines: extra empty line at the start of a block (revive)
oci/devices_linux.go:57:34: empty-lines: extra empty line at the start of a block (revive)
oci/devices_linux.go:60:69: empty-lines: extra empty line at the start of a block (revive)
image/fs_test.go:53:38: empty-lines: extra empty line at the end of a block (revive)
image/tarexport/save.go:88:29: empty-lines: extra empty line at the end of a block (revive)
layer/layer_unix_test.go:21:34: empty-lines: extra empty line at the end of a block (revive)
distribution/xfer/download.go:302:9: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:154:99: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:329:52: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:354:59: empty-lines: extra empty line at the end of a block (revive)
registry/config_test.go:323:42: empty-lines: extra empty line at the end of a block (revive)
registry/config_test.go:350:33: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
cmd/dockerd/trap/trap_linux_test.go:29:29: empty-lines: extra empty line at the end of a block (revive)
cmd/dockerd/daemon.go:327:35: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
client/events.go:19:115: empty-lines: extra empty line at the start of a block (revive)
client/events_test.go:60:31: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
api/server/router/build/build_routes.go:239:32: empty-lines: extra empty line at the start of a block (revive)
api/server/middleware/version.go:45:241: empty-lines: extra empty line at the end of a block (revive)
api/server/router/swarm/helpers_test.go:11:44: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
opts/address_pools_test.go:7:39: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:12:42: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:60:49: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:253:37: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/network/filter_test.go:174:19: empty-lines: extra empty line at the end of a block (revive)
daemon/restart.go:17:116: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_linux_test.go:255:41: empty-lines: extra empty line at the end of a block (revive)
daemon/reload_test.go:340:58: empty-lines: extra empty line at the end of a block (revive)
daemon/oci_linux.go:495:101: empty-lines: extra empty line at the end of a block (revive)
daemon/seccomp_linux_test.go:17:36: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:560:73: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:558:76: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:1092:64: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:587:24: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:807:18: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:813:42: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:872:72: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/images/image_squash.go:17:71: empty-lines: extra empty line at the start of a block (revive)
daemon/images/store.go:128:27: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_list.go:154:55: empty-lines: extra empty line at the start of a block (revive)
daemon/images/image_delete.go:135:13: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_search.go:25:64: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/logger/loggertest/logreader.go:58:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/ring_test.go:119:34: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:37:12: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:41:44: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:170:9: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp_test.go:152:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp.go:124:117: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/syslog/syslog.go:249:87: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/graphdriver/aufs/aufs.go:239:80: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:249:27: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/testutil.go:271:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:179:32: empty-block: this block is empty, you can remove it (revive)
daemon/graphdriver/zfs/zfs.go:375:48: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/overlay/overlay.go:248:89: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:636:21: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1150:70: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1613:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1645:65: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/btrfs/btrfs.go:53:101: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1944:89: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/cluster/convert/service.go:96:34: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:169:44: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:470:30: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/container.go:224:23: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/convert/network.go:109:14: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:537:27: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:247:19: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:252:41: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:256:12: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:289:80: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/health_test.go:18:37: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/adapter.go:437:68: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
plugin/v2/settable_test.go:24:29: empty-lines: extra empty line at the end of a block (revive)
plugin/manager_linux.go:96:6: empty-lines: extra empty line at the end of a block (revive)
plugin/backend_linux.go:373:16: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
volume/mounts/parser_test.go:42:39: empty-lines: extra empty line at the end of a block (revive)
volume/mounts/windows_parser.go:129:24: empty-lines: extra empty line at the end of a block (revive)
volume/local/local_test.go:16:35: empty-lines: extra empty line at the end of a block (revive)
volume/local/local_unix.go:145:3: early-return: if c {...} else {... return } can be simplified to if !c { ... return } ... (revive)
volume/service/service_test.go:18:38: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
testutil/fixtures/load/frozen.go:141:99: empty-lines: extra empty line at the end of a block (revive)
testutil/daemon/plugin.go:56:129: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration/config/config_test.go:106:31: empty-lines: extra empty line at the end of a block (revive)
integration/secret/secret_test.go:106:31: empty-lines: extra empty line at the end of a block (revive)
integration/network/service_test.go:58:50: empty-lines: extra empty line at the end of a block (revive)
integration/network/service_test.go:401:58: empty-lines: extra empty line at the end of a block (revive)
integration/system/event_test.go:30:38: empty-lines: extra empty line at the end of a block (revive)
integration/plugin/logging/read_test.go:19:41: empty-lines: extra empty line at the end of a block (revive)
integration/service/list_test.go:30:48: empty-lines: extra empty line at the end of a block (revive)
integration/service/create_test.go:400:46: empty-lines: extra empty line at the start of a block (revive)
integration/container/logs_test.go:156:42: empty-lines: extra empty line at the end of a block (revive)
integration/container/daemon_linux_test.go:135:44: empty-lines: extra empty line at the end of a block (revive)
integration/container/restart_test.go:160:62: empty-lines: extra empty line at the end of a block (revive)
integration/container/wait_test.go:181:47: empty-lines: extra empty line at the end of a block (revive)
integration/container/restart_test.go:116:30: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
builder/remotecontext/detect_test.go:64:66: empty-lines: extra empty line at the end of a block (revive)
builder/remotecontext/detect_test.go:78:46: empty-lines: extra empty line at the end of a block (revive)
builder/remotecontext/detect_test.go:91:51: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/internals_test.go:95:38: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/copy.go:86:112: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/dispatchers_test.go:286:39: empty-lines: extra empty line at the start of a block (revive)
builder/dockerfile/builder.go:280:38: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/dispatchers.go:66:85: empty-lines: extra empty line at the start of a block (revive)
builder/dockerfile/dispatchers.go:559:85: empty-lines: extra empty line at the start of a block (revive)
builder/builder-next/adapters/localinlinecache/inlinecache.go:26:183: empty-lines: extra empty line at the start of a block (revive)
builder/builder-next/adapters/containerimage/pull.go:441:9: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration-cli/docker_cli_pull_test.go:55:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_exec_test.go:46:64: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_service_health_test.go:86:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_images_test.go:128:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_node_test.go:79:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_health_test.go:51:57: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_health_test.go:159:73: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_swarm_unix_test.go:60:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_inspect_test.go:30:33: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_build_test.go:429:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_attach_unix_test.go:19:78: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_build_test.go:470:70: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_history_test.go:29:64: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_links_test.go:93:86: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:33:61: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_links_test.go:145:78: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:114:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_attach_test.go:226:153: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_by_digest_test.go:239:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_create_test.go:135:49: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:143:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:181:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_inspect_test.go:72:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:98:77: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:144:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_rmi_test.go:63:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:199:79: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_rmi_test.go:69:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:300:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_prune_unix_test.go:35:25: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_events_unix_test.go:393:60: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_events_unix_test.go:441:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_ps_test.go:33:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_ps_test.go:559:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_events_test.go:117:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:547:74: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_api_containers_test.go:1054:84: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:1076:87: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:1232:72: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_api_containers_test.go:1801:21: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_network_unix_test.go:58:95: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_network_unix_test.go:750:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_network_unix_test.go:765:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_swarm_test.go:617:100: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_swarm_test.go:892:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_daemon_test.go:119:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_daemon_test.go:981:68: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_daemon_test.go:1951:87: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_run_test.go:83:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_run_test.go:357:72: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_build_test.go:89:83: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:114:83: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:183:80: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:290:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:314:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:331:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:366:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:403:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:648:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:708:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:938:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1018:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1097:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1182:62: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1244:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1524:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1546:80: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1716:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1730:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2162:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2270:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2288:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3206:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3392:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3433:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3678:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3732:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3759:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3802:61: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3898:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4107:9: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4791:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4821:73: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4854:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:5341:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:5593:81: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:2145:11: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also renamed variables that collided with import
api/types/strslice/strslice_test.go:36:41: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/directory/directory.go:9:49: empty-lines: extra empty line at the start of a block (revive)
pkg/pubsub/publisher.go:8:48: empty-lines: extra empty line at the start of a block (revive)
pkg/loopback/attach_loopback.go:96:69: empty-lines: extra empty line at the start of a block (revive)
pkg/devicemapper/devmapper_wrapper.go:136:48: empty-lines: extra empty line at the start of a block (revive)
pkg/devicemapper/devmapper.go:391:35: empty-lines: extra empty line at the end of a block (revive)
pkg/devicemapper/devmapper.go:676:35: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/changes_posix_test.go:15:38: empty-lines: extra empty line at the end of a block (revive)
pkg/devicemapper/devmapper.go:241:51: empty-lines: extra empty line at the start of a block (revive)
pkg/fileutils/fileutils_test.go:17:47: empty-lines: extra empty line at the end of a block (revive)
pkg/fileutils/fileutils_test.go:34:48: empty-lines: extra empty line at the end of a block (revive)
pkg/fileutils/fileutils_test.go:318:32: empty-lines: extra empty line at the end of a block (revive)
pkg/tailfile/tailfile.go:171:6: empty-lines: extra empty line at the end of a block (revive)
pkg/tarsum/fileinfosums_test.go:16:41: empty-lines: extra empty line at the end of a block (revive)
pkg/tarsum/tarsum_test.go:198:42: empty-lines: extra empty line at the start of a block (revive)
pkg/tarsum/tarsum_test.go:294:25: empty-lines: extra empty line at the start of a block (revive)
pkg/tarsum/tarsum_test.go:407:34: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/fswriters_test.go:52:45: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/writers_test.go:24:39: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/bytespipe_test.go:78:26: empty-lines: extra empty line at the end of a block (revive)
pkg/sysinfo/sysinfo_linux_test.go:13:37: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive_linux_test.go:57:64: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/changes.go:248:72: empty-lines: extra empty line at the start of a block (revive)
pkg/archive/changes_posix_test.go:15:38: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/copy.go:248:124: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/diff_test.go:198:44: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:304:12: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:749:37: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:812:81: empty-lines: extra empty line at the start of a block (revive)
pkg/archive/copy_unix_test.go:347:34: empty-lines: extra empty line at the end of a block (revive)
pkg/system/path.go:11:39: empty-lines: extra empty line at the end of a block (revive)
pkg/system/meminfo_linux.go:29:21: empty-lines: extra empty line at the end of a block (revive)
pkg/plugins/plugins.go:135:32: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:71:48: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/api_test.go:18:51: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_test.go:23:44: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_unix_test.go:17:46: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/api_test.go:57:45: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:83:50: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/api_test.go:66:47: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_unix_test.go:45:48: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:145:75: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/middleware_unix_test.go:56:51: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrExtractPointNotDirectory was
not checked for, or used as a sentinel error.
This error was introduced in c32dde5baa. It was
never used as a sentinel error, but from that commit, it looks like it was added
as a package variable to mirror already existing errors defined at the package
level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrVolumeReadonly was not checked
for, or used as a sentinel error.
This error was introduced in c32dde5baa. It was
never used as a sentinel error, but from that commit, it looks like it was added
as a package variable to mirror already existing errors defined at the package
level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrRootFSReadOnly was not checked
for, or used as a sentinel error.
This error was introduced in c32dde5baa, originally
named `ErrContainerRootfsReadonly`. It was never used as a sentinel error, but
from that commit, it looks like it was added as a package variable to mirror
the coding style of already existing errors defined at the package level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was just string-matching error output, and no longer had a
direct connection with ErrContainerRootfsReadonly / ErrVolumeReadonly.
Moving it inline better shows what it's actually checking for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The getPortMapInfo var was introduced in f198dfd856,
and (from looking at that patch) looks to have been as a quick and dirty workaround
for the `container` argument colliding with the `container` import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg/signal package was moved to github.com/moby/sys/signal in
28409ca6c7. The DefaultStopSignal const was
deprecated in e53f65a916, and the DumpStacks
function was moved to pkg/stack in ea5c94cdb9,
all of which are included in the 22.x release, so we can safely remove these
from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were moved, and deprecated in f19ef20a44 and
4caf68f4f6, which are part of the 22.x release, so
we can safely remove these from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These functions were moved to github.com/moby/sys/sequential, and the
stubs were added in 509f19f611, which is
part of the 22.x release, so we can safely remove these from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fixes an inifinite loop in mkdirAs(), used by `MkdirAllAndChown`,
`MkdirAndChown`, and `MkdirAllAndChownNew`, as well as directories being
chown'd multiple times when relative paths are used.
The for loop in this function was incorrectly assuming that;
1. `filepath.Dir()` would always return the parent directory of any given path
2. traversing any given path to ultimately result in "/"
While this is correct for absolute and "cleaned" paths, both assumptions are
incorrect in some variations of "path";
1. for paths with a trailing path-separator ("some/path/"), or dot ("."),
`filepath.Dir()` considers the (implicit) "." to be a location _within_ the
directory, and returns "some/path" as ("parent") directory. This resulted
in the path itself to be included _twice_ in the list of paths to chown.
2. for relative paths ("./some-path", "../some-path"), "traversing" the path
would never end in "/", causing the for loop to run indefinitely:
```go
// walk back to "/" looking for directories which do not exist
// and add them to the paths array for chown after creation
dirPath := path
for {
dirPath = filepath.Dir(dirPath)
if dirPath == "/" {
break
}
if _, err := os.Stat(dirPath); err != nil && os.IsNotExist(err) {
paths = append(paths, dirPath)
}
}
```
A _partial_ mitigation for this would be to use `filepath.Clean()` before using
the path (while `filepath.Dir()` _does_ call `filepath.Clean()`, it only does so
_after_ some processing, so only cleans the result). Doing so would prevent the
double chown from happening, but would not prevent the "final" path to be "."
or ".." (in the relative path case), still causing an infinite loop, or
additional checks for "." / ".." to be needed.
| path | filepath.Dir(path) | filepath.Dir(filepath.Clean(path)) |
|----------------|--------------------|------------------------------------|
| some-path | . | . |
| ./some-path | . | . |
| ../some-path | .. | .. |
| some/path/ | some/path | some |
| ./some/path/ | some/path | some |
| ../some/path/ | ../some/path | ../some |
| some/path/. | some/path | some |
| ./some/path/. | some/path | some |
| ../some/path/. | ../some/path | ../some |
| /some/path/ | /some/path | /some |
| /some/path/. | /some/path | /some |
Instead, this patch adds a `filepath.Abs()` to the function, so make sure that
paths are both cleaned, and not resulting in an infinite loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive)
libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive)
libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive)
libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive)
libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive)
libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive)
libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive)
libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive)
libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive)
libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive)
libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was duplicated in two places -- factor it out, add
documentation, and move magic numbers into a constant.
Additionally, use the same permissions (0755) in both code paths, and
ensure that the ID map is used in both code paths.
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
It was unclear what the distinction was between these configuration
structs, so merging them to simplify.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was used for testing purposes when libnetwork was in a separate repo, using
the dnet utility, which was removed in 7266a956a8.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Libnetwork configuration files were only used as part of integration tests using
the dnet utility, which was removed in 7266a956a8
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the "deadcode", "structcheck", and "varcheck" linters, as they are
deprecated:
WARN [runner] The linter 'deadcode' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'structcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'varcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [linters context] structcheck is disabled because of generics. You can track the evolution of the generics support by following the https://github.com/golangci/golangci-lint/issues/2649.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The TODO comment was in regards to allowing graphdriver plugins to
provide their own ContainerFS implementations. The ContainerFS interface
has been removed from Moby, so there is no longer anything which needs
to be figured out.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Now that the type of Container.BaseFS has been reverted to a string,
values can never implement the extractor or archiver interfaces. Rip out
the dead code to support archiving and unarchiving through those
interfcaes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Driver abstraction was needed for Linux Containers on Windows,
support for which has since been removed.
There is no direct equivalent to Lchmod() in the standard library so
continue to use the containerd/continuity version.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- dbus: add Connected methods to check connections status
- dbus: add support for querying unit by PID
- dbus: implement support for cgroup freezer APIs
- journal: remove implicit initialization
- login1: add methods to get session/user properties
- login1: add context-aware ListSessions and ListUsers methods
full diff: https://github.com/github.com/coreos/go-systemd/compare/v22.3.2...v22.4.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Had this stashed as part of some other changes, so thought I'd push
as a PR; with this patch:
=== RUN TestNewClientWithOpsFromEnv/invalid_cert_path
=== RUN TestNewClientWithOpsFromEnv/default_api_version_with_cert_path
=== RUN TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_tls_verify
=== RUN TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_host
=== RUN TestNewClientWithOpsFromEnv/invalid_docker_host
=== RUN TestNewClientWithOpsFromEnv/invalid_docker_host,_with_good_format
=== RUN TestNewClientWithOpsFromEnv/override_api_version
--- PASS: TestNewClientWithOpsFromEnv (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/invalid_cert_path (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version_with_cert_path (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_tls_verify (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_host (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/invalid_docker_host (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/invalid_docker_host,_with_good_format (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/override_api_version (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that we can pass any custom containerd shim to dockerd there is need
for this check. Without this it becomes possible to use wasm shims for
example with images that have "wasi" as the OS.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These functions were used in 63a7ccdd23, which was
part of Docker v1.5.0 and v1.6.0, but removed in Docker v1.7.0 when the network
stack was replaced with libnetwork in d18919e304.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After discussing in the maintainers meeting, we concluded that Slowloris attacks
are not a real risk other than potentially having some additional goroutines
lingering around, so setting a long timeout to satisfy the linter, and to at
least have "some" timeout.
libnetwork/diagnostic/server.go:96:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv := &http.Server{
Addr: net.JoinHostPort(ip, strconv.Itoa(port)),
Handler: s,
}
api/server/server.go:60:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv: &http.Server{
Addr: addr,
},
daemon/metrics_unix.go:34:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
cmd/dockerd/metrics.go:27:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These interfaces were added in aacddda89d, with
no clear motivation, other than "Also hide ViewDB behind an interface".
This patch removes the interface in favor of using a concrete implementation;
There's currently only one implementation of this interface, and if we would
decide to change to an alternative implementation, we could define relevant
interfaces on the receiver side.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also switching to use arm64, as all amd64 stages have moved to GitHub actions,
so using arm64 allows the same machine to be used for tests after the DCO check
completed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was missing to be removed from Jenkinsfile when we moved
to GHA for unit and integration tests.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The 1.16 `io/fs` compatibility code was being built on 1.18 and 1.19.
Drop it completely as 1.16 is long EOL, and additionally drop 1.17 as it
has been EOL for a month and 1.18 is both the minimum Go supported by
the 20.10 branch, as well as a very easy jump from 1.17.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
The OCI image spec is considering to change the Image struct and embedding the
Platform type (see opencontainers/image-spec#959) in the go implementation.
Moby currently uses some struct-literals to propagate the platform fields,
which will break once those changes in the OCI spec are merged.
Ideally (once that change arrives) we would update the code to set the Platform
information as a whole, instead of assigning related fields individually, but
in some cases in the code, image platform information is only partially set
(for example, OSVersion and OSFeatures are not preserved in all cases). This
may be on purpose, so needs to be reviewed.
This patch keeps the current behavior (assigning only specific fields), but
removes the use of struct-literals to make the code compatible with the
upcoming changes in the image-spec module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure we use the same alias everywhere for easier finding,
and to prevent accidentally introducing duplicate imports with
different aliases for the same package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- prefer error over panic where possible
- ContainerChanges is not implemented by snapshotter-based ImageService
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Tagger was introduced in 0296797f0f, as part
of a refactor, but was never used outside of the package itself. The commit
also didn't explain why this was changed into a Type with a constructor, as all
the constructor appears to be used for is to sanitize and validate the tags.
This patch removes the `Tagger` struct and its constructor, and instead just
uses a function to do the same.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.1 and 1.18.6, minor point releases.
These minor releases include 2 security fixes following the security policy:
- net/http: handle server errors after sending GOAWAY
A closing HTTP/2 server connection could hang forever waiting for a clean
shutdown that was preempted by a subsequent fatal error. This failure mode
could be exploited to cause a denial of service.
Thanks to Bahruz Jabiyev, Tommaso Innocenti, Anthony Gavazzi, Steven Sprecher,
and Kaan Onarlioglu for reporting this.
This is CVE-2022-27664 and Go issue https://go.dev/issue/54658.
- net/url: JoinPath does not strip relative path components in all circumstances
JoinPath and URL.JoinPath would not remove `../` path components appended to a
relative path. For example, `JoinPath("https://go.dev", "../go")` returned the
URL `https://go.dev/../go`, despite the JoinPath documentation stating that
`../` path elements are cleaned from the result.
Thanks to q0jt for reporting this issue.
This is CVE-2022-32190 and Go issue https://go.dev/issue/54385.
Release notes:
go1.19.1 (released 2022-09-06) includes security fixes to the net/http and
net/url packages, as well as bug fixes to the compiler, the go command, the pprof
command, the linker, the runtime, and the crypto/tls and crypto/x509 packages.
See the Go 1.19.1 milestone on the issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.19.1+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.1 and 1.18.6, minor point releases.
These minor releases include 2 security fixes following the security policy:
- net/http: handle server errors after sending GOAWAY
A closing HTTP/2 server connection could hang forever waiting for a clean
shutdown that was preempted by a subsequent fatal error. This failure mode
could be exploited to cause a denial of service.
Thanks to Bahruz Jabiyev, Tommaso Innocenti, Anthony Gavazzi, Steven Sprecher,
and Kaan Onarlioglu for reporting this.
This is CVE-2022-27664 and Go issue https://go.dev/issue/54658.
- net/url: JoinPath does not strip relative path components in all circumstances
JoinPath and URL.JoinPath would not remove `../` path components appended to a
relative path. For example, `JoinPath("https://go.dev", "../go")` returned the
URL `https://go.dev/../go`, despite the JoinPath documentation stating that
`../` path elements are cleaned from the result.
Thanks to q0jt for reporting this issue.
This is CVE-2022-32190 and Go issue https://go.dev/issue/54385.
Release notes:
go1.18.6 (released 2022-09-06) includes security fixes to the net/http package,
as well as bug fixes to the compiler, the go command, the pprof command, the
runtime, and the crypto/tls, encoding/xml, and net packages. See the Go 1.18.6
milestone on the issue tracker for details;
https://github.com/golang/go/issues?q=milestone%3AGo1.18.6+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The wrapper sets the default namespace in the context if none is
provided, this is needed because we are calling these services directly
and not trough GRPC that has an interceptor to set the default namespace
to all calls.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
While the name generator has been frozen for new additions in 624b3cfbe8,
this person has become controversial. Our intent is for this list to be inclusive
and non-controversial.
This patch removes the name from the list.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
1. Commit 1a22418f9f changed permissions to `0700`
on Windows, or more factually, it removed `rw` (`chmod g-rw,o-rw`) and added
executable bits (`chmod u+x`).
2. This was too restrictive, and b7dc9040f0 changed
permissions to only remove the group- and world-writable bits to give read and
execute access to everyone, but setting execute permissions for everyone.
3. However, this also removed the non-permission bits, so 41eb61d5c2
updated the code to preserve those, and keep parity with Linux.
This changes it back to `2.`. I wonder (_think_) _permission_ bits (read, write)
can be portable, except for the _executable_ bit (which is not present on Windows).
The alternative could be to keep the permission bits, and only set the executable
bit (`perm | 0111`) for everyone (equivalent of `chmod +x`), but that likely would
be a breaking change.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The fillGo18FileTypeBits func was added in 1a451d9a7b
to keep the tar headers consistent with headers created with go1.8 and older.
go1.8 and older incorrectly preserved all file-mode bits, including file-type,
instead of stripping those bits and only preserving the _permission_ bits, as
defined in;
- the GNU tar spec: https://www.gnu.org/software/tar/manual/html_node/Standard.html
- and POSIX: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/tar.h.html
We decided at the time to copy the "wrong" behavior to prevent a cache-bust and
to keep the archives identical, however:
- It's not matching the standards, which causes differences between our tar
implementation and the standard tar implementations, as well as implementations
in other languages, such as Python (see docker/compose#883).
- BuildKit does not implement this hack.
- We don't _need_ this extra information (as it's already preserved in the
type header; https://pkg.go.dev/archive/tar#pkg-constants
In short; let's remove this hack.
This reverts commit 1a451d9a7b.
This reverts commit 41eb61d5c2.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration-cli/docker_cli_daemon_test.go:545:54: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
cmdArgs = append(cmdArgs, "--tls=false", "--host", fmt.Sprintf("tcp://%s:%s", l.daemon, l.port))
^
opts/hosts_test.go:35:31: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
"tcp://:5555": fmt.Sprintf("tcp://%s:5555", DefaultHTTPHost),
^
opts/hosts_test.go:91:30: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
":5555": fmt.Sprintf("tcp://%s:5555", DefaultHTTPHost),
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updating test-code only; set ReadHeaderTimeout for some, or suppress the linter
error for others.
contrib/httpserver/server.go:11:12: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
log.Panic(http.ListenAndServe(":80", nil))
^
integration/plugin/logging/cmd/close_on_start/main.go:42:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: mux,
}
integration/plugin/logging/cmd/discard/main.go:17:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: mux,
}
integration/plugin/logging/cmd/dummy/main.go:14:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
integration/plugin/volumes/cmd/dummy/main.go:14:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
testutil/fixtures/plugin/basic/basic.go:25:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
volume/testutils/testutils.go:170:5: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
go http.Serve(l, mux)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The linter falsely detects this as using "math/rand":
libnetwork/networkdb/cluster.go:721:14: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
val, err := rand.Int(rand.Reader, big.NewInt(int64(n)))
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that CanonicalTarNameForPath is an alias for filepath.ToSlash, they were
mostly redundant, and only testing Go's stdlib. Coverage for filepath.ToSlash is
provided through TestCanonicalTarName, which does a superset of CanonicalTarNameForPath,
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
filepath.ToSlash is already a no-op on non-Windows platforms, so there's no
need to provide multiple implementations.
We could consider deprecating this function, but it's used in the CLI, and
perhaps it's still useful to have a canonical location to perform this normalization.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Migrating these functions to allow them being shared between moby, docker/cli,
and containerd, and to allow using them without importing all of sys / system,
which (in containerd) also depends on hcsshim and more.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
see https://github.com/koalaman/shellcheck/wiki/SC2155
Looking at how these were used, I don't think we even need to
export them, so removing that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
validate other YAML files, such as the ones used in the documentation,
and GitHub actions workflows, to prevent issues such as;
- 30295c1750
- 8e8d9a3650
With this patch:
hack/validate/yamllint
Congratulations! yamllint config file formatted correctly
Congratulations! YAML files are formatted correctly
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Suppresses warnings like:
LANG=C.UTF-8 yamllint -c hack/validate/yamllint.yaml -f parsable .github/workflows/*.yml
.github/workflows/ci.yml:7:1: [warning] truthy value should be one of [false, true] (truthy)
.github/workflows/windows.yml:7:1: [warning] truthy value should be one of [false, true] (truthy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before:
10030:81 error line too long (89 > 80 characters) (line-length)
After:
api/swagger.yaml:10030:81: [error] line too long (89 > 80 characters) (line-length)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't make the file hidden, and add .yaml extension, so that editors
pick up the right formatting :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This interface is used as part of an exported function's signature,
so exporting the interface as well for callers to know what the argument
must have implemented.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
filepath.IsAbs() will short-circuit on Linux/Unix, so having a single
implementation should not affect those platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent new health check probes from racing the task deletion. This may
have been a root cause of containers taking so long to stop on Windows.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We have integration tests which assert the invariant that a
GET /containers/{id}/json response lists only IDs of execs which are in
the Running state, according to GET /exec/{id}/json. The invariant could
be violated if those requests were to race the handling of the exec's
task-exit event. The coarse-grained locking of the container ExecStore
when starting an exec task was accidentally synchronizing
(*Daemon).ProcessEvent and (*Daemon).ContainerExecInspect to it just
enough to make it improbable for the integration tests to catch the
invariant violation on execs which exit immediately. Removing the
unnecessary locking made the underlying race condition more likely for
the tests to hit.
Maintain the invariant by deleting the exec from its container's
ExecCommands before clearing its Running flag. Additionally, fix other
potential data races with execs by ensuring that the ExecConfig lock is
held whenever a mutable field is read from or written to.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Modifying the builtin Windows runtime to send the exited event
immediately upon the container's init process exiting, without first
waiting for the Compute System to shut down, perturbed the timings
enough to make TestWaitConditions flaky on that platform. Make
TestWaitConditions timing-independent by having the container wait
for input on STDIN before exiting.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing logic to handle container ID conflicts when attempting to
create a plugin container is not nearly as robust as the implementation
in daemon for user containers. Extract and refine the logic from daemon
and use it in the plugin executor.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.
Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The OOMKilled flag on a container's state has historically behaved
rather unintuitively: it is updated on container exit to reflect whether
or not any process within the container has been OOM-killed during the
preceding run of the container. The OOMKilled flag would be set to true
when the container exits if any process within the container---including
execs---was OOM-killed at any time while the container was running,
whether or not the OOM-kill was the cause of the container exiting. The
flag is "sticky," persisting through the next start of the container;
only being cleared once the container exits without any processes having
been OOM-killed that run.
Alter the behavior of the OOMKilled flag such that it signals whether
any process in the container had been OOM-killed since the most recent
start of the container. Set the flag immediately upon any process being
OOM-killed, and clear it when the container transitions to the "running"
state.
There is an ulterior motive for this change. It reduces the amount of
state the libcontainerd client needs to keep track of and clean up on
container exit. It's one less place the client could leak memory if a
container was to be deleted without going through libcontainerd.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The daemon.containerd.Exec call does not access or mutate the
container's ExecCommands store in any way, and locking the exec config
is sufficient to synchronize with the event-processing loop. Locking
the ExecCommands store while starting the exec process only serves to
block unrelated operations on the container for an extended period of
time.
Convert the Store struct's mutex to an unexported field to prevent this
from regressing in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an integration test to verify that health checks are killed on
timeout and that the output is captured.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just a minor cleanup; I went looking where these options were used, and
these occurrences set then to their default value. Removing these
assignments makes it easier to find where they're actually used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was deprecated in edac92409a, which
was part of 18.09 and up, so should be safe by now to remove this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and only a "convenience" type,
not used to detect a specific error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in ee230d8fdd,
which is in the 22.06 branch, so we can safely remove it from
master to have them removed in the release after that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat (storage/graph)Driver as snapshotter
Also moved some layerStore related initialization to the non-c8d case
because otherwise they get treated as a graphdriver plugins.
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since runtimes can now just be containerd shims, we need to check if the
reference is possibly a containerd shim.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Update the profile to make use of CAP_BPF and CAP_PERFMON capabilities. Prior to
kernel 5.8, bpf and perf_event_open required CAP_SYS_ADMIN. This change enables
finer control of the privilege setting, thus allowing us to run certain system
tracing tools with minimal privileges.
Based on the original patch from Henry Wang in the containerd repository.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `-g` / `--graph` options were soft deprecated in favor of `--data-root` in
261ef1fa27 (v17.05.0) and at the time considered
to not be removed. However, with the move towards containerd snapshotters, having
these options around adds additional complexity to handle fallbacks for deprecated
(and hidden) flags, so completing the deprecation.
With this patch:
dockerd --graph=/var/lib/docker --validate
Flag --graph has been deprecated, Use --data-root instead
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
mkdir -p /etc/docker
echo '{"graph":"/var/lib/docker"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used as an intermediate variable to store what's returned
by layerstore.DriverName() / ImageService.StorageDriver()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the function name more generic, as it's no longer used only
for graphdrivers but also for snapshotters.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Makes sure that tests use a config struct that's more representative
to how it's used in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Makes sure that tests use a config struct that's more representative
to how it's used in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This centralizes more defaults, to be part of the config struct that's
created, instead of interweaving the defaults with other code in various
places.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the information stored as part of the container for the error-message,
instead of querying the current storage driver from the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The check was accounting for old containers that did not have a storage-driver
set in their config, and was added in 4908d7f81d
for docker v0.7.0-rc6 - nearly 9 Years ago, so very likely nobody is still
depending on this ;-)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 0cba7740d4, as part of
the LCOW implementation. LCOW support has been removed, so we can remove
this check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently only provides the existing "platform" option, but more
options will be added in follow-ups.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The legacy v1 is not supported by the containerd import
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- return early if we're expecting a system-containerd
- rename `initContainerD` to `initContainerd` ':)
- remove .Config to reduce verbosity
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Change the log-level for messages about starting the managed containerd instance
to be the same as for the main API. And remove a redundant debug-log.
With this patch:
dockerd
INFO[2022-08-11T11:46:32.573299176Z] Starting up
INFO[2022-08-11T11:46:32.574304409Z] containerd not running, starting managed containerd
INFO[2022-08-11T11:46:32.575289181Z] started new containerd process address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=5370
cmd/dockerd: initContainerD(): clean-up some logs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
the `--log-level` flag overrides whatever is in the containerd configuration file;
f033f6ff85/cmd/containerd/command/main.go (L339-L352)
Given that we set that flag when we start the containerd binary, there is no need
to write it both to the generated config-file and pass it as flag.
This patch also slightly changes the behavior; as both dockerd and containerd use
"info" as default log-level, don't set the log-level if it's the default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adding a remote.configFile to store the location instead of re-constructing its
location each time. Also fixing a minor inconsistency in the error formats.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adding a remote.pidFile to store the location instead of re-constructing its
location each time. Also performing a small refactor to use `strconv.Itoa`
instead of `fmt.Sprintf`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Containerd, like dockerd has a OOMScore configuration option to adjust its own
OOM score. In dockerd, this option was added when default installations were not
yet running the daemon as a systemd unit, which made it more complicated to set
the score, and adding a daemon option was convenient.
A binary adjusting its own score has been frowned upon, as it's more logical to
make that the responsibility of the process manager _starting_ the daemon, which
is what we did for dockerd in 21578530d7.
There have been discussions on deprecating the daemon flag for dockerd, and
similar discussions have been happening for containerd.
This patch changes how we set the OOM score for the containerd child process,
and to have dockerd (supervisor) set the OOM score, as it's acting as process
manager in this case (performing a role similar to systemd otherwise).
With this patch, the score is still adjusted as usual, but not written to the
containerd configuration file;
dockerd --oom-score-adjust=-123
cat /proc/$(pidof containerd)/oom_score_adj
-123
As a follow-up, we may consider to adjust the containerd OOM score based on the
daemon's own score instead of on the `cli.OOMScoreAdjust` configuration so that
we will also adjust the score in situations where dockerd's OOM score was set
through other ways (systemd or manually adjusting the cgroup). A TODO was added
for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Consider Address() (Config.GRPC.Addres) to be the source of truth for
the location of the containerd socket.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This RWMutex was added in 9c4570a958, and used in
the `remote.Client()` method. Commit dd2e19ebd5
split the code for client and daemon, but did not remove the mutex.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- properly use filepath.Join() for Windows paths
- use getDaemonConfDir() to get the path for storing daemon.json
- return error instead of logrus.Fatal(). The logrus.Fatal() was a left-over
from when Cobra was not used, and appears to have been missed in commit
fb83394714, which did the conversion to Cobra.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in 098a44c07f, which is
in the 22.06 branch, and no longer in use since e05f614267
so we can remove them from the master branch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 6068d1894d...48dd89375d
Finishes off the work to change references to cluster volumes in the API
from using "csi" as the magic word to "cluster". This reflects that the
volumes are "cluster volumes", not "csi volumes".
Notably, there is no change to the plugin definitions being "csinode"
and "csicontroller". This terminology is appropriate with regards to
plugins because it accurates reflects what the plugin is.
Signed-off-by: Drew Erny <derny@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
rename some variables that collided with imports or (upcoming)
changes, e.g. `ctx`, which is commonly used for `context.Context`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
My IDE was complaining about some things;
- fix inconsistent receiver name (i vs s)
- fix some variables that collided with imports
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/v1.6.6...v1.6.7
Welcome to the v1.6.7 release of containerd!
The seventh patch release for containerd 1.6 contains various fixes,
includes a new version of runc and adds support for ppc64le and riscv64
(requires unreleased runc 1.2) builds.
Notable Updates
- Update runc to v1.1.3
- Seccomp: Allow clock_settime64 with CAP_SYS_TIME
- Fix WWW-Authenticate parsing
- Support RISC-V 64 and ppc64le builds
- Windows: Update hcsshim to v0.9.4 to fix regression with HostProcess stats
- Windows: Fix shim logs going to panic.log file
- Allow ptrace(2) by default for kernels >= 4.8
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/v1.6.6...v1.6.7
Welcome to the v1.6.7 release of containerd!
The seventh patch release for containerd 1.6 contains various fixes,
includes a new version of runc and adds support for ppc64le and riscv64
(requires unreleased runc 1.2) builds.
Notable Updates
- Update runc to v1.1.3
- Seccomp: Allow clock_settime64 with CAP_SYS_TIME
- Fix WWW-Authenticate parsing
- Support RISC-V 64 and ppc64le builds
- Windows: Update hcsshim to v0.9.4 to fix regression with HostProcess stats
- Windows: Fix shim logs going to panic.log file
- Allow ptrace(2) by default for kernels >= 4.8
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update Go runtime to 1.18.5 to address CVE-2022-32189.
Full diff: https://github.com/golang/go/compare/go1.18.4...go1.18.5
--------------------------------------------------------
From the security announcement:
https://groups.google.com/g/golang-announce/c/YqYYG87xB10
We have just released Go versions 1.18.5 and 1.17.13, minor point
releases.
These minor releases include 1 security fixes following the security
policy:
encoding/gob & math/big: decoding big.Float and big.Rat can panic
Decoding big.Float and big.Rat types can panic if the encoded message is
too short.
This is CVE-2022-32189 and Go issue https://go.dev/issue/53871.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.18.5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing implementation used a `nil` value for the CRI plugin's configuration
to indicate that the plugin had to be disabled. Effectively, the `Plugins` value
was only used as an intermediate step, only to be removed later on, and to instead
add the given plugin to `DisabledPlugins` in the containerd configuration.
This patch removes the intermediate step; as a result we also don't need to mask
the containerd `Plugins` field, which was added to allow serializing the toml.
A code comment was added as well to explain why we're (currently) disabling the
CRI plugin by default, which may help future visitors of the code to determin
if that default is still needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes the `WithRemoteAddr()`, `WithRemoteAddrUser()`, `WithDebugAddress()`,
and `WithMetricsAddress()` options, added in ddae20c032,
but most of them were never used, and `WithRemoteAddr()` no longer in use since
dd2e19ebd5.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some error conditions returned a non-typed error, which would be returned
as a 500 status by the API. This patch;
- Updates such errors to return an errdefs.InvalidParameter type
- Introduces a locally defined `invalidParam{}` type for convenience.
- Updates some error-strings to match Go conventions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This has been around for a long time - since v17.04 (API v1.28)
but was never documented.
It allows removing a plugin even if it's still in use.
Signed-off-by: Milas Bowman <milas.bowman@docker.com>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Making the api types more focused per API type, and the general
api/types package somewhat smaller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- TestServiceLogsCompleteness runs service with command to write 6 log
lines but as command exits immediately, service is restarted and 6 more
lines are printed in logs, which confuses the checker.Equals(6) check.
- TestServiceLogsSince runs service with command to write 3 log lines,
and service restart can also affect it's checks.
Let's change from `tail` which exits immediately to `tail -f` which
hangs forever, this way we would not confuse checks with more log lines
when expected.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Contrary to popular belief, the OCI Runtime specification does not
specify the command-line API for runtimes. Looking at containerd's
architecture from the lens of the OCI Runtime spec, the _shim_ is the
OCI Runtime and runC is "just" an implementation detail of the
io.containerd.runc.v2 runtime. When one configures a non-default runtime
in Docker, what they're really doing is instructing Docker to create
containers using the io.containerd.runc.v2 runtime with a configuration
option telling the runtime that the runC binary is at some non-default
path. Consequently, only OCI runtimes which are compatible with the
io.containerd.runc.v2 shim, such as crun, can be used in this manner.
Other OCI runtimes, including kata-containers v2, come with their own
containerd shim and are not compatible with io.containerd.runc.v2.
As Docker has not historically provided a way to select a non-default
runtime which requires its own shim, runtimes such as kata-containers v2
could not be used with Docker.
Allow other containerd shims to be used with Docker; no daemon
configuration required. If the daemon is instructed to create a
container with a runtime name which does not match any of the configured
or stock runtimes, it passes the name along to containerd verbatim. A
user can start a container with the kata-containers runtime, for
example, simply by calling
docker run --runtime io.containerd.kata.v2
Runtime names which containerd would interpret as a path to an arbitrary
binary are disallowed. While handy for development and testing it is not
strictly necessary and would allow anyone with Engine API access to
trivially execute any binary on the host as root, so we have decided it
would be safest for our users if it was not allowed.
It is not yet possible to set an alternative containerd shim as the
default runtime; it can only be configured per-container.
Signed-off-by: Cory Snider <csnider@mirantis.com>
doCopyXattrs() never reached due to copyXattrs boolean being false, as
a result file capabilities not being copied.
moved copyXattr() out of doCopyXattrs()
Signed-off-by: Illo Abdulrahim <abdulrahim.illo@nokia.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement --follow entirely correctly for the journald log reader, such
that it exits immediately upon reading back the last log message written
to the journal before the logger was closed. The impossibility of doing
so has been slightly exaggerated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Fix journald and logfile-powered (jsonfile, local) log readers
incorrectly filtering out messages with timestamps < Since which were
preceded by a message with a timestamp >= Since.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Careful management of the journal read pointer is sufficient to ensure
that no entry is read more than once.
Unit test the journald logger without requiring a running journald by
using the systemd-journal-remote command to write arbitrary entries to
journal files.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Wrap the libsystemd journal reading functionality in a more idiomatic Go
API and refactor the journald logging driver's ReadLogs implementation
to use the wrapper. Rewrite the parts of the ReadLogs implementation in
Go which were previously implemented in C as part of the cgo preamble.
Separating the business logic from the cgo minutiae should hopefully
make the code more accessible to a wider audience of developers for
reviewing the code and contributing improvements.
The structure of the ReadLogs implementation is retained with few
modifications. Any ignored errors were also ignored before the refactor;
the explicit error return values afforded by the sdjournal wrapper makes
this more obvious.
The package github.com/coreos/go-systemd/v22/sdjournal also provides a
more idiomatic Go wrapper around libsystemd. It is unsuitable for our
needs as it does not expose wrappers for the sd_journal_process and
sd_journal_get_fd functions.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure the package can be imported, no matter the build constratints, by
adding an unconstrained doc.go containing a package statement.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This test is skipped on Windows anyway.
Also add a short explanation why emptyfs image was chosen.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Reformat/rename some bits to better align with the old implementation for
easier comparing, and add some TODOs for tracking issues on remaining work.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In network node change test, the expected behavior is focused on how many nodes
left in networkDB, besides timing issues, things would also go tricky for a
leave-then-join sequence, if the check (counting the nodes) happened before the
first "leave" event, then the testcase actually miss its target and report PASS
without verifying its final result; if the check happened after the 'leave' event,
but before the 'join' event, the test would report FAIL unnecessary;
This code change would check both the db changes and the node count, it would
report PASS only when networkdb has indeed changed and the node count is expected.
Signed-off-by: David Wang <00107082@163.com>
Not all filters are implemented yet, so make sure an error
is returned if a not-yet implemented filter is used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 6cdc4ba6cd in 2016, likely
because at the time we were still building for CentOS 6 and Ubuntu 14.04.
All currently supported distros appear to be on _at least_ 219 now, so it looks
safe to remove this;
```bash
docker run -it --rm centos:7
yum install -y systemd-devel
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 219
pkg-config --print-provides 'libsystemd-journal'
libsystemd-journal = 219
```
And on a `debian:buster` (old stable)
```bash
docker run -it --rm debian:buster
apt-get update && apt-get install -y libsystemd-dev pkg-config
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 241
pkg-config --print-provides 'libsystemd-journal'
Package libsystemd-journal was not found in the pkg-config search path.
Perhaps you should add the directory containing `libsystemd-journal.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libsystemd-journal' found
```
OpenSUSE leap (I think that's built for s390x)
```bash
docker run -it --rm docker.io/opensuse/leap:15
zypper install -y systemd-devel
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 246
pkg-config --print-provides 'libsystemd-journal'
Package libsystemd-journal was not found in the pkg-config search path.
Perhaps you should add the directory containing `libsystemd-journal.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libsystemd-journal' found
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 906b979b88, which changed
a `goto` to a `break`, but afaics, the intent was still to break out of the loop.
(linter didn't catch this before because it didn't have the right build-tag set)
daemon/logger/journald/read.go:238:4: SA4011: ineffective break statement. Did you mean to break out of the outer loop? (staticcheck)
break // won't be able to write anything anymore
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change there was a race condition between State.Wait reading
the exit code from State and the State being changed instantly after the
change which ended the State.Wait.
Now, each State.Wait has its own channel which is used to transmit the
desired StateStatus at the time the state transitions to the awaited
one. Wait no longer reads the status by itself so there is no race.
The issue caused the `docker run --restart=always ...' to sometimes exit
with 0 exit code, because the process was already restarted by the time
State.Wait got the chance to read the exit code.
Test run
--------
Before:
```
$ go test -count 1 -run TestCorrectStateWaitResultAfterRestart .
--- FAIL: TestCorrectStateWaitResultAfterRestart (0.00s)
state_test.go:198: expected exit code 10, got 0
FAIL
FAIL github.com/docker/docker/container 0.011s
FAIL
```
After:
```
$ go test -count 1 -run TestCorrectStateWaitResultAfterRestart .
ok github.com/docker/docker/container 0.011s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This caused a race condition where AutoRemove could be restored before
container was considered for restart and made autoremove containers
impossible to restart.
```
$ make DOCKER_GRAPHDRIVER=vfs BIND_DIR=. TEST_FILTER='TestContainerWithAutoRemoveCanBeRestarted' TESTFLAGS='-test.count 1' test-integration
...
=== RUN TestContainerWithAutoRemoveCanBeRestarted
=== RUN TestContainerWithAutoRemoveCanBeRestarted/kill
=== RUN TestContainerWithAutoRemoveCanBeRestarted/stop
--- PASS: TestContainerWithAutoRemoveCanBeRestarted (1.61s)
--- PASS: TestContainerWithAutoRemoveCanBeRestarted/kill (0.70s)
--- PASS: TestContainerWithAutoRemoveCanBeRestarted/stop (0.86s)
PASS
DONE 3 tests in 3.062s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This splits the ImageService methods to separate files, to closer
match the existing implementation, and to reduce the amount of code
per file, making it easier to read, and to reduce merge conflicts if
new functionality is added.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We use "specs" as alias in most places; rename the alias here accordingly
to prevent confusiong and reduce the risk of introducing duplicate imports.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Cory has actively participated in the project for many months, assisted in several
security advisories, code review, and triage, and (in short) already acted a
maintainer for some time (thank you!).
I nominated Cory as a maintainer per e-mail, and we reached quorum, so opening
this pull request to (should he choose to accept it) be added as a maintainer.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Initial pull/ls works
Build is deactivated if the feature is active
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The correct formatting for machine-readable comments is;
//<some alphanumeric identifier>:<options>[,<option>...][ // comment]
Which basically means:
- MUST NOT have a space before `<identifier>` (e.g. `nolint`)
- Identified MUST be alphanumeric
- MUST be followed by a colon
- MUST be followed by at least one `<option>`
- Optionally additional `<options>` (comma-separated)
- Optionally followed by a comment
Any other format will not be considered a machine-readable comment by `gofmt`,
and thus formatted as a regular comment. Note that this also means that a
`//nolint` (without anything after it) is considered invalid, same for `//#nosec`
(starts with a `#`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.18.4 (released 2022-07-12) includes security fixes to the compress/gzip,
encoding/gob, encoding/xml, go/parser, io/fs, net/http, and path/filepath
packages, as well as bug fixes to the compiler, the go command, the linker,
the runtime, and the runtime/metrics package. See the Go 1.18.4 milestone on the
issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.18.4+label%3ACherryPickApproved
This update addresses:
CVE-2022-1705, CVE-2022-1962, CVE-2022-28131, CVE-2022-30630, CVE-2022-30631,
CVE-2022-30632, CVE-2022-30633, CVE-2022-30635, and CVE-2022-32148.
Full diff: https://github.com/golang/go/compare/go1.18.3...go1.18.4
From the security announcement;
https://groups.google.com/g/golang-announce/c/nqrv9fbR0zE
We have just released Go versions 1.18.4 and 1.17.12, minor point releases. These
minor releases include 9 security fixes following the security policy:
- net/http: improper sanitization of Transfer-Encoding header
The HTTP/1 client accepted some invalid Transfer-Encoding headers as indicating
a "chunked" encoding. This could potentially allow for request smuggling, but
only if combined with an intermediate server that also improperly failed to
reject the header as invalid.
This is CVE-2022-1705 and https://go.dev/issue/53188.
- When `httputil.ReverseProxy.ServeHTTP` was called with a `Request.Header` map
containing a nil value for the X-Forwarded-For header, ReverseProxy would set
the client IP as the value of the X-Forwarded-For header, contrary to its
documentation. In the more usual case where a Director function set the
X-Forwarded-For header value to nil, ReverseProxy would leave the header
unmodified as expected.
This is https://go.dev/issue/53423 and CVE-2022-32148.
Thanks to Christian Mehlmauer for reporting this issue.
- compress/gzip: stack exhaustion in Reader.Read
Calling Reader.Read on an archive containing a large number of concatenated
0-length compressed files can cause a panic due to stack exhaustion.
This is CVE-2022-30631 and Go issue https://go.dev/issue/53168.
- encoding/xml: stack exhaustion in Unmarshal
Calling Unmarshal on a XML document into a Go struct which has a nested field
that uses the any field tag can cause a panic due to stack exhaustion.
This is CVE-2022-30633 and Go issue https://go.dev/issue/53611.
- encoding/xml: stack exhaustion in Decoder.Skip
Calling Decoder.Skip when parsing a deeply nested XML document can cause a
panic due to stack exhaustion. The Go Security team discovered this issue, and
it was independently reported by Juho Nurminen of Mattermost.
This is CVE-2022-28131 and Go issue https://go.dev/issue/53614.
- encoding/gob: stack exhaustion in Decoder.Decode
Calling Decoder.Decode on a message which contains deeply nested structures
can cause a panic due to stack exhaustion.
This is CVE-2022-30635 and Go issue https://go.dev/issue/53615.
- path/filepath: stack exhaustion in Glob
Calling Glob on a path which contains a large number of path separators can
cause a panic due to stack exhaustion.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2022-30632 and Go issue https://go.dev/issue/53416.
- io/fs: stack exhaustion in Glob
Calling Glob on a path which contains a large number of path separators can
cause a panic due to stack exhaustion.
This is CVE-2022-30630 and Go issue https://go.dev/issue/53415.
- go/parser: stack exhaustion in all Parse* functions
Calling any of the Parse functions on Go source code which contains deeply
nested types or declarations can cause a panic due to stack exhaustion.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2022-1962 and Go issue https://go.dev/issue/53616.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes;
- VolumeCreateBody (alias for CreateOptions)
- VolumeListOKBody (alias for ListResponse)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes;
- ContainerCreateCreatedBody (alias for CreateResponse)
- ContainerWaitOKBody (alias for WaitResponse)
- ContainerWaitOKBodyError (alias for WaitExitError)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 22.06 branch was created, so changes in master/main should now be
targeting the next version of the API (1.43).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add pkey_alloc(2), pkey_free(2) and pkey_mprotect(2) in seccomp default profile.
pkey_alloc(2), pkey_free(2) and pkey_mprotect(2) can only configure
the calling process's own memory, so they are existing "safe for everyone" syscalls.
close issue: #43481
Signed-off-by: zhubojun <bojun.zhu@foxmail.com>
- Update IsErrNotFound() to check for the current type before falling back to
detecting the deprecated type.
- Remove unauthorizedError and notImplementedError types, which were not used.
- IsErrPluginPermissionDenied() was added in 7c36a1af03,
but not used at the time, and still appears to be unused.
- Deprecate IsErrUnauthorized in favor of errdefs.IsUnauthorized()
- Deprecate IsErrNotImplemented in favor of errdefs,IsNotImplemented()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Older versions of Go don't format comments, so committing this as
a separate commit, so that we can already make these changes before
we upgrade to Go 1.19.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was caught by goimports;
goimports -w $(find . -type f -name '*.go'| grep -v "/vendor/")
CI doesn't run on these platforms, so didn't catch it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This uses the newer github issues forms to create bug reports and
feature requests.
It also makes it harder for people to accidentally open a blank issue
since there are required fields that must be filled in.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Inlining the string makes the code more grep'able; renaming the
const to "driverName" to reflect the remaining uses of it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inlining the string makes the code more grep'able; renaming the
const to "driverName" to reflect the remaining uses of it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was effectively a constructor, but through some indirection; make it a
regular function, which is a bit more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was effectively a constructor, but through some indirection; make it a
regular function, which is a bit more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/opencontainers/runc/compare/v1.1.2...v1.1.3
This is the third release of the 1.1.z series of runc, and contains
various minor improvements and bugfixes.
- Our seccomp `-ENOSYS` stub now correctly handles multiplexed syscalls on
s390 and s390x. This solves the issue where syscalls the host kernel did not
support would return `-EPERM` despite the existence of the `-ENOSYS` stub
code (this was due to how s390x does syscall multiplexing).
- Retry on dbus disconnect logic in libcontainer/cgroups/systemd now works as
intended; this fix does not affect runc binary itself but is important for
libcontainer users such as Kubernetes.
- Inability to compile with recent clang due to an issue with duplicate
constants in libseccomp-golang.
- When using systemd cgroup driver, skip adding device paths that don't exist,
to stop systemd from emitting warnings about those paths.
- Socket activation was failing when more than 3 sockets were used.
- Various CI fixes.
- Allow to bind mount `/proc/sys/kernel/ns_last_pid` to inside container.
- runc static binaries are now linked against libseccomp v2.5.4.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
iptables package has a function `detectIptables()` called to initialize
some local variables. Since v20.10.0, it first looks for iptables bin,
then ip6tables and finally it checks what iptables flags are available
(including -C). It early exits when ip6tables isn't available, and
doesn't execute the last check.
To remove port mappings (eg. when a container stops/dies), Docker
first checks if those NAT rules exist and then deletes them. However, in
the particular case where there's no ip6tables bin available, iptables
`-C` flag is considered unavailable and thus it looks for NAT rules by
using some substring matching. This substring matching then fails
because `iptables -t nat -S POSTROUTING` dumps rules in a slighly format
than what's expected.
For instance, here's what `iptables -t nat -S POSTROUTING` dumps:
```
-A POSTROUTING -s 172.18.0.2/32 -d 172.18.0.2/32 -p tcp -m tcp --dport 9999 -j MASQUERADE
```
And here's what Docker looks for:
```
POSTROUTING -p tcp -s 172.18.0.2 -d 172.18.0.2 --dport 9999 -j MASQUERADE
```
Because of that, those rules are considered non-existant by Docker and
thus never deleted. To fix that, this change reorders the code in
`detectIptables()`.
Fixes#42127.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Starting with the 22.06 release, buildx is the default client for
docker build, which uses BuildKit as builder.
This patch changes the default builder version as advertised by
the daemon to "2" (BuildKit), so that pre-22.06 CLIs with BuildKit
support (but no buildx installed) also default to using BuildKit
when interacting with a 22.06 (or up) daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
updates the "logentries" dependency;
- checking error when calling output
- Support Go Modules
full diff: 7a984a84b5...fc06dab2ca
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While doing a boltdb operation and if the bucket is not found
we should not return a boltdb specific bucket not found error
because this causes leaky abstraction where in the user of libkv
needs to know about boltdb and import boltdb dependencies
neither of which is desirable. Replaced all the bucket not found
errors with the more generic `store.ErrKeyNotFound` error which
is more appropriate.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When using AtomicPut with 'previous' set at nil, it interprets
that the Key should be created with the AtomicPut. Instead of
returning a generic error, we return store.ErrKeyExists if the
key exists in the store during the operation.
Signed-off-by: Alexandre Beslic <abronan@docker.com>
Currently boltdb uses a handle which can be accessed
concurrently from multiple go routines and all of them
trying to open and close the boldb db handle which can
cause havoc. Use a mutex to serialize db access and
handle access.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This commit migrates the old 'go-etcd' client, which is deprecated
to the new 'coreos/etcd/client'.
One notable change is the ability to specify an 'IsDir' parameter
to the 'Put' call. This allows to circumvent the limitations of etcd
regarding the key/directory distinction while setting up Watches on
a directory. A conservative measure to set up a watch that should be
used the same way for etcd/consul/zookeeper is to enforce the 'IsDir'
parameter with 'WriteOptions' on 'Put' to avoid the 'NotANode' error
thrown by etcd on Watch call. Consul and zookeeper are not using the
'IsDir' parameter.
Signed-off-by: Alexandre Beslic <abronan@docker.com>
AtomicPut can now be used to Compare-and-Swap against the state
where the key doesn't yet exist. E.g. a race where two clients
create the same key: one succeeds, the other fails.
Pass nil for the previous argument of AtomicPut for this
behavior. Before, this would cause an error.
Implements this change for all three backends.
run:echo "::error::PR title suggests targetting the ${{ steps.title_branch.outputs.branch }} branch, but is opened against ${{ github.event.pull_request.base.ref }}" && exit 1
DOCKER_GITCOMMIT:=$(shell git rev-parse --short HEAD ||echo unsupported)
DOCKER_GITCOMMIT:=$(shell git rev-parse HEAD)
exportDOCKER_GITCOMMIT
# allow overriding the repository and branch that validation scripts are running
@ -20,6 +16,9 @@ export VALIDATE_REPO
exportVALIDATE_BRANCH
exportVALIDATE_ORIGIN_BRANCH
exportPAGER
exportGIT_PAGER
# env vars passed through directly to Docker's build scripts
# to allow things like `make KEEPBUNDLE=1 binary` easily
# `project/PACKAGERS.md` have some limited documentation of some of these
@ -28,7 +27,7 @@ export VALIDATE_ORIGIN_BRANCH
# option of "go build". For example, a built-in graphdriver priority list
# can be changed during build time like this:
#
# make DOCKER_LDFLAGS="-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,devicemapper" dynbinary
# make DOCKER_LDFLAGS="-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,zfs" dynbinary
#
DOCKER_ENVS:=\
-e BUILDFLAGS \
@ -40,6 +39,10 @@ DOCKER_ENVS := \
-e DOCKER_BUILDKIT \
-e DOCKER_BASH_COMPLETION_PATH \
-e DOCKER_CLI_PATH \
-e DOCKERCLI_VERSION \
-e DOCKERCLI_REPOSITORY \
-e DOCKERCLI_INTEGRATION_VERSION \
-e DOCKERCLI_INTEGRATION_REPOSITORY \
-e DOCKER_DEBUG \
-e DOCKER_EXPERIMENTAL \
-e DOCKER_GITCOMMIT \
@ -56,9 +59,11 @@ DOCKER_ENVS := \
-e GITHUB_ACTIONS \
-e TEST_FORCE_VALIDATE \
-e TEST_INTEGRATION_DIR \
-e TEST_INTEGRATION_USE_SNAPSHOTTER \
-e TEST_INTEGRATION_FAIL_FAST \
-e TEST_SKIP_INTEGRATION \
-e TEST_SKIP_INTEGRATION_CLI \
-e TEST_IGNORE_CGROUP_CHECK \
-e TESTCOVERAGE \
-e TESTDEBUG \
-e TESTDIRS \
@ -74,7 +79,12 @@ DOCKER_ENVS := \
-e PLATFORM \
-e DEFAULT_PRODUCT_LICENSE \
-e PRODUCT \
-e PACKAGER_NAME
-e PACKAGER_NAME \
-e PAGER \
-e GIT_PAGER \
-e OTEL_EXPORTER_OTLP_ENDPOINT \
-e OTEL_EXPORTER_OTLP_PROTOCOL \
-e OTEL_SERVICE_NAME
# note: we _cannot_ add "-e DOCKER_BUILDTAGS" here because even if it's unset in the shell, that would shadow the "ENV DOCKER_BUILDTAGS" set in our Dockerfile, which is very important for our official builds
# to allow `make BIND_DIR=. shell` or `make BIND_DIR= test`
@ -14,7 +14,7 @@ Moby is an open project guided by strong principles, aiming to be modular, flexi
It is open to the community to help set its direction.
- Modular: the project includes lots of components that have well-defined functions and APIs that work together.
- Batteries included but swappable: Moby includes enough components to build fully featured container system, but its modular architecture ensures that most of the components can be swapped by different implementations.
- Batteries included but swappable: Moby includes enough components to build fully featured container systems, but its modular architecture ensures that most of the components can be swapped by different implementations.
- Usable security: Moby provides secure defaults without compromising usability.
- Developer focused: The APIs are intended to be functional and useful to build powerful tools.
They are not necessarily intended as end user tools but as components aimed at developers.
@ -37,6 +37,6 @@ There is hopefully enough example material in the file for you to copy a similar
When you make edits to `swagger.yaml`, you may want to check the generated API documentation to ensure it renders correctly.
Run `make swagger-docs` and a preview will be running at `http://localhost`. Some of the styling may be incorrect, but you'll be able to ensure that it is generating the correct documentation.
Run `make swagger-docs` and a preview will be running at `http://localhost:9000`. Some of the styling may be incorrect, but you'll be able to ensure that it is generating the correct documentation.
The production documentation is generated by vendoring `swagger.yaml` into [docker/docker.github.io](https://github.com/docker/docker.github.io).
expectedErr:fmt.Sprintf("invalid API version: the minimum API version (%s) is higher than the default version (%s)",api.DefaultVersion,api.MinSupportedAPIVersion),
},
{
doc:"invalid default too low",
defaultVersion:"0.1",
minVersion:api.MinSupportedAPIVersion,
expectedErr:fmt.Sprintf("invalid default API version (0.1): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid default too high",
defaultVersion:"9999.9999",
minVersion:api.DefaultVersion,
expectedErr:fmt.Sprintf("invalid default API version (9999.9999): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid minimum too low",
defaultVersion:api.MinSupportedAPIVersion,
minVersion:"0.1",
expectedErr:fmt.Sprintf("invalid minimum API version (0.1): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid minimum too high",
defaultVersion:api.DefaultVersion,
minVersion:"9999.9999",
expectedErr:fmt.Sprintf("invalid minimum API version (9999.9999): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
errString:"client version 0.1 is too old. Minimum supported API version is 1.2.0, please upgrade your client to a newer version",
errString:fmt.Sprintf("client version 0.1 is too old. Minimum supported API version is %s, please upgrade your client to a newer version",api.MinSupportedAPIVersion),
},
{
reqVersion:"9999.9999",
errString:"client version 9999.9999 is too new. Maximum supported API version is 1.10.0",
errString:fmt.Sprintf("client version 9999.9999 is too new. Maximum supported API version is %s",api.DefaultVersion),
// There is existing endpoint config - if it's not indexed by NetworkMode.Name(), we
// can't tell which network the container-wide settings was intended for. NetworkMode,
// the keys in EndpointsConfig and the NetworkID in EndpointsConfig may mix network
// name/id/short-id. It's not safe to create EndpointsConfig under the NetworkMode
// name to store the container-wide MAC address, because that may result in two sets
// of EndpointsConfig for the same network and one set will be discarded later. So,
// reject the request ...
ep,ok:=networkingConfig.EndpointsConfig[nwName]
if!ok{
return"",errdefs.InvalidParameter(errors.New("if a container-wide MAC address is supplied, HostConfig.NetworkMode must match the identity of a network in NetworkSettings.Networks"))
}
// ep is the endpoint that needs the container-wide MAC address; migrate the address
// to it, or bail out if there's a mismatch.
ifep.MacAddress==""{
ep.MacAddress=deprecatedMacAddress
}elseifep.MacAddress!=deprecatedMacAddress{
return"",errdefs.InvalidParameter(errors.New("the container-wide MAC address must match the endpoint-specific MAC address for the main network, or be left empty"))
}
}
}
warning="The container-wide MacAddress field is now deprecated. It should be specified in EndpointsConfig instead."
config.MacAddress=""//nolint:staticcheck // ignore SA1019: field is deprecated, but still used on API < v1.44.
stream:=grpc.StreamInterceptor(grpc_middleware.ChainStreamServer(otelgrpc.StreamServerInterceptor(),grpcerrors.StreamServerInterceptor))//nolint:staticcheck // TODO(thaJeztah): ignore SA1019 for deprecated options: see https://github.com/moby/moby/issues/47437
withTrace:=otelgrpc.UnaryServerInterceptor()//nolint:staticcheck // TODO(thaJeztah): ignore SA1019 for deprecated options: see https://github.com/moby/moby/issues/47437
the URL is not supported by the daemon, a HTTP `400 Bad Request` error message
is returned.
If you omit the version-prefix, the current version of the API (v1.42) is used.
For example, calling `/info` is the same as calling `/v1.42/info`. Using the
If you omit the version-prefix, the current version of the API (v1.45) is used.
For example, calling `/info` is the same as calling `/v1.45/info`. Using the
API without a version-prefix is deprecated and will be removed in a future release.
Engine releases in the near future should support this version of the API,
@ -388,6 +388,20 @@ definitions:
description:"Create mount point on host if missing"
type:"boolean"
default:false
ReadOnlyNonRecursive:
description:|
Make the mount non-recursively read-only, but still leave the mount recursive
(unless NonRecursive is set to `true` in conjunction).
Addded in v1.44, before that version all read-only mounts were
non-recursive by default. To match the previous behaviour this
will default to `true` for clients on versions prior to v1.44.
type:"boolean"
default:false
ReadOnlyForceRecursive:
description:"Raise an error if the mount cannot be made recursively read-only."
type:"boolean"
default:false
VolumeOptions:
description:"Optional configuration for the `volume` type."
type:"object"
@ -413,6 +427,10 @@ definitions:
type:"object"
additionalProperties:
type:"string"
Subpath:
description:"Source path inside the volume. Must be relative without any back traversals."
type:"string"
example:"dir-inside-volume/subdirectory"
TmpfsOptions:
description:"Optional configuration for the `tmpfs` type."
type:"object"
@ -794,6 +812,12 @@ definitions:
1000000(1 ms). 0 means inherit.
type:"integer"
format:"int64"
StartInterval:
description:|
The time to wait between checks in nanoseconds during the start period.
It should be 0 or at least 1000000 (1 ms). 0 means inherit.
type:"integer"
format:"int64"
Health:
description:|
@ -976,6 +1000,13 @@ definitions:
items:
type:"integer"
minimum:0
Annotations:
type:"object"
description:|
Arbitrary non-identifying metadata attached to container and
provided to the runtime when the container is started.
additionalProperties:
type:"string"
# Applicable to UNIX platforms
CapAdd:
@ -1122,6 +1153,7 @@ definitions:
remapping option is enabled.
ShmSize:
type:"integer"
format:"int64"
description:|
Size of `/dev/shm` in bytes. If omitted, the system uses 64MB.
minimum:0
@ -1289,7 +1321,10 @@ definitions:
type:"boolean"
x-nullable:true
MacAddress:
description:"MAC address of the container."
description:|
MAC address of the container.
Deprecated:this field is deprecated in API v1.44 and up. Use EndpointSettings.MacAddress instead.
type:"string"
x-nullable:true
OnBuild:
@ -1339,16 +1374,16 @@ definitions:
EndpointsConfig:
description:|
A mapping of network name to endpoint configuration for that network.
The endpoint configuration can be left empty to connect to that
network with no particular endpoint configuration.
type:"object"
additionalProperties:
$ref:"#/definitions/EndpointSettings"
example:
# putting an example here, instead of using the example values from
# /definitions/EndpointSettings, because containers/create currently
# does not support attaching to multiple networks, so the example request
# would be confusing if it showed that multiple networks can be contained
# in the EndpointsConfig.
# TODO remove once we support multiple networks on container create (see https://github.com/moby/moby/blob/07e6b843594e061f82baa5fa23c2ff7d536c2a05/daemon/create.go#L323)
# /definitions/EndpointSettings, because EndpointSettings contains
# operational data returned when inspecting a container that we don't
# accept here.
EndpointsConfig:
isolated_nw:
IPAMConfig:
@ -1357,19 +1392,22 @@ definitions:
LinkLocalIPs:
- "169.254.34.68"
- "fe80::3468"
MacAddress:"02:42:ac:12:05:02"
Links:
- "container_1"
- "container_2"
Aliases:
- "server_x"
- "server_y"
database_nw:{}
NetworkSettings:
description:"NetworkSettings exposes the network settings in the API"
type:"object"
properties:
Bridge:
description:Name of the network's bridge (for example, `docker0`).
description:|
Name of the default bridge interface when dockerd's --bridge flag is set.
type:"string"
example:"docker0"
SandboxID:
@ -1379,34 +1417,40 @@ definitions:
HairpinMode:
description:|
Indicates if hairpin NAT should be enabled on the virtual interface.
Deprecated:This field is never set and will be removed in a future release.
type:"boolean"
example:false
LinkLocalIPv6Address:
description:IPv6 unicast address using the link-local prefix.
description:|
IPv6 unicast address using the link-local prefix.
Deprecated:This field is never set and will be removed in a future release.
type:"string"
example:"fe80::42:acff:fe11:1"
example:""
LinkLocalIPv6PrefixLen:
description:Prefix length of the IPv6 unicast address.
description:|
Prefix length of the IPv6 unicast address.
Deprecated:This field is never set and will be removed in a future release.
type:"integer"
example:"64"
example:""
Ports:
$ref:"#/definitions/PortMap"
SandboxKey:
description:SandboxKey identifies the sandbox
description:SandboxKey is the full path of the netns handle
type:"string"
example:"/var/run/docker/netns/8ab54b426c38"
# TODO is SecondaryIPAddresses actually used?
SecondaryIPAddresses:
description:""
description:"Deprecated: This field is never set and will be removed in a future release."
type:"array"
items:
$ref:"#/definitions/Address"
x-nullable:true
# TODO is SecondaryIPv6Addresses actually used?
SecondaryIPv6Addresses:
description:""
description:"Deprecated: This field is never set and will be removed in a future release."
Returns which files in a container's filesystem have been added, deleted,
or modified. The `Kind` of modification can be one of:
- `0`:Modified
- `1`:Added
- `2`:Deleted
- `0`:Modified ("C")
- `1`:Added ("A")
- `2`:Deleted ("D")
operationId:"ContainerChanges"
produces:["application/json"]
responses:
@ -6886,22 +7048,7 @@ paths:
schema:
type:"array"
items:
type:"object"
x-go-name:"ContainerChangeResponseItem"
title:"ContainerChangeResponseItem"
description:"change item in response to ContainerChanges operation"
required:[Path, Kind]
properties:
Path:
description:"Path to file that has changed"
type:"string"
x-nullable:false
Kind:
description:"Kind of change"
type:"integer"
format:"uint8"
enum:[0,1,2]
x-nullable:false
$ref:"#/definitions/FilesystemChange"
examples:
application/json:
- Path:"/dev"
@ -8009,6 +8156,7 @@ paths:
- `label=key` or `label="key=value"` of an image label
- `reference`=(`<image-name>[:<tag>]`)
- `since`=(`<image-name>[:<tag>]`, `<image id>` or `<image@digest>`)
- `until=<timestamp>`
type:"string"
- name:"shared-size"
in:"query"
@ -8191,6 +8339,16 @@ paths:
description:"BuildKit output configuration"
type:"string"
default:""
- name:"version"
in:"query"
type:"string"
default:"1"
enum:["1","2"]
description:|
Version of the builder backend to use.
- `1` is the first generation classic (deprecated) builder in the Docker daemon (default)
- `2` is [BuildKit](https://github.com/moby/buildkit)
responses:
200:
description:"no error"
@ -8228,7 +8386,7 @@ paths:
Available filters:
- `until=<duration>`:duration relative to daemon's time, during which build cache was not used, in Go's duration format (e.g., '24h')
- `until=<timestamp>` remove cache older than `<timestamp>`. The `<timestamp>` can be Unix timestamps, date formatted timestamps, or Go duration strings (e.g. `10m`, `1h30m`) computed relative to the daemon's local time.
- `id=<id>`
- `parent=<id>`
- `type=<string>`
@ -8260,7 +8418,7 @@ paths:
/images/create:
post:
summary:"Create an image"
description:"Create an image by either pulling it from a registry or importing it."
description:"Pull or import an image."
operationId:"ImageCreate"
consumes:
- "text/plain"
@ -8611,28 +8769,35 @@ paths:
is_official:
type:"boolean"
is_automated:
description:|
Whether this repository has automated builds enabled.
<p><br /></p>
> **Deprecated**:This field is deprecated and will always be "false".
type:"boolean"
example:false
name:
type:"string"
star_count:
type:"integer"
examples:
application/json:
- description:""
is_official:false
- description:"A minimal Docker image based on Alpine Linux with a complete package index and only 5 MB in size!"
is_official:true
is_automated:false
name:"wma55/u1210sshd"
star_count:0
- description:""
is_official:false
name:"alpine"
star_count:10093
- description:"Busybox base image."
is_official:true
is_automated:false
name:"jdswinbank/sshd"
star_count:0
- description:""
is_official:false
name:"Busybox base image."
star_count:3037
- description:"The PostgreSQL object-relational database system provides reliability and data integrity."
is_official:true
is_automated:false
name:"vgauthier/sshd"
star_count:0
name:"postgres"
star_count:12408
500:
description:"Server error"
schema:
@ -8652,7 +8817,6 @@ paths:
description:|
A JSON encoded value of the filters (a `map[string][]string`) to process on the images list. Available filters:
- `is-automated=(true|false)`
- `is-official=(true|false)`
- `stars=<number>` Matches images that has at least 'number' stars.
description:"operation not supported for pre-defined networks"
description:|
Forbidden operation. This happens when trying to create a network named after a pre-defined network,
or when trying to create an overlay network on a daemon which is not part of a Swarm cluster.
schema:
$ref:"#/definitions/ErrorResponse"
404:
@ -9937,13 +10106,7 @@ paths:
type:"string"
CheckDuplicate:
description:|
Check for networks with duplicate names. Since Network is
primarily keyed based on a random ID and not on the name, and
network name is strictly a user-friendly alias to the network
which is uniquely identified using ID, there is no guaranteed
way to check for duplicates. CheckDuplicate is there to provide
a best effort checking of any networks which has the same name
but it is not guaranteed to catch all name collisions.
Deprecated:CheckDuplicate is now always enabled.
type:"boolean"
Driver:
description:"Name of the network driver plugin to use."
@ -10011,14 +10174,19 @@ paths:
/networks/{id}/connect:
post:
summary:"Connect a container to a network"
description:"The network must be either a local-scoped network or a swarm-scoped network with the `attachable` option set. A network cannot be re-attached to a running container"
operationId:"NetworkConnect"
consumes:
- "application/json"
responses:
200:
description:"No error"
400:
description:"bad parameter"
schema:
$ref:"#/definitions/ErrorResponse"
403:
description:"Operation not supported for swarm scoped networks"
description:"Operation forbidden"
schema:
$ref:"#/definitions/ErrorResponse"
404:
@ -10053,6 +10221,7 @@ paths:
IPAMConfig:
IPv4Address:"172.24.56.89"
IPv6Address:"2001:db8::5689"
MacAddress:"02:42:ac:12:05:02"
tags:["Network"]
/networks/{id}/disconnect:
@ -10374,6 +10543,12 @@ paths:
default if omitted.
required:true
type:"string"
- name:"force"
in:"query"
description:|
Force disable a plugin even if still in use.
required:false
type:"boolean"
tags:["Plugin"]
/plugins/{name}/upgrade:
post:
@ -11040,18 +11215,7 @@ paths:
201:
description:"no error"
schema:
type:"object"
title:"ServiceCreateResponse"
properties:
ID:
description:"The ID of the created service."
type:"string"
Warning:
description:"Optional warning message"
type:"string"
example:
ID:"ak7w3gjqoa3kuz8xcpnyy0pvl"
Warning:"unable to pin image doesnotexist:latest to digest: image library/doesnotexist:latest not found"
msg:="invalid restart policy: maximum retry count can only be used with 'on-failure'"
ifpolicy.MaximumRetryCount<0{
msg+=" and cannot be negative"
}
return&errInvalidParameter{fmt.Errorf(msg)}
}
returnnil
caseRestartPolicyOnFailure:
ifpolicy.MaximumRetryCount<0{
return&errInvalidParameter{fmt.Errorf("invalid restart policy: maximum retry count cannot be negative")}
}
returnnil
case"":
// Versions before v25.0.0 created an empty restart-policy "name" as
// default. Allow an empty name with "any" MaximumRetryCount for
// backward-compatibility.
returnnil
default:
return&errInvalidParameter{fmt.Errorf("invalid restart policy: unknown policy '%s'; use one of '%s', '%s', '%s', or '%s'",policy.Name,RestartPolicyDisabled,RestartPolicyAlways,RestartPolicyOnFailure,RestartPolicyUnlessStopped)}
}
}
// LogMode is a type to define the available modes for logging
// These modes affect how logs are handled when log messages start piling up.
typeLogModestring
// Available logging modes
const(
LogModeUnsetLogMode=""
LogModeBlockingLogMode="blocking"
LogModeNonBlockLogMode="non-blocking"
)
// LogConfig represents the logging configuration of the container.