This error is returned when attempting to walk a descriptor that
*should* be an index or a manifest.
Without this the error is not very helpful sicne there's no way to tell
what triggered it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Issue was caused by the changes here https://github.com/moby/moby/pull/45504
First released in v25.0.0-beta.1
Signed-off-by: Christopher Petito <47751006+krissetto@users.noreply.github.com>
Don't mutate the container's `Config.WorkingDir` permanently with a
cleaned path when creating a working directory.
Move the `filepath.Clean` to the `translateWorkingDir` instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `normalizeWorkdir` function has two branches, one that returns a
result of `filepath.Join` which always returns a cleaned path, and
another one where the input string is returned unmodified.
To make these two outputs consistent, also clean the path in the second
branch.
This also makes the cleaning of the container workdir explicit in the
`normalizeWorkdir` function instead of relying on the
`SetupWorkingDirectory` to mutate it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make the internal DNS resolver for Windows containers forward requests
to upsteam DNS servers when it cannot respond itself, rather than
returning SERVFAIL.
Windows containers are normally configured with the internal resolver
first for service discovery (container name lookup), then external
resolvers from '--dns' or the host's networking configuration.
When a tool like ping gets a SERVFAIL from the internal resolver, it
tries the other nameservers. But, nslookup does not, and with this
change it does not need to.
The internal resolver learns external server addresses from the
container's HNSEndpoint configuration, so it will use the same DNS
servers as processes in the container.
The internal resolver for Windows containers listens on the network's
gateway address, and each container may have a different set of external
DNS servers. So, the resolver uses the source address of the DNS request
to select external resolvers.
On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used
to prevent the internal resolver from forwarding requests (restoring the
old behaviour).
Signed-off-by: Rob Murray <rob.murray@docker.com>
- deprecate Prestart hook
- deprecate kernel memory limits
Additions
- config: add idmap and ridmap mount options
- config.md: allow empty mappings for [r]idmap
- features-linux: Expose idmap information
- mount: Allow relative mount destinations on Linux
- features: add potentiallyUnsafeConfigAnnotations
- config: add support for org.opencontainers.image annotations
Minor fixes:
- config: improve bind mount and propagation doc
full diff: https://github.com/opencontainers/runtime-spec/compare/v1.1.0...v1.2.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds some nolint-comments for the deprecated kernel-memory options; we
deprecated these, but they could technically still be accepted by alternative
runtimes.
daemon/daemon_unix.go:108:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &config.KernelMemory
^
daemon/update_linux.go:63:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &resources.KernelMemory
^
Prestart hooks are deprecated, and more granular hooks should be used instead.
CreateRuntime are the closest equivalent, and executed in the same locations
as Prestart-hooks, but depending on what these hooks do, possibly one of the
other hooks could be used instead (such as CreateContainer or StartContainer).
As these hooks are still supported, this patch adds nolint comments, but adds
some TODOs to consider migrating to something else;
daemon/nvidia_linux.go:86:2: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
daemon/oci_linux.go:76:5: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No IPAM IPv6 address is given to an interface in a network with
'--ipv6=false', but the kernel would assign a link-local address and,
in a macvlan/ipvlan network, the interface may get a SLAAC-assigned
address.
So, disable IPv6 on the interface to avoid that.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This reverts commit a77e147d32.
The ipvlan integration tests have been skipped in CI because of a check
intended to ensure the kernel has ipvlan support - which failed, but
seems to be unnecessary (probably because kernels have moved on).
Signed-off-by: Rob Murray <rob.murray@docker.com>
We document that an macvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an macvlan
network was still configured with a gateway. So, DNS proxying would
be enabled in the internal resolver (and, if the host's resolver
was on a localhost address, requests to external resolvers from the
host's network namespace would succeed).
This change disables configuration of a gateway for a macvlan Endpoint
if no parent interface is specified.
(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway will still be set up. Documentation
will need to be updated to note that '--internal' should be used to
prevent DNS request forwarding in this case.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
The internal DNS resolver should only forward requests to external
resolvers if the libnetwork.Sandbox served by the resolver has external
network access (so, no forwarding for '--internal' networks).
The test for external network access was whether the Sandbox had an
Endpoint with a gateway configured.
However, an ipvlan-l3 networks with external network access does not
have a gateway, it has a default route bound to an interface.
Also, we document that an ipvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an ipvlan-l2
network was configured with a gateway. So, DNS proxying would be enabled
in the internal resolver (and, if the host's resolver was on a localhost
address, requests to external resolvers from the host's network
namespace would succeed).
So, this change adjusts the test for enabling DNS proxying to include
a check for '--internal' (as a shortcut) and, for non-internal networks,
checks for a default route as well as a gateway. It also disables
configuration of a gateway or a default route for an ipvlan Endpoint if
no parent interface is specified.
(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway/default route will still be set up
and external DNS proxying will be enabled. The network must be
configured as '--internal' to prevent that from happening.)
Signed-off-by: Rob Murray <rob.murray@docker.com>
go1.21.9 (released 2024-04-03) includes a security fix to the net/http
package, as well as bug fixes to the linker, and the go/types and
net/http packages. See the [Go 1.21.9 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.21.9+label%3ACherryPickApproved)
for more details.
These minor releases include 1 security fixes following the security policy:
- http2: close connections when receiving too many headers
Maintaining HPACK state requires that we parse and process all HEADERS
and CONTINUATION frames on a connection. When a request's headers exceed
MaxHeaderBytes, we don't allocate memory to store the excess headers but
we do parse them. This permits an attacker to cause an HTTP/2 endpoint
to read arbitrary amounts of header data, all associated with a request
which is going to be rejected. These headers can include Huffman-encoded
data which is significantly more expensive for the receiver to decode
than for an attacker to send.
Set a limit on the amount of excess header frames we will process before
closing a connection.
Thanks to Bartek Nowotarski (https://nowotarski.info/) for reporting this issue.
This is CVE-2023-45288 and Go issue https://go.dev/issue/65051.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.22.2
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.9+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.8...go1.21.9
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
full diff: https://github.com/golang/net/compare/v0.22.0...v0.23.0
Includes a fix for CVE-2023-45288, which is also addressed in go1.22.2
and go1.21.9;
> http2: close connections when receiving too many headers
>
> Maintaining HPACK state requires that we parse and process
> all HEADERS and CONTINUATION frames on a connection.
> When a request's headers exceed MaxHeaderBytes, we don't
> allocate memory to store the excess headers but we do
> parse them. This permits an attacker to cause an HTTP/2
> endpoint to read arbitrary amounts of data, all associated
> with a request which is going to be rejected.
>
> Set a limit on the amount of excess header frames we
> will process before closing a connection.
>
> Thanks to Bartek Nowotarski for reporting this issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diffs changes relevant to vendored code:
- https://github.com/golang/net/compare/v0.18.0...v0.22.0
- websocket: add support for dialing with context
- http2: remove suspicious uint32->v conversion in frame code
- http2: send an error of FLOW_CONTROL_ERROR when exceed the maximum octets
- https://github.com/golang/crypto/compare/v0.17.0...v0.21.0
- internal/poly1305: drop Go 1.12 compatibility
- internal/poly1305: improve sum_ppc64le.s
- ocsp: don't use iota for externally defined constants
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
illumos is the opensource continuation of OpenSolaris after Oracle
closed to source it (again).
For example use see: https://github.com/openbao/openbao/pull/205.
Signed-off-by: Jasper Siepkes <siepkes@serviceplanet.nl>
This was brought up by bmitch that its not expected to have a platform
object in the config descriptor.
Also checked with tianon who agreed, its not _wrong_ but is unexpected
and doesn't neccessarily make sense to have it there.
Also, while technically incorrect, ECR is throwing an error when it sees
this.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This was using `errors.Wrap` when there was no error to wrap, meanwhile
we are supposed to be creating a new error.
Found this while investigating some log corruption issues and
unexpectedly getting a nil reader and a nil error from `getTailReader`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The NetworkMode "default" is now normalized into the value it
aliases ("bridge" on Linux and "nat" on Windows) by the
ContainerCreate endpoint, the legacy image builder, Swarm's
cluster executor and by the container restore codepath.
builder-next is left untouched as it already uses the normalized
value (ie. bridge).
Going forward, this will make maintenance easier as there's one
less NetworkMode to care about.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
The `identity.ChainIDs` call was accidentally removed in
b37ced2551.
This broke the shared size calculation for images with more than one
layer that were sharing the same compressed layer.
This was could be reproduced with:
```
$ docker pull docker.io/docker/desktop-kubernetes-coredns:v1.11.1
$ docker pull docker.io/docker/desktop-kubernetes-etcd:3.5.10-0
$ docker system df
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
After a535a65c4b the size reported by the
image list was changed to include all platforms of that image.
This made the "shared size" calculation consider all diff ids of all the
platforms available in the image which caused "snapshot not found"
errors when multiple images were sharing the same layer which wasn't
unpacked.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is better because every possible platform combination
does not need to be defined in the Dockerfile. If built
for platform where Delve is not supported then it is just
skipped.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Copy the swagger / OpenAPI file to the documentation. This is the API
version used by the upcoming v26.0.0 release.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Benchmark the `Images` implementation (image list) against an image
store with 10, 100 and 1000 random images. Currently the images are
single-platform only.
The images are generated randomly, but a fixed seed is used so the
actual testing data will be the same across different executions.
Because the content store is not a real containerd image store but a
local implementation, a small delay (500us) is added to each content
store method call. This is to simulate a real-world usage where each
containerd client call requires a gRPC call.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit 8921897e3b introduced the uses of `clear()`,
which requires go1.21, but Go is downgrading this file to go1.16 when used in
other projects (due to us not yet being a go module);
0.175 + xx-go build '-gcflags=' -ldflags '-X github.com/moby/buildkit/version.Version=b53a13e -X github.com/moby/buildkit/version.Revision=b53a13e4f5c8d7e82716615e0f23656893df89af -X github.com/moby/buildkit/version.Package=github.com/moby/buildkit -extldflags '"'"'-static'"'" -tags 'osusergo netgo static_build seccomp ' -o /usr/bin/buildkitd ./cmd/buildkitd
181.8 # github.com/docker/docker/libnetwork/internal/resolvconf
181.8 vendor/github.com/docker/docker/libnetwork/internal/resolvconf/resolvconf.go:509:2: clear requires go1.21 or later (-lang was set to go1.16; check go.mod)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
52a80b40e2 extracted the `imageSummary`
function but introduced a bug causing the whole caller function to
return if the image should be skipped.
`imageSummary` returns a nil error and nil image when the image doesn't
have any platform or all its platforms are not available locally.
In this case that particular image should be skipped, instead of failing
the whole image list operation.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't run filter function which would only run through the images
reading theirs config without checking any label anyway.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit c655b7dc78 added a check to make sure
the TMP_OUT variable was not set to an empty value, as such a situation would
perform an `rm -rf /**` during cleanup.
However, it was a bit too eager, because Makefile conditionals (`ifeq`) are
evaluated when parsing the Makefile, which happens _before_ the make target
is executed.
As a result `$@_TMP_OUT` was always empty when the `ifeq` was evaluated,
making it not possible to execute the `generate-files` target.
This patch changes the check to use a shell command to evaluate if the var
is set to an empty value.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix `error mounting "/etc/hosts" to rootfs at "/etc/hosts": mount
/etc/hosts:/etc/hosts (via /proc/self/fd/6), flags: 0x5021: operation
not permitted`.
This error was introduced in 7d08d84b03
(`dockerd-rootless.sh: set rootlesskit --state-dir=DIR`) that changed
the filesystem of the state dir from /tmp to /run (in a typical setup).
Fix issue 47248
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This code is currently only used in the daemon, but is also needed in other
places. We should consider moving this code to github.com/moby/sys, so that
BuildKit can also use the same implementation instead of maintaining a fork;
moving it to internal allows us to reuse this code inside the repository, but
does not allow external consumers to depend on it (which we don't want as
it's not a permanent location).
As our code only uses this in linux files, I did not add a stub for other
platforms (but we may decide to do that in the moby/sys repository).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit cbc2a71c2 makes `connect` syscall fail fast when a container is
only attached to an internal network. Thanks to that, if such a
container tries to resolve an "external" domain, the embedded resolver
returns an error immediately instead of waiting for a timeout.
This commit makes sure the embedded resolver doesn't even try to forward
to upstream servers.
Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Adds an experimental `DOCKER_BUILDKIT_RUNC_COMMAND` variable that allows
to specify different runc-compatible binary to be used by the buildkit's
runc executor.
This allows runtimes like sysbox be used for the containers spawned by
buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
full diffs:
- https://github.com/protocolbuffers/protobuf-go/compare/v1.31.0...v1.33.0
- https://github.com/golang/protobuf/compare/v1.5.3...v1.5.4
From the Go security announcement list;
> Version v1.33.0 of the google.golang.org/protobuf module fixes a bug in
> the google.golang.org/protobuf/encoding/protojson package which could cause
> the Unmarshal function to enter an infinite loop when handling some invalid
> inputs.
>
> This condition could only occur when unmarshaling into a message which contains
> a google.protobuf.Any value, or when the UnmarshalOptions.UnmarshalUnknown
> option is set. Unmarshal now correctly returns an error when handling these
> inputs.
>
> This is CVE-2024-24786.
In a follow-up post;
> A small correction: This vulnerability applies when the UnmarshalOptions.DiscardUnknown
> option is set (as well as when unmarshaling into any message which contains a
> google.protobuf.Any). There is no UnmarshalUnknown option.
>
> In addition, version 1.33.0 of google.golang.org/protobuf inadvertently
> introduced an incompatibility with the older github.com/golang/protobuf
> module. (https://github.com/golang/protobuf/issues/1596) Users of the older
> module should update to github.com/golang/protobuf@v1.5.4.
govulncheck results in our code:
govulncheck ./...
Scanning your code and 1221 packages across 204 dependent modules for known vulnerabilities...
=== Symbol Results ===
Vulnerability #1: GO-2024-2611
Infinite loop in JSON unmarshaling in google.golang.org/protobuf
More info: https://pkg.go.dev/vuln/GO-2024-2611
Module: google.golang.org/protobuf
Found in: google.golang.org/protobuf@v1.31.0
Fixed in: google.golang.org/protobuf@v1.33.0
Example traces found:
#1: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Peek
#2: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls json.Decoder.Read
#3: daemon/logger/gcplogs/gcplogging.go:154:18: gcplogs.New calls logging.Client.Ping, which eventually calls protojson.Unmarshal
Your code is affected by 1 vulnerability from 1 module.
This scan found no other vulnerabilities in packages you import or modules you
require.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turn warnings into a deprecation notice and highlight that it will
prevent daemon startup in future releases.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- full diff: https://github.com/containerd/containerd/compare/v1.7.13...v1.7.14
- release notes: https://github.com/containerd/containerd/releases/tag/v1.7.14
Welcome to the v1.7.14 release of containerd!
The fourteenth patch release for containerd 1.7 contains various fixes and updates.
Highlights
- Update builds to use go 1.21.8
- Fix various timing issues with docker pusher
- Register imagePullThroughput and count with MiB
- Move high volume event logs to Trace level
Container Runtime Interface (CRI)
- Handle pod transition states gracefully while listing pod stats
Runtime
- Update runc-shim to process exec exits before init
Dependency Changes
- github.com/containerd/nri v0.4.0 -> v0.6.0
- github.com/containerd/ttrpc v1.2.2 -> v1.2.3
- google.golang.org/genproto/googleapis/rpc 782d3b101e98 -> cbb8c96f2d6d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With both rootless and live restore enabled, there's some race condition
which causes the container to be `Unmount`ed before the refcount is
restored.
This makes sure we don't underflow the refcount (uint64) when
decrementing it.
The root cause of this race condition still needs to be investigated and
fixed, but at least this unflakies the `TestLiveRestore`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use a separate `devcontainer` Dockerfile target, this allows to include
the `gopls` in the devcontainer so it doesn't have to be installed by
the Go vscode extension.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make sure the `ping` command used by `TestBridgeICC` actually has
the `-6` flag when it runs IPv6 test cases. Without this flag,
IPv6 connectivity isn't tested properly.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Currently this won't have any real effect because the platform matcher
matches all platform and is only used for sorting.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Move containers counting out of `singlePlatformImage` and count them
based on the `ImageManifest` property.
(also remove ChainIDs calculation as they're no longer used)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Avoid fetching `SnapshotService` from client every time. Fetch it once
and then store when creating the image service.
This also allows to pass custom snapshotter implementation for unit
testing.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Use `image.Store` and `content.Store` stored in the ImageService struct
instead of fetching it every time from containerd client.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Both containerd and graphdriver image service use the same code to
create the cache - they only supply their own `cacheAdaptor` struct.
Extract the shared code to `cache.New`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Move image store backend specific code out of the cache code and move it
to a separate interface to allow using the same cache code with
containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Rather than error out if the host's resolv.conf has a bad ndots option,
just ignore it. Still validate ndots supplied via '--dns-option' and
treat failure as an error.
Signed-off-by: Rob Murray <rob.murray@docker.com>
When this was called concurrently from the moby image
exporter there could be a data race where a layer was
written to the refs map when it was already there.
In that case the reference count got mixed up and on
release only one of these layers was actually released.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
When IPv6 is disabled in a container by, for example, using the --sysctl
option - an IPv6 address/gateway is still allocated. Don't attempt to
apply that config because doing so enables IPv6 on the interface.
Signed-off-by: Rob Murray <rob.murray@docker.com>
When configuring the internal DNS resolver - rather than keep IPv6
nameservers read from the host's resolv.conf in the container's
resolv.conf, treat them like IPv4 addresses and use them as upstream
resolvers.
For IPv6 nameservers, if there's a zone identifier in the address or
the container itself doesn't have IPv6 support, mark the upstream
addresses for use in the host's network namespace.
Signed-off-by: Rob Murray <rob.murray@docker.com>
RootlessKit will print hints if something is still unsatisfied.
e.g., `kernel.apparmor_restrict_unprivileged_userns` constraint
rootless-containers/rootlesskit@33c3e7ca6c
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
In de2447c, the creation of the 'lower' file was changed from using
os.Create to using ioutils.AtomicWriteFile, which ignores the system's
umask. This means that even though the requested permission in the
source code was always 0666, it was 0644 on systems with default
umask of 0022 prior to de2447c, so the move to AtomicFile potentially
increased the file's permissions.
This is not a security issue because the parent directory does not
allow writes into the file, but it can confuse security scanners on
Linux-based systems into giving false positives.
Signed-off-by: Jaroslav Jindrak <dzejrou@gmail.com>
The field will still be present in the response, but will always be
`false`.
Searching for `is-automated=true` will yield no results, while
`is-automated=false` will effectively be a no-op.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When using devcontainers in VSCode, install the Go extension
automatically in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
While github.com/stretchr/testify is not used directly by any of the
repository code, it is a transitive dependency via Swarmkit and
therefore still easy to use without having to revendor. Add lint rules
to ban importing testify packages to make sure nobody does.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Apply command gotest.tools/v3/assert/cmd/gty-migrate-from-testify to the
cnmallocator package to be consistent with the assertion library used
elsewhere in moby.
Signed-off-by: Cory Snider <csnider@mirantis.com>
In a container-create API request, HostConfig.NetworkMode (the identity
of the "main" network) may be a name, id or short-id.
The configuration for that network, including preferred IP address etc,
may be keyed on network name or id - it need not match the NetworkMode.
So, when migrating the old container-wide MAC address to the new
per-endpoint field - it is not safe to create a new EndpointSettings
entry unless there is no possibility that it will duplicate settings
intended for the same network (because one of the duplicates will be
discarded later, dropping the settings it contains).
This change introduces a new API restriction, if the deprecated container
wide field is used in the new API, and EndpointsConfig is provided for
any network, the NetworkMode and key under which the EndpointsConfig is
store must be the same - no mixing of ids and names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This message accidentally changed in ac2a028dcc
because my IDE's "refactor tool" was a bit over-enthusiastic. It also went and
updated the tests accordingly, so CI didn't catch this :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Moby imports Swarmkit; Swarmkit no longer imports Moby. In order to
accomplish this feat, Swarmkit has introduced a new plugin.Getter
interface so it could stop importing our pkg/plugingetter package. This
new interface is not entirely compatible with our
plugingetter.PluginGetter interface, necessitating a thin adapter.
Swarmkit had to jettison the CNM network allocator to stop having to
import libnetwork as the cnmallocator package is deeply tied to
libnetwork. Move the CNM network allocator into libnetwork, where it
belongs. The package had a short an uninteresting Git history in the
Swarmkit repository so no effort was made to retain history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This patch disables pulling legacy (schema1 and schema 2, version 1) images by
default.
A `DOCKER_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE` environment-variable is
introduced to allow re-enabling this feature, aligning with the environment
variable used in containerd 2.0 (`CONTAINERD_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE`).
With this patch, attempts to pull a legacy image produces an error:
With graphdrivers:
docker pull docker:1.0
1.0: Pulling from library/docker
[DEPRECATION NOTICE] Docker Image Format v1, and Docker Image manifest version 2, schema 1 support will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format, or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
With the containerd image store enabled, output is slightly different
as it returns the error before printing the `1.0: pulling ...`:
docker pull docker:1.0
Error response from daemon: [DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
Using the "distribution" endpoint to resolve the digest for an image also
produces an error:
curl -v --unix-socket /var/run/docker.sock http://foo/distribution/docker.io/library/docker:1.0/json
* Trying /var/run/docker.sock:0...
* Connected to foo (/var/run/docker.sock) port 80 (#0)
> GET /distribution/docker.io/library/docker:1.0/json HTTP/1.1
> Host: foo
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< Api-Version: 1.45
< Content-Type: application/json
< Docker-Experimental: false
< Ostype: linux
< Server: Docker/dev (linux)
< Date: Tue, 27 Feb 2024 16:09:42 GMT
< Content-Length: 354
<
{"message":"[DEPRECATION NOTICE] Docker Image Format v1, and Docker Image manifest version 2, schema 1 support will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format, or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/"}
* Connection #0 to host foo left intact
Starting the daemon with the `DOCKER_ENABLE_DEPRECATED_PULL_SCHEMA_1_IMAGE`
env-var set to a non-empty value allows pulling the image;
docker pull docker:1.0
[DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/docker:1.0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
b0a0e6710d13: Already exists
d193ad713811: Already exists
ba7268c3149b: Already exists
c862d82a67a2: Already exists
Digest: sha256:5e7081837926c7a40e58881bbebc52044a95a62a2ea52fb240db3fc539212fe5
Status: Image is up to date for docker:1.0
docker.io/library/docker:1.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When creating a new daemon in the `TestDaemonProxy`, reset the
`OTEL_EXPORTER_OTLP_ENDPOINT` to an empty value to disable OTEL
collection to avoid it hitting the proxy.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This should allow to enable host loopback by setting
DOCKERD_ROOTLESS_ROOTLESSKIT_DISABLE_HOST_LOOPBACK to false,
defaults true.
Signed-off-by: serhii.n <serhii.n@thescimus.com>
Don't use all `*.json` files blindly, take only these that are likely to
be reports from go test.
Also, use `find ... -exec` instead of piping results to `xargs`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
For current implementation of Checkpoint Restore (C/R) in docker, it
will write the checkpoint to content store. However, when restoring
libcontainerd uses .Digest().Encoded(), which will remove the info
of alg, leading to error.
Signed-off-by: huang-jl <1046678590@qq.com>
Buildkit added support for exporting metrics in:
7de2e4fb32
Explicitly set the protocol for exporting metrics like we do for the
traces. We need that because Buildkit defaults to grpc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
30c069cb03
removed the `ResolveImageConfig` method in favor of more generic
`ResolveSourceMetadata` that can also support other things than images.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
e358792815
changed that field to a function and added an `OverrideResource`
function that allows to override it.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
StaticDirSource definition changed and can no longer be initialized from
the composite literal.
a80b48544c
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All other progress updates are emitted with truncated id.
```diff
$ docker pull --platform linux/amd64 alpine
Using default tag: latest
latest: Pulling from library/alpine
-sha256:4abcf20661432fb2d719aaf90656f55c287f8ca915dc1c92ec14ff61e67fbaf8: Pulling fs layer
+4abcf2066143: Download complete
Digest: sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
Status: Image is up to date for alpine:latest
docker.io/library/alpine:latest
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't change the behavior for older clients and keep the same behavior.
Otherwise client can't opt-out (because `ReadOnlyNonRecursive` is
unsupported before 1.44).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit e6907243af applied a fix for situations
where the client was configured with API-version negotiation, but did not yet
negotiate a version.
However, the checkVersion() function that was implemented copied the semantics
of cli.NegotiateAPIVersion, which ignored connection failures with the
assumption that connection errors would still surface further down.
However, when using the result of a failed negotiation for NewVersionError,
an API version mismatch error would be produced, masking the actual connection
error.
This patch changes the signature of checkVersion to return unexpected errors,
including failures to connect to the API.
Before this patch:
docker -H unix:///no/such/socket.sock secret ls
"secret list" requires API version 1.25, but the Docker daemon API version is 1.24
With this patch applied:
docker -H unix:///no/such/socket.sock secret ls
Cannot connect to the Docker daemon at unix:///no/such/socket.sock. Is the docker daemon running?
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function has various errors that are returned when failing to make a
connection (due to permission issues, TLS mis-configuration, or failing to
resolve the TCP address).
The errConnectionFailed error is currently used as a special case when
processing Ping responses. The current code did not consistently treat
connection errors, and because of that could either absorb the error,
or process the empty response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
NegotiateAPIVersion was ignoring errors returned by Ping. The intent here
was to handle API responses from a daemon that may be in an unhealthy state,
however this case is already handled by Ping itself.
Ping only returns an error when either failing to connect to the API (daemon
not running or permissions errors), or when failing to parse the API response.
Neither of those should be ignored in this code, or considered a successful
"ping", so update the code to return
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in 27ef09a46f, which changed
the Ping handling to ignore internal server errors. That case is tested in
TestPingFail, which verifies that we accept the Ping response if a 500
status code was received.
The TestPingWithError test was added to verify behavior if a protocol
(connection) error occurred; however the mock-client returned both a
response, and an error; the error returned would only happen if a connection
error occurred, which means that the server would not provide a reply.
Running the test also shows that returning a response is unexpected, and
ignored:
=== RUN TestPingWithError
2024/02/23 14:16:49 RoundTripper returned a response & error; ignoring response
2024/02/23 14:16:49 RoundTripper returned a response & error; ignoring response
--- PASS: TestPingWithError (0.00s)
PASS
This patch updates the test to remove the response.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't error out when mount source doesn't exist and mounts has
`CreateMountpoint` option enabled.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Any PR that is labeled with any `impact/*` label should have a
description for the changelog and an `area/*` label.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
A common pattern in libnetwork is to delete an object using
`DeleteAtomic`, ie. to check the optimistic lock, but put in a retry
loop to refresh the data and the version index used by the optimistic
lock.
This commit introduces a new `Delete` method to delete without
checking the optimistic lock. It focuses only on the few places where
it's obvious the calling code doesn't rely on the side-effects of the
retry loop (ie. refreshing the object to be deleted).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
I noticed that this log didn't use structured logs;
[resolver] failed to query DNS server: 10.115.11.146:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:46361->10.115.11.146:53: i/o timeout
[resolver] failed to query DNS server: 10.44.139.225:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:53991->10.44.139.225:53: i/o timeout
But other logs did;
DEBU[2024-02-20T15:48:51.026704088Z] [resolver] forwarding query client-addr="udp:172.19.0.2:39661" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t A"
DEBU[2024-02-20T15:48:51.028331088Z] [resolver] forwarding query client-addr="udp:172.19.0.2:35163" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t AAAA"
DEBU[2024-02-20T15:48:51.057329755Z] [resolver] received AAAA record "2a00:1450:400e:801::200e" for "google.com." from udp:192.168.65.7
DEBU[2024-02-20T15:48:51.057666880Z] [resolver] received A record "142.251.36.14" for "google.com." from udp:192.168.65.7
As we're already constructing a logger with these fields, we may as well use it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Allow to override the PAGER/GIT_PAGER variables inside the container.
Use `cat` as pager when running in Github Actions (to avoid things like
`git diff` stalling the CI).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't use OTEL tracing in this test because we're testing the HTTP proxy
behavior here and we don't want OTEL to interfere.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This will return a single entry for each name/value pair, and for now
all the "image specific" metadata (labels, config, size) should be
either "default platform" or "first platform we have locally" (which
then matches the logic for commands like `docker image inspect`, etc)
with everything else (just ID, maybe?) coming from the manifest
list/index.
That leaves room for the longer-term implementation to add new fields to
describe the _other_ images that are part of the manifest list/index.
Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
v1.33.0 is also available, but it would also cause
`github.com/aws/aws-sdk-go-v2` change from v1.24.1 to v1.25.0
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
DNS names were only set up for user-defined networks. On Linux, none
of the built-in networks (bridge/host/none) have built-in DNS, so they
don't need DNS names.
But, on Windows, the default network is "nat" and it does need the DNS
names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This matches the prior behavior before 2a6ff3c24f.
This also updates the Swagger documentation for the current version to note that the field might be the empty string and what that means.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Archives being unpacked by Dockerfiles may have been created on other
OSes with different conventions and semantics for xattrs, making them
impossible to apply when extracting. Restore the old best-effort xattr
behaviour users have come to depend on in the classic builder.
The (archive.Archiver).UntarPath function does not allow the options
passed to Untar to be customized. It also happens to be a trivial
wrapper around the Untar function. Inline the function body and add the
option.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Update to the latest patch release, which contains changes from v0.13.5 to
remove the reference package from "github.com/docker/distribution", which
is now a separate module.
full diff: https://github.com/containerd/nydus-snapshotter/compare/v0.8.2...v0.13.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Non-swarm networks created before network-creation-time validation
was added in 25.0.0 continued working, because the checks are not
re-run.
But, swarm creates networks when needed (with 'agent=true'), to
ensure they exist on each agent - ignoring the NetworkNameError
that says the network already existed.
By ignoring validation errors on creation of a network with
agent=true, pre-existing swarm networks with IPAM config that would
fail the new checks will continue to work too.
New swarm (overlay) networks are still validated, because they are
initially created with 'agent=false'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This spec is not directly relevant for the image spec, and the Docker
documentation no longer includes the actual specification.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to release 25.0.0, the bridge in an internal network was assigned
an IP address - making the internal network accessible from the host,
giving containers on the network access to anything listening on the
bridge's address (or INADDR_ANY on the host).
This change restores that behaviour. It does not restore the default
route that was configured in the container, because packets sent outside
the internal network's subnet have always been dropped. So, a 'connect()'
to an address outside the subnet will still fail fast.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.
Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.
Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.
Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This const contains the minimum API version that can be supported by the
API server. The daemon is currently configured to use the same version,
but we may increment the _configured_ minimum version when deprecating
old API versions in future.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 08e4e88482 (Docker Engine v25.0.0)
deprecated API version v1.23 and lower, but older API versions could be
enabled through the DOCKER_MIN_API_VERSION environment variable.
This patch removes all support for API versions < v1.24.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.20 (Docker Engine v1.11.0) and older allowed a HostConfig to be passed
when starting a container. This feature was deprecated in API v1.21 (Docker
Engine v1.10.0) in 3e7405aea8, and removed in
API v1.23 (Docker Engine v1.12.0) in commit 0a8386c8be.
API v1.23 and older are deprecated, and this patch removes the feature.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 322e2a7d05 changed the format of errors
returned by the API to be in JSON format for API v1.24. Older versions of
the API returned errors in plain-text format.
API v1.23 and older are deprecated, so we can remove support for plain-text
error responses.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This endpoint was deprecated in API v1.20 (Docker Engine v1.8.0) in
commit db9cc91a9e, in favor of the
`PUT /containers/{id}/archive` and `HEAD /containers/{id}/archive`
endpoints, and disabled in API v1.24 (Docker Engine v1.12.0) through
commit 428328908d.
This patch removes the endpoint, and the associated `daemon.ContainerCopy`
method in the backend.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.21 (Docker Engine v1.9.0) enforces the request to have a JSON
content-type on exec start (see 45dc57f229).
An exception was added in 0b5e628e14 to
make this check conditional (supporting API < 1.21).
API v1.23 and older are deprecated, and this patch removes the feature.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.20 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The TestInspectAPIContainerResponse mentioned that Windows does not
support API versions before v1.25.
While technically, no stable release existed for Windows with API versions
before that (see f811d5b128), API version
v1.24 was enabled in e4af39aeb3, to have
a consistend fallback version for API version negotiation.
This patch updates the test to reflect that change.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.23 and older are deprecated, so we can remove the code to adjust
responses for API v1.19 and lower.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API v1.20 and up produces an error when signalling / killing a non-running
container (see c92377e300). Older API versions
allowed this, and an exception was added in 621e3d8587.
API v1.23 and older are deprecated, so we can remove this handling.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
API versions before 1.19 allowed CpuShares that were greater than the maximum
or less than the minimum supported by the kernel, and relied on the kernel to
do the right thing.
Commit ed39fbeb2a introduced code to adjust the
CPU shares to be within the accepted range when using API version 1.18 or
lower.
API v1.23 and older are deprecated, so we can remove support for this
functionality.
Currently, there's no validation for CPU shares to be within an acceptable
range; a TODO was added to add validation for this option, and to use the
`linuxMinCPUShares` and `linuxMaxCPUShares` consts for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "pull" option was added in API v1.16 (Docker Engine v1.4.0) in commit
054e57a622, which gated the option by API
version.
API v1.23 and older are deprecated, so we can remove the gate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "rm" option was made the default in API v1.12 (Docker Engine v1.0.0)
in commit b60d647172, and "force-rm" was
added in 667e2bd4ea.
API v1.23 and older are deprecated, so we can remove these gates.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "pause" flag was added in API v1.13 (Docker Engine v1.1.0), and is
enabled by default (see 17d870bed5).
API v1.23 and older are deprecated, so we can remove the version-gate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inspect and history used two different ways to find the present images.
This made history fail in some cases where image inspect would work (if
a configuration of a manifest wasn't found in the content store).
With this change we now use the same logic for both inspect and history.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59
futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af
futex: Add sys_futex_wait()
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722
futex: Add sys_futex_requeue()
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e
x86/shstk: Introduce map_shadow_stack syscall
When operating with shadow stacks enabled, the kernel will automatically
allocate shadow stacks for new threads, however in some cases userspace
will need additional shadow stacks. The main example of this is the
ucontext family of functions, which require userspace allocating and
pivoting to userspace managed stacks.
Unlike most other user memory permissions, shadow stacks need to be
provisioned with special data in order to be useful. They need to be setup
with a restore token so that userspace can pivot to them via the RSTORSSP
instruction. But, the security design of shadow stacks is that they
should not be written to except in limited circumstances. This presents a
problem for userspace, as to how userspace can provision this special
data, without allowing for the shadow stack to be generally writable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07
fs: Add fchmodat2()
On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].
There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.
The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25
mm/mempolicy: add set_mempolicy_home_node syscall
This syscall can be used to set a home node for the MPOL_BIND and
MPOL_PREFERRED_MANY memory policy. Users should use this syscall after
setting up a memory policy for the specified range as shown below.
mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
new_nodes->size + 1, 0);
sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
home_node, 0);
The syscall allows specifying a home node/preferred node from which
kernel will fulfill memory allocation requests first.
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The compatibility depends on whether `hyperv` or `process` container
isolation is used.
This fixes cache not being used when building images based on older
Windows versions on a newer Windows host.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.
Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Since 964ab7158c, we explicitly set the bridge MTU if it was specified.
Unfortunately, kernel <v4.17 have a check preventing us to manually set
the MTU to anything greater than 1500 if no links is attached to the
bridge, which is how we do things -- create the bridge, set its MTU and
later on, attach veths to it.
Relevant kernel commit: 804b854d37
As we still have to support CentOS/RHEL 7 (and their old v3.10 kernels)
for a few more months, we need to ignore EINVAL if the MTU is > 1500
(but <= 65535).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Commit 4f47013feb introduced a new validation step to make sure no
IPv6 subnet is configured on a network which has EnableIPv6=false.
Commit 5d5eeac310 then removed that validation step and automatically
enabled IPv6 for networks with a v6 subnet. But this specific commit
was reverted in c59e93a67b and now the error introduced by 4f47013feb
is re-introduced.
But it turns out some users expect a network created with an IPv6
subnet and EnableIPv6=false to actually have no IPv6 connectivity.
This restores that behavior.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Previous commit made getDBhandle a one-liner returning a struct
member -- making it useless. Inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This parameter was used to tell the boltdb kvstore not to open/close
the underlying boltdb db file before/after each get/put operation.
Since d21d0884ae, we've a single datastore instance shared by all
components that need it. That commit set `PersistConnection=true`.
We can now safely remove this param altogether, and remove all the
code that was opening and closing the db file before and after each
operation -- it's dead code!
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test is non-representative of what we now do in libnetwork.
Since the ability of opening the same boltdb database multiple
times in parallel will be dropped in the next commit, just remove
this test.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:
```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
The monitorDaemon() goroutine calls startContainerd() then blocks on
<-daemonWaitCh to wait for it to exit. The startContainerd() function
would (re)initialize the daemonWaitCh so a restarted containerd could be
waited on. This implementation was race-free because startContainerd()
would synchronously initialize the daemonWaitCh before returning. When
the call to start the managed containerd process was moved into the
waiter goroutine, the code to initialize the daemonWaitCh struct field
was also moved into the goroutine. This introduced a race condition.
Move the daemonWaitCh initialization to guarantee that it happens before
the startContainerd() call returns.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Containers attached to an 'internal' bridge network are unable to
communicate when the host is running firewalld.
Non-internal bridges are added to a trusted 'docker' firewalld zone, but
internal bridges were not.
DOCKER-ISOLATION iptables rules are still configured for an internal
network, they block traffic to/from addresses outside the network's subnet.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Do not set 'Config.MacAddress' in inspect output unless the MAC address
is configured.
Also, make sure it is filled in for a configured address on the default
network before the container is started (by translating the network name
from 'default' to 'config' so that the address lookup works).
Signed-off-by: Rob Murray <rob.murray@docker.com>
The API's EndpointConfig struct has a MacAddress field that's used for
both the configured address, and the current address (which may be generated).
A configured address must be restored when a container is restarted, but a
generated address must not.
The previous attempt to differentiate between the two, without adding a field
to the API's EndpointConfig that would show up in 'inspect' output, was a
field in the daemon's version of EndpointSettings, MACOperational. It did
not work, MACOperational was set to true when a configured address was
used. So, while it ensured addresses were regenerated, it failed to preserve
a configured address.
So, this change removes that code, and adds DesiredMacAddress to the wrapped
version of EndpointSettings, where it is persisted but does not appear in
'inspect' results. Its value is copied from MacAddress (the API field) when
a container is created.
Signed-off-by: Rob Murray <rob.murray@docker.com>
File paths can contain commas, particularly paths returned from
t.TempDir() in subtests which include commas in their names. There is
only one datastore provider and it only supports a single address, so
the only use of parsing the address is to break tests in mysterious
ways.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The bbolt library wants exclusive access to the boltdb file and uses
file locking to assure that is the case. The controller and each network
driver that needs persistent storage instantiates its own unique
datastore instance, backed by the same boltdb file. The boltdb kvstore
implementation works around multiple access to the same boltdb file by
aggressively closing the boltdb file between each transaction. This is
very inefficient. Have the controller pass its datastore instance into
the drivers and enable the PersistConnection option to disable closing
the boltdb between transactions.
Set data-dir in unit tests which instantiate libnetwork controllers so
they don't hang trying to lock the default boltdb database file.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- full diff: https://github.com/actions/setup-go/compare/v3.5.0...v5.0.0
v5
In scope of this release, we change Nodejs runtime from node16 to node20.
Moreover, we update some dependencies to the latest versions.
Besides, this release contains such changes as:
- Fix hosted tool cache usage on windows
- Improve documentation regarding dependencies caching
V4
The V4 edition of the action offers:
- Enabled caching by default
- The action will try to enable caching unless the cache input is explicitly
set to false.
Please see "Caching dependency files and build outputs" for more information:
https://github.com/actions/setup-go#caching-dependency-files-and-build-outputs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If a reader has caught up to the logger and is waiting for the next
message, it should stop waiting when the logger is closed. Otherwise
the reader will unnecessarily wait the full closedDrainTimeout for no
log messages to arrive.
This case was overlooked when the journald reader was recently
overhauled to be compatible with systemd 255, and the reader tests only
failed when a logical race happened to settle in such a way to exercise
the bugged code path. It was only after implicit flushing on close was
added to the journald test harness that the Follow tests would
repeatably fail due to this bug. (No new regression tests are needed.)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The journald reader test harness injects an artificial asynchronous
delay into the logging pipeline: a logged message won't be written to
the journal until at least 150ms after the Log() call returns. If a test
returns while log messages are still in flight to be written, the logs
may attempt to be written after the TempDir has been cleaned up, leading
to spurious errors.
The logger read tests which interleave writing and reading have to
include explicit synchronization points to work reliably with this delay
in place. On the other hand, tests should not be required to sync the
logger explicitly before returning. Override the Close() method in the
test harness wrapper to wait for in-flight logs to be flushed to disk.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Check the return value when logging messages
- Log the stream (stdout/stderr) and list of messages that were not read
- Wait until the logger is closed before returning early (panic/fatal)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Writing the systemd-journal-remote command output directly to os.Stdout
and os.Stderr makes it nearly impossible to tell which test case the
output is related to when the tests are not run in verbose mode. Extend
the journald sender fake to redirect output to the test log so they
interleave with the rest of the test output.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Go race detector was detecting a data race when running the
TestLogRead/Follow/Concurrent test against the journald logging driver.
The race was in the test harness, specifically syncLogger. The waitOn
field would be reassigned each time a log entry is sent to the journal,
which is not concurrency-safe. Make it concurrency-safe using the same
patterns that are used in the log follower implementation to synchronize
with the logger.
Signed-off-by: Cory Snider <csnider@mirantis.com>
When saving an image treat `image@sha256:abcdef...` the same as
`abcdef...`, this makes it:
- Not export the digested tag as the image name
- Not try to export all tags from the image repository
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Saving an image via digested reference, ID or truncated ID doesn't store
the image reference in the archive. This also causes the save code to
not add the image's manifest to the index.json.
This commit explicitly adds the untagged manifests to the index.json if
no tagged manifests were added.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
errDrainDone is a sentinel error which is never supposed to escape the
package. Consequently, it needs to be filtered out of returns all over
the place, adding boilerplate. Forgetting to filter out these errors
would be a logic bug which the compiler would not help us catch. Replace
it with boolean multi-valued returns as they can't be accidentally
ignored or propagated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While it doesn't really matter if the reader waits for an extra
arbitrary period beyond an arbitrary hardcoded timeout, it's also
trivial and cheap to implement, and nice to have.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The journald reader uses a timer to set an upper bound on how long to
wait for the final log message of a stopped container. However, the
timer channel is only received from in non-blocking select statements!
There isn't enough benefit of using a timer to offset the cost of having
to manage the timer resource. Setting a deadline and comparing the
current time is just as effective, without having to manage the
lifecycle of any runtime resources.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Synthesize a boot ID for journal entries fed into
systemd-journal-remote, as required by systemd 255.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Following logs with a non-negative tail when the container log is empty
is broken on the journald driver when used with systemd 255. Add tests
which cover this edge case to our loggertest suite.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Previously this was done indirectly - the `compare` function didn't
check the `ArgsEscaped`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Restrict cache candidates only to images that were built locally.
This doesn't affect builds using `--cache-from`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Store additional image property which makes it possible to distinguish
if image was built locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a follow-up to 2cf230951f, adding
more directives to adjust for some new code added since:
Before this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
# github.com/docker/docker/internal/sliceutil
internal/sliceutil/sliceutil.go:3:12: type parameter requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:3:14: predeclared comparable requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:4:19: invalid map key type T (missing comparable constraint)
# github.com/docker/docker/libnetwork
libnetwork/endpoint.go:252:17: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
# github.com/docker/docker/daemon
daemon/container_operations.go:682:9: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
daemon/inspect.go:42:18: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
With this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
=== RUN TestModuleCompatibllity
main_test.go:321: all packages have the correct go version specified through //go:build
--- PASS: TestModuleCompatibllity (0.00s)
PASS
ok gocompat 0.031s
make: Leaving directory '/go/src/github.com/docker/docker/internal/gocompat'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Functional programming for the win! Add a utility function to map the
values of a slice, along with a curried variant, to tide us over until
equivalent functionality gets added to the standard library
(https://go.dev/issue/61898)
Signed-off-by: Cory Snider <csnider@mirantis.com>
We need to isolate the images that we are remapping to a userns, we
can't mix them with "normal" images. In the graph driver case this means
we create a new root directory where we store the images and everything
else, in the containerd case we can use a new namespace.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These types were deprecated in v25.0, and moved to api/types/container;
This patch removes the aliases for;
- api/types.ResizeOptions (deprecated in 95b92b1f97)
- api/types.ContainerAttachOptions (deprecated in 30f09b4a1a)
- api/types.ContainerCommitOptions (deprecated in 9498d897ab)
- api/types.ContainerRemoveOptions (deprecated in 0f77875220)
- api/types.ContainerStartOptions (deprecated in 7bce33eb0f)
- api/types.ContainerListOptions (deprecated in 9670d9364d)
- api/types.ContainerLogsOptions (deprecated in ebef4efb88)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in v25.0, and moved to api/types/swarm;
This patch removes the aliases for;
- api/types.ServiceUpdateResponse (deprecated in 5b3e6555a3)
- api/types.ServiceCreateResponse (deprecated in ec69501e94)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in 48cacbca24
(v25.0), and moved to api/types/image.
This patch removes the aliases for;
- api/types.ImageDeleteResponseItem
- api/types.ImageSummary
- api/types.ImageMetadata
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in b688af2226
(v25.0), and moved to api/types/checkpoint.
This patch removes the aliases for;
- api/types.CheckpointCreateOptions
- api/types.CheckpointListOptions
- api/types.CheckpointDeleteOptions
- api/types.Checkpoint
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were deprecated in c90229ed9a
(v25.0), and moved to api/types/system.
This patch removes the aliases for;
- api/types.Info
- api/types.Commit
- api/types.PluginsInfo
- api/types.NetworkAddressPool
- api/types.Runtime
- api/types.SecurityOpt
- api/types.KeyValue
- api/types.DecodeSecurityOptions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
To prevent a circular import between api/types and api/types image,
the RequestPrivilegeFunc reference was not moved, but defined as
part of the PullOptions / PushOptions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8b7af1d0f added some code to update the DNSNames of all
endpoints attached to a sandbox by loading a new instance of each
affected endpoints from the datastore through a call to
`Network.EndpointByID()`.
This method then calls `Network.getEndpointFromStore()`, that in
turn calls `store.GetObject()`, which then calls `cache.get()`,
which calls `o.CopyTo(kvObject)`. This effectively creates a fresh
new instance of an Endpoint. However, endpoints are already kept in
memory by Sandbox, meaning we now have two in-memory instances of
the same Endpoint.
As it turns out, libnetwork is built around the idea that no two objects
representing the same thing should leave in-memory, otherwise breaking
mutex locking and optimistic locking (as both instances will have a drifting
version tracking ID -- dbIndex in libnetwork parliance).
In this specific case, this bug materializes by container rename failing
when applied a second time for a given container. An integration test is
added to make sure this won't happen again.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
I made a mistake in the last commit - after resolving the IP from the
passed `addr` for CIFS it would still resolve the `device` part.
Apply only one name resolution
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to 7a9b680a, the container short ID was added to the network
aliases only for custom networks. However, this logic wasn't preserved
in 6a2542d and now the cid is always added to the list of network
aliases.
This commit reintroduces the old logic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- pass the cluster as an argument instead of manually setting it after
creating the router-options
- remove the "opts" variable, to prevent it accidentally being used (with
the assumption that's the value returned)
- use a struct-literal for the returned options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 21e50b89c9 added a label on the buildkit
worker to advertise the host-gateway-ip. This option can be either set by the
user in the daemon config, or otherwise defaults to the gateway-ip.
If no value is set by the user, discovery of the gateway-ip happens when
initializing the network-controller (`NewDaemon`, `daemon.restore()`).
However d222bf097c changed how we handle the
daemon config. As a result, the `cli.Config` used when initializing the
builder only holds configuration information form the daemon config
(user-specified or defaults), but is not updated with information set
by `NewDaemon`.
This patch adds an accessor on the daemon to get the current daemon config.
An alternative could be to return the config by `NewDaemon` (which should
likely be a _copy_ of the config).
Before this patch:
docker buildx inspect default
Name: default
Driver: docker
Nodes:
Name: default
Endpoint: default
Status: running
Buildkit: v0.12.4+3b6880d2a00f
Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
Labels:
org.mobyproject.buildkit.worker.moby.host-gateway-ip: <nil>
After this patch:
docker buildx inspect default
Name: default
Driver: docker
Nodes:
Name: default
Endpoint: default
Status: running
Buildkit: v0.12.4+3b6880d2a00f
Platforms: linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
Labels:
org.mobyproject.buildkit.worker.moby.host-gateway-ip: 172.18.0.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 8ae94cafa5 added a DNS resolution
of the `device` part of the volume option.
The previous way to resolve the passed hostname was to use `addr`
option, which was handled by the same code path as the `nfs` mount type.
The issue is that `addr` is also an SMB module option handled by kernel
and passing a hostname as `addr` produces an invalid argument error.
To fix that, restore the old behavior to handle `addr` the same way as
before, and only perform the new DNS resolution of `device` if there is
no `addr` passed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Update the version of compose used in CI to the latest version.
- full diff: docker/compose@v2.24.1...v2.24.2
- release notes: https://github.com/docker/compose/releases/tag/v2.24.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixes some potentially unclosed file-handles,
inlines some variables, and use consts for fixed
values.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixing a "defer in loop" warning, instead changing to use
sub-tests, and simplifying some code, using os.WriteFile() instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The names of extended attributes are not completely freeform. Attributes
are namespaced, and the kernel enforces (among other things) that only
attributes whose names are prefixed with a valid namespace are
permitted. The name of the attribute therefore needs to be known in
order to diagnose issues with lsetxattr. Include the name of the
extended attribute in the errors returned from the Lsetxattr and
Lgetxattr so users and us can more easily troubleshoot xattr-related
issues. Include the name in a separate rich-error field to provide code
handling the error enough information to determine whether or not the
failure can be ignored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `GetImageOpts` struct is used for options to be passed to the backend,
and are not used in client code. This struct currently is intended for internal
use only.
This patch moves the `GetImageOpts` struct to the backend package to prevent
it being imported in the client, and to make it more clear that this is part
of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The MAC address of a running container was stored in the same place as
the configured address for a container.
When starting a stopped container, a generated address was treated as a
configured address. If that generated address (based on an IPAM-assigned
IP address) had been reused, the containers ended up with duplicate MAC
addresses.
So, remember whether the MAC address was explicitly configured, and
clear it if not.
Signed-off-by: Rob Murray <rob.murray@docker.com>
With containerd snapshotters enabled `docker run` currently fails when
creating a container from an image that doesn't have the default host
platform without an explicit `--platform` selection:
```
$ docker run image:amd64
Unable to find image 'asdf:amd64' locally
docker: Error response from daemon: pull access denied for asdf, repository does not exist or may require 'docker login'.
See 'docker run --help'.
```
This is confusing and the graphdriver behavior is much better here,
because it runs whatever platform the image has, but prints a warning:
```
$ docker run image:amd64
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
```
This commits changes the containerd snapshotter behavior to be the same
as the graphdriver. This doesn't affect container creation when platform
is specified explicitly.
```
$ docker run --rm --platform linux/arm64 asdf:amd64
Unable to find image 'asdf:amd64' locally
docker: Error response from daemon: pull access denied for asdf, repository does not exist or may require 'docker login'.
See 'docker run --help'.
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Order the layers in OCI manifest by their actual apply order. This is
required by the OCI image spec.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Since v25.0 (commit ff50388), we validate endpoint settings when
containers are created, instead of doing so when containers are started.
However, a container created prior to that release would still trigger
validation error at start-time. In such case, the API returns a 500
status code because the Go error isn't wrapped into an InvalidParameter
error. This is now fixed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test was added in f301c5765a to test
inspect output for API > v1.21, however, it was pinned to API v1.21,
which is now deprecated.
Remove the fixed version, as the intent was to test "current" API versions
(API v1.21 and up),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in f301c5765a to test
inspect output for API > v1.21, however, it was pinned to API v1.21,
which is now deprecated.
Remove the fixed version, as the intent was to test "current" API versions
(API v1.21 and up),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was added in 75f6929b44, but pinned
to the API version that was current at the time (v1.20), which is now
deprecated.
Update the test to use the current API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tables linked to deprecated API versions, and an up-to-date version of
the matrix is already included at https://docs.docker.com/engine/api/#api-version-matrix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- add some asserts for unhandled errors
- use consts for fixed values, and slightly re-format Dockerfile contentt
- inline one-line Dockerfiles
- fix some vars to be properly camel-cased
- improve assert for error-types;
Before:
=== RUN TestBuildPlatformInvalid
build_test.go:685: assertion failed: expression is false: errdefs.IsInvalidParameter(err)
--- FAIL: TestBuildPlatformInvalid (0.01s)
FAIL
After:
=== RUN TestBuildPlatformInvalid
build_test.go:689: assertion failed: error is Error response from daemon: "foobar": unknown operating system or architecture: invalid argument (errdefs.errSystem), not errdefs.IsInvalidParameter
--- FAIL: TestBuildPlatformInvalid (0.01s)
FAIL
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This matcher was only used internally in the containerd implementation of
the image store. Un-export it, and make it a local utility in that package
to prevent external use.
This package was introduced in 1616a09b61
(v24.0), and there are no known external consumers of this package, so there
should be no need to deprecate / alias the old location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When resolving names in swarm mode, services with exposed ports are
connected to user overlay network, ingress network, and local (docker_gwbridge)
networks. Name resolution should prioritize returning the VIP/IPs on user
overlay network over ingress and local networks.
Sandbox.ResolveName implemented this by taking the list of endpoints,
splitting the list into 3 separate lists based on the type of network
that the endpoint was attached to (dynamic, ingress, local), and then
creating a new list, applying the networks in that order.
This patch refactors that logic to use a custom sorter (sort.Interface),
which makes the code more transparent, and prevents iterating over the
list of endpoints multiple times.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Permit container network attachments to set any static IP address within
the network's IPAM master pool, including when a subpool is configured.
Users have come to depend on being able to statically assign container
IP addresses which are guaranteed not to collide with automatically-
assigned container addresses.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This package was introduced in af59752712
as a utility package for devicemapper, which was removed in commit
dc11d2a2d8 (v25.0.0), and the package
was deprecated in bf692d47fb.
This patch removes the package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This flag was marked deprecated in commit 5a922dc16 (released in v24.0)
and to be removed in the next release.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Some configuration in a container depends on whether it has support for
IPv6 (including default entries for '::1' etc in '/etc/hosts').
Before this change, the container's support for IPv6 was determined by
whether it was connected to any IPv6-enabled networks. But, that can
change over time, it isn't a property of the container itself.
So, instead, detect IPv6 support by looking for '::1' on the container's
loopback interface. It will not be present if the kernel does not have
IPv6 support, or the user has disabled it in new namespaces by other
means.
Once IPv6 support has been determined for the container, its '/etc/hosts'
is re-generated accordingly.
The daemon no longer disables IPv6 on all interfaces during initialisation.
It now disables IPv6 only for interfaces that have not been assigned an
IPv6 address. (But, even if IPv6 is disabled for the container using the
sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6
networks still get IPv6 addresses that appear in the internal DNS. There's
more to-do!)
Signed-off-by: Rob Murray <rob.murray@docker.com>
All components of the path are locked before the check, and
released once the path is already mounted.
This makes it impossible to replace the mounted directory until it's
actually mounted in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All subpath components are opened with openat, relative to the base
volume directory and checked against the volume escape.
The final file descriptor is mounted from the /proc/self/fd/<fd> to a
temporary mount point owned by the daemon and then passed to the
underlying container runtime.
Temporary mountpoint is removed after the container is started.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`VolumeOptions` now has a `Subpath` field which allows to specify a path
relative to the volume that should be mounted as a destination.
Symlinks are supported, but they cannot escape the base volume
directory.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We constructed a "function level" logger, which was used once "as-is", but
also added additional Fields in a loop (for each resource), effectively
overwriting the previous one for each iteration. Adding additional
fields can result in some overhead, so let's construct a "logger" only for
inside the loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We have many "image" packages, so these vars easily conflict/shadow
imports. Let's rename them (and in some cases use a const) to
prevent that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For some time, when adding an interface with no IPv6 address (an
interface to a network that does not have IPv6 enabled), we've been
disabling IPv6 on that interface.
As part of a separate change, I'm removing that logic - there's nothing
wrong with having IPv6 enabled on an interface with no routable address.
The difference is that the kernel will assign a link-local address.
TestAddRemoveInterface does this...
- Assign an IPv6 link-local address to one end of a veth interface, and
add it to a namespace.
- Add a bridge with no assigned IPv6 address to the namespace.
- Remove the veth interface from the namespace.
- Put the veth interface back into the namespace, still with an
explicitly assigned IPv6 link local address.
When IPv6 is disabled on the bridge interface, the test passes.
But, when IPv6 is enabled, the bridge gets a kernel assigned link-local
address.
Then, when re-adding the veth interface, the test generates an error in
'osl/interface_linux.go:checkRouteConflict()'. The conflict is between
the explicitly assigned fe80::2 on the veth, and a route for fe80::/64
belonging to the bridge.
So, in preparation for not-disabling IPv6 on these interfaces, use a
unique-local address in the test instead of link-local.
I don't think that changes the intent of the test.
With the change to not-always disable IPv6, it is possible to repro the
problem with a real container, disconnect and re-connect a user-defined
network with '--subnet fe80::/64' while the container's connected to an
IPv4 network. So, strictly speaking, that will be a regression.
But, it's also possible to repro the problem in master, by disconnecting
and re-connecting the fe80::/64 network while another IPv6 network is
connected. So, I don't think it's a problem we need to address, perhaps
other than by prohibiting '--subnet fe80::/64'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This hopefully makes the test less flakey (or removes any flake that
would be caused by the test itself).
1. Adds tail of cluster daemon logs when there is a test failure so we
can more easily see what may be happening
2. Scans the daemon logs to check if the key is rotated before
restarting the daemon. This is a little hacky but a little better
than assuming it is done after a hard-coded 3 seconds.
3. Cleans up the `node ls` check such that it uses a poll function
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2024-01-04 00:18:58 +00:00
1879 changed files with 135861 additions and 53503 deletions
run:echo "::error::PR title suggests targetting the ${{ steps.title_branch.outputs.branch }} branch, but is opened against ${{ github.event.pull_request.base.ref }}" && exit 1
expectedErr:fmt.Sprintf("invalid API version: the minimum API version (%s) is higher than the default version (%s)",api.DefaultVersion,api.MinSupportedAPIVersion),
},
{
doc:"invalid default too low",
defaultVersion:"0.1",
minVersion:api.MinSupportedAPIVersion,
expectedErr:fmt.Sprintf("invalid default API version (0.1): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid default too high",
defaultVersion:"9999.9999",
minVersion:api.DefaultVersion,
expectedErr:fmt.Sprintf("invalid default API version (9999.9999): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid minimum too low",
defaultVersion:api.MinSupportedAPIVersion,
minVersion:"0.1",
expectedErr:fmt.Sprintf("invalid minimum API version (0.1): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
},
{
doc:"invalid minimum too high",
defaultVersion:api.DefaultVersion,
minVersion:"9999.9999",
expectedErr:fmt.Sprintf("invalid minimum API version (9999.9999): must be between %s and %s",api.MinSupportedAPIVersion,api.DefaultVersion),
errString:"client version 0.1 is too old. Minimum supported API version is 1.2.0, please upgrade your client to a newer version",
errString:fmt.Sprintf("client version 0.1 is too old. Minimum supported API version is %s, please upgrade your client to a newer version",api.MinSupportedAPIVersion),
},
{
reqVersion:"9999.9999",
errString:"client version 9999.9999 is too new. Maximum supported API version is 1.10.0",
errString:fmt.Sprintf("client version 9999.9999 is too new. Maximum supported API version is %s",api.DefaultVersion),
return"",errdefs.InvalidParameter(errors.New("the container-wide MAC address should match the endpoint-specific MAC address for the main network or should be left empty"))
// If there's no endpoint config, create a place to store the configured address.
// There is existing endpoint config - if it's not indexed by NetworkMode.Name(), we
// can't tell which network the container-wide settings was intended for. NetworkMode,
// the keys in EndpointsConfig and the NetworkID in EndpointsConfig may mix network
// name/id/short-id. It's not safe to create EndpointsConfig under the NetworkMode
// name to store the container-wide MAC address, because that may result in two sets
// of EndpointsConfig for the same network and one set will be discarded later. So,
// reject the request ...
ep,ok:=networkingConfig.EndpointsConfig[nwName]
if!ok{
return"",errdefs.InvalidParameter(errors.New("if a container-wide MAC address is supplied, HostConfig.NetworkMode must match the identity of a network in NetworkSettings.Networks"))
}
// ep is the endpoint that needs the container-wide MAC address; migrate the address
// to it, or bail out if there's a mismatch.
ifep.MacAddress==""{
ep.MacAddress=deprecatedMacAddress
}elseifep.MacAddress!=deprecatedMacAddress{
return"",errdefs.InvalidParameter(errors.New("the container-wide MAC address must match the endpoint-specific MAC address for the main network, or be left empty"))
}
}
}
warning="The container-wide MacAddress field is now deprecated. It should be specified in EndpointsConfig instead."
stream:=grpc.StreamInterceptor(grpc_middleware.ChainStreamServer(otelgrpc.StreamServerInterceptor(),grpcerrors.StreamServerInterceptor))//nolint:staticcheck // TODO(thaJeztah): ignore SA1019 for deprecated options: see https://github.com/moby/moby/issues/47437
withTrace:=otelgrpc.UnaryServerInterceptor()//nolint:staticcheck // TODO(thaJeztah): ignore SA1019 for deprecated options: see https://github.com/moby/moby/issues/47437
This package includes types for legacy API versions. The stable version of the API types live in `api/types/*.go`.
Consider moving a type here when you need to keep backwards compatibility in the API. This legacy types are organized by the latest API version they appear in. For instance, types in the `v1p19` package are valid for API versions below or equal `1.19`. Types in the `v1p20` package are valid for the API version `1.20`, since the versions below that will use the legacy types in `v1p19`.
## Package name conventions
The package name convention is to use `v` as a prefix for the version number and `p`(patch) as a separator. We use this nomenclature due to a few restrictions in the Go package name convention:
1. We cannot use `.` because it's interpreted by the language, think of `v1.20.CallFunction`.
2. We cannot use `_` because golint complains about it. The code is actually valid, but it looks probably more weird: `v1_20.CallFunction`.
For instance, if you want to modify a type that was available in the version `1.21` of the API but it will have different fields in the version `1.22`, you want to create a new package under `api/types/versions/v1p21`.