Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59
futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d69729e053)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af
futex: Add sys_futex_wait()
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 10d344d176)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722
futex: Add sys_futex_requeue()
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit df57a080b6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e
x86/shstk: Introduce map_shadow_stack syscall
When operating with shadow stacks enabled, the kernel will automatically
allocate shadow stacks for new threads, however in some cases userspace
will need additional shadow stacks. The main example of this is the
ucontext family of functions, which require userspace allocating and
pivoting to userspace managed stacks.
Unlike most other user memory permissions, shadow stacks need to be
provisioned with special data in order to be useful. They need to be setup
with a restore token so that userspace can pivot to them via the RSTORSSP
instruction. But, the security design of shadow stacks is that they
should not be written to except in limited circumstances. This presents a
problem for userspace, as to how userspace can provision this special
data, without allowing for the shadow stack to be generally writable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8826f402f9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07
fs: Add fchmodat2()
On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].
There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.
The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6f242f1a28)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4d0d5ee10d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25
mm/mempolicy: add set_mempolicy_home_node syscall
This syscall can be used to set a home node for the MPOL_BIND and
MPOL_PREFERRED_MANY memory policy. Users should use this syscall after
setting up a memory policy for the specified range as shown below.
mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
new_nodes->size + 1, 0);
sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
home_node, 0);
The syscall allows specifying a home node/preferred node from which
kernel will fulfill memory allocation requests first.
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1251982cf7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.
Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:
```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The monitorDaemon() goroutine calls startContainerd() then blocks on
<-daemonWaitCh to wait for it to exit. The startContainerd() function
would (re)initialize the daemonWaitCh so a restarted containerd could be
waited on. This implementation was race-free because startContainerd()
would synchronously initialize the daemonWaitCh before returning. When
the call to start the managed containerd process was moved into the
waiter goroutine, the code to initialize the daemonWaitCh struct field
was also moved into the goroutine. This introduced a race condition.
Move the daemonWaitCh initialization to guarantee that it happens before
the startContainerd() call returns.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit dd20bf4862)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ecb217cf69)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previously this was done indirectly - the `compare` function didn't
check the `ArgsEscaped`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 96d461d27e)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make sure the cache candidate platform matches the requested.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 877ebbe038)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Restrict cache candidates only to images that were built locally.
This doesn't affect builds using `--cache-from`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 96ac22768a)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Store additional image property which makes it possible to distinguish
if image was built locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit c6156dc51b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add checks for some image config fields that were missing.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 537348763f)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 585d74bad1)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a follow-up to 2cf230951f, adding
more directives to adjust for some new code added since:
Before this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
# github.com/docker/docker/internal/sliceutil
internal/sliceutil/sliceutil.go:3:12: type parameter requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:3:14: predeclared comparable requires go1.18 or later (-lang was set to go1.16; check go.mod)
internal/sliceutil/sliceutil.go:4:19: invalid map key type T (missing comparable constraint)
# github.com/docker/docker/libnetwork
libnetwork/endpoint.go:252:17: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
# github.com/docker/docker/daemon
daemon/container_operations.go:682:9: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
daemon/inspect.go:42:18: implicit function instantiation requires go1.18 or later (-lang was set to go1.16; check go.mod)
With this patch:
make -C ./internal/gocompat/
GO111MODULE=off go generate .
GO111MODULE=on go mod tidy
GO111MODULE=on go test -v
=== RUN TestModuleCompatibllity
main_test.go:321: all packages have the correct go version specified through //go:build
--- PASS: TestModuleCompatibllity (0.00s)
PASS
ok gocompat 0.031s
make: Leaving directory '/go/src/github.com/docker/docker/internal/gocompat'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit bd4ff31775)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This repository is not yet a module (i.e., does not have a `go.mod`). This
is not problematic when building the code in GOPATH or "vendor" mode, but
when using the code as a module-dependency (in module-mode), different semantics
are applied since Go1.21, which switches Go _language versions_ on a per-module,
per-package, or even per-file base.
A condensed summary of that logic [is as follows][1]:
- For modules that have a go.mod containing a go version directive; that
version is considered a minimum _required_ version (starting with the
go1.19.13 and go1.20.8 patch releases: before those, it was only a
recommendation).
- For dependencies that don't have a go.mod (not a module), go language
version go1.16 is assumed.
- Likewise, for modules that have a go.mod, but the file does not have a
go version directive, go language version go1.16 is assumed.
- If a go.work file is present, but does not have a go version directive,
language version go1.17 is assumed.
When switching language versions, Go _downgrades_ the language version,
which means that language features (such as generics, and `any`) are not
available, and compilation fails. For example:
# github.com/docker/cli/cli/context/store
/go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/storeconfig.go:6:24: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
/go/pkg/mod/github.com/docker/cli@v25.0.0-beta.2+incompatible/cli/context/store/store.go:74:12: predeclared any requires go1.18 or later (-lang was set to go1.16; check go.mod)
Note that these fallbacks are per-module, per-package, and can even be
per-file, so _(indirect) dependencies_ can still use modern language
features, as long as their respective go.mod has a version specified.
Unfortunately, these failures do not occur when building locally (using
vendor / GOPATH mode), but will affect consumers of the module.
Obviously, this situation is not ideal, and the ultimate solution is to
move to go modules (add a go.mod), but this comes with a non-insignificant
risk in other areas (due to our complex dependency tree).
We can revert to using go1.16 language features only, but this may be
limiting, and may still be problematic when (e.g.) matching signatures
of dependencies.
There is an escape hatch: adding a `//go:build` directive to files that
make use of go language features. From the [go toolchain docs][2]:
> The go line for each module sets the language version the compiler enforces
> when compiling packages in that module. The language version can be changed
> on a per-file basis by using a build constraint.
>
> For example, a module containing code that uses the Go 1.21 language version
> should have a `go.mod` file with a go line such as `go 1.21` or `go 1.21.3`.
> If a specific source file should be compiled only when using a newer Go
> toolchain, adding `//go:build go1.22` to that source file both ensures that
> only Go 1.22 and newer toolchains will compile the file and also changes
> the language version in that file to Go 1.22.
This patch adds `//go:build` directives to those files using recent additions
to the language. It's currently using go1.19 as version to match the version
in our "vendor.mod", but we can consider being more permissive ("any" requires
go1.18 or up), or more "optimistic" (force go1.21, which is the version we
currently use to build).
For completeness sake, note that any file _without_ a `//go:build` directive
will continue to use go1.16 language version when used as a module.
[1]: 58c28ba286/src/cmd/go/internal/gover/version.go (L9-L56)
[2]: https://go.dev/doc/toolchain
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 2cf230951f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit fc8fcf85a2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When running a `docker cp` to copy files to/from a container, the
lookup of the `getent` executable happens within the container's
filesystem, so we cannot re-use the results.
Unfortunately, that also means we can't preserve the results for
any other uses of these functions, but probably the lookup should not
be "too" costly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b5376c7cec)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This hopefully makes the test less flakey (or removes any flake that
would be caused by the test itself).
1. Adds tail of cluster daemon logs when there is a test failure so we
can more easily see what may be happening
2. Scans the daemon logs to check if the key is rotated before
restarting the daemon. This is a little hacky but a little better
than assuming it is done after a hard-coded 3 seconds.
3. Cleans up the `node ls` check such that it uses a poll function
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit fbdc02534a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All underlying jobs inherit from the status of all parent jobs
in the tree, not just the very parent. We need to apply the same
kind of special condition.
Signed-off-by: CrazyMax <1951866+crazy-max@users.noreply.github.com>
(cherry picked from commit 0252a6f475)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Starting with [6e0ed3d19c54603f0f7d628ea04b550151d8a262], the minimum
allowed size is now 300MB. Given that this is a sparse image, and
the size of the image is irrelevant to the test (we check for
limits defined through project-quotas, not the size of the
device itself), we can raise the size of this image.
[6e0ed3d19c54603f0f7d628ea04b550151d8a262]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=6e0ed3d19c54603f0f7d628ea04b550151d8a262
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9709b7e458)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In some cases, when the daemon launched by a test panics and quits, the
cleanup code would end with an error when trying to kill it by its pid.
In those cases the whole suite will end up waiting for the daemon that
we start in .integration-daemon-start to finish and we end up waiting 2
hours for the CI to cancel after a timeout.
Using process substitution makes the integration tests quit.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
(cherry picked from commit 3d8b8dc09a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test considered `Foo/bar` to be an invalid name, with the assumption
that it was `[docker.io]/Foo/bar`. However, this was incorrect, and the
test passed because the reference parsing had a bug; if the first element
(`Foo`) is not lowercase (so not a valid namespace / "path element"), then
it *should* be considered a domain (as uppercase domain names are valid).
The reference parser did not account for this, and running the test with
a version of the parser with a fix caused the test to fail:
=== Failed
=== FAIL: client TestImageTagInvalidSourceImageName/invalidRepo/FOO/bar (0.00s)
image_tag_test.go:54: assertion failed: expected error to contain "not a valid repository/tag", got "Error response from daemon: client should not have made an API call"
Error response from daemon: client should not have made an API call
=== FAIL: client TestImageTagInvalidSourceImageName (0.00s)
This patch removes the faulty test-case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c243efb0cd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was testing the client-side validation, so might as well
move it there, and validate that the client invalidates before
trying to make an API call.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 3d3ce9812f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 71da8c13e1)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We no longer have any arm (not 64) CI.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
(cherry picked from commit 159b168eea)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: d3e6c1360f...435cb77e36
The 0.11 branch of buildkit defaults to go1.19 (EOL), and
Alpine 3.17 (EOL).
We already set GO_VERSION to override the go version to
use go1.20, but the Dockerfile also has a ALPINE_VERSION
build-arg, so let's override that as well to prevent the
build from failing:
Dockerfile:39
--------------------
37 |
38 | # go base image
39 | >>> FROM --platform=$BUILDPLATFORM golang:${GO_VERSION}-alpine${ALPINE_VERSION} AS golatest
40 |
41 | # git stage is used for checking out remote repository sources
--------------------
ERROR: failed to solve: golang:1.20.13-alpine3.17: docker.io/library/golang:1.20.13-alpine3.17: not found
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>