The platform comparison was backported from the branch that vendors
containerd 1.7.
In this branch the vendored containerd version is older and doesn't have
the same comparison logic for Windows specific OSVersion.
Require both major and minor components of Windows OSVersion to match.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit b3888ed899)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59
futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d69729e053)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af
futex: Add sys_futex_wait()
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 10d344d176)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722
futex: Add sys_futex_requeue()
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit df57a080b6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e
x86/shstk: Introduce map_shadow_stack syscall
When operating with shadow stacks enabled, the kernel will automatically
allocate shadow stacks for new threads, however in some cases userspace
will need additional shadow stacks. The main example of this is the
ucontext family of functions, which require userspace allocating and
pivoting to userspace managed stacks.
Unlike most other user memory permissions, shadow stacks need to be
provisioned with special data in order to be useful. They need to be setup
with a restore token so that userspace can pivot to them via the RSTORSSP
instruction. But, the security design of shadow stacks is that they
should not be written to except in limited circumstances. This presents a
problem for userspace, as to how userspace can provision this special
data, without allowing for the shadow stack to be generally writable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8826f402f9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07
fs: Add fchmodat2()
On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].
There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.
The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6f242f1a28)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4d0d5ee10d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25
mm/mempolicy: add set_mempolicy_home_node syscall
This syscall can be used to set a home node for the MPOL_BIND and
MPOL_PREFERRED_MANY memory policy. Users should use this syscall after
setting up a memory policy for the specified range as shown below.
mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
new_nodes->size + 1, 0);
sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
home_node, 0);
The syscall allows specifying a home node/preferred node from which
kernel will fulfill memory allocation requests first.
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1251982cf7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 2c01d53d96)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The compatibility depends on whether `hyperv` or `process` container
isolation is used.
This fixes cache not being used when building images based on older
Windows versions on a newer Windows host.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 91ea04089b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.
Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:
```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update the containerd binary that's used in CI to align with the version used
for Linux. This was missed in d6abda0710.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ecb217cf69)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previously this was done indirectly - the `compare` function didn't
check the `ArgsEscaped`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 96d461d27e)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 44e6f3da60)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make sure the cache candidate platform matches the requested.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 877ebbe038)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 8a19bb7193)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Restrict cache candidates only to images that were built locally.
This doesn't affect builds using `--cache-from`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 96ac22768a)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 17af50f46b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Store additional image property which makes it possible to distinguish
if image was built locally.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit c6156dc51b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ffb63c0bae)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add checks for some image config fields that were missing.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 537348763f)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 593b754d8f)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 585d74bad1)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
go1.20.12 (released 2023-12-05) includes security fixes to the go command,
and the net/http and path/filepath packages, as well as bug fixes to the
compiler and the go command. See the Go 1.20.12 milestone on our issue
tracker for details.
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.12+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.11...go1.20.12
from the security mailing:
[security] Go 1.21.5 and Go 1.20.12 are released
Hello gophers,
We have just released Go versions 1.21.5 and 1.20.12, minor point releases.
These minor releases include 3 security fixes following the security policy:
- net/http: limit chunked data overhead
A malicious HTTP sender can use chunk extensions to cause a receiver
reading from a request or response body to read many more bytes from
the network than are in the body.
A malicious HTTP client can further exploit this to cause a server to
automatically read a large amount of data (up to about 1GiB) when a
handler fails to read the entire body of a request.
Chunk extensions are a little-used HTTP feature which permit including
additional metadata in a request or response body sent using the chunked
encoding. The net/http chunked encoding reader discards this metadata.
A sender can exploit this by inserting a large metadata segment with
each byte transferred. The chunk reader now produces an error if the
ratio of real body to encoded bytes grows too small.
Thanks to Bartek Nowotarski for reporting this issue.
This is CVE-2023-39326 and Go issue https://go.dev/issue/64433.
- cmd/go: go get may unexpectedly fallback to insecure git
Using go get to fetch a module with the ".git" suffix may unexpectedly
fallback to the insecure "git://" protocol if the module is unavailable
via the secure "https://" and "git+ssh://" protocols, even if GOINSECURE
is not set for said module. This only affects users who are not using
the module proxy and are fetching modules directly (i.e. GOPROXY=off).
Thanks to David Leadbeater for reporting this issue.
This is CVE-2023-45285 and Go issue https://go.dev/issue/63845.
- path/filepath: retain trailing \ when cleaning paths like \\?\c:\
Go 1.20.11 and Go 1.21.4 inadvertently changed the definition of the
volume name in Windows paths starting with \\?\, resulting in
filepath.Clean(\\?\c:\) returning \\?\c: rather than \\?\c:\ (among
other effects). The previous behavior has been restored.
This is an update to CVE-2023-45283 and Go issue https://go.dev/issue/64028.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.11 (released 2023-11-07) includes security fixes to the path/filepath
package, as well as bug fixes to the linker and the net/http package. See the
Go 1.20.11 milestone on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.11+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.10...go1.20.11
from the security mailing:
[security] Go 1.21.4 and Go 1.20.11 are released
Hello gophers,
We have just released Go versions 1.21.4 and 1.20.11, minor point releases.
These minor releases include 2 security fixes following the security policy:
- path/filepath: recognize `\??\` as a Root Local Device path prefix.
On Windows, a path beginning with `\??\` is a Root Local Device path equivalent
to a path beginning with `\\?\`. Paths with a `\??\` prefix may be used to
access arbitrary locations on the system. For example, the path `\??\c:\x`
is equivalent to the more common path c:\x.
The filepath package did not recognize paths with a `\??\` prefix as special.
Clean could convert a rooted path such as `\a\..\??\b` into
the root local device path `\??\b`. It will now convert this
path into `.\??\b`.
`IsAbs` did not report paths beginning with `\??\` as absolute.
It now does so.
VolumeName now reports the `\??\` prefix as a volume name.
`Join(`\`, `??`, `b`)` could convert a seemingly innocent
sequence of path elements into the root local device path
`\??\b`. It will now convert this to `\.\??\b`.
This is CVE-2023-45283 and https://go.dev/issue/63713.
- path/filepath: recognize device names with trailing spaces and superscripts
The `IsLocal` function did not correctly detect reserved names in some cases:
- reserved names followed by spaces, such as "COM1 ".
- "COM" or "LPT" followed by a superscript 1, 2, or 3.
`IsLocal` now correctly reports these names as non-local.
This is CVE-2023-45284 and https://go.dev/issue/63713.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 18e322bc7c)
Signed-off-by: Cory Snider <csnider@mirantis.com>
When reading logs, timestamps should always be presented in UTC. Unlike
the "json-file" and other logging drivers, the "local" logging driver
was using local time.
Thanks to Roman Valov for reporting this issue, and locating the bug.
Before this change:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
docker logs --timestamps fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
2023-12-08T18:16:56.291023422+01:00 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T18:16:56.291056463+01:00 /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T18:16:56.291890130+01:00 /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
With this patch:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
docker logs --timestamps 14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
2023-12-08T17:18:46.635967625Z /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T17:18:46.635989792Z /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T17:18:46.636897417Z /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit afe281964d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 5fa4cfcabf)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>