Commit graph

45209 commits

Author SHA1 Message Date
Paweł Gronowski
ab34280ac3
Merge pull request #47699 from vvoland/v23.0-47658
[23.0 backport] Fix cases where we are wrapping a nil error
2024-04-09 13:55:11 +02:00
Brian Goff
8f67ed81aa
Fix cases where we are wrapping a nil error
This was using `errors.Wrap` when there was no error to wrap, meanwhile
we are supposed to be creating a new error.

Found this while investigating some log corruption issues and
unexpectedly getting a nil reader and a nil error from `getTailReader`.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 0a48d26fbc)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-04-09 10:21:36 +02:00
Bjorn Neergaard
bed0abf9ca
Merge pull request #47593 from corhere/backport-23.0/libnet-resolver-nxdomain
[23.0 backport] libnet: Don't forward to upstream resolvers on internal nw
2024-03-19 12:18:12 -06:00
Albin Kerouanton
f4657eae7d libnet: Don't forward to upstream resolvers on internal nw
This commit makes sure the embedded resolver doesn't try to forward to
upstream servers when a container is only attached to an internal
network.

Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
(cherry picked from commit 790c3039d0)
Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-03-19 12:50:35 -04:00
Albin Kerouanton
a379e026c9 integration: Add a new networking integration test suite
This commit introduces a new integration test suite aimed at testing
networking features like inter-container communication, network
isolation, port mapping, etc... and how they interact with daemon-level
and network-level parameters.

So far, there's pretty much no tests making sure our networks are well
configured: 1. there're a few tests for port mapping, but they don't
cover all use cases ; 2. there're a few tests that check if a specific
iptables rule exist, but that doesn't prevent that specific iptables
rule to be wrong in the first place.

As we're planning to refactor how iptables rules are written, and change
some of them to fix known security issues, we need a way to test all
combinations of parameters. So far, this was done by hand, which is
particularly painful and time consuming. As such, this new test suite is
foundational to upcoming work.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 409ea700c7)
Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-03-19 12:50:35 -04:00
Albin Kerouanton
673020119f integration: Add RunAttach helper
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 5bd8aa5246)
Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-03-18 16:38:35 -04:00
Sebastiaan van Stijn
1725cc5f92
Merge pull request #47535 from vvoland/v23.0-47530
[23.0 backport] volume: Don't decrement refcount below 0
2024-03-11 15:58:46 +01:00
Paweł Gronowski
d1aa20efb7
volume: Don't decrement refcount below 0
With both rootless and live restore enabled, there's some race condition
which causes the container to be `Unmount`ed before the refcount is
restored.

This makes sure we don't underflow the refcount (uint64) when
decrementing it.

The root cause of this race condition still needs to be investigated and
fixed, but at least this unflakies the `TestLiveRestore`.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 294fc9762e)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-03-08 12:56:41 +01:00
Sebastiaan van Stijn
e2ccfc19cc
Merge pull request #47529 from vvoland/v23.0-47523
[23.0 backport] builder-next: fix missing lock in ensurelayer
2024-03-07 13:24:55 +01:00
Tonis Tiigi
e35f6fbd08
builder-next: fix missing lock in ensurelayer
When this was called concurrently from the moby image
exporter there could be a data race where a layer was
written to the refs map when it was already there.

In that case the reference count got mixed up and on
release only one of these layers was actually released.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 37545cc644)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-03-07 12:33:26 +01:00
Sebastiaan van Stijn
64b5835e95
Merge pull request #47515 from vvoland/v23.0-47498
[23.0 backport] daemon: overlay2: remove world writable permission from the lower file
2024-03-06 15:19:26 +01:00
Jaroslav Jindrak
f2954d7622
daemon: overlay2: remove world writable permission from the lower file
In de2447c, the creation of the 'lower' file was changed from using
os.Create to using ioutils.AtomicWriteFile, which ignores the system's
umask. This means that even though the requested permission in the
source code was always 0666, it was 0644 on systems with default
umask of 0022 prior to de2447c, so the move to AtomicFile potentially
increased the file's permissions.

This is not a security issue because the parent directory does not
allow writes into the file, but it can confuse security scanners on
Linux-based systems into giving false positives.

Signed-off-by: Jaroslav Jindrak <dzejrou@gmail.com>
(cherry picked from commit cadb124ab6)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-03-06 13:14:18 +01:00
Sebastiaan van Stijn
548f37a132
Merge pull request #47339 from vvoland/cache-fix-older-windows-23
[23.0 backport] image/cache: Ignore Build and Revision on Windows
2024-02-16 17:00:24 +01:00
Akihiro Suda
5cc674870c
Merge pull request #47346 from thaJeztah/23.0_backport_seccomp_updates
[23.0 backport] profiles/seccomp: add syscalls for kernel v5.17 - v6.6, match containerd's profile
2024-02-07 19:47:54 +09:00
Paweł Gronowski
d10756f356
image/cache: Require Major and Minor match for Windows OSVersion
The platform comparison was backported from the branch that vendors
containerd 1.7.

In this branch the vendored containerd version is older and doesn't have
the same comparison logic for Windows specific OSVersion.

Require both major and minor components of Windows OSVersion to match.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit b3888ed899)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-02-06 17:52:16 +01:00
Sebastiaan van Stijn
20b867b732
seccomp: add futex_wake syscall (kernel v6.7, libseccomp v2.5.5)
Add this syscall to match the profile in containerd

containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59

    futex: Add sys_futex_wake()

    To complement sys_futex_waitv() add sys_futex_wake(). This syscall
    implements what was previously known as FUTEX_WAKE_BITSET except it
    uses 'unsigned long' for the bitmask and takes FUTEX2 flags.

    The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d69729e053)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:53 +01:00
Sebastiaan van Stijn
fe0619e49f
seccomp: add futex_wait syscall (kernel v6.7, libseccomp v2.5.5)
Add this syscall to match the profile in containerd

containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af

    futex: Add sys_futex_wait()

    To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
    syscall implements what was previously known as FUTEX_WAIT_BITSET
    except it uses 'unsigned long' for the value and bitmask arguments,
    takes timespec and clockid_t arguments for the absolute timeout and
    uses FUTEX2 flags.

    The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 10d344d176)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:53 +01:00
Sebastiaan van Stijn
68e7b988b1
seccomp: add futex_requeue syscall (kernel v6.7, libseccomp v2.5.5)
Add this syscall to match the profile in containerd

containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722

    futex: Add sys_futex_requeue()

    Finish off the 'simple' futex2 syscall group by adding
    sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
    too numerous to fit into a regular syscall. As such, use struct
    futex_waitv to pass the 'source' and 'destination' futexes to the
    syscall.

    This syscall implements what was previously known as FUTEX_CMP_REQUEUE
    and uses {val, uaddr, flags} for source and {uaddr, flags} for
    destination.

    This design explicitly allows requeueing between different types of
    futex by having a different flags word per uaddr.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit df57a080b6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:53 +01:00
Sebastiaan van Stijn
26d766450c
seccomp: add map_shadow_stack syscall (kernel v6.6, libseccomp v2.5.5)
Add this syscall to match the profile in containerd

containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e

    x86/shstk: Introduce map_shadow_stack syscall

    When operating with shadow stacks enabled, the kernel will automatically
    allocate shadow stacks for new threads, however in some cases userspace
    will need additional shadow stacks. The main example of this is the
    ucontext family of functions, which require userspace allocating and
    pivoting to userspace managed stacks.

    Unlike most other user memory permissions, shadow stacks need to be
    provisioned with special data in order to be useful. They need to be setup
    with a restore token so that userspace can pivot to them via the RSTORSSP
    instruction. But, the security design of shadow stacks is that they
    should not be written to except in limited circumstances. This presents a
    problem for userspace, as to how userspace can provision this special
    data, without allowing for the shadow stack to be generally writable.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8826f402f9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:53 +01:00
Sebastiaan van Stijn
c98179d3c7
seccomp: add fchmodat2 syscall (kernel v6.6, libseccomp v2.5.5)
Add this syscall to match the profile in containerd

containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07

    fs: Add fchmodat2()

    On the userspace side fchmodat(3) is implemented as a wrapper
    function which implements the POSIX-specified interface. This
    interface differs from the underlying kernel system call, which does not
    have a flags argument. Most implementations require procfs [1][2].

    There doesn't appear to be a good userspace workaround for this issue
    but the implementation in the kernel is pretty straight-forward.

    The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
    unlike existing fchmodat.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6f242f1a28)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:52 +01:00
Sebastiaan van Stijn
573ebdba6e
seccomp: add cachestat syscall (kernel v6.5, libseccomp v2.5.5)
Add this syscall to match the profile in containerd

containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329

    NAME
        cachestat - query the page cache statistics of a file.

    SYNOPSIS
        #include <sys/mman.h>

        struct cachestat_range {
            __u64 off;
            __u64 len;
        };

        struct cachestat {
            __u64 nr_cache;
            __u64 nr_dirty;
            __u64 nr_writeback;
            __u64 nr_evicted;
            __u64 nr_recently_evicted;
        };

        int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
            struct cachestat *cstat, unsigned int flags);

    DESCRIPTION
        cachestat() queries the number of cached pages, number of dirty
        pages, number of pages marked for writeback, number of evicted
        pages, number of recently evicted pages, in the bytes range given by
        `off` and `len`.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4d0d5ee10d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:52 +01:00
Sebastiaan van Stijn
b9f01eea45
seccomp: add set_mempolicy_home_node syscall (kernel v5.17, libseccomp v2.5.4)
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.

containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25

    mm/mempolicy: add set_mempolicy_home_node syscall
    This syscall can be used to set a home node for the MPOL_BIND and
    MPOL_PREFERRED_MANY memory policy.  Users should use this syscall after
    setting up a memory policy for the specified range as shown below.

      mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
            new_nodes->size + 1, 0);
      sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
                    home_node, 0);

    The syscall allows specifying a home node/preferred node from which
    kernel will fulfill memory allocation requests first.
    ...

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1251982cf7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 15:29:52 +01:00
Paweł Gronowski
a51e65b334
image/cache: Use Platform from ocispec
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 2c01d53d96)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-02-06 14:30:18 +01:00
Paweł Gronowski
347be7f928
image/cache: Ignore Build and Revision on Windows
The compatibility depends on whether `hyperv` or `process` container
isolation is used.
This fixes cache not being used when building images based on older
Windows versions on a newer Windows host.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 91ea04089b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-02-06 13:22:43 +01:00
Laura Brehm
58a26e4fda
Merge pull request #47325 from thaJeztah/23.0_backport_plugin-install-digest
[23.0 backport] plugins: Fix panic when fetching by digest
2024-02-05 13:47:13 +00:00
Laura Brehm
d5c9f26e3d
plugins: fix panic installing from repo w/ digest
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.

Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-05 10:12:40 +01:00
Laura Brehm
2dc6b50863
tests: add plugin install test w/ digest
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:

```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-05 10:12:34 +01:00
Sebastiaan van Stijn
79de67d989
Merge pull request from GHSA-xw73-rw38-6vjc
[23.0 backport] image/cache: Restrict cache candidates to locally built images
2024-02-01 01:12:23 +01:00
Sebastiaan van Stijn
43e3116aa7
Merge pull request #47284 from thaJeztah/23.0_bump_containerd_binary_1.6.28
[23.0] update containerd binary to v1.6.28
2024-02-01 00:33:37 +01:00
Sebastiaan van Stijn
b523cec06b
Merge pull request #47271 from thaJeztah/23.0_backport_bump_runc_binary_1.1.12
[23.0 backport] update runc binary to v1.1.12
2024-02-01 00:08:16 +01:00
Sebastiaan van Stijn
045dfe9e09
update containerd binary to v1.6.28
Update the containerd binary that's used in CI

- full diff: https://github.com/containerd/containerd/compare/v1.6.27...v1.6.28
- release notes: https://github.com/containerd/containerd/releases/tag/v1.6.28

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-31 23:42:13 +01:00
Sebastiaan van Stijn
1c867c02ef
Merge pull request #47261 from thaJeztah/23.0_bump_containerd_windows_binary_1.6.27
[23.0] Dockerfile: update containerd binary for windows to 1.6.27
2024-01-31 23:41:19 +01:00
Sebastiaan van Stijn
e6ff860b96
Merge pull request #47282 from aepifanov/23.0_backport_bump_runc_1.1.12
[23.0 backport] vendor: github.com/opencontainers/runc v1.1.12
2024-01-31 23:40:13 +01:00
Andrey Epifanov
76894512ab
vendor: github.com/opencontainers/runc v1.1.12
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.12
- full diff: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.12

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
2024-01-31 13:22:09 -08:00
Andrey Epifanov
e23be1cc51
vendor: github.com/cyphar/filepath-securejoin v0.2.4
full diff: https://github.com/cyphar/filepath-securejoin/compare/v0.2.3..v0.2.4

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
2024-01-31 13:19:19 -08:00
Sebastiaan van Stijn
46cfa2e34c
Dockerfile: update containerd binary for windows to 1.6.27
Update the containerd binary that's used in CI to align with the version used
for Linux. This was missed in d6abda0710.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-31 21:43:42 +01:00
Sebastiaan van Stijn
5fbbb68dad
Merge pull request #47267 from vvoland/ci-fix-makeps1-templatefail-23
[23.0 backport] hack/make.ps1: Fix go list pattern
2024-01-31 21:36:36 +01:00
Sebastiaan van Stijn
448e7bed7d
update runc binary to v1.1.12
Update the runc binary that's used in CI and for the static packages, which
includes a fix for [CVE-2024-21626].

- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.12
- full diff: https://github.com/opencontainers/runc/compare/v1.1.11...v1.1.12

[CVE-2024-21626]: https://github.com/opencontainers/runc/security/advisories/GHSA-xr7r-f8xq-vfvv

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 44bf407d4d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-31 21:06:43 +01:00
Paweł Gronowski
4bb779015a
hack/make.ps1: Fix go list pattern
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ecb217cf69)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-31 19:58:08 +01:00
Sebastiaan van Stijn
42d92e8903
Merge pull request #47226 from thaJeztah/23.0_update_containerd_binary
[23.0] Dockerfile: update containerd binary to v1.6.27
2024-01-25 21:40:50 +01:00
Paweł Gronowski
d5de9f7779
builder/windows: Don't set ArgsEscaped for RUN cache probe
Previously this was done indirectly - the `compare` function didn't
check the `ArgsEscaped`.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 96d461d27e)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 44e6f3da60)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-25 16:33:36 +01:00
Paweł Gronowski
bb03b5b86e
image/cache: Check image platform
Make sure the cache candidate platform matches the requested.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 877ebbe038)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 8a19bb7193)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-25 16:33:33 +01:00
Sebastiaan van Stijn
d6abda0710
Dockerfile: update containerd binary to v1.6.27
Update the binary version used in CI and for the static packages.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-25 16:31:10 +01:00
Paweł Gronowski
75c70b08b5
image/cache: Restrict cache candidates to locally built images
Restrict cache candidates only to images that were built locally.
This doesn't affect builds using `--cache-from`.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 96ac22768a)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 17af50f46b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-25 16:23:08 +01:00
Paweł Gronowski
8617cd0570
daemon/imageStore: Mark images built locally
Store additional image property which makes it possible to distinguish
if image was built locally.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit c6156dc51b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ffb63c0bae)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-25 16:23:06 +01:00
Paweł Gronowski
e9e21b6bf1
image/cache: Compare all config fields
Add checks for some image config fields that were missing.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 537348763f)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 593b754d8f)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-25 16:23:05 +01:00
Sebastiaan van Stijn
d363536dca
Merge pull request #47223 from vvoland/pkg-pools-close-noop-23
[23.0 backport] pkg/ioutils: Make subsequent Close attempts noop
2024-01-25 16:16:18 +01:00
Paweł Gronowski
43c2952a31
pkg/ioutils: Make subsequent Close attempts noop
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 585d74bad1)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-25 15:23:28 +01:00
Sebastiaan van Stijn
c0ee655462
Merge pull request #46900 from thaJeztah/23.0_update_golang_1.20.12
[23.0] update to go1.20.13
2024-01-20 14:52:48 +01:00
Sebastiaan van Stijn
751d19fd3c
update to go1.20.13
go1.20.13 (released 2024-01-09) includes fixes to the runtime and the crypto/tls
package. See the Go 1.20.13 milestone on our issue tracker for details:

- https://github.com/golang/go/issues?q=milestone%3AGo1.20.13+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.12...go1.20.13

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-19 12:12:23 +01:00