Commit graph

46376 commits

Author SHA1 Message Date
Paweł Gronowski
016ad9b3e8
c8d/prune: Handle containers started from image id
If an image is only by id instead of its name, don't prune it
completely. but only untag it and create a dangling image for it.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit e638351ef9)
Resolved conflicts:
	daemon/containerd/image_prune.go
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:34:09 -06:00
Paweł Gronowski
87778af711
c8d/prune: Exclude dangling tag of the images used by containers
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit a93298d4db)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:31:47 -06:00
Paweł Gronowski
8bf037b246
c8d/softDelete: Deep copy Labels
So we don't override the original Labels in the passed image object.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit a6d5db3f9b)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:31:45 -06:00
Paweł Gronowski
8afe75ffa9
c8d/softDelete: Extract ensureDanglingImage
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 2b0655a71a)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:31:44 -06:00
Paweł Gronowski
e2bade43e7
testutil/environment: Add GetTestDanglingImageId
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit a96e6044cc)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:31:43 -06:00
Sebastiaan van Stijn
e0091d6616
c8d: ImageService.softImageDelete: rename var that collided with import
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f17c9e4aeb)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:31:34 -06:00
Sebastiaan van Stijn
42f3f7ed86
c8d: ImageService.softImageDelete: use OCI and containerd constants
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit df5deab20b)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-30 10:31:33 -06:00
Sebastiaan van Stijn
aace62f6d3
pkg/fileutils: GetTotalUsedFds(): use fast-path for Kernel 6.2 and up
Linux 6.2 and up (commit [f1f1f2569901ec5b9d425f2e91c09a0e320768f3][1])
provides a fast path for the number of open files for the process.

From the [Linux docs][2]:

> The number of open files for the process is stored in 'size' member of
> `stat()` output for /proc/<pid>/fd for fast access.

[1]: f1f1f25699
[2]: https://docs.kernel.org/filesystems/proc.html#proc-pid-fd-list-of-symlinks-to-open-files

This patch adds a fast-path for Kernels that support this, and falls back
to the slow path if the Size fields is zero.

Comparing on a Fedora 38 (kernel 6.2.9-300.fc38.x86_64):

Before/After:

    go test -bench ^BenchmarkGetTotalUsedFds$ -run ^$ ./pkg/fileutils/
    BenchmarkGetTotalUsedFds        57264     18595 ns/op     408 B/op      10 allocs/op
    BenchmarkGetTotalUsedFds       370392      3271 ns/op      40 B/op       3 allocs/op

Note that the slow path has 1 more file-descriptor, due to the open
file-handle for /proc/<pid>/fd during the calculation.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ec79d0fc05)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-30 18:30:34 +02:00
Sebastiaan van Stijn
bb50485dfd
pkg/fileutils: GetTotalUsedFds: reduce allocations
Use File.Readdirnames instead of os.ReadDir, as we're only interested in
the number of files, and results don't have to be sorted.

Before:

    BenchmarkGetTotalUsedFds-5   	  149272	      7896 ns/op	     945 B/op	      20 allocs/op

After:

    BenchmarkGetTotalUsedFds-5   	  153517	      7644 ns/op	     408 B/op	      10 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit eaa9494b71)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-30 18:30:24 +02:00
Sebastiaan van Stijn
5dcea89ce1
pkg/fileutils: add BenchmarkGetTotalUsedFds
go test -bench ^BenchmarkGetTotalUsedFds$ -run ^$ ./pkg/fileutils/
    goos: linux
    goarch: arm64
    pkg: github.com/docker/docker/pkg/fileutils
    BenchmarkGetTotalUsedFds-5   	  149272	      7896 ns/op	     945 B/op	      20 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 03390be5fa)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-30 18:29:27 +02:00
Sebastiaan van Stijn
01eb4835c9
pkg/fileutils: GetTotalUsedFds(): don't pretend to support FreeBSD
Commit 8d56108ffb moved this function from
the generic (no build-tags) fileutils.go to a unix file, adding "freebsd"
to the build-tags.

This likely was a wrong assumption (as other files had freebsd build-tags).
FreeBSD's procfs does not mention `/proc/<pid>/fd` in the manpage, and
we don't test FreeBSD in CI, so let's drop it, and make this a Linux-only
file.

While updating also dropping the import-tag, as we're planning to move
this file internal to the daemon.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 252e94f499)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-30 18:29:17 +02:00
Sebastiaan van Stijn
cd44aba8db
[24.0] pkg/fileutils: switch to use containerd log pkg
(very) partial backport of 74da6a6363
and ab35df454d

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-30 18:28:55 +02:00
Bjorn Neergaard
2435d75b89
Merge pull request #45849 from neersighted/backport/45816/24.0
[24.0 backport] c8d/images: handle images without manifests for default platform
2023-06-30 06:20:57 -06:00
Sebastiaan van Stijn
80d1e863f5
Merge pull request #45853 from thaJeztah/24.0_backport_gha_fix_missing_daemonjson
[24.0 backport] gha: don't fail if no daemon.json is present
2023-06-30 11:02:40 +02:00
Sebastiaan van Stijn
ee29fd944b
gha: don't fail if no daemon.json is present
CI failed sometimes if no daemon.json was present:

    Run sudo rm /etc/docker/daemon.json
    sudo rm /etc/docker/daemon.json
    sudo service docker restart
    docker version
    docker info
    shell: /usr/bin/bash -e {0}
    env:
    DESTDIR: ./build
    BUILDKIT_REPO: moby/buildkit
    BUILDKIT_TEST_DISABLE_FEATURES: cache_backend_azblob,cache_backend_s3,merge_diff
    BUILDKIT_REF: 798ad6b0ce9f2fe86dfb2b0277e6770d0b545871
    rm: cannot remove '/etc/docker/daemon.json': No such file or directory
    Error: Process completed with exit code 1.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 264dbad43a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-30 01:52:49 +02:00
Laura Brehm
b8ee9a7829
c8d/images: handle images without manifests for default platform
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
(cherry picked from commit 6d3bcd8017)
Resolved conflicts:
	daemon/containerd/image.go
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-29 13:43:19 -06:00
Sebastiaan van Stijn
d9e097e328
vendor: github.com/opencontainers/image-spec v1.1.0-rc3
full diff: https://github.com/opencontainers/image-spec/compare/3a7f492d3f1b...v1.1.0-rc3

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b42e367045)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-29 13:43:18 -06:00
Brian Goff
2bef272269
Merge pull request #45824 from thaJeztah/24.0_backport_fix_live_restore_local_vol_mounts
[24.0 backport] Restore active mount counts on live-restore
2023-06-28 14:27:11 -07:00
Sebastiaan van Stijn
3f9d07570a
Merge pull request #45833 from neersighted/backport/45766/24.0
[24.0 backport] seccomp: always allow name_to_handle_at(2)
2023-06-28 18:34:09 +02:00
Bjorn Neergaard
806849eb62
seccomp: add name_to_handle_at to allowlist
Based on the analysis on [the previous PR][1].

  [1]: https://github.com/moby/moby/pull/45766#pullrequestreview-1493908145

Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit b335e3d305)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-28 05:46:56 -06:00
Brian Goff
c24c37bd8a
Restore active mount counts on live-restore
When live-restoring a container the volume driver needs be notified that
there is an active mount for the volume.
Before this change the count is zero until the container stops and the
uint64 overflows pretty much making it so the volume can never be
removed until another daemon restart.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 647c2a6cdd)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-28 09:45:07 +02:00
Vitor Anjos
c306276ab1
remove name_to_handle_at(2) from filtered syscalls
Signed-off-by: Vitor Anjos <bartier@users.noreply.github.com>
(cherry picked from commit fdc9b7cceb)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
2023-06-27 13:20:23 -06:00
Sebastiaan van Stijn
6eb4d7f33b
Merge pull request #45829 from thaJeztah/24.0_backport_sudo_tee
[24.0 backport] gha: Setup Runner: add missing sudo
2023-06-27 16:36:43 +02:00
Sebastiaan van Stijn
186eb805f6
Merge pull request #45827 from thaJeztah/24.0_backport_dockerfile_more_resilient
[24.0 backport] Dockerfile: make cli stages more resilient against unclean termination
2023-06-27 16:09:15 +02:00
Sebastiaan van Stijn
d5e31e03b6
gha: Setup Runner: add missing sudo
I think this may be missing a sudo (as all other operations do use
sudo to access daemon.json);

    Run if [ ! -e /etc/docker/daemon.json ]; then
      if [ ! -e /etc/docker/daemon.json ]; then
       echo '{}' | tee /etc/docker/daemon.json >/dev/null
      fi
      DOCKERD_CONFIG=$(jq '.+{"experimental":true,"live-restore":true,"ipv6":true,"fixed-cidr-v6":"2001:db8:1::/64"}' /etc/docker/daemon.json)
      sudo tee /etc/docker/daemon.json <<<"$DOCKERD_CONFIG" >/dev/null
      sudo service docker restart
      shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
      env:
        GO_VERSION: 1.20.5
        GOTESTLIST_VERSION: v0.3.1
        TESTSTAT_VERSION: v0.1.3
        ITG_CLI_MATRIX_SIZE: 6
        DOCKER_EXPERIMENTAL: 1
        DOCKER_GRAPHDRIVER: overlay2
    tee: /etc/docker/daemon.json: Permission denied
    Error: Process completed with exit code 1.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d8bc5828cd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-27 15:40:46 +02:00
Sebastiaan van Stijn
85ad299668
Dockerfile: make cli stages more resilient against unclean termination
The Dockerfile in this repository performs many stages in parallel. If any of
those stages fails to build (which could be due to networking congestion),
other stages are also (forcibly?) terminated, which can cause an unclean
shutdown.

In some case, this can cause `git` to be terminated, leaving a `.lock` file
behind in the cache mount. Retrying the build now will fail, and the only
workaround is to clean the build-cache (which causes many stages to be
built again, potentially triggering the problem again).

     > [dockercli-integration 3/3] RUN --mount=type=cache,id=dockercli-integration-git-linux/arm64/v8,target=./.git     --mount=type=cache,target=/root/.cache/go-build,id=dockercli-integration-build-linux/arm64/v8     /download-or-build-cli.sh v17.06.2-ce https://github.com/docker/cli.git /build:
    #0 1.575 fatal: Unable to create '/go/src/github.com/docker/cli/.git/shallow.lock': File exists.
    #0 1.575
    #0 1.575 Another git process seems to be running in this repository, e.g.
    #0 1.575 an editor opened by 'git commit'. Please make sure all processes
    #0 1.575 are terminated then try again. If it still fails, a git process
    #0 1.575 may have crashed in this repository earlier:
    #0 1.575 remove the file manually to continue.

This patch:

- Updates the Dockerfile to remove `.lock` files (`shallow.lock`, `index.lock`)
  that may have been left behind from previous builds. I put this code in the
  Dockerfile itself (not the script), as the script may be used in other
  situations outside of the Dockerfile (for which we cannot guarantee no other
  git session is active).
- Adds a `docker --version` step to the stage; this is mostly to verify the
  build was successful (and to be consistent with other stages).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9f6dbbc7ea)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-27 14:48:39 +02:00
Sebastiaan van Stijn
4735ce7ff2
Merge pull request #45822 from thaJeztah/24.0_backport_containerd-from-scratch
[24.0 backport] Skip cache lookup for "FROM scratch" in containerd
2023-06-27 14:14:50 +02:00
Tianon Gravi
e84365f967
Skip cache lookup for "FROM scratch" in containerd
Ideally, this should actually do a lookup across images that have no parent, but I wasn't 100% sure how to accomplish that so I opted for the smaller change of having `FROM scratch` builds not be cached for now.

Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
(cherry picked from commit 1741771b67)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-27 11:51:44 +02:00
Sebastiaan van Stijn
5899e935d4
Merge pull request #45813 from thaJeztah/24.0_backport_no_homedir
[24.0 backport] integration-cli: don't use pkg/homedir in test
2023-06-26 15:11:21 +02:00
Bjorn Neergaard
4d5f1d6bbc
Merge pull request #45811 from thaJeztah/24.0_backport_update_buildx_0.11
[24.0 backport] Dockerfile: update buildx to v0.11.0
2023-06-26 06:05:58 -06:00
Sebastiaan van Stijn
96534f015d
integration-cli: don't use pkg/homedir in test
I'm considering deprecating the "Key()" utility, as it was only
used in tests.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0215a62d5b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-26 13:42:18 +02:00
Sebastiaan van Stijn
49e24566d0
Merge pull request #45810 from thaJeztah/24.0_backport_fix-missing-csi-topology
[24.0 backport] Fix missing Topology in NodeCSIInfo
2023-06-26 13:04:42 +02:00
Sebastiaan van Stijn
6424ae830b
Dockerfile: update buildx to v0.11.0
Update the version of buildx we use in the dev-container to v0.11.0;
https://github.com/docker/buildx/releases/tag/v0.11.0

Full diff: https://github.com/docker/buildx/compare/v0.10.5..v0.11.0

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4d831949a7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-26 12:10:59 +02:00
Drew Erny
6055b07292
Fix missing Topology in NodeCSIInfo
Added code to correctly retrieve and convert the Topology from the gRPC
Swarm Node.

Signed-off-by: Drew Erny <derny@mirantis.com>
(cherry picked from commit cdb1293eea)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-26 11:01:28 +02:00
Sebastiaan van Stijn
98518e0734
Merge pull request #45801 from thaJeztah/24.0_backport_fix-45788-restore-exit-status
[24.0 backport] daemon: fix restoring container with missing task
2023-06-23 22:20:53 +02:00
Cory Snider
2f379ecfd6
daemon: fix restoring container with missing task
Before 4bafaa00aa, if the daemon was
killed while a container was running and the container shim is killed
before the daemon is restarted, such as if the host system is
hard-rebooted, the daemon would restore the container to the stopped
state and set the exit code to 255. The aforementioned commit introduced
a regression where the container's exit code would instead be set to 0.
Fix the regression so that the exit code is once against set to 255 on
restore.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 165dfd6c3e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 18:21:29 +02:00
Sebastiaan van Stijn
8b61625a5e
Merge pull request #45790 from crazy-max/24.0_backport_fix-host-gateway
[24.0 backport] builder: pass host-gateway IP as worker label
2023-06-23 09:34:21 +02:00
Bjorn Neergaard
575d03df66
Merge pull request #45798 from thaJeztah/24.0_backport_fix-health-probe-double-unlock
[24.0 backport] daemon: fix double-unlock in health check probe
2023-06-22 18:38:06 -06:00
Bjorn Neergaard
a13eea29fb
Merge pull request #45794 from thaJeztah/24.0_backport_fix_45770_processevent_nil_check
[24.0 backport] daemon: fix panic on failed exec start
2023-06-22 18:17:59 -06:00
Cory Snider
136893e33b
daemon: fix double-unlock in health check probe
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 786c9adaa2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 00:59:41 +02:00
Cory Snider
290fc0440c
daemon: fix panic on failed exec start
If an exec fails to start in such a way that containerd publishes an
exit event for it, daemon.ProcessEvent will race
daemon.ContainerExecStart in handling the failure. This race has been a
long-standing bug, which was mostly harmless until
4bafaa00aa. After that change, the daemon
would dereference a nil pointer and crash if ProcessEvent won the race.
Restore the status quo buggy behaviour by adding a check to skip the
dereference if execConfig.Process is nil.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 3b28a24e97)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-22 23:25:44 +02:00
Sebastiaan van Stijn
0556ba23a4
daemon: handleContainerExit(): use logrus.WithFields
Use `WithFields()` instead of chaining multiple `WithField()` calls.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit de363f1404)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-22 23:25:38 +02:00
CrazyMax
35a29c7328
builder: pass host-gateway IP as worker label
We missed a case when parsing extra hosts from the dockerfile
frontend so the build fails.

To handle this case we need to set a dedicated worker label
that contains the host gateway IP so clients like Buildx
can just set the proper host:ip when parsing extra hosts
that contain the special string "host-gateway".

Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 21e50b89c9)
2023-06-22 16:23:40 +02:00
Sebastiaan van Stijn
6bca2bf3bf
Merge pull request #45746 from thaJeztah/24.0_backport_fix_zeroes_in_linux_resources
[24.0 backport] daemon: stop setting container resources to zero
2023-06-22 14:07:35 +02:00
Cory Snider
210c4d6f4b
daemon: ensure OCI options play nicely together
Audit the OCI spec options used for Linux containers to ensure they are
less order-dependent. Ensure they don't assume that any pointer fields
are non-nil and that they don't unintentionally clobber mutations to the
spec applied by other options.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 8a094fe609)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-21 22:16:28 +02:00
Cory Snider
f50cb0c7bd
daemon: stop setting container resources to zero
Many of the fields in LinuxResources struct are pointers to scalars for
some reason, presumably to differentiate between set-to-zero and unset
when unmarshaling from JSON, despite zero being outside the acceptable
range for the corresponding kernel tunables. When creating the OCI spec
for a container, the daemon sets the container's OCI spec CPUShares and
BlkioWeight parameters to zero when the corresponding Docker container
configuration values are zero, signifying unset, despite the minimum
acceptable value for CPUShares being two, and BlkioWeight ten. This has
gone unnoticed as runC does not distingiush set-to-zero from unset as it
also uses zero internally to represent unset for those fields. However,
kata-containers v3.2.0-alpha.3 tries to apply the explicit-zero resource
parameters to the container, exactly as instructed, and fails loudly.
The OCI runtime-spec is silent on how the runtime should handle the case
when those parameters are explicitly set to out-of-range values and
kata's behaviour is not unreasonable, so the daemon must therefore be in
the wrong.

Translate unset values in the Docker container's resources HostConfig to
omit the corresponding fields in the container's OCI spec when starting
and updating a container in order to maximize compatibility with
runtimes.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit dea870f4ea)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-21 22:16:22 +02:00
Cory Snider
0a6a5a9140
daemon: modernize oci_linux_test.go
Switch to using t.TempDir() instead of rolling our own.

Clean up mounts leaked by the tests as otherwise the tests fail due to
the leaked mounts because unlike the old cleanup code, t.TempDir()
cleanup does not ignore errors from os.RemoveAll.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 9ff169ccf4)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-21 22:14:47 +02:00
Sebastiaan van Stijn
f3743766e9
Merge pull request #45785 from vvoland/busybox-5007-24 2023-06-21 22:13:49 +02:00
Akihiro Suda
e6a7df0e00
Merge pull request #45786 from neersighted/backport/45781/24.0
[24.0 backport] c8d: mark stargz as requiring reference-counted mounts
2023-06-22 01:17:07 +09:00
Akihiro Suda
d3c5b613ac
Merge pull request #45703 from thaJeztah/24.0_backport_bump_swarmkit
[24.0 backport] vendor: github.com/moby/swarmkit/v2 v2.0.0-20230531205928-01bb7a41396b
2023-06-22 00:38:40 +09:00