full diff: 48dd89375d...6341884e5f
Pulls in a set of fixes to SwarmKit's nascent Cluster Volumes support
discovered during subsequent development and testing.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Deleting a containerd task whose status is Created fails with a
"precondition failed" error. This is because (aside from Windows)
a process is spawned when the task is created, and deleting the task
while the process is running would leak the process if it was allowed.
libcontainerd and the containerd plugin executor mistakenly try to clean
up from a failed start by deleting the created task, which will always
fail with the aforementined error. Change them to pass the
`WithProcessKill` delete option so the cleanup has a chance to succeed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Currently an attempt to pull a reference which resolves to an OCI
artifact (Helm chart for example), results in a bit unrelated error
message `invalid rootfs in image configuration`.
This provides a more meaningful error in case a user attempts to
download a media type which isn't image related.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This fallback is used when we filter the manifest list by the user-provided platform and find no matches such that we match the previous Docker behavior (before it supported variant matching). This has been deprecated long enough that I think it's time we finally stop supporting this weird fallback, especially since it makes for buggy behavior like `docker pull --platform linux/arm/v5 alpine:3.16` leading to a `linux/arm/v6` image being pulled (I specified a variant, every manifest list entry specifies a variant, so clearly the only behavior I as a user could reasonably expect is an error that `linux/arm/v5` is not supported, but instead I get an explicitly incompatible image despite doing everything I as a user can to prevent that situation).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This debug message already includes a full platform string, so this ends up being something like `linux/arm/v7/amd64` in the end result.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
On Windows, syscall.StartProcess and os/exec.Cmd did not properly
check for invalid environment variable values. A malicious
environment variable value could exploit this behavior to set a
value for a different environment variable. For example, the
environment variable string "A=B\x00C=D" set the variables "A=B" and
"C=D".
Thanks to RyotaK (https://twitter.com/ryotkak) for reporting this
issue.
This is CVE-2022-41716 and Go issue https://go.dev/issue/56284.
This Go release also fixes https://github.com/golang/go/issues/56309, a
runtime bug which can cause random memory corruption when a goroutine
exits with runtime.LockOSThread() set. This fix is necessary to unblock
work to replace certain uses of pkg/reexec with unshared OS threads.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `execCmd()` utility was a basic wrapper around `exec.Command()`. Inlining it
makes the code more transparent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the example actually do something, and include the output, so that it
shows up in the documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some of these tests are failing (but not enabled in CI), but the current output
doesn't provide any details on the failure, so this patch is just to improve the
test output to allow debugging the actual failure.
Before this, tests would fail like:
make BIND_DIR=. TEST_FILTER=TestPluginInstallImage test-integration
...
=== FAIL: amd64.integration-cli TestDockerPluginSuite/TestPluginInstallImage (15.22s)
docker_cli_plugins_test.go:220: assertion failed: expression is false: strings.Contains(out, `Encountered remote "application/vnd.docker.container.image.v1+json"(image) when fetching`)
--- FAIL: TestDockerPluginSuite/TestPluginInstallImage (15.22s)
With this patch, tests provide more useful output:
make BIND_DIR=. TEST_FILTER=TestPluginInstallImage test-integration
...
=== FAIL: amd64.integration-cli TestDockerPluginSuite/TestPluginInstallImage (1.15s)
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json"
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json" digest="sha256:bee151d3fef5c1f787e7846efe4fa42b25a02db4e7543e54e8c679cf19d78598"
mediatype=application/vnd.docker.plugin.v1+json size=522
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json"
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json" digest="sha256:bee151d3fef5c1f787e7846efe4fa42b25a02db4e7543e54e8c679cf19d78598"
mediatype=application/vnd.docker.plugin.v1+json size=522
docker_cli_plugins_test.go:221: assertion failed: string "Error response from daemon: application/vnd.docker.distribution.manifest.v1+prettyjws not supported\n" does not contain "Encountered remote
\"application/vnd.docker.container.image.v1+json\"(image) when fetching"
--- FAIL: TestDockerPluginSuite/TestPluginInstallImage (1.15s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The ParseLink() function has special handling for legacy formats;
> This is kept because we can actually get a HostConfig with links
> from an already created container and the format is not `foo:bar`
> but `/foo:/c1/bar`
This patch adds a test-case for this format. While updating, also switching
to use gotest.tools assertions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The new daemon.containerFSView type covers all the use-cases on Linux
with a much more intuitive API, but is not portable to Windows.
Discourage people from using the old and busted functions in new Linux
code by excluding them entirely from Linux builds.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Mounting a container's volumes under its rootfs directory inside the
host mount namespace causes problems with cross-namespace mount
propagation when /var/lib/docker is bind-mounted into the container as a
volume. The mount event propagates into the container's mount namespace,
overmounting the volume, but the propagated unmount events do not fully
reverse the effect. Each archive operation causes the mount table in the
container's mount namespace to grow larger and larger, until the kernel
limiton the number of mounts in a namespace is hit. The only solution to
this issue which is not subject to race conditions or other blocker
caveats is to avoid mounting volumes into the container's rootfs
directory in the host mount namespace in the first place.
Mount the container volumes inside an unshared mount namespace to
prevent any mount events from propagating into any other mount
namespace. Greatly simplify the archiving implementations by also
chrooting into the container rootfs to sidestep the need to resolve
paths in the host.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing archive implementation is not easy to reason about by
reading the source. Prepare to rewrite it by covering more edge cases in
tests. The new test cases were determined by black-box characterizing
the existing behaviour.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It is only applicable to Windows so it does not need to be called from
platform-generic code. Fix locking in the Windows implementation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Linux implementation needs to diverge significantly from the Windows
one in order to fix platform-specific bugs. Cut the generic
implementation out of daemon/archive.go and paste identical, verbatim
copies of that implementation into daemon/archive_{windows,linux}.go to
make it easier to compare the progression of changes to the respective
implementations through Git history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This change was introduced early in the development of rootless support,
before all the kinks were worked out and rootlesskit was built. The
author was testing the daemon by inside a user namespace set up by runc,
observed that the unshare(2) syscall was returning EPERM, and assumed
that it was a fundamental limitation of user namespaces. Seeing as the
kernel documentation (of today) disagrees with that assessment and that
unshare demonstrably works inside user namespaces, I can only assume
that the EPERM was due to a quirk of their test environment, such as a
seccomp filter set up by runc blocking the unshare syscall.
https://github.com/moby/moby/pull/20902#issuecomment-236409406
Mount namespaces are necessary to address #38995 and #43390. Revert the
special-casing so those issues can also be fixed for rootless daemons.
This reverts commit dc950567c1.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Unshare the thread's file system attributes and, if applicable, mount
namespace so that the chroot operation does not affect the rest of the
process.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The applyLayer implementation in pkg/chrootarchive has to set the TMPDIR
environment variable so that archive.UnpackLayer() can successfully
create the whiteout-file temp directory. Change UnpackLayer to create
the temporary directory under the destination path so that environment
variables do not need to be touched.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fix tries to address issues raised in #44346.
The max-concurrent-downloads and max-concurrent-uploads limits are applied for the whole engine and not for each pull/push command.
Signed-off-by: Luis Henrique Mulinari <luis.mulinari@gmail.com>