Commit graph

45099 commits

Author SHA1 Message Date
Sebastiaan van Stijn
aeafa2a28f
Merge pull request #44363 from luismulinari/fix_max_concurrent_downloads_uploads_docs
Fix the max-concurrent-downloads and max-concurrent-uploads configs documentation
2022-10-28 21:17:24 -04:00
Cory Snider
ad4073edc1 daemon: fix docs for config-default constants
Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-28 15:52:57 -04:00
Cory Snider
dcd6c1d2e2 container: make path resolution fns Windows-only
The new daemon.containerFSView type covers all the use-cases on Linux
with a much more intuitive API, but is not portable to Windows.
Discourage people from using the old and busted functions in new Linux
code by excluding them entirely from Linux builds.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-27 12:52:14 -04:00
Cory Snider
2bdc7fb0a1 daemon: archive in a dedicated mount namespace
Mounting a container's volumes under its rootfs directory inside the
host mount namespace causes problems with cross-namespace mount
propagation when /var/lib/docker is bind-mounted into the container as a
volume. The mount event propagates into the container's mount namespace,
overmounting the volume, but the propagated unmount events do not fully
reverse the effect. Each archive operation causes the mount table in the
container's mount namespace to grow larger and larger, until the kernel
limiton the number of mounts in a namespace is hit. The only solution to
this issue which is not subject to race conditions or other blocker
caveats is to avoid mounting volumes into the container's rootfs
directory in the host mount namespace in the first place.

Mount the container volumes inside an unshared mount namespace to
prevent any mount events from propagating into any other mount
namespace. Greatly simplify the archiving implementations by also
chrooting into the container rootfs to sidestep the need to resolve
paths in the host.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-27 12:52:14 -04:00
Cory Snider
7d23c50599 integration: test more copy edge-cases
The existing archive implementation is not easy to reason about by
reading the source. Prepare to rewrite it by covering more edge cases in
tests. The new test cases were determined by black-box characterizing
the existing behaviour.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:06:31 -04:00
Cory Snider
6750d1bac8 daemon: drop Windows-only code from archive_unix.go
Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:06:31 -04:00
Cory Snider
4fd91c3f37 daemon: refactor isOnlineFSOperationPermitted
It is only applicable to Windows so it does not need to be called from
platform-generic code. Fix locking in the Windows implementation.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:06:31 -04:00
Cory Snider
84cbe29d5b daemon: dupe the archive implementation
The Linux implementation needs to diverge significantly from the Windows
one in order to fix platform-specific bugs. Cut the generic
implementation out of daemon/archive.go and paste identical, verbatim
copies of that implementation into daemon/archive_{windows,linux}.go to
make it easier to compare the progression of changes to the respective
implementations through Git history.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:06:31 -04:00
Cory Snider
60ee6f739f Add reusable chroot and unshare utilities
Refactor pkg/chrootarchive in terms of those utilities.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:06:31 -04:00
Cory Snider
317d3d10b8 Revert "Use real chroot if daemon is running in a user namespace"
This change was introduced early in the development of rootless support,
before all the kinks were worked out and rootlesskit was built. The
author was testing the daemon by inside a user namespace set up by runc,
observed that the unshare(2) syscall was returning EPERM, and assumed
that it was a fundamental limitation of user namespaces. Seeing as the
kernel documentation (of today) disagrees with that assessment and that
unshare demonstrably works inside user namespaces, I can only assume
that the EPERM was due to a quirk of their test environment, such as a
seccomp filter set up by runc blocking the unshare syscall.
https://github.com/moby/moby/pull/20902#issuecomment-236409406

Mount namespaces are necessary to address #38995 and #43390. Revert the
special-casing so those issues can also be fixed for rootless daemons.

This reverts commit dc950567c1.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:05:20 -04:00
Cory Snider
5de229644f pkg/chrootarchive: stop reexec'ing before chroot
Unshare the thread's file system attributes and, if applicable, mount
namespace so that the chroot operation does not affect the rest of the
process.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:05:13 -04:00
Cory Snider
f2f884a92f pkg/archive: create whiteout temp dir under dest
The applyLayer implementation in pkg/chrootarchive has to set the TMPDIR
environment variable so that archive.UnpackLayer() can successfully
create the whiteout-file temp directory. Change UnpackLayer to create
the temporary directory under the destination path so that environment
variables do not need to be touched.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:04:37 -04:00
Cory Snider
1f32e3c95d Add integration test for #38995, #43390
Modify the DinD entrypoint scripts to make the issue reproducible inside
a DinD container.

Co-authored-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-26 12:04:37 -04:00
Luis Henrique Mulinari
6c0aa5b00a
Fix the max-concurrent-downloads and max-concurrent-uploads configs documentation
This fix tries to address issues raised in #44346.
The max-concurrent-downloads and max-concurrent-uploads limits are applied for the whole engine and not for each pull/push command.

Signed-off-by: Luis Henrique Mulinari <luis.mulinari@gmail.com>
2022-10-26 11:10:00 +01:00
Sebastiaan van Stijn
542c735926
Merge pull request #44256 from thaJeztah/redundant_sprintfs
replace redundant fmt.Sprintf() with strconv
2022-10-25 16:48:15 -04:00
Brian Goff
6c5ca9779b
Merge pull request #44310 from thaJeztah/daemon_getPluginExecRoot
daemon: getPluginExecRoot(): pass config
2022-10-25 11:52:35 -07:00
Brian Goff
7b1245dc7f
Merge pull request #44224 from dperny/cluster-volumes-update
Fix force-remove for cluster volumes
2022-10-25 11:13:43 -07:00
Cory Snider
22529b81f8 libnetwork: drop InitOSContext()
The function is a no-op on all platforms.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-25 13:35:44 -04:00
Cory Snider
7fc29c1435 libnetwork/osl: clean up Linux InvokeFunc()
Aside from unconditionally unlocking the OS thread even if restoring the
thread's network namespace fails, func (*networkNamespace).InvokeFunc()
correctly implements invoking a function inside a network namespace.
This is far from obvious, however. func InitOSContext() does much of the
heavy lifting but in a bizarre fashion: it restores the initial network
namespace before it is changed in the first place, and the cleanup
function it returns does not restore the network namespace at all! The
InvokeFunc() implementation has to restore the network namespace
explicitly by deferring a call to ns.SetNamespace().

func InitOSContext() is a leaky abstraction taped to a footgun. On the
one hand, it defensively resets the current thread's network namespace,
which has the potential to fix up the thread state if other buggy code
had failed to maintain the invariant that an OS thread must be locked to
a goroutine unless it is interchangeable with a "clean" thread as
spawned by the Go runtime. On the other hand, it _facilitates_ writing
buggy code which fails to maintain the aforementioned invariant because
the cleanup function it returns unlocks the thread from the goroutine
unconditionally while neglecting to restore the thread's network
namespace! It is quite scary to need a function which fixes up threads'
network namespaces after the fact as an arbitrary number of goroutines
could have been scheduled onto a "dirty" thread and run non-libnetwork
code before the thread's namespace is fixed up. Any number of
(not-so-)subtle misbehaviours could result if an unfortunate goroutine
is scheduled onto a "dirty" thread. The whole repository has been
audited to ensure that the aforementioned invariant is never violated,
making after-the-fact fixing up of thread network namespaces redundant.
Make InitOSContext() a no-op on Linux and inline the thread-locking into
the function (singular) which previously relied on it to do so.

func ns.SetNamespace() is of similarly dubious utility. It intermixes
capturing the initial network namespace and restoring the thread's
network namespace, which could result in threads getting put into the
wrong network namespace if the wrong thread is the first to call it.
Delete it entirely; functions which need to manipulate a thread's
network namespace are better served by being explicit about capturing
and restoring the thread's namespace.

Rewrite InvokeFunc() to invoke the closure inside a goroutine to enable
a graceful and safe recovery if the thread's network namespace could not
be restored. Avoid any potential race conditions due to changing the
main thread's network namespace by preventing the aforementioned
goroutines from being eligible to be scheduled onto the main thread.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-25 13:35:44 -04:00
Cory Snider
d1e3705c1a libnet/d/overlay: restore thread netns
func (*network) watchMiss() correctly locks its goroutine to an OS
thread before changing the thread's network namespace, but neglects to
restore the thread's network namespace before unlocking. Fix this
oversight by unlocking iff the thread's network namespace is
successfully restored.

Prevent the watchMiss goroutine from being locked to the main thread to
avoid the issues which would arise if such a situation was to occur.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-25 13:35:44 -04:00
Cory Snider
3e2f0c7a39 libnetwork: fixup thread locking in Linux tests
The parallel tests were unconditionally unlocking the test case
goroutine from the OS thread, irrespective of whether the thread's
network namespace was successfully restored. This was not a problem in
practice as the unpaired calls to runtime.LockOSThread() peppered
through the test case would have prevented the goroutine from being
unlocked. Unlock the goroutine from the thread iff the thread's network
namespace is successfully restored.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-25 13:35:44 -04:00
Brian Goff
ada6ddc794
Merge pull request #44306 from thaJeztah/chrootarchive_mkdir
pkg/chrootarchive: replace system.MkdirAll for os.Mkdir, use t.TempDir()
2022-10-25 09:29:19 -07:00
Sebastiaan van Stijn
a4ce46e06c
Merge pull request #44354 from thaJeztah/vendor_containerd_1.6.9
vendor: github.com/containerd/containerd v1.6.9
2022-10-24 17:00:25 -04:00
Tianon Gravi
ba31a9645c
Merge pull request #44299 from crazy-max/busybox-w32-img
integration: download busybox-w32 from GitHub Release
2022-10-24 20:07:04 +00:00
Cory Snider
afa41b16ea libnetwork/testutils: restore netns on teardown
testutils.SetupTestOSContext() sets the calling thread's network
namespace but neglected to restore it on teardown. This was not a
problem in practice as it called runtime.LockOSThread() twice but
runtime.UnlockOSThread() only once, so the tampered threads would be
terminated by the runtime when the test case returned and replaced with
a clean thread. Correct the utility so it restores the thread's network
namespace during teardown and unlocks the goroutine from the thread on
success.

Remove unnecessary runtime.LockOSThread() calls peppering test cases
which leverage testutils.SetupTestOSContext().

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-24 15:37:46 -04:00
Brian Goff
caeb591fa3
Merge pull request #44351 from thaJeztah/update_containerd_binary
update containerd binary to v1.6.9
2022-10-24 11:56:43 -07:00
Sebastiaan van Stijn
04dc007c76
vendor: github.com/containerd/containerd v1.6.9
release notes: https://github.com/containerd/containerd/releases/tag/v1.6.9

full diff: https://github.com/containerd/containerd/compare/v1.6.8...v1.6.9

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-24 14:17:46 -04:00
Sebastiaan van Stijn
ac79a02ace
update containerd binary to v1.6.9
release notes: https://github.com/containerd/containerd/releases/tag/v1.6.9

full diff: containerd/containerd@v1.6.8...v1.6.9

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-24 13:52:01 -04:00
CrazyMax
4f1d1422de
integration: download busybox-w32 from GitHub Release
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
2022-10-24 19:11:16 +02:00
Sebastiaan van Stijn
40b3fc727d
Merge pull request #44257 from tockn/master
fix typo
2022-10-23 00:07:40 +02:00
Sebastiaan van Stijn
fffa94787c
Merge pull request #44344 from thaJeztah/go1.18_compat
builder/remotecontext/git: allow building on go1.18
2022-10-21 19:38:54 +02:00
Sebastiaan van Stijn
4fdc1bb1fb
builder/remotecontext/git: allow building on go1.18
cmd.Environ() is new in go1.19, and not needed for this specific case.
Without this, trying to use this package in code that uses go1.18 will fail;

    builder/remotecontext/git/gitutils.go:216:23: cmd.Environ undefined (type *exec.Cmd has no field or method Environ)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-21 17:41:41 +02:00
Sebastiaan van Stijn
43b8dffb83
Merge pull request #44327 from thaJeztah/ghsa-ambiguous-pull-by-digest_master
Validate digest in repo for pull by digest
2022-10-21 14:19:55 +02:00
Sebastiaan van Stijn
b9921a5560
Merge pull request #44273 from thaJeztah/use_walkdir
use filepath.WalkDir instead of filepath.Walk
2022-10-21 02:28:56 +02:00
Sebastiaan van Stijn
08735b4aa8
Merge pull request #44324 from corhere/fix-git-file-leak
builder: Isolate Git from local system
2022-10-21 02:11:33 +02:00
Sebastiaan van Stijn
64cb636b06
Merge pull request #44337 from thaJeztah/buildkit_skip_unit
gha: buildkit: remove "skip-integration-tests" from matrix
2022-10-21 01:59:41 +02:00
Sebastiaan van Stijn
4f43cb660a
skip TestImagePullStoredfDigestForOtherRepo() on Windows and rootless
- On Windows, we don't build and run a local  test registry (we're not running
  docker-in-docker), so we need to skip this test.
- On rootless, networking doesn't support this (currently)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-21 01:48:59 +02:00
Brian Goff
27530efedb
Validate digest in repo for pull by digest
This is accomplished by storing the distribution source in the content
labels. If the distribution source is not found then we check to the
registry to see if the digest exists in the repo, if it does exist then
the puller will use it.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-21 01:48:59 +02:00
Sebastiaan van Stijn
92eca900b0
Revert "testutil/registry: remove unused WithStdout(), WithStErr() opts"
This reverts commit 1f21c4dd05.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-21 01:48:56 +02:00
Sebastiaan van Stijn
c93c9bca8e
Merge pull request #44336 from thaJeztah/buildkit_testskips
gha: update buildkit to v0.10.5-6-ge27c8e24 to skip some tests
2022-10-21 01:47:32 +02:00
Sebastiaan van Stijn
0f2956ab5d
Merge pull request #44302 from thaJeztah/sys_windows
pkg/system: optimize and refactor MkdirAllWithACL()
2022-10-21 00:36:58 +02:00
Sebastiaan van Stijn
413f66f1a3
Merge pull request #44308 from thaJeztah/add_DOCKER_INTEGRATION_USE_SNAPSHOTTER
daemon: add TEST_INTEGRATION_USE_SNAPSHOTTER for CI
2022-10-21 00:22:20 +02:00
Sebastiaan van Stijn
201fdf67ac
gha: update buildkit to v0.10.5-6-ge27c8e24 to skip some tests
full diff: https://github.com/moby/buildkit/compare/v0.10.5...v0.10.5-6-ge27c8e24

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-20 23:49:26 +02:00
Sebastiaan van Stijn
0760c6f4e1
gha: buildkit: make checks more readable
GitHub uses these parameters to construct a name; removing the ./ prefix
to make them more readable (and add them back where it's used)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-20 23:18:44 +02:00
Sebastiaan van Stijn
cfa2f9a2f2
gha: buildkit: remove "skip-integration-tests" from matrix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-20 23:17:55 +02:00
Cory Snider
67d010bd2c builder: add missing doc comment
Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-20 16:47:18 -04:00
Cory Snider
94672c89cc builder: fix running git commands on Windows
Setting cmd.Env overrides the default of passing through the parent
process' environment, which works out fine most of the time, except when
it doesn't. For whatever reason, leaving out all the environment causes
git-for-windows sh.exe subprocesses to enter an infinite loop of
access violations during Cygwin initialization in certain environments
(specifically, our very own dev container image).

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-20 16:47:18 -04:00
Cory Snider
61acc9939f builder: make git config isolation opt-in
While it is undesirable for the system or user git config to be used
when the daemon clones a Git repo, it could break workflows if it was
unconditionally applied to docker/cli as well.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-20 16:47:18 -04:00
Cory Snider
72119f5d9b builder: isolate git from local system
Prevent git commands we run from reading the user or system
configuration, or cloning submodules from the local filesystem.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-20 16:47:18 -04:00
Cory Snider
0f7b0897cc builder: explicitly set CWD for all git commands
Keep It Simple! Set the working directory for git commands by...setting
the git process's working directory. Git commands can be run in the
parent process's working directory by passing the empty string.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-20 16:47:18 -04:00