Commit 7a1618ced3 regresses running Docker
in user namespaces. The new check for whether quota are supported calls
NewControl() which in turn calls makeBackingFsDev() which tries to
mknod(). Skip quota tests when we detect that we are running in a user
namespace and return ErrQuotaNotSupported to the caller. This just
restores the status quo.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Add a way to specify a custom graphdriver priority list
during build. This can be done with something like
go build -ldflags "-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,devicemapper"
As ldflags are already used by the engine build process, and it seems
that only one (last) `-ldflags` argument is taken into account by go,
an envoronment variable `DOCKER_LDFLAGS` is introduced in order to
be able to append some text to `-ldflags`. With this in place,
using the feature becomes
make DOCKER_LDFLAGS="-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,devicemapper" dynbinary
The idea behind this is, the priority list might be different
for different distros, so vendors are now able to change it
without patching the source code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Make it possible to disable overlay and overlay2 separately.
With this commit, we now have `exclude_graphdriver_overlay` and
`exclude_graphdriver_overlay2` build tags for the engine, which
is in line with any other graph driver.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
From https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt:
> The lower filesystem can be any filesystem supported by Linux and does
> not need to be writable. The lower filesystem can even be another
> overlayfs. The upper filesystem will normally be writable and if it
> is it must support the creation of trusted.* extended attributes, and
> must provide valid d_type in readdir responses, so NFS is not suitable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In order to avoid reverting our fix for mount leakage in devicemapper,
add a test which checks that devicemapper's Get() and Put() cycle can
survive having a command running in an rprivate mount propagation setup
in-between. While this is quite rudimentary, it should be sufficient.
We have to skip this test for pre-3.18 kernels.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This patch adds the capability for the VFS graphdriver to use
XFS project quotas. It reuses the existing quota management
code that was created by overlay2 on XFS.
It doesn't rely on a filesystem whitelist, but instead
the quota-capability detection code.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This adds a mechanism (read-only) to check for project quota support
in a standard way. This mechanism is leveraged by the tests, which
test for the following:
1. Can we get a quota controller?
2. Can we set the quota for a particular directory?
3. Is the quota being over-enforced?
4. Is the quota being under-enforced?
5. Can we retrieve the quota?
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This changeset allows Docker's VFS, and Overlay to take advantage of
Linux's zerocopy APIs.
The copy function first tries to use the ficlone ioctl. Reason being:
- they do not allow partial success (aka short writes)
- clones are expected to be a fast metadata operation
See: http://oss.sgi.com/archives/xfs/2015-12/msg00356.html
If the clone fails, we fall back to copy_file_range, which internally
may fall back to splice, which has an upper limit on the size
of copy it can perform. Given that, we have to loop until the copy
is done.
For a given dirCopy operation, if the clone fails, we will not try
it again during any other file copy. Same is true with copy_file_range.
If all else fails, we fall back to traditional copy.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
When use overlay2 as the graphdriver and the kernel enable
`CONFIG_OVERLAY_FS_REDIRECT_DIR=y`, rename a dir in lower layer
will has a xattr to redirct its dir to source dir. This make the
image layer unportable. This patch fallback to use naive diff driver
when kernel enable CONFIG_OVERLAY_FS_REDIRECT_DIR
Signed-off-by: Lei Jitang <leijitang@huawei.com>
The change in 7a7357dae1 inadvertently
changed the `defer` error code into a no-op. This restores its behavior
prior to that code change, and also introduces a little more error
logging.
Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
It was causing the error message to be
'overlay' is not supported over <unknown>
instead of
'overlay' is not supported over ecryptfs
Signed-off-by: Iago López Galeiras <iago@kinvolk.io>
This commit reverts a hunk of commit 2f5f0af3f ("Add unconvert linter")
and adds a hint for unconvert linter to ignore excessive conversion as
it is required on 32-bit platforms (e.g. armhf).
The exact error on armhf is this:
19:06:45 ---> Making bundle: dynbinary (in bundles/17.06.0-dev/dynbinary)
19:06:48 Building: bundles/17.06.0-dev/dynbinary-daemon/dockerd-17.06.0-dev
19:10:58 # github.com/docker/docker/daemon/graphdriver/overlay
19:10:58 daemon/graphdriver/overlay/copy.go:161: cannot use stat.Atim.Sec (type int32) as type int64 in argument to time.Unix
19:10:58 daemon/graphdriver/overlay/copy.go:161: cannot use stat.Atim.Nsec (type int32) as type int64 in argument to time.Unix
19:10:58 daemon/graphdriver/overlay/copy.go:162: cannot use stat.Mtim.Sec (type int32) as type int64 in argument to time.Unix
19:10:58 daemon/graphdriver/overlay/copy.go:162: cannot use stat.Mtim.Nsec (type int32) as type int64 in argument to time.Unix
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Instead of providing a generic message listing all possible reasons
why xfs is not available on the system, let's be specific.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
If mount fails, the reason might be right there in the kernel log ring buffer.
Let's include it in the error message, it might be of great help.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since the update to Debian Stretch, devmapper unit test fails. One
reason is, the combination of somewhat old (less than 3.16) kernel and
relatively new xfsprogs leads to creating a filesystem which is not supported
by the kernel:
> [12206.467518] XFS (dm-1): Superblock has unknown read-only compatible features (0x1) enabled.
> [12206.472046] XFS (dm-1): Attempted to mount read-only compatible filesystem read-write.
> Filesystem can only be safely mounted read only.
> [12206.472079] XFS (dm-1): SB validate failed with error 22.
Ideally, that would be automatically and implicitly handled by xfsprogs.
In real life, we have to take care about it here. Sigh.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Static build with devmapper is impossible now since libudev is required
and no static version of libudev is available (as static libraries are
not supported by systemd which udev is part of).
This should not hurt anyone as "[t]he primary user of static builds
is the Editions, and docker in docker via the containers, and none
of those use device mapper".
Also, since the need for static libdevmapper is gone, there is no need
to self-compile libdevmapper -- let's use the one from Debian Stretch.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Make sure to call C.free on C string allocated using C.CString in every
exit path.
C.CString allocates memory in the C heap using malloc. It is the callers
responsibility to free them. See
https://golang.org/cmd/cgo/#hdr-Go_references_to_C for details.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
This commit reverts a hunk of commit 2f5f0af3f ("Add unconvert linter")
and adds a hint for unconvert linter to ignore excessive conversion as
it is required on 32-bit platforms (e.g. armhf).
The exact error on armhf is this:
19:06:45 ---> Making bundle: dynbinary (in bundles/17.06.0-dev/dynbinary)
19:06:48 Building: bundles/17.06.0-dev/dynbinary-daemon/dockerd-17.06.0-dev
19:10:58 # github.com/docker/docker/daemon/graphdriver/overlay
19:10:58 daemon/graphdriver/overlay/copy.go:161: cannot use stat.Atim.Sec (type int32) as type int64 in argument to time.Unix
19:10:58 daemon/graphdriver/overlay/copy.go:161: cannot use stat.Atim.Nsec (type int32) as type int64 in argument to time.Unix
19:10:58 daemon/graphdriver/overlay/copy.go:162: cannot use stat.Mtim.Sec (type int32) as type int64 in argument to time.Unix
19:10:58 daemon/graphdriver/overlay/copy.go:162: cannot use stat.Mtim.Nsec (type int32) as type int64 in argument to time.Unix
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
libdm currently has a fairly substantial DoS bug that makes certain
operations fail on a libdm device if the device has active references
through mountpoints. This is a significant problem with the advent of
mount namespaces and MS_PRIVATE, and can cause certain --volume mounts
to cause libdm to no longer be able to remove containers:
% docker run -d --name testA busybox top
% docker run -d --name testB -v /var/lib/docker:/docker busybox top
% docker rm -f testA
[fails on libdm with dm_task_run errors.]
This also solves the problem of unprivileged users being able to DoS
docker by using unprivileged mount namespaces to preseve mounts that
Docker has dropped.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
In d42dbdd3d4 the code was re-arranged to
better report errors, and ignore non-errors.
In doing so we removed a deferred remove of the AUFS diff path, but did
not replace it with a non-deferred one.
This fixes the issue and makes the code a bit more readable.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
I was able to successfully use device mapper autoconfig feature
(commit 5ef07d79c) but it stopped working after a reboot.
Investigation shown that the dm device was not activated because of
a missing binary, that is not used during initial setup, but every
following time. Here's an error shown when trying to manually activate
the device:
> kir@kd:~/go/src/github.com/docker/docker$ sudo lvchange -a y /dev/docker/thinpool
> /usr/sbin/thin_check: execvp failed: No such file or directory
> Check of pool docker/thinpool failed (status:2). Manual repair required!
Surely, there is no solution to this other than to have a package that
provides the thin_check binary installed beforehand. Due to the fact
the issue revealed itself way later than DM setup was performed, it was
somewhat harder to investigate.
With this in mind, let's check for binary presense before setting up DM,
refusing to proceed if the binary is not there, saving a user from later
frustration.
While at it, eliminate repeated binary checking code. The downside is
that the binary lookup is happening more than once now -- I think the
clarity of code overweights this minor de-optimization.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>