Commit graph

1061 commits

Author SHA1 Message Date
bin liu
b00a67be6e Fix typos in daemon
Signed-off-by: bin liu <liubin0329@gmail.com>
2018-02-10 19:42:54 +08:00
WANG Chao
9015a05606 graphdriver: Fix RefCounter memory leak
Signed-off-by: WANG Chao <chao.wang@ucloud.cn>
2018-02-09 10:26:06 +08:00
Brian Goff
0e5eaf8ee3 Ensure plugin returns correctly scoped paths
Before this change, volume management was relying on the fact that
everything the plugin mounts is visible on the host within the plugin's
rootfs. In practice this caused some issues with mount leaks, so we
changed the behavior such that mounts are not visible on the plugin's
rootfs, but available outside of it, which breaks volume management.

To fix the issue, allow the plugin to scope the path correctly rather
than assuming that everything is visible in `p.Rootfs`.
In practice this is just scoping the `PropagatedMount` paths to the
correct host path.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-02-07 15:48:27 -05:00
Brian Goff
a53930a04f Plugins perform propagated mount in runtime spec
Setting up the mounts on the host increases chances of mount leakage and
makes for more cleanup after the plugin has stopped.
With this change all mounts for the plugin are performed by the
container runtime and automatically cleaned up when the container exits.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-02-07 15:48:27 -05:00
Brian Goff
2fe4f888be Do not recursive unmount on cleanup of zfs/btrfs
This was added in #36047 just as a way to make sure the tree is fully
unmounted on shutdown.

For ZFS this could be a breaking change since there was no unmount before.
Someone could have setup the zfs tree themselves. It would be better, if
we really do want the cleanup to actually the unpacked layers checking
for mounts rather than a blind recursive unmount of the root.

BTRFS does not use mounts and does not need to unmount anyway.
These was only an unmount to begin with because for some reason the
btrfs tree was being moutned with `private` propagation.

For the other graphdrivers that still have a recursive unmount here...
these were already being unmounted and performing the recursive unmount
shouldn't break anything. If anyone had anything mounted at the
graphdriver location it would have been unmounted on shutdown anyway.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-02-07 15:08:17 -05:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Yong Tang
03a1df9536
Merge pull request #36114 from Microsoft/jjh/fixdeadlock-lcowdriver-hotremove
LCOW: Graphdriver fix deadlock in hotRemoveVHDs
2018-01-29 09:57:43 -08:00
Michael Crosby
2c05aefc99
Merge pull request #36047 from cpuguy83/graphdriver_improvements
Do not make graphdriver homes private mounts.
2018-01-26 13:54:30 -05:00
John Howard
a44fcd3d27 LCOW: Graphdriver fix deadlock
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-26 08:57:52 -08:00
Sebastiaan van Stijn
a8d0e36d03
Merge pull request #36052 from Microsoft/jjh/no-overlay-off-only-one-disk
LCOW: Regular mount if only one layer
2018-01-25 15:46:16 -08:00
Vincent Demeester
3c9d023af3
Merge pull request #36051 from Microsoft/jjh/remotefs-read-return-error
LCOW remotefs - return error in Read() implementation
2018-01-19 11:27:13 -08:00
John Howard
6112ad6e7d LCOW remotefs - return error in Read() implementation
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-18 17:46:58 -08:00
John Howard
420dc4eeb4 LCOW: Regular mount if only one layer
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-18 12:01:58 -08:00
John Howard
afd305c4b5 LCOW: Refactor to multiple layer-stores based on feedback
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-18 08:31:05 -08:00
Brian Goff
9803272f2d Do not make graphdriver homes private mounts.
The idea behind making the graphdrivers private is to prevent leaking
mounts into other namespaces.
Unfortunately this is not really what happens.

There is one case where this does work, and that is when the namespace
was created before the daemon's namespace.
However with systemd each system servie winds up with it's own mount
namespace. This causes a race betwen daemon startup and other system
services as to if the mount is actually private.

This also means there is a negative impact when other system services
are started while the daemon is running.

Basically there are too many things that the daemon does not have
control over (nor should it) to be able to protect against these kinds
of leakages. One thing is certain, setting the graphdriver roots to
private disconnects the mount ns heirarchy preventing propagation of
unmounts... new mounts are of course not propagated either, but the
behavior is racey (or just bad in the case of restarting services)... so
it's better to just be able to keep mount propagation in tact.

It also does not protect situations like `-v
/var/lib/docker:/var/lib/docker` where all mounts are recursively bound
into the container anyway.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-01-18 09:34:00 -05:00
John Howard
141b9a7471 LCOW: Fix OpenFile parameters
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-17 07:58:18 -08:00
Drew Hubl
27b002f4a0 Improve zfs init log message for zfs
Signed-off-by: Drew Hubl <drew.hubl@gmail.com>
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-01-16 21:42:05 -05:00
Yong Tang
c36274da83
Merge pull request #35638 from cpuguy83/error_helpers2
Add helpers to create errdef errors
2018-01-15 10:56:46 -08:00
Sebastiaan van Stijn
b4a6313969
Golint: remove redundant ifs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-01-15 00:42:25 +01:00
Brian Goff
d453fe35b9 Move api/errdefs to errdefs
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-01-11 21:21:43 -05:00
Daniel Nephin
e3a6323419 Cleanup graphdriver/quota test
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-01-08 13:41:45 -05:00
Tõnis Tiigi
daded8da91
Merge pull request #35674 from kolyshkin/zfs-rmdir
zfs: fix ebusy on umount etc
2017-12-28 15:35:19 -08:00
Yong Tang
3e1df952b7
Merge pull request #35827 from kolyshkin/vfs-quota
Fix VFS vs quota regression
2017-12-22 12:32:03 -06:00
Kir Kolyshkin
1e8a087850 vfs gd: ignore quota setup errors
This is a fix to regression in vfs graph driver introduced by
commit 7a1618ced3 ("add quota support to VFS graphdriver").

On some filesystems, vfs fails to init with the following error:

> Error starting daemon: error initializing graphdriver: Failed to mknod
> /go/src/github.com/docker/docker/bundles/test-integration/d6bcf6de610e9/root/vfs/backingFsBlockDev:
> function not implemented

As quota is not essential for vfs, let's ignore (but log as a warning) any error
from quota init.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-12-19 13:55:04 -08:00
Kir Kolyshkin
2dd39b7841 projectquota: treat ENOSYS as quota unsupported
If mknod() returns ENOSYS, it most probably means quota is not supported
here, so return the appropriate error.

This is a conservative* fix to regression in vfs graph driver introduced
by commit 7a1618ced3 ("add quota support to VFS graphdriver").
On some filesystems, vfs fails to init with the following error:

> Error starting daemon: error initializing graphdriver: Failed to mknod
> /go/src/github.com/docker/docker/bundles/test-integration/d6bcf6de610e9/root/vfs/backingFsBlockDev:
> function not implemented

Reported-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-12-18 10:32:36 -08:00
Sebastiaan van Stijn
6ed1163c98
Remove redundant build-tags
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-12-18 17:41:53 +01:00
Phil Estes
05b8d59015
Fix overlay2 storage driver inside a user namespace
The overlay2 driver was not setting up the archive.TarOptions field
properly like other storage backend routes to "applyTarLayer"
functionality. The InUserNS field is populated now for overlay2 using
the same query function used by the other storage drivers.

Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
2017-12-13 23:38:22 -05:00
Vivek Goyal
f728d74ac5 devmapper: Log fstype and mount options during mount error
Right now we only log source and destination (and demsg) if mount operation
fails. fstype and mount options are available easily. It probably is a good
idea to log these as well. Especially sometimes failures can happen due to
mount options.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2017-12-11 13:27:14 -05:00
Victor Vieux
bd8a9c25ee
Merge pull request #35514 from thaJeztah/disable-overlay-without-d_type
Remove support for overlay/overlay2 without d_type
2017-12-06 14:13:58 -08:00
Jihyun Hwang
518c50c9b2 fixed typo (reliablity -> reliability)
Signed-off-by: Jihyun Hwang <jhhwang@telcoware.com>
2017-12-06 09:51:53 +09:00
Sebastiaan van Stijn
0a4e793a3d
Allow existing setups to continue using d_type
Even though it's highly discouraged, there are existing
installs that are running overlay/overlay2 on filesystems
without d_type support.

This patch allows the daemon to start in such cases, instead of
refusing to start without an option to override.

For fresh installs, backing filesystems without d_type support
will still cause the overlay/overlay2 drivers to be marked as
"unsupported", and skipped during the automatic selection.

This feature is only to keep backward compatibility, but
will be removed at some point.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-12-04 18:41:25 -08:00
Yong Tang
4047cede65
Merge pull request #35537 from sargun/vfs-use-copy_file_range
Have VFS graphdriver use accelerated in-kernel copy
2017-12-04 19:34:56 -06:00
Sebastiaan van Stijn
0abb8dec3f
Remove support for overlay/overlay2 without d_type
Support for running overlay/overlay2 on a backing filesystem
without d_type support (most likely: xfs, as ext4 supports
this by default), was deprecated for some time.

Running without d_type support is problematic, and can
lead to difficult to debug issues ("invalid argument" errors,
or unable to remove files from the container's filesystem).

This patch turns the warning that was previously printed
into an "unsupported" error, so that the overlay/overlay2
drivers are not automatically selected when detecting supported
storage drivers.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-12-04 17:10:20 -08:00
Sebastiaan van Stijn
f9c8fa305e
Perform fsmagic detection on driver's home-dir if it exists
The fsmagic check was always performed on "data-root" (`/var/lib/docker`),
not on the storage-driver's home directory (e.g. `/var/lib/docker/<somedriver>`).

This caused detection to be done on the wrong filesystem in situations
where `/var/lib/docker/<somedriver>` was a mount, and a different
filesystem than `/var/lib/docker` itself.

This patch checks if the storage-driver's home directory exists, and only
falls back to `/var/lib/docker` if it doesn't exist.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-12-04 17:10:07 -08:00
Kir Kolyshkin
a450a575a6 zfs: fix ebusy on umount
This commit is a set of fixes and improvement for zfs graph driver,
in particular:

1. Remove mount point after umount in `Get()` error path, as well
   as in `Put()` (with `MNT_DETACH` flag). This should solve "failed
   to remove root filesystem for <ID> .... dataset is busy" error
   reported in Moby issue 35642.

To reproduce the issue:

   - start dockerd with zfs
   - docker run -d --name c1 --rm busybox top
   - docker run -d --name c2 --rm busybox top
   - docker stop c1
   - docker rm c1

Output when the bug is present:
```
Error response from daemon: driver "zfs" failed to remove root
filesystem for XXX : exit status 1: "/sbin/zfs zfs destroy -r
scratch/docker/YYY" => cannot destroy 'scratch/docker/YYY':
dataset is busy
```

Output when the bug is fixed:
```
Error: No such container: c1
```
(as the container has been successfully autoremoved on stop)

2. Fix/improve error handling in `Get()` -- do not try to umount
   if `refcount` > 0

3. Simplifies unmount in `Get()`. Specifically, remove call to
   `graphdriver.Mounted()` (which checks if fs is mounted using
   `statfs()` and check for fs type) and `mount.Unmount()` (which
   parses `/proc/self/mountinfo`). Calling `unix.Unmount()` is
   simple and sufficient.

4. Add unmounting of driver's home to `Cleanup()`.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-12-01 13:46:47 -08:00
Sargun Dhillon
77a2bc3e5b Fix setting mtimes on directories
Previously, the code would set the mtime on the directories before
creating files in the directory itself. This was problematic
because it resulted in the mtimes on the directories being
incorrectly set. This change makes it so that the mtime is
set only _after_ all of the files have been created.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2017-12-01 09:12:43 -08:00
Michael Crosby
72e45fd54e
Merge pull request #35618 from kolyshkin/mkdir-all
Fix MkdirAll* and its usage
2017-11-30 11:19:29 -05:00
Tõnis Tiigi
bdd9668b48
Merge pull request #35483 from thaJeztah/disallow-nfs-backing-for-overlay
Disallow overlay/overlay2 on top of NFS
2017-11-29 19:24:58 -08:00
Victor Vieux
11e07e7da6
Merge pull request #35527 from thaJeztah/feature-detect-overlay2
Detect overlay2 support on pre-4.0 kernels
2017-11-29 13:51:26 -08:00
Akihiro Suda
09eb7bcc36
Merge pull request #34948 from euank/public-mounts
Fix EBUSY errors under overlayfs and v4.13+ kernels
2017-11-29 12:48:39 +09:00
Sargun Dhillon
0eac562281 Fix bug, where copy_file_range was still calling legacy copy
There was a small issue here, where it copied the data using
traditional mechanisms, even when copy_file_range was successful.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2017-11-28 14:59:56 -08:00
Sargun Dhillon
d2b71b2660 Have VFS graphdriver use accelerated in-kernel copy
This change makes the VFS graphdriver use the kernel-accelerated
(copy_file_range) mechanism of copying files, which is able to
leverage reflinks.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2017-11-28 14:59:56 -08:00
Sargun Dhillon
b467f8b2ef Fix copying hardlinks in graphdriver/copy
Previously, graphdriver/copy would improperly copy hardlinks as just regular
files. This patch changes that behaviour, and instead the code now keeps
track of inode numbers, and if it sees the same inode number again
during the copy loop, it hardlinks it, instead of copying it.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2017-11-28 14:59:56 -08:00
Sebastiaan van Stijn
955c1f881a
Detect overlay2 support on pre-4.0 kernels
The overlay2 storage-driver requires multiple lower dir
support for overlayFs. Support for this feature was added
in kernel 4.x, but some distros (RHEL 7.4, CentOS 7.4) ship with
an older kernel with this feature backported.

This patch adds feature-detection for multiple lower dirs,
and will perform this feature-detection on pre-4.x kernels
with overlayFS support.

With this patch applied, daemons running on a kernel
with multiple lower dir support will now select "overlay2"
as storage-driver, instead of falling back to "overlay".

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-11-28 13:55:33 -08:00
Victor Vieux
9ae697167c
Merge pull request #35528 from thaJeztah/ignore-empty-graphdirs
Skip empty directories on prior graphdriver detection
2017-11-27 17:37:39 -08:00
Kir Kolyshkin
516010e92d Simplify/fix MkdirAll usage
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:

 - for `Mkdir()`, `IsExist` error should (usually) be ignored
   (unless you want to make sure directory was not there before)
   as it means "the destination directory was already there"

 - for `MkdirAll()`, `IsExist` error should NEVER be ignored.

Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.

Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.

NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).

For more details, a quote from my runc commit 6f82d4b (July 2015):

    TL;DR: check for IsExist(err) after a failed MkdirAll() is both
    redundant and wrong -- so two reasons to remove it.

    Quoting MkdirAll documentation:

    > MkdirAll creates a directory named path, along with any necessary
    > parents, and returns nil, or else returns an error. If path
    > is already a directory, MkdirAll does nothing and returns nil.

    This means two things:

    1. If a directory to be created already exists, no error is
    returned.

    2. If the error returned is IsExist (EEXIST), it means there exists
    a non-directory with the same name as MkdirAll need to use for
    directory. Example: we want to MkdirAll("a/b"), but file "a"
    (or "a/b") already exists, so MkdirAll fails.

    The above is a theory, based on quoted documentation and my UNIX
    knowledge.

    3. In practice, though, current MkdirAll implementation [1] returns
    ENOTDIR in most of cases described in #2, with the exception when
    there is a race between MkdirAll and someone else creating the
    last component of MkdirAll argument as a file. In this very case
    MkdirAll() will indeed return EEXIST.

    Because of #1, IsExist check after MkdirAll is not needed.

    Because of #2 and #3, ignoring IsExist error is just plain wrong,
    as directory we require is not created. It's cleaner to report
    the error now.

    Note this error is all over the tree, I guess due to copy-paste,
    or trying to follow the same usage pattern as for Mkdir(),
    or some not quite correct examples on the Internet.

    [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-11-27 17:32:12 -08:00
Euan Kemp
af0d589623 graphdriver/overlay{,2}: remove 'merged' on umount
This removes and recreates the merged dir with each umount/mount
respectively.
This is done to make the impact of leaking mountpoints have less
user-visible impact.

It's fairly easy to accidentally leak mountpoints (even if moby doesn't,
other tools on linux like 'unshare' are quite able to incidentally do
so).

As of recently, overlayfs reacts to these mounts being leaked (see

One trick to force an unmount is to remove the mounted directory and
recreate it. Devicemapper now does this, overlay can follow suit.

Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
2017-11-22 14:32:30 -08:00
Euan Kemp
1e214c0952 graphdriver/overlay: minor doc comment cleanup
Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
2017-11-22 14:17:08 -08:00
Sebastiaan van Stijn
1262c57714
Skip empty directories on prior graphdriver detection
When starting the daemon, the `/var/lib/docker` directory
is scanned for existing directories, so that the previously
selected graphdriver will automatically be used.

In some situations, empty directories are present (those
directories can be created during feature detection of
graph-drivers), in which case the daemon refuses to start.

This patch improves detection, and skips empty directories,
so that leftover directories don't cause the daemon to
fail.

Before this change:

    $ mkdir /var/lib/docker /var/lib/docker/aufs /var/lib/docker/overlay2
    $ dockerd
    ...
    Error starting daemon: error initializing graphdriver: /var/lib/docker contains several valid graphdrivers: overlay2, aufs; Please cleanup or explicitly choose storage driver (-s <DRIVER>)

With this patch applied:

    $ mkdir /var/lib/docker /var/lib/docker/aufs /var/lib/docker/overlay2
    $ dockerd
    ...
    INFO[2017-11-16T17:26:43.207739140Z] Docker daemon                                 commit=ab90bc296 graphdriver(s)=overlay2 version=dev
    INFO[2017-11-16T17:26:43.208033095Z] Daemon has completed initialization

And on restart (prior graphdriver is still picked up):

    $ dockerd
    ...
    INFO[2017-11-16T17:27:52.260361465Z] [graphdriver] using prior storage driver: overlay2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-11-21 15:42:04 +01:00
Sebastiaan van Stijn
38b3af567f
Remove deprecated MkdirAllAs(), MkdirAs()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-11-21 13:53:54 +01:00