Commit graph

87 commits

Author SHA1 Message Date
Sebastiaan van Stijn
5ca758199d
replace pkg/locker with github.com/moby/locker
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-09-10 22:15:40 +02:00
Kir Kolyshkin
39048cf656 Really switch to moby/sys/mount*
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.

This commit was generated by the following bash script:

```
set -e -u -o pipefail

for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
	sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
		-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
		$file
	goimports -w $file
done
```

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-03-20 09:46:25 -07:00
Sebastiaan van Stijn
07ff4f1de8
goimports: fix imports
Format the source according to latest goimports.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:56:54 +02:00
Kir Kolyshkin
6533136961 pkg/mount: wrap mount/umount errors
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.

Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:

> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted

or

> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted

Before this patch, it was just

> operation not permitted

[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-12-10 20:07:02 -08:00
Salahuddin Khan
763d839261 Add ADD/COPY --chown flag support to Windows
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.

NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.

The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>

On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.

Signed-off-by: Salahuddin Khan <salah@docker.com>
2018-08-13 21:59:11 -07:00
Sebastiaan van Stijn
f23c00d870
Various code-cleanup
remove unnescessary import aliases, brackets, and so on.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-05-23 17:50:54 +02:00
Alejandro González Hevia
9392838150 Standardized log messages accross the different storage drivers.
Now all of the storage drivers use the field "storage-driver" in their log
messages, which is set to name of the respective driver.
Storage drivers changed:
- Aufs
- Btrfs
- Devicemapper
- Overlay
- Overlay 2
- Zfs

Signed-off-by: Alejandro GonzÃlez Hevia <alejandrgh11@gmail.com>
2018-03-27 14:37:30 +02:00
Kir Kolyshkin
9d00aedebc devmapper cleanup: improve error msg
1. Make sure it's clear the error is from unmount.

2. Simplify the code a bit to make it more readable.

[v2: use errors.Wrap]
[v3: use errors.Wrapf]
[v4: lowercase the error message]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-05 10:08:56 -08:00
Kir Kolyshkin
732dd9b848 devmapper/Remove(): use Rmdir, ignore errors
1. Replace EnsureRemoveAll() with Rmdir(), as here we are removing
   the container's mount point, which is already properly unmounted
   and is therefore an empty directory.

2. Ignore the Rmdir() error (but log it unless it's ENOENT). This
   is a mount point, currently unmounted (i.e. an empty directory),
   and an older kernel can return EBUSY if e.g. the mount was
   leaked to other mount namespaces.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-02 18:10:57 -08:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Brian Goff
9803272f2d Do not make graphdriver homes private mounts.
The idea behind making the graphdrivers private is to prevent leaking
mounts into other namespaces.
Unfortunately this is not really what happens.

There is one case where this does work, and that is when the namespace
was created before the daemon's namespace.
However with systemd each system servie winds up with it's own mount
namespace. This causes a race betwen daemon startup and other system
services as to if the mount is actually private.

This also means there is a negative impact when other system services
are started while the daemon is running.

Basically there are too many things that the daemon does not have
control over (nor should it) to be able to protect against these kinds
of leakages. One thing is certain, setting the graphdriver roots to
private disconnects the mount ns heirarchy preventing propagation of
unmounts... new mounts are of course not propagated either, but the
behavior is racey (or just bad in the case of restarting services)... so
it's better to just be able to keep mount propagation in tact.

It also does not protect situations like `-v
/var/lib/docker:/var/lib/docker` where all mounts are recursively bound
into the container anyway.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-01-18 09:34:00 -05:00
Sebastiaan van Stijn
b4a6313969
Golint: remove redundant ifs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-01-15 00:42:25 +01:00
Kir Kolyshkin
516010e92d Simplify/fix MkdirAll usage
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:

 - for `Mkdir()`, `IsExist` error should (usually) be ignored
   (unless you want to make sure directory was not there before)
   as it means "the destination directory was already there"

 - for `MkdirAll()`, `IsExist` error should NEVER be ignored.

Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.

Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.

NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).

For more details, a quote from my runc commit 6f82d4b (July 2015):

    TL;DR: check for IsExist(err) after a failed MkdirAll() is both
    redundant and wrong -- so two reasons to remove it.

    Quoting MkdirAll documentation:

    > MkdirAll creates a directory named path, along with any necessary
    > parents, and returns nil, or else returns an error. If path
    > is already a directory, MkdirAll does nothing and returns nil.

    This means two things:

    1. If a directory to be created already exists, no error is
    returned.

    2. If the error returned is IsExist (EEXIST), it means there exists
    a non-directory with the same name as MkdirAll need to use for
    directory. Example: we want to MkdirAll("a/b"), but file "a"
    (or "a/b") already exists, so MkdirAll fails.

    The above is a theory, based on quoted documentation and my UNIX
    knowledge.

    3. In practice, though, current MkdirAll implementation [1] returns
    ENOTDIR in most of cases described in #2, with the exception when
    there is a race between MkdirAll and someone else creating the
    last component of MkdirAll argument as a file. In this very case
    MkdirAll() will indeed return EEXIST.

    Because of #1, IsExist check after MkdirAll is not needed.

    Because of #2 and #3, ignoring IsExist error is just plain wrong,
    as directory we require is not created. It's cleaner to report
    the error now.

    Note this error is all over the tree, I guess due to copy-paste,
    or trying to follow the same usage pattern as for Mkdir(),
    or some not quite correct examples on the Internet.

    [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-11-27 17:32:12 -08:00
Sebastiaan van Stijn
38b3af567f
Remove deprecated MkdirAllAs(), MkdirAs()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-11-21 13:53:54 +01:00
Vincent Demeester
bbc4f78ba9
Merge pull request #34573 from cyphar/dm-dos-prevention-remove-mountpoint
devicemapper: remove container rootfs mountPath after umount
2017-11-08 17:08:07 +01:00
Sebastiaan van Stijn
8f702de9b7
Improve devicemapper driver-status output
Do not print "Data file" and "Metadata file" if they're
not used, and sort/group output.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-10-27 10:12:39 +02:00
Akash Gupta
7a7357dae1 LCOW: Implemented support for docker cp + build
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.

Signed-off-by: Akash Gupta <akagup@microsoft.com>
2017-09-14 12:07:52 -07:00
Daniel Nephin
f7f101d57e Add gosimple linter
Update gometalinter

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-09-12 12:09:59 -04:00
Aleksa Sarai
92e45b81e0
devicemapper: remove container rootfs mountPath after umount
libdm currently has a fairly substantial DoS bug that makes certain
operations fail on a libdm device if the device has active references
through mountpoints. This is a significant problem with the advent of
mount namespaces and MS_PRIVATE, and can cause certain --volume mounts
to cause libdm to no longer be able to remove containers:

  % docker run -d --name testA busybox top
  % docker run -d --name testB -v /var/lib/docker:/docker busybox top
  % docker rm -f testA
  [fails on libdm with dm_task_run errors.]

This also solves the problem of unprivileged users being able to DoS
docker by using unprivileged mount namespaces to preseve mounts that
Docker has dropped.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-09-06 20:11:01 +10:00
Derek McGowan
1009e6a40b
Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-07-31 13:16:46 -07:00
Brian Goff
54dcbab25e Do not remove containers from memory on error
Before this, if `forceRemove` is set the container data will be removed
no matter what, including if there are issues with removing container
on-disk state (rw layer, container root).

In practice this causes a lot of issues with leaked data sitting on
disk that users are not able to clean up themselves.
This is particularly a problem while the `EBUSY` errors on remove are so
prevalent. So for now let's not keep this behavior.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2017-05-05 17:02:04 -04:00
Tonis Tiigi
fc1cf1911b Add more locking to storage drivers
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-02-17 15:50:25 -08:00
Antonio Murdaca
8bcbc6711f
devmapper: fixup error formatting
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-01-09 16:58:09 +01:00
Vivek Goyal
39bdf601f6 devmapper: Return device id in error message
I often get complains that container removal failed and users got following
error message.

"Driver devicemapper failed to remove root filesystem 18a69ba82aaf7a039ce7d44156215012d703001643079775190ac7dd6c6acf56:Device is Busy"

This error message talks about container id but does not give any info
about which particular device id is busy. Most likely device is mounted
in some other mount namespace and if one knows the device id, they
can try to do some debugging figuring which process and which mount
namespace is keeping the device busy and how did we reach that stage.

Without that information, it becomes almost impossible to debug the
problem.

So to improve the debuggability, when device removal fails, also return
device id in error message. Now new message looks as follows.

"Driver devicemapper failed to remove root filesystem 18a69ba82aaf7a039ce7d44156215012d703001643079775190ac7dd6c6acf56: Failed to remove device dbc15bdf9994a17c613d8ef9e924f3cffbf67f91e4f709295c901ad628377991:Device is Busy"

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2017-01-05 15:02:21 -05:00
Vivek Goyal
b937aa8e69 Pass all graphdriver create() parameters in a struct
This allows for easy extension of adding more parameters to existing
parameters list. Otherwise adding a single parameter changes code
at so many places.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-11-09 15:59:58 -05:00
Shishir Mahajan
09d0720e2f Fixes Issue # 22992: docker commit failing.
1) docker create / run / start: this would create a snapshot device and mounts it onto the filesystem.
So the first time GET operation is called. it will create the rootfs directory and return the path to rootfs
2) Now when I do docker commit. It will call the GET operation second time. This time the refcount will check
that the count > 1 (count=2). so the rootfs already exists, it will just return the path to rootfs.

Earlier it was just returning the mp: /var/lib/docker/devicemapper/mnt/{ID} and hence the inconsistent paths error.

Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-05-27 14:35:46 -04:00
Michael Crosby
5b6b8df0c1 Add reference counting to aufs
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-23 15:57:23 -07:00
Michael Crosby
1ba05cdb6a Add fast path for fsmagic supported drivers
For things that we can check if they are mounted by using their fsmagic
we should use that and for others do it the slow way.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-23 15:57:23 -07:00
Michael Crosby
009ee16bef Restore ref count
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-23 15:57:23 -07:00
Brian Goff
227c83826a Merge pull request #21945 from rhvgoyal/export-min-free-space
Export Mininum Thin Pool Free Space through docker info
2016-05-02 20:20:08 -04:00
Brian Goff
7342060b07 Add refcounts to graphdrivers that use fsdiff
This makes sure fsdiff doesn't try to unmount things that shouldn't be.

**Note**: This is intended as a temporary solution to have as minor a
change as possible for 1.11.1. A bigger change will be required in order
to support container re-attach.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-04-21 12:19:57 -04:00
Vivek Goyal
55a9b8123d Export Mininum Thin Pool Free Space through docker info
Right now there is no way to know what's the minimum free space threshold
daemon is applying. It would be good to export it through docker info and
then user knows what's the current value. Also this could be useful to
higher level management tools which can look at this value and setup their
own internal thresholds for image garbage collection etc.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-04-21 15:42:23 +00:00
Stefan J. Wernli
ef5bfad321 Adding readOnly parameter to graphdriver Create method
Since the layer store was introduced, the level above the graphdriver
now differentiates between read/write and read-only layers.  This
distinction is useful for graphdrivers that need to take special steps
when creating a layer based on whether it is read-only or not.
Adding this parameter allows the graphdrivers to differentiate, which
in the case of the Windows graphdriver, removes our dependence on parsing
the id of the parent for "-init" in order to infer this information.

This will also set the stage for unblocking some of the layer store
unit tests in the next preview build of Windows.

Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
2016-04-06 13:52:53 -07:00
Shishir Mahajan
b16decfccf CLI flag for docker create(run) to change block device size.
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-03-28 10:05:18 -04:00
Brian Goff
65d79e3e5e Move layer mount refcounts to mountedLayer
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.

In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-03-23 14:42:52 -07:00
Tonis Tiigi
e91de9fb9d Revert "Move layer mount refcounts to mountedLayer"
This reverts commit 563d0711f8.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-03-23 00:33:02 -07:00
Brian Goff
563d0711f8 Move layer mount refcounts to mountedLayer
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.

In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-03-22 11:36:28 -04:00
David Calavera
4fef42ba20 Replace pkg/units with docker/go-units.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-16 12:26:49 -05:00
Chris Dituri
0aa6ace6e6 Make daemon/graphdriver/devmapper log messages with a common, consistent prefix.
Closes #16667

Uses the prefix "devmapper:" for all the fmt and logrus error, debug, and info messages.

Signed-off-by: Chris Dituri <csdituri@gmail.com>
2015-12-14 21:35:13 -06:00
Justas Brazauskas
927b334ebf Fix typos found across repository
Signed-off-by: Justas Brazauskas <brazauskasjustas@gmail.com>
2015-12-13 18:04:12 +02:00
Antonio Murdaca
037cbcec98 devmapper: remove unused var
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2015-12-10 08:28:02 +01:00
Liu Hua
f7bdb97357 Fix Put without Get in devicemapper
Signed-off-by: Liu Hua <sdu.liu@huawei.com>
2015-12-03 22:22:25 +08:00
Michael Crosby
1ecb9a40db Merge pull request #17974 from anusha-ragunathan/fsMagic
Fix devmapper backend in docker info
2015-11-17 11:44:48 -08:00
Anusha Ragunathan
fdc2641c2b Fix devmapper backend in docker info
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
2015-11-13 21:05:47 -08:00
Dan Walsh
1716d497a4 Relabel BTRFS Content on container Creation
This change will allow us to run SELinux in a container with
BTRFS back end.  We continue to work on fixing the kernel/BTRFS
but this change will allow SELinux Security separation on BTRFS.

It basically relabels the content on container creation.

Just relabling -init directory in BTRFS use case. Everything looks like it
works. I don't believe tar/achive stores the SELinux labels, so we are good
as far as docker commit.

Tested Speed on startup with BTRFS on top of loopback directory. BTRFS
not on loopback should get even better perfomance on startup time.  The
more inodes inside of the container image will increase the relabel time.

This patch will give people who care more about security the option of
runnin BTRFS with SELinux.  Those who don't want to take the slow down
can disable SELinux either in individual containers or for all containers
by continuing to disable SELinux in the daemon.

Without relabel:

> time docker run --security-opt label:disable fedora echo test
test

real    0m0.918s
user    0m0.009s
sys    0m0.026s

With Relabel

test

real    0m1.942s
user    0m0.007s
sys    0m0.030s

Signed-off-by: Dan Walsh <dwalsh@redhat.com>

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2015-11-11 14:49:27 -05:00
Vincent Demeester
5ecbc9747f Merge pull request #16303 from coolljt0725/add_docker_info_show_base_size
Add docker info show base filesystem size of container/image when use devicemapper
2015-10-13 14:43:52 +02:00
Lei Jitang
5c374c7137 Add docker info show base filesystem size of container/image when use devicemapper
Signed-off-by: Lei Jitang <leijitang@huawei.com>
2015-10-10 22:52:05 +08:00
Phil Estes
442b45628e Add user namespace (mapping) support to the Docker engine
Adds support for the daemon to handle user namespace maps as a
per-daemon setting.

Support for handling uid/gid mapping is added to the builder,
archive/unarchive packages and functions, all graphdrivers (except
Windows), and the test suite is updated to handle user namespace daemon
rootgraph changes.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-10-09 17:47:37 -04:00
Vivek Goyal
d295dc6652 devmapper: Keep track of number of deleted devices
Keep track of number of deleted devices and export this information through
"docker info".

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-06 17:37:21 -04:00
Vivek Goyal
d929589c1f devmapper: Implement deferred deletion functionality
Finally here is the patch to implement deferred deletion functionality.
Deferred deleted devices are marked as "Deleted" in device meta file. 

First we try to delete the device and only if deletion fails and user has
enabled deferred deletion, device is marked for deferred deletion.

When docker starts up again, we go through list of deleted devices and
try to delete these again.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-10-06 17:37:21 -04:00