This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there"
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.
Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.
NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).
For more details, a quote from my runc commit 6f82d4b (July 2015):
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Before this, if a volume exists in a driver but not in the local cache,
the store would just return a bare volume. This means that if a user
supplied options or labels, they will not get stored.
Instead only return early if we have the volume stored locally. Note
this could still have an issue with labels/opts passed in by the user
differing from what is stored, however this isn't really a new problem.
This fixes a problem where if there is a shared storage backend between
two docker nodes, a create on one node will have labels stored and a
create on the other node will not.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In some circumstances we were not properly releasing plugin references,
leading to failures in removing a plugin with no way to recover other
than restarting the daemon.
1. If volume create fails (in the driver)
2. If a driver validation fails (should be rare)
3. If trying to get a plugin that does not match the passed in capability
Ideally the test for 1 and 2 would just be a unit test, however the
plugin interfaces are too complicated as `plugingetter` relies on
github.com/pkg/plugin/Client (a concrete type), which will require
spinning up services from within the unit test... it just wouldn't be a
unit test at this point.
I attempted to refactor this a bit, but since both libnetwork and
swarmkit are reliant on `plugingetter` as well, this would not work.
This really requires a re-write of the lower-level plugin management to
decouple these pieces.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When using a volume via the `Binds` API, a shared selinux label is
automatically set.
The `Mounts` API is not setting this, which makes volumes specified via
the mounts API useless when selinux is enabled.
This fix adopts the same selinux label for volumes on the mounts API as on
binds.
Note in the case of both the `Binds` API and the `Mounts` API, the
selinux label is only applied when the volume driver is the `local`
driver.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Current insider builds of Windows have support for mounting individual
named pipe servers from the host to the guest. This allows, for example,
exposing the docker engine's named pipe to a container.
This change allows the user to request such a mount via the normal bind
mount syntax in the CLI:
docker run -v \\.\pipe\docker_engine:\\.\pipe\docker_engine <args>
Signed-off-by: John Starks <jostarks@microsoft.com>
[1.12.x] Fix issue where volume metadata was not removed
(cherry picked from commit 7613b23a58)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Conflicts:
volume/store/store.go
volume/store/store_test.go
If a container mount the socket the daemon is listening on into
container while the daemon is being shutdown, the socket will
not exist on the host, then daemon will assume it's a directory
and create it on the host, this will cause the daemon can't start
next time.
fix issue https://github.com/moby/moby/issues/30348
To reproduce this issue, you can add following code
```
--- a/daemon/oci_linux.go
+++ b/daemon/oci_linux.go
@@ -8,6 +8,7 @@ import (
"sort"
"strconv"
"strings"
+ "time"
"github.com/Sirupsen/logrus"
"github.com/docker/docker/container"
@@ -666,7 +667,8 @@ func (daemon *Daemon) createSpec(c *container.Container) (*libcontainerd.Spec, e
if err := daemon.setupIpcDirs(c); err != nil {
return nil, err
}
-
+ fmt.Printf("===please stop the daemon===\n")
+ time.Sleep(time.Second * 2)
ms, err := daemon.setupMounts(c)
if err != nil {
return nil, err
```
step1 run a container which has `--restart always` and `-v /var/run/docker.sock:/sock`
```
$ docker run -ti --restart always -v /var/run/docker.sock:/sock busybox
/ #
```
step2 exit the the container
```
/ # exit
```
and kill the daemon when you see
```
===please stop the daemon===
```
in the daemon log
The daemon can't restart again and fail with `can't create unix socket /var/run/docker.sock: is a directory`.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
Closes#32663 by adding CreatedAt field when volume is created.
Displaying CreatedAt value when volume is inspected
Adding tests to verfiy the new field is correctly populated
Signed-off-by: Marianna <mtesselh@gmail.com>
Moving CreatedAt tests from the CLI
Moving the tests added for the newly added CreatedAt field for Volume, from CLI to API tests
Signed-off-by: Marianna <mtesselh@gmail.com>
This makes sure that multiple users of MountPoint pointer can
mount/unmount without affecting each other.
Before this PR, if you run a container (stay running), then do `docker
cp`, when the `docker cp` is done the MountPoint is mutated such that
when the container stops the volume driver will not get an Unmount
request. Effectively there would be two mounts with only one unmount.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When there is an error unmounting a local volume, it is still possible
to call `Remove()` on the volume causing removal of the mounted
resources which is generally not desirable.
This ensures that resources are unmounted before attempting removal.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Until and unless user has specified a propagation property for volume, they
should default to "rprivate" and it should be passed to runc.
We can't make it conditional on HasPropagation(). GetPropagation() returns
default of rprivate if noting was passed in by user.
If we don't pass "rprivate" to runc, then bind mount could be shared even
if user did not ask for it. For example, mount two volumes in a container.
One is "shared" while other's propagation is not specified by caller. If
both volume has same source mount point of "shared", then second volume
will also be shared inside container (instead of being private).
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This adds 'consistency' mode flags to the mount command line argument.
Initially, the valid 'consistency' flags are 'consistent', 'cached',
'delegated', and 'default'.
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: Jeremy Yallop <yallop@docker.com>
There was no validation for `docker run --tmpfs foo`.
In this PR, only two obvious rules are implemented:
- path must be absolute
- path must not be "/"
We should add more rules carefully.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This patch fixed below 4 types of code line
1. Remove unnecessary variable assignment
2. Use variables declaration instead of explicit initial zero value
3. Change variable name to underbar when variable not used
4. Add erro check and return for ignored error
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Currently local volumes and other volumes that support SELinux do
not get labeled correctly. This patch will allow a user to specify
:Z or :z when mounting a volume and have it fix the label of the newly
created volume.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Go style calls for mixed caps instead of all caps:
https://golang.org/doc/effective_go.html#mixed-caps
Change LOOKUP, ACQUIRE, and RELEASE to Lookup, Acquire, and Release.
This vendors a fork of libnetwork for now, to deal with a cyclic
dependency issue. The change will be upstream to libnetwork once this is
merged.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Move plugins to shared distribution stack with images.
Create immutable plugin config that matches schema2 requirements.
Ensure data being pushed is same as pulled/created.
Store distribution artifacts in a blobstore.
Run init layer setup for every plugin start.
Fix breakouts from unsafe file accesses.
Add support for `docker plugin install --alias`
Uses normalized references for default names to avoid collisions when using default hosts/tags.
Some refactoring of the plugin manager to support the change, like removing the singleton manager and adding manager config struct.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
bolt k/v pairs are only valid for the life of a transaction.
This means the memory that the k/v pair is referencing may be invalid if
it is accessed outside of the transaction.
This can potentially cause a panic.
For reference: https://godoc.org/github.com/boltdb/bolt#hdr-Caveats
To fix this issue, unmarshal the stored data into volume meta before
closing the transaction.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Previously, it was comparing against the driver name passed in by the
caller. This could lead to subtle issues when using plugins, like
"plugin" vs. "plugin:latest".
Also, remove "conflict:" prefix to improve the error message.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Ensures all known volumes (known b/c they are persisted to disk) have
their volume drivers refcounted properly.
In testing this, I found an issue with `--live-restore` (required since
currently the provided volume plugin doesn't keep state on restart)
where restorted plugins did not have a plugin client loaded causing a
panic when trying to use the plugin.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Adds 2 new methods to v2 plugin `Acquire` and `Release` which allow
refcounting directly at the plugin level instead of just the store.
Since a graphdriver is initialized exactly once, and is really managed
by a separate object, it didn't really seem right to call
`getter.Get()` to refcount graphdriver plugins.
On shutdown it was particularly weird where we'd either need to keep a
driver reference in daemon, or keep a reference to the pluggin getter in
the layer store, and even then still store extra details on if the
graphdriver is a plugin or not.
Instead the plugin proxy itself will handle calling the neccessary
refcounting methods directly on the plugin object.
Also adds a new interface in `plugingetter` to account for these new
functions which are not going to be implemented by v1 plugins.
Changes terms `plugingetter.CREATE` and `plugingetter.REMOVE` to
`ACQUIRE` and `RELEASE` respectively, which seems to be better
adjectives for what we're doing.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Legacy plugins expect host-relative paths (such as for Volume.Mount).
However, a containerized plugin cannot respond with a host-relative
path. Therefore, this commit modifies new volume plugins' paths in Mount
and List to prepend the container's rootfs path.
This introduces a new PropagatedMount field in the Plugin Config.
When it is set for volume plugins, RootfsPropagation is set to rshared
and the path specified by PropagatedMount is bind-mounted with rshared
prior to launching the container. This is so that the daemon code can
access the paths returned by the plugin from the host mount namespace.
Signed-off-by: Tibor Vass <tibor@docker.com>
The current implementation of getRefs is a bit fragile. It returns a
slice to callers without copying its contents, and assumes the contents
will not be modified elsewhere.
Also, the current implementation of Dereference requires copying the
slice of references, excluding the one we wish to remove.
To improve both of these things, change refs to be a map of maps.
Deleting an item becomes trivial, and returning a slice of references
necessitates copying from the map.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Fix issue where out-of-band deletions and then a `docker volume create`
on the same driver caused volume to not be re-created in the driver but
return as created since it was stored in the cache.
Previous fix only worked if the driver names did not match.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 28769 where
checkpoint name was not checked before passing to containerd.
As a result, it was possible to use a special checkpoint name
to get outside of the container's directory.
This fix add restriction `[a-zA-Z0-9][a-zA-Z0-9_.-]+` (`RestrictedNamePattern`).
This is the same as container name restriction.
This fix fixes 28769.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Instead of converting nicely typed service mounts into untyped `Binds`
when creating containers, use the new `Mounts` API which is a 1-1
mapping between service mounts and container mounts.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
For some reason, `go vet` and `go fmt` validate does not capture
several issues.
The following was the output of `go vet`:
```
ubuntu@ubuntu:~/docker$ go vet ./... 2>&1 | grep -v ^vendor | grep -v '^exit status 1$'
cli/command/formatter/container_test.go:393: possible formatting directive in Log call
volume/volume_test.go:257: arg mp.RW for printf verb %s of wrong type: bool
```
The following was the output of `go fmt -s`:
```
ubuntu@ubuntu:~/docker$ gofmt -s -l . | grep -v ^vendor
cli/command/stack/list.go
daemon/commit.go
```
Fixed above issues with `go vet` and `go fmt -s`
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Fixes#22564
When an error occurs on mount, there should not be any call later to
unmount. This can throw off refcounting in the underlying driver
unexpectedly.
Consider these two cases:
```
$ docker run -v foo:/bar busybox true
```
```
$ docker run -v foo:/bar -w /foo busybox true
```
In the first case, if mounting `foo` fails, the volume driver will not
get a call to unmount (this is the incorrect behavior).
In the second case, the volume driver will not get a call to unmount
(correct behavior).
This occurs because in the first case, `/bar` does not exist in the
container, and as such there is no call to `volume.Mount()` during the
`create` phase. It will error out during the `start` phase.
In the second case `/bar` is created before dealing with the volume
because of the `-w`. Because of this, when the volume is being setup
docker will try to copy the image path contents in the volume, in which
case it will attempt to mount the volume and fail. This happens during
the `create` phase. This makes it so the container will not be created
(or at least fully created) and the user gets the error on `create`
instead of `start`. The error handling is different in these two phases.
Changed to only send `unmount` if the volume is mounted.
While investigating the cause of the reported issue I found some odd
behavior in unmount calls so I've cleaned those up a bit here as well.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When a conflict is found in the volume cache, check with the driver if
that volume still actually exists.
If the volume doesn't exist, purge it from the cache and allow the
create to happen.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 25545 where
volume options at the creation time is not showed up
in `docker volume inspect`.
This fix adds the field `Options` in `Volume` type and
persist the options in volume db so that `volume inspect`
could display the options.
This fix adds a couple of test cases to cover the changes.
This fix fixes 25545.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Found a couple of places where pretty low level errors were never being
wrapped with any sort of context.
For example, if you try to create a local volume using some bad mount
options, the kernel will return `invalid argument` when we try to mount
it at container start.
What would happen is a user would `docker run` with this volume and get
an error like `Error response from daemon: invalid argument`.
This uses github.com/pkg/errors to provide some context to the error
message without masking the original error.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
As part of making graphdrivers support pluginv2, a PluginGetter
interface was necessary for cleaner separation and avoiding import
cycles.
This commit creates a PluginGetter interface and makes pluginStore
implement it. Then the pluginStore object is created in the daemon
(rather than by the plugin manager) and passed to plugin init as
well as to the different subsystems (eg. graphdrivers, volumedrivers).
A side effect of this change was that some code was moved out of
experimental. This is good, since plugin support will be stable soon.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
`Mounts` allows users to specify in a much safer way the volumes they
want to use in the container.
This replaces `Binds` and `Volumes`, which both still exist, but
`Mounts` and `Binds`/`Volumes` are exclussive.
The CLI will continue to use `Binds` and `Volumes` due to concerns with
parsing the volume specs on the client side and cross-platform support
(for now).
The new API follows exactly the services mount API.
Example usage of `Mounts`:
```
$ curl -XPOST localhost:2375/containers/create -d '{
"Image": "alpine:latest",
"HostConfig": {
"Mounts": [{
"Type": "Volume",
"Target": "/foo"
},{
"Type": "bind",
"Source": "/var/run/docker.sock",
"Target": "/var/run/docker.sock",
},{
"Type": "volume",
"Name": "important_data",
"Target": "/var/data",
"ReadOnly": true,
"VolumeOptions": {
"DriverConfig": {
Name: "awesomeStorage",
Options: {"size": "10m"},
Labels: {"some":"label"}
}
}]
}
}'
```
There are currently 2 types of mounts:
- **bind**: Paths on the host that get mounted into the
container. Paths must exist prior to creating the container.
- **volume**: Volumes that persist after the
container is removed.
Not all fields are available in each type, and validation is done to
ensure these fields aren't mixed up between types.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
daemon/events/testutils: rename eventstestutils to testutils
volume/testutils: rename volumetestutils to testutils
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>