This is not for the sake of test to run faster of course;
this is to simplify the code as well as have some more
testing for mount.SingleEntryFilter().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There is no need to parse mount table and iterate through the list of
mounts, and then call Unmount() which again parses the mount table and
iterates through the list of mounts.
It is totally OK to call Unmount() unconditionally.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.
One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.
This commit adds `filter` argument to `GetMounts()` to implement
two things:
1. filter out entries a caller is not interested in
2. stop processing if a caller is found what it wanted
`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.
A few filters are implemented:
- `PrefixFilter`: filters out all entries not under `prefix`
- `SingleEntryFilter`: looks for a specific entry
Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.
Unit tests are added to check filters are working.
[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This moves the platform specific stuff in a separate package and keeps
the `volume` package and the defined interfaces light to import.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Instead of using a global store for volume drivers, scope the driver
store to the caller (e.g. the volume store). This makes testing much
simpler.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Changes Details:
--------------
Fixes: #36395
Refactoring the code to do the following:
1. Add the method `errBindSourceDoesNotExist` inside `validate.go` to be in-line with the rest of error message
2. Utilised the new method inside `linux_parser.go`, `windows_parser.go` and `validate_test.go`
3. Change the format from `bind mount source path: '%s' does not exist` to `bind mount source path does not exist: %s`
4. Reflected the format change into the 2 unit tests, namely: `volume_test.go` and `validate_test.go`
5. Reflected the format change into `docker_api_containers_test.go` integration test
Signed-off-by: Amr Gawish <amr.gawish@gmail.com>
Before this change, volume management was relying on the fact that
everything the plugin mounts is visible on the host within the plugin's
rootfs. In practice this caused some issues with mount leaks, so we
changed the behavior such that mounts are not visible on the plugin's
rootfs, but available outside of it, which breaks volume management.
To fix the issue, allow the plugin to scope the path correctly rather
than assuming that everything is visible in `p.Rootfs`.
In practice this is just scoping the `PropagatedMount` paths to the
correct host path.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Both lcow_parser.go and linux_parser.go are duplicating the error:
"invalid specification: destination can't be '/'"
This commit creates a new error called "ErrVolumeTargetIsRoot"
that is used by both linux_parser and lcow_parser and remove
the duplication in the code.
Signed-off-by: Boaz Shuster <ripcurld.github@gmail.com>
Instead of having to create a bunch of custom error types that are doing
nothing but wrapping another error in sub-packages, use a common helper
to create errors of the requested type.
e.g. instead of re-implementing this over and over:
```go
type notFoundError struct {
cause error
}
func(e notFoundError) Error() string {
return e.cause.Error()
}
func(e notFoundError) NotFound() {}
func(e notFoundError) Cause() error {
return e.cause
}
```
Packages can instead just do:
```
errdefs.NotFound(err)
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Validation of Mounts was only performed on container _creation_, not on
container _start_. As a result, if the host-path no longer existed
when the container was started, a directory was created in the given
location.
This is the wrong behavior, because when using the `Mounts` API, host paths
should never be created, and an error should be produced instead.
This patch adds a validation step on container start, and produces an
error if the host path is not found.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there"
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.
Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.
NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).
For more details, a quote from my runc commit 6f82d4b (July 2015):
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Before this, if a volume exists in a driver but not in the local cache,
the store would just return a bare volume. This means that if a user
supplied options or labels, they will not get stored.
Instead only return early if we have the volume stored locally. Note
this could still have an issue with labels/opts passed in by the user
differing from what is stored, however this isn't really a new problem.
This fixes a problem where if there is a shared storage backend between
two docker nodes, a create on one node will have labels stored and a
create on the other node will not.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In some circumstances we were not properly releasing plugin references,
leading to failures in removing a plugin with no way to recover other
than restarting the daemon.
1. If volume create fails (in the driver)
2. If a driver validation fails (should be rare)
3. If trying to get a plugin that does not match the passed in capability
Ideally the test for 1 and 2 would just be a unit test, however the
plugin interfaces are too complicated as `plugingetter` relies on
github.com/pkg/plugin/Client (a concrete type), which will require
spinning up services from within the unit test... it just wouldn't be a
unit test at this point.
I attempted to refactor this a bit, but since both libnetwork and
swarmkit are reliant on `plugingetter` as well, this would not work.
This really requires a re-write of the lower-level plugin management to
decouple these pieces.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When using a volume via the `Binds` API, a shared selinux label is
automatically set.
The `Mounts` API is not setting this, which makes volumes specified via
the mounts API useless when selinux is enabled.
This fix adopts the same selinux label for volumes on the mounts API as on
binds.
Note in the case of both the `Binds` API and the `Mounts` API, the
selinux label is only applied when the volume driver is the `local`
driver.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Current insider builds of Windows have support for mounting individual
named pipe servers from the host to the guest. This allows, for example,
exposing the docker engine's named pipe to a container.
This change allows the user to request such a mount via the normal bind
mount syntax in the CLI:
docker run -v \\.\pipe\docker_engine:\\.\pipe\docker_engine <args>
Signed-off-by: John Starks <jostarks@microsoft.com>
[1.12.x] Fix issue where volume metadata was not removed
(cherry picked from commit 7613b23a58)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Conflicts:
volume/store/store.go
volume/store/store_test.go
If a container mount the socket the daemon is listening on into
container while the daemon is being shutdown, the socket will
not exist on the host, then daemon will assume it's a directory
and create it on the host, this will cause the daemon can't start
next time.
fix issue https://github.com/moby/moby/issues/30348
To reproduce this issue, you can add following code
```
--- a/daemon/oci_linux.go
+++ b/daemon/oci_linux.go
@@ -8,6 +8,7 @@ import (
"sort"
"strconv"
"strings"
+ "time"
"github.com/Sirupsen/logrus"
"github.com/docker/docker/container"
@@ -666,7 +667,8 @@ func (daemon *Daemon) createSpec(c *container.Container) (*libcontainerd.Spec, e
if err := daemon.setupIpcDirs(c); err != nil {
return nil, err
}
-
+ fmt.Printf("===please stop the daemon===\n")
+ time.Sleep(time.Second * 2)
ms, err := daemon.setupMounts(c)
if err != nil {
return nil, err
```
step1 run a container which has `--restart always` and `-v /var/run/docker.sock:/sock`
```
$ docker run -ti --restart always -v /var/run/docker.sock:/sock busybox
/ #
```
step2 exit the the container
```
/ # exit
```
and kill the daemon when you see
```
===please stop the daemon===
```
in the daemon log
The daemon can't restart again and fail with `can't create unix socket /var/run/docker.sock: is a directory`.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
Closes#32663 by adding CreatedAt field when volume is created.
Displaying CreatedAt value when volume is inspected
Adding tests to verfiy the new field is correctly populated
Signed-off-by: Marianna <mtesselh@gmail.com>
Moving CreatedAt tests from the CLI
Moving the tests added for the newly added CreatedAt field for Volume, from CLI to API tests
Signed-off-by: Marianna <mtesselh@gmail.com>
This makes sure that multiple users of MountPoint pointer can
mount/unmount without affecting each other.
Before this PR, if you run a container (stay running), then do `docker
cp`, when the `docker cp` is done the MountPoint is mutated such that
when the container stops the volume driver will not get an Unmount
request. Effectively there would be two mounts with only one unmount.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When there is an error unmounting a local volume, it is still possible
to call `Remove()` on the volume causing removal of the mounted
resources which is generally not desirable.
This ensures that resources are unmounted before attempting removal.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Until and unless user has specified a propagation property for volume, they
should default to "rprivate" and it should be passed to runc.
We can't make it conditional on HasPropagation(). GetPropagation() returns
default of rprivate if noting was passed in by user.
If we don't pass "rprivate" to runc, then bind mount could be shared even
if user did not ask for it. For example, mount two volumes in a container.
One is "shared" while other's propagation is not specified by caller. If
both volume has same source mount point of "shared", then second volume
will also be shared inside container (instead of being private).
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This adds 'consistency' mode flags to the mount command line argument.
Initially, the valid 'consistency' flags are 'consistent', 'cached',
'delegated', and 'default'.
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: Jeremy Yallop <yallop@docker.com>
There was no validation for `docker run --tmpfs foo`.
In this PR, only two obvious rules are implemented:
- path must be absolute
- path must not be "/"
We should add more rules carefully.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This patch fixed below 4 types of code line
1. Remove unnecessary variable assignment
2. Use variables declaration instead of explicit initial zero value
3. Change variable name to underbar when variable not used
4. Add erro check and return for ignored error
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Currently local volumes and other volumes that support SELinux do
not get labeled correctly. This patch will allow a user to specify
:Z or :z when mounting a volume and have it fix the label of the newly
created volume.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Go style calls for mixed caps instead of all caps:
https://golang.org/doc/effective_go.html#mixed-caps
Change LOOKUP, ACQUIRE, and RELEASE to Lookup, Acquire, and Release.
This vendors a fork of libnetwork for now, to deal with a cyclic
dependency issue. The change will be upstream to libnetwork once this is
merged.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Move plugins to shared distribution stack with images.
Create immutable plugin config that matches schema2 requirements.
Ensure data being pushed is same as pulled/created.
Store distribution artifacts in a blobstore.
Run init layer setup for every plugin start.
Fix breakouts from unsafe file accesses.
Add support for `docker plugin install --alias`
Uses normalized references for default names to avoid collisions when using default hosts/tags.
Some refactoring of the plugin manager to support the change, like removing the singleton manager and adding manager config struct.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
bolt k/v pairs are only valid for the life of a transaction.
This means the memory that the k/v pair is referencing may be invalid if
it is accessed outside of the transaction.
This can potentially cause a panic.
For reference: https://godoc.org/github.com/boltdb/bolt#hdr-Caveats
To fix this issue, unmarshal the stored data into volume meta before
closing the transaction.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Previously, it was comparing against the driver name passed in by the
caller. This could lead to subtle issues when using plugins, like
"plugin" vs. "plugin:latest".
Also, remove "conflict:" prefix to improve the error message.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Ensures all known volumes (known b/c they are persisted to disk) have
their volume drivers refcounted properly.
In testing this, I found an issue with `--live-restore` (required since
currently the provided volume plugin doesn't keep state on restart)
where restorted plugins did not have a plugin client loaded causing a
panic when trying to use the plugin.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Adds 2 new methods to v2 plugin `Acquire` and `Release` which allow
refcounting directly at the plugin level instead of just the store.
Since a graphdriver is initialized exactly once, and is really managed
by a separate object, it didn't really seem right to call
`getter.Get()` to refcount graphdriver plugins.
On shutdown it was particularly weird where we'd either need to keep a
driver reference in daemon, or keep a reference to the pluggin getter in
the layer store, and even then still store extra details on if the
graphdriver is a plugin or not.
Instead the plugin proxy itself will handle calling the neccessary
refcounting methods directly on the plugin object.
Also adds a new interface in `plugingetter` to account for these new
functions which are not going to be implemented by v1 plugins.
Changes terms `plugingetter.CREATE` and `plugingetter.REMOVE` to
`ACQUIRE` and `RELEASE` respectively, which seems to be better
adjectives for what we're doing.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Legacy plugins expect host-relative paths (such as for Volume.Mount).
However, a containerized plugin cannot respond with a host-relative
path. Therefore, this commit modifies new volume plugins' paths in Mount
and List to prepend the container's rootfs path.
This introduces a new PropagatedMount field in the Plugin Config.
When it is set for volume plugins, RootfsPropagation is set to rshared
and the path specified by PropagatedMount is bind-mounted with rshared
prior to launching the container. This is so that the daemon code can
access the paths returned by the plugin from the host mount namespace.
Signed-off-by: Tibor Vass <tibor@docker.com>
The current implementation of getRefs is a bit fragile. It returns a
slice to callers without copying its contents, and assumes the contents
will not be modified elsewhere.
Also, the current implementation of Dereference requires copying the
slice of references, excluding the one we wish to remove.
To improve both of these things, change refs to be a map of maps.
Deleting an item becomes trivial, and returning a slice of references
necessitates copying from the map.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Fix issue where out-of-band deletions and then a `docker volume create`
on the same driver caused volume to not be re-created in the driver but
return as created since it was stored in the cache.
Previous fix only worked if the driver names did not match.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 28769 where
checkpoint name was not checked before passing to containerd.
As a result, it was possible to use a special checkpoint name
to get outside of the container's directory.
This fix add restriction `[a-zA-Z0-9][a-zA-Z0-9_.-]+` (`RestrictedNamePattern`).
This is the same as container name restriction.
This fix fixes 28769.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Instead of converting nicely typed service mounts into untyped `Binds`
when creating containers, use the new `Mounts` API which is a 1-1
mapping between service mounts and container mounts.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>