Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.
Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.
This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.
This commit was generated by the following bash script:
```
set -e -u -o pipefail
for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
$file
goimports -w $file
done
```
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using `errors.Errorf()` passes the error with the stack trace for
debugging purposes.
Also using `errdefs.InvalidParameter` for Windows, so that the API
will return a 4xx status, instead of a 5xx, and added tests for
both validations.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Description:
When using local volume option such as size=10G, type=tmpfs, if we provide wrong options, we could create volume successfully.
But when we are ready to use it, it will fail to start container by failing to mount the local volume(invalid option).
We should check the options at when we create it.
Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.
Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:
> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted
or
> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted
Before this patch, it was just
> operation not permitted
[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
There is no need to parse mount table and iterate through the list of
mounts, and then call Unmount() which again parses the mount table and
iterates through the list of mounts.
It is totally OK to call Unmount() unconditionally.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.
One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.
This commit adds `filter` argument to `GetMounts()` to implement
two things:
1. filter out entries a caller is not interested in
2. stop processing if a caller is found what it wanted
`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.
A few filters are implemented:
- `PrefixFilter`: filters out all entries not under `prefix`
- `SingleEntryFilter`: looks for a specific entry
Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.
Unit tests are added to check filters are working.
[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Instead of having to create a bunch of custom error types that are doing
nothing but wrapping another error in sub-packages, use a common helper
to create errors of the requested type.
e.g. instead of re-implementing this over and over:
```go
type notFoundError struct {
cause error
}
func(e notFoundError) Error() string {
return e.cause.Error()
}
func(e notFoundError) NotFound() {}
func(e notFoundError) Cause() error {
return e.cause
}
```
Packages can instead just do:
```
errdefs.NotFound(err)
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there"
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.
Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.
NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).
For more details, a quote from my runc commit 6f82d4b (July 2015):
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When using a volume via the `Binds` API, a shared selinux label is
automatically set.
The `Mounts` API is not setting this, which makes volumes specified via
the mounts API useless when selinux is enabled.
This fix adopts the same selinux label for volumes on the mounts API as on
binds.
Note in the case of both the `Binds` API and the `Mounts` API, the
selinux label is only applied when the volume driver is the `local`
driver.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When there is an error unmounting a local volume, it is still possible
to call `Remove()` on the volume causing removal of the mounted
resources which is generally not desirable.
This ensures that resources are unmounted before attempting removal.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 28769 where
checkpoint name was not checked before passing to containerd.
As a result, it was possible to use a special checkpoint name
to get outside of the container's directory.
This fix add restriction `[a-zA-Z0-9][a-zA-Z0-9_.-]+` (`RestrictedNamePattern`).
This is the same as container name restriction.
This fix fixes 28769.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Found a couple of places where pretty low level errors were never being
wrapped with any sort of context.
For example, if you try to create a local volume using some bad mount
options, the kernel will return `invalid argument` when we try to mount
it at container start.
What would happen is a user would `docker run` with this volume and get
an error like `Error response from daemon: invalid argument`.
This uses github.com/pkg/errors to provide some context to the error
message without masking the original error.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When the daemon is started, it looks at all the volumes and checks to
see if any of them have mount options persisted to disk, and loads them
from disk if it does.
In some cases a volume will be created with an empty map causing the
options file to be persisted and volume options set to a non-nil value
on daemon restart... this causes problems later when the driver checks
for a non-nil value to determine if it should try and mount with the
persisted volume options.
Ensures 2 things:
1. Instead of only checking nilness for the opts map, use `len` to make
sure it is not an empty map, which we don't really need to persit.
2. An empty (or nulled) opts.json will not inadvertnatly set volume
options on daemon restart.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
this improves the error message if a user tries to
create a volume with a single-character name:
Before this change:
docker volume create --name a
Error response from daemon: create a: "a" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed
After this change:
docker volume create --name a
Error response from daemon: create a: volume name is too short, names should be at least two alphanumeric characters
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On daemon restart the local volume driver will read options that it
persisted to disk, however it was reading an incorrect path, causing
volume options to be silently ignored after a daemon restart.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This is similar to network scopes where a volume can either be `local`
or `global`. A `global` volume is one that exists across the entire
cluster where as a `local` volume exists on a single engine.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This generates an ID string for calls to Mount/Unmount, allowing drivers
to differentiate between two callers of `Mount` and `Unmount`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The `Status` field is a `map[string]interface{}` which allows the driver to pass
back low-level details about the underlying volume.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Allows users to submit options similar to the `mount` command when
creating a volume with the `local` volume driver.
For example:
```go
$ docker volume create -d local --opt type=nfs --opt device=myNfsServer:/data --opt o=noatime,nosuid
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Moving all strings to the errors package wasn't a good idea after all.
Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:
```go
func GetErrorMessage(err error) string {
switch err.(type) {
case errcode.Error:
e, _ := err.(errcode.Error)
return e.Message
case errcode.ErrorCode:
ec, _ := err.(errcode.ErrorCode)
return ec.Message()
default:
return err.Error()
}
}
```
This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.
Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.
Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:
```go
switch err.(type) {
case errcode.ErrorCode:
daError, _ := err.(errcode.ErrorCode)
statusCode = daError.Descriptor().HTTPStatusCode
errMsg = daError.Message()
case errcode.Error:
// For reference, if you're looking for a particular error
// then you can do something like :
// import ( derr "github.com/docker/docker/errors" )
// if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }
daError, _ := err.(errcode.Error)
statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
errMsg = daError.Message
default:
// This part of will be removed once we've
// converted everything over to use the errcode package
// FIXME: this is brittle and should not be necessary.
// If we need to differentiate between different possible error types,
// we should create appropriate error types with clearly defined meaning
errStr := strings.ToLower(err.Error())
for keyword, status := range map[string]int{
"not found": http.StatusNotFound,
"no such": http.StatusNotFound,
"bad parameter": http.StatusBadRequest,
"conflict": http.StatusConflict,
"impossible": http.StatusNotAcceptable,
"wrong login/password": http.StatusUnauthorized,
"hasn't been activated": http.StatusForbidden,
} {
if strings.Contains(errStr, keyword) {
statusCode = status
break
}
}
}
```
You can notice two things in that code:
1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.
This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:
```go
type errorWithStatus interface {
HTTPErrorStatusCode() int
}
```
This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.
I included helper functions to generate errors that use custom status code in `errors/errors.go`.
By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.
Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors
Signed-off-by: David Calavera <david.calavera@gmail.com>
Makes `docker volume ls` and `docker volume inspect` ask the volume
drivers rather than only using what is cached locally.
Previously in order to use a volume from an external driver, one would
either have to use `docker volume create` or have a container that is
already using that volume for it to be visible to the other volume
API's.
For keeping uniqueness of volume names in the daemon, names are bound to
a driver on a first come first serve basis. If two drivers have a volume
with the same name, the first one is chosen, and a warning is logged
about the second one.
Adds 2 new methods to the plugin API, `List` and `Get`.
If a plugin does not implement these endpoints, a user will not be able
to find the specified volumes as well requests go through the drivers.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>