Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reuse existing structures and rely on json serialization to deep copy
Container objects.
Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
After running the test suite with the race detector enabled I found
these gems that need to be fixed.
This is just round one, sadly lost my test results after I built the
binary to test this... (whoops)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
Added an apparmorEnabled boolean in the Daemon struct to indicate if AppArmor is enabled or not. It is set in NewDaemon using sysInfo information.
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
gofmt'd
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
change the function name to something more adequate and changed the behaviour to show empty value on an apparmor disabled system.
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
go fmt
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
Fixes#22564
When an error occurs on mount, there should not be any call later to
unmount. This can throw off refcounting in the underlying driver
unexpectedly.
Consider these two cases:
```
$ docker run -v foo:/bar busybox true
```
```
$ docker run -v foo:/bar -w /foo busybox true
```
In the first case, if mounting `foo` fails, the volume driver will not
get a call to unmount (this is the incorrect behavior).
In the second case, the volume driver will not get a call to unmount
(correct behavior).
This occurs because in the first case, `/bar` does not exist in the
container, and as such there is no call to `volume.Mount()` during the
`create` phase. It will error out during the `start` phase.
In the second case `/bar` is created before dealing with the volume
because of the `-w`. Because of this, when the volume is being setup
docker will try to copy the image path contents in the volume, in which
case it will attempt to mount the volume and fail. This happens during
the `create` phase. This makes it so the container will not be created
(or at least fully created) and the user gets the error on `create`
instead of `start`. The error handling is different in these two phases.
Changed to only send `unmount` if the volume is mounted.
While investigating the cause of the reported issue I found some odd
behavior in unmount calls so I've cleaned those up a bit here as well.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This adds a metrics packages that creates additional metrics. Add the
metrics endpoint to the docker api server under `/metrics`.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add metrics to daemon package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
api: use standard way for metrics route
Also add "type" query parameter
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Convert timers to ms
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Before this, container's auto-removal after exit is done in a goroutine,
this commit will get ContainerRm out of the goroutine.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
`--rm` is a client side flag which caused lots of problems:
1. if client lost connection to daemon, including client crash or be
killed, there's no way to clean garbage container.
2. if docker stop a `--rm` container, this container won't be
autoremoved.
3. if docker daemon restart, container is also left over.
4. bug: `docker run --rm busybox fakecmd` will exit without cleanup.
In a word, client side `--rm` flag isn't sufficient for garbage
collection. Move the `--rm` flag to daemon will be more reasonable.
What this commit do is:
1. implement a `--rm` on daemon side, adding one flag `AutoRemove` into
HostConfig.
2. Allow `run --rm -d`, no conflicting `--rm` and `-d` any more,
auto-remove can work on detach mode.
3. `docker restart` a `--rm` container will succeed, the container won't
be autoremoved.
This commit will help a lot for daemon to do garbage collection for
temporary containers.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
In order to keep a little bit of "sanity" on the API side, validate
hostname only starting from v1.24 API version.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Currently `start` will hide some errors and throw a consolidated error,
which will make it hard to debug because developer can't find the
original error.
This commit allow daemon to log original errors first.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
If contaner start fail of (say) "command not found", the container
actually didn't start at all, we shouldn't log start and die event for
it, because that doesnt actually happen.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
When container is automatically restarted based on restart policy,
docker events can't get "start" event but only get "die" event, this is
not consistent with previous behavior. This commit will add "start"
event back.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
- Refactor generic and path based cleanup functions into a single function.
- Include aufs and zfs mounts in the mounts cleanup.
- Containers that receive exit event on restore don't require manual cleanup.
- Make missing sandbox id message a warning because currently sandboxes are always cleared on startup. libnetwork#975
- Don't unmount volumes for containers that don't have base path. Shouldn't be needed after #21372
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
so that the user knows what's not in the container but should be.
Its not always easy for the user to know what exact command is being run
when the 'docker run' is embedded deep in something else, like a Makefile.
Saw this while dealing with the containerd migration.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Attach can hang forever if there is no data to send. This PR adds notification
of Attach goroutine about container stop.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Correct creation of a non-existing WORKDIR during docker build to use
remapped root uid/gid on mkdir
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
Moving all strings to the errors package wasn't a good idea after all.
Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:
```go
func GetErrorMessage(err error) string {
switch err.(type) {
case errcode.Error:
e, _ := err.(errcode.Error)
return e.Message
case errcode.ErrorCode:
ec, _ := err.(errcode.ErrorCode)
return ec.Message()
default:
return err.Error()
}
}
```
This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.
Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.
Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:
```go
switch err.(type) {
case errcode.ErrorCode:
daError, _ := err.(errcode.ErrorCode)
statusCode = daError.Descriptor().HTTPStatusCode
errMsg = daError.Message()
case errcode.Error:
// For reference, if you're looking for a particular error
// then you can do something like :
// import ( derr "github.com/docker/docker/errors" )
// if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }
daError, _ := err.(errcode.Error)
statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
errMsg = daError.Message
default:
// This part of will be removed once we've
// converted everything over to use the errcode package
// FIXME: this is brittle and should not be necessary.
// If we need to differentiate between different possible error types,
// we should create appropriate error types with clearly defined meaning
errStr := strings.ToLower(err.Error())
for keyword, status := range map[string]int{
"not found": http.StatusNotFound,
"no such": http.StatusNotFound,
"bad parameter": http.StatusBadRequest,
"conflict": http.StatusConflict,
"impossible": http.StatusNotAcceptable,
"wrong login/password": http.StatusUnauthorized,
"hasn't been activated": http.StatusForbidden,
} {
if strings.Contains(errStr, keyword) {
statusCode = status
break
}
}
}
```
You can notice two things in that code:
1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.
This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:
```go
type errorWithStatus interface {
HTTPErrorStatusCode() int
}
```
This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.
I included helper functions to generate errors that use custom status code in `errors/errors.go`.
By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.
Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors
Signed-off-by: David Calavera <david.calavera@gmail.com>
Add `--restart` flag for `update` command, so we can change restart
policy for a container no matter it's running or stopped.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
When a container is created it is registered before the mount is created. This can lead to mount does not exist errors when inspecting between create and mount.
Fixes#18753
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Don't involve code waiting for blocking channel in locked critical
section because it has potential risk of hanging forever.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
- Make the API client library completely standalone.
- Move windows partition isolation detection to the client, so the
driver doesn't use external types.
Signed-off-by: David Calavera <david.calavera@gmail.com>
`adaptContainerSettings` is growing up, new it's only called
when create. It'll be a problem that old containers will never
have chance to adapt the latest rule. `HostConfig` of these
containers will be obsoleted.
Add this calling to start to avoid problems like #18550 and
avoid such backward compatability in the future.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Container needs to be locked when updating the fields, and
this PR also remove the redundant `parseSecurityOpt` since
it'll be done in `setHostConfig`.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
So other packages don't need to import the daemon package when they
want to use this struct.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
It will Tar up contents of child directory onto tmpfs if mounted over
This patch will use the new PreMount and PostMount hooks to "tar"
up the contents of the base image on top of tmpfs mount points.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Remove double reference between containers and exec configurations by
keeping only the container id.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This change will allow us to run SELinux in a container with
BTRFS back end. We continue to work on fixing the kernel/BTRFS
but this change will allow SELinux Security separation on BTRFS.
It basically relabels the content on container creation.
Just relabling -init directory in BTRFS use case. Everything looks like it
works. I don't believe tar/achive stores the SELinux labels, so we are good
as far as docker commit.
Tested Speed on startup with BTRFS on top of loopback directory. BTRFS
not on loopback should get even better perfomance on startup time. The
more inodes inside of the container image will increase the relabel time.
This patch will give people who care more about security the option of
runnin BTRFS with SELinux. Those who don't want to take the slow down
can disable SELinux either in individual containers or for all containers
by continuing to disable SELinux in the daemon.
Without relabel:
> time docker run --security-opt label:disable fedora echo test
test
real 0m0.918s
user 0m0.009s
sys 0m0.026s
With Relabel
test
real 0m1.942s
user 0m0.007s
sys 0m0.030s
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
The purpose of this PR is for users to distinguish Docker errors from
contained command errors.
This PR modifies 'docker run' exit codes to follow the chroot standard
for exit codes.
Exit status:
125 if 'docker run' itself fails
126 if contained command cannot be invoked
127 if contained command cannot be found
the exit status otherwise
Signed-off-by: Sally O'Malley <somalley@redhat.com>
Side effects:
- Decouple daemon and container to start containers.
- Decouple daemon and container to copy files.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.
Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).
This reverts commit de41640435, reversing
changes made to 7daeecd42d.
Signed-off-by: Tibor Vass <tibor@docker.com>
Conflicts:
api/server/container.go
builder/internals.go
daemon/container_unix.go
daemon/create.go
This PR adds a "request ID" to each event generated, the 'docker events'
stream now looks like this:
```
2015-09-10T15:02:50.000000000-07:00 [reqid: c01e3534ddca] de7c5d4ca927253cf4e978ee9c4545161e406e9b5a14617efb52c658b249174a: (from ubuntu) create
```
Note the `[reqID: c01e3534ddca]` part, that's new.
Each HTTP request will generate its own unique ID. So, if you do a
`docker build` you'll see a series of events all with the same reqID.
This allow for log processing tools to determine which events are all related
to the same http request.
I didn't propigate the context to all possible funcs in the daemon,
I decided to just do the ones that needed it in order to get the reqID
into the events. I'd like to have people review this direction first, and
if we're ok with it then I'll make sure we're consistent about when
we pass around the context - IOW, make sure that all funcs at the same level
have a context passed in even if they don't call the log funcs - this will
ensure we're consistent w/o passing it around for all calls unnecessarily.
ping @icecrime @calavera @crosbymichael
Signed-off-by: Doug Davis <dug@us.ibm.com>
- some method names were changed to have a 'Locking' suffix, as the
downcased versions already existed, and the existing functions simply
had locks around the already downcased version.
- deleting unused functions
- package comment
- magic numbers replaced by golang constants
- comments all over
Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
It may happen that host system settings are changed while the daemon is running.
This will cause errors at libcontainer level when starting a container with a
particular hostConfig (e.g. hostConfig with memory swappiness but the memory
cgroup was umounted).
This patch adds an hostConfig check on container start to prevent the daemon
from even calling libcontainer with the wrong configuration as we're already
doing on container's creation).
Signed-off-by: Antonio Murdaca <runcom@linux.com>
(cherry picked from commit 0d2628cdf1)
Move some calls to container.LogEvent down lower so that there's
less of a chance of them being missed. Also add a few more events
that appear to have been missed.
Added testcases for new events: commit, copy, resize, attach, rename, top
Signed-off-by: Doug Davis <dug@us.ibm.com>
These settings need to be in the HostConfig so that they are not
committed to an image and cannot introduce a security issue.
We can safely move this field from the Config to the HostConfig
without any regressions because these settings are consumed at container
created and used to populate fields on the Container struct. Because of
this, existing settings will be honored for containers already created
on a daemon with custom security settings and prevent values being
consumed via an Image.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Conflicts:
daemon/create.go
changing config to hostConfig was required to fix the
build