Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Delete needs to release names related to a container even if that
container isn't present in the db. However, slightly overzealous error
checking causes the transaction to get rolled back. Ignore the error
from Delete on the container itself, since it may not be present.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Currently, names are maintained by a separate system called "registrar".
This means there is no way to atomically snapshot the state of
containers and the names associated with them.
We can add this atomicity and simplify the code by storing name
associations in the memdb. This removes the need for pkg/registrar, and
makes snapshots a lot less expensive because they no longer need to copy
all the names. This change also avoids some problematic behavior from
pkg/registrar where it returns slices which may be modified later on.
Note that while this change makes the *snapshotting* atomic, it doesn't
yet do anything to make sure containers are named at the same time that
they are added to the database. We can do that by adding a transactional
interface, either as a followup, or as part of this PR.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Do not change pause state when restoring container's
status, or status in docker will be different with
status in runc.
Signed-off-by: Fengtu Wang <wangfengtu@huawei.com>
Description:
1. start a container with restart=always.
`docker run -d --restart=always ubuntu sleep 3`
2. container init process exits.
3. use `docker pause <id>` to pause this container.
if the pause action is before cgroup data is removed and after the init process died.
`Pause` operation will success to write cgroup data, but actually do not freeze any process.
And then docker received pause event and stateExit event from
containerd, the docker state will be Running(paused), but the container
is free running.
Then we can not remove it, stop it , pause it and unpause it.
Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
If a container doesn't exist in the memdb, First will return nil, not an
error. This should be checked for before using the result.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Reuse existing structures and rely on json serialization to deep copy
Container objects.
Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
The Solaris version (previously daemon/inspect_solaris.go) was
apparently missing some fields that should be available on that
platform.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Doing a chown/chmod automatically can cause `EPERM` in some cases (e.g.
with an NFS mount). Currently Docker will always call chown+chmod on a
volume path unless `:nocopy` is passed in, but we don't need to make
these calls if the perms and ownership already match and potentially
avoid an uneccessary `EPERM`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The use of pools.Copy avoids io.Copy's internal buffer allocation.
This commit replaces io.Copy with pools.Copy to avoid the allocation of
buffers in io.Copy.
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
Commit abd72d4008 added
a "FIXME" comment to the container "State", mentioning
that a container cannot be both "Running" and "Paused".
This comment was incorrect, because containers on
Linux actually _must_ be running in order to be
paused.
This patch adds additional information both in a
comment, and in the API documentation to clarify
that these booleans are not mutually exclusive.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The commit adds capability to accept csv parameters
for network option in service create/update commands.The change
includes name,alias driver options specific to the network.
With this the following will be supported
docker service create --name web --network name=docknet,alias=web1,driver-opt=field1=value1 nginx
docker service create --name web --network docknet nginx
docker service update web --network-add name=docknet,alias=web1,driver-opt=field1=value1
docker service update web --network-rm docknet
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
This patch adds the untilRemoved option to the ContainerWait API which
allows the client to wait until the container is not only exited but
also removed.
This patch also adds some more CLI integration tests for waiting for a
created container and waiting with the new --until-removed flag.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Handle detach sequence in CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Update Container Wait Conditions
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Apply container wait changes to API 1.30
The set of changes to the containerWait API missed the cut for the
Docker 17.05 release (API version 1.29). This patch bumps the version
checks to use 1.30 instead.
This patch also makes a minor update to a testfile which was added to
the builder/dockerfile package.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Remove wait changes from CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address minor nits on wait changes
- Changed the name of the tty Proxy wrapper to `escapeProxy`
- Removed the unnecessary Error() method on container.State
- Fixes a typo in comment (repeated word)
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Use router.WithCancel in the containerWait handler
This handler previously added this functionality manually but now uses
the existing wrapper which does it for us.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Add WaitCondition constants to api/types/container
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
- Update ContainerWait backend interface to not return pointer values
for container.StateStatus type.
- Updated container state's Wait() method comments to clarify that a
context MUST be used for cancelling the request, setting timeouts,
and to avoid goroutine leaks.
- Removed unnecessary buffering when making channels in the client's
ContainerWait methods.
- Renamed result and error channels in client's ContainerWait methods
to clarify that only a single result or error value would be sent
on the channel.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Move container.WaitCondition type to separate file
... to avoid conflict with swagger-generated code for API response
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
This patch consolidates the two WaitStop and WaitWithContext methods
on the container.State type. Now there is a single method, Wait, which
takes a context and a bool specifying whether to wait for not just a
container exit but also removal.
The behavior has been changed slightly so that a wait call during a
Created state will not return immediately but instead wait for the
container to be started and then exited.
The interface has been changed to no longer block, but instead returns
a channel on which the caller can receive a *StateStatus value which
indicates the ExitCode or an error if there was one (like a context
timeout or state transition error).
These changes have been propagated through the rest of the deamon to
preserve all other existing behavior.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
This makes sure that multiple users of MountPoint pointer can
mount/unmount without affecting each other.
Before this PR, if you run a container (stay running), then do `docker
cp`, when the `docker cp` is done the MountPoint is mutated such that
when the container stops the volume driver will not get an Unmount
request. Effectively there would be two mounts with only one unmount.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 31032 where it was
not possible to specify `--cpus` for `docker update`.
This fix adds `--cpus` support for `docker update`. In case both
`--cpus` and `--cpu-period/--cpu-quota` have been specified,
an error will be returned.
Related docs has been updated.
Integration tests have been added.
This fix fixes 31032.
This fix is related to 27921, 27958.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
Working directory processing was handled differently for Hyper-V and Windows-Server containers, as annotated in the builder documentation (updated in this PR). For Hyper-V containers, the working directory set by WORKDIR was not created. This PR makes Hyper-V containers work the same as Windows Server containers (and the same as Linux).
Example (only applies to Hyper-V containers, so not reproducible under CI environment)
Dockerfile:
FROM microsoft/nanoserver
WORKDIR c:\installer
ENV GOROOT=c:\installer
ADD go.exe .
RUN go --help
Running on Windows Server 2016, using docker master without this change, but with daemon set to --exec-opt isolation=hyperv as it would be for Client operating systems.
PS E:\go\src\github.com\docker\docker> dockerd -g c:\control --exec-opt isolation=hyperv
time="2017-02-01T15:48:09.657286100-08:00" level=info msg="Windows default isolation mode: hyperv"
time="2017-02-01T15:48:09.662720900-08:00" level=info msg="[graphdriver] using prior storage driver: windowsfilter"
time="2017-02-01T15:48:10.011588000-08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2017-02-01T15:48:10.016655800-08:00" level=info msg="Loading containers: start."
time="2017-02-01T15:48:10.460820000-08:00" level=info msg="Loading containers: done."
time="2017-02-01T15:48:10.509859600-08:00" level=info msg="Daemon has completed initialization"
time="2017-02-01T15:48:10.509859600-08:00" level=info msg="Docker daemon" commit=3c64061 graphdriver=windowsfilter version=1.14.0-dev
First with no explicit isolation:
PS E:\docker\build\unifyworkdir> docker build --no-cache .
Sending build context to Docker daemon 10.1 MB
Step 1/5 : FROM microsoft/nanoserver
---> 89b8556cb9ca
Step 2/5 : WORKDIR c:\installer
---> 7e0f41d08204
Removing intermediate container 236c7802042a
Step 3/5 : ENV GOROOT c:\installer
---> Running in 8ea5237183c1
---> 394b70435261
Removing intermediate container 8ea5237183c1
Step 4/5 : ADD go.exe .
---> e47401a1745c
Removing intermediate container 88dcc28e74b1
Step 5/5 : RUN go --help
---> Running in efe90e1b6b8b
container efe90e1b6b8b76586abc5c1dc0e2797b75adc26517c48733d90651e767c8463b encountered an error during CreateProcess: failure in a Windows system call: The directory name is invalid. (0x10b) extra info: {"ApplicationName":"","CommandLine":"cmd /S /C go --help","User":"","WorkingDirectory":"C:\\installer","Environment":{"GOROOT":"c:\\installer"},"EmulateConsole":false,"CreateStdInPipe":true,"CreateStdOutPipe":true,"CreateStdErrPipe":true,"ConsoleSize":[0,0]}
PS E:\docker\build\unifyworkdir>
Then forcing process isolation:
PS E:\docker\build\unifyworkdir> docker build --isolation=process --no-cache .
Sending build context to Docker daemon 10.1 MB
Step 1/5 : FROM microsoft/nanoserver
---> 89b8556cb9ca
Step 2/5 : WORKDIR c:\installer
---> 350c955980c8
Removing intermediate container 8339c1e9250c
Step 3/5 : ENV GOROOT c:\installer
---> Running in bde511c5e3e0
---> b8820063b5b6
Removing intermediate container bde511c5e3e0
Step 4/5 : ADD go.exe .
---> e4ac32f8902b
Removing intermediate container d586e8492eda
Step 5/5 : RUN go --help
---> Running in 9e1aa235af5f
Cannot mkdir: C:\installer is not a directory
PS E:\docker\build\unifyworkdir>
Now compare the same results after this PR. Again, first with no explicit isolation (defaulting to Hyper-V containers as that's what the daemon it set to) - note it now succeeds 😄
PS E:\docker\build\unifyworkdir> docker build --no-cache .
Sending build context to Docker daemon 10.1 MB
Step 1/5 : FROM microsoft/nanoserver
---> 89b8556cb9ca
Step 2/5 : WORKDIR c:\installer
---> 4f319f301c69
Removing intermediate container 61b9c0b1ff6f
Step 3/5 : ENV GOROOT c:\installer
---> Running in c464a1d612d8
---> 96a26ab9a7b5
Removing intermediate container c464a1d612d8
Step 4/5 : ADD go.exe .
---> 0290d61faf57
Removing intermediate container dc5a085fffe3
Step 5/5 : RUN go --help
---> Running in 60bd56042ff8
Go is a tool for managing Go source code.
Usage:
go command [arguments]
The commands are:
build compile packages and dependencies
clean remove object files
doc show documentation for package or symbol
env print Go environment information
fix run go tool fix on packages
fmt run gofmt on package sources
generate generate Go files by processing source
get download and install packages and dependencies
install compile and install packages and dependencies
list list packages
run compile and run Go program
test test packages
tool run specified go tool
version print Go version
vet run go tool vet on packages
Use "go help [command]" for more information about a command.
Additional help topics:
c calling between Go and C
buildmode description of build modes
filetype file types
gopath GOPATH environment variable
environment environment variables
importpath import path syntax
packages description of package lists
testflag description of testing flags
testfunc description of testing functions
Use "go help [topic]" for more information about that topic.
The command 'cmd /S /C go --help' returned a non-zero code: 2
And the same with forcing process isolation. Also works 😄
PS E:\docker\build\unifyworkdir> docker build --isolation=process --no-cache .
Sending build context to Docker daemon 10.1 MB
Step 1/5 : FROM microsoft/nanoserver
---> 89b8556cb9ca
Step 2/5 : WORKDIR c:\installer
---> f423b9cc3e78
Removing intermediate container 41330c88893d
Step 3/5 : ENV GOROOT c:\installer
---> Running in 0b99a2d7bf19
---> e051144bf8ec
Removing intermediate container 0b99a2d7bf19
Step 4/5 : ADD go.exe .
---> 7072e32b7c37
Removing intermediate container a7a97aa37fd1
Step 5/5 : RUN go --help
---> Running in 7097438a54e5
Go is a tool for managing Go source code.
Usage:
go command [arguments]
The commands are:
build compile packages and dependencies
clean remove object files
doc show documentation for package or symbol
env print Go environment information
fix run go tool fix on packages
fmt run gofmt on package sources
generate generate Go files by processing source
get download and install packages and dependencies
install compile and install packages and dependencies
list list packages
run compile and run Go program
test test packages
tool run specified go tool
version print Go version
vet run go tool vet on packages
Use "go help [command]" for more information about a command.
Additional help topics:
c calling between Go and C
buildmode description of build modes
filetype file types
gopath GOPATH environment variable
environment environment variables
importpath import path syntax
packages description of package lists
testflag description of testing flags
testfunc description of testing functions
Use "go help [topic]" for more information about that topic.
The command 'cmd /S /C go --help' returned a non-zero code: 2
PS E:\docker\build\unifyworkdir>
This allows the user to set a logging mode to "blocking" (default), or
"non-blocking", which uses the ring buffer as a proxy to the real log
driver.
This allows a container to never be blocked on stdio at the cost of
dropping log messages.
Introduces 2 new log-opts that works for all drivers, `log-mode` and
`log-size`. `log-mode` takes a value of "blocking", or "non-blocking"
I chose not to implement this as a bool since it is difficult to
determine if the mode was set to false vs just not set... especially
difficult when merging the default daemon config with the container config.
`log-size` takes a size string, e.g. `2MB`, which sets the max size
of the ring buffer. When the max size is reached, it will start
dropping log messages.
```
BenchmarkRingLoggerThroughputNoReceiver-8 2000000000 36.2 ns/op 856.35 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputWithReceiverDelay0-8 300000000 156 ns/op 198.48 MB/s 32 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay1-8 2000000000 36.1 ns/op 857.80 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay10-8 1000000000 36.2 ns/op 856.53 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay50-8 2000000000 34.7 ns/op 894.65 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay100-8 2000000000 35.1 ns/op 883.91 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay300-8 1000000000 35.9 ns/op 863.90 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay500-8 2000000000 35.8 ns/op 866.88 MB/s 0 B/op 0 allocs/op
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix fixes issue raised in 29492 where it was not
possible to specify a default `--default-shm-size` in daemon
configuration for each `docker run``.
The flag `--default-shm-size` which is reloadable, has been
added to the daemon configuation.
Related docs has been updated.
This fix fixes 29492.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
`copyEscapable` is a copy/paste of io.Copy with some added handling for
checking for the attach escape sequence.
This removes the copy/paste and uses `io.Copy` directly. To be able to
do this, it now implements an `io.Reader` which proxies to the main
reader but looks for the escape sequence.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This cleans up attach a little bit, and moves it out of the container
package.
Really `AttachStream` is a method on `*stream.Config`, so moved if from
a package level function to one bound to `Config`.
In addition, uses a config struct rather than passing around tons and
tons of arguments.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
… or could be in `opts` package. Having `runconfig/opts` and `opts`
doesn't really make sense and make it difficult to know where to put
some code.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
use secret store interface instead of embedded secret data into container
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
`StreamConfig` carries with it a dep on libcontainerd, which is used by
other projects, but libcontainerd doesn't compile on all platforms, so
move it to `github.com/docker/docker/container/stream`
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Fixes#22564
When an error occurs on mount, there should not be any call later to
unmount. This can throw off refcounting in the underlying driver
unexpectedly.
Consider these two cases:
```
$ docker run -v foo:/bar busybox true
```
```
$ docker run -v foo:/bar -w /foo busybox true
```
In the first case, if mounting `foo` fails, the volume driver will not
get a call to unmount (this is the incorrect behavior).
In the second case, the volume driver will not get a call to unmount
(correct behavior).
This occurs because in the first case, `/bar` does not exist in the
container, and as such there is no call to `volume.Mount()` during the
`create` phase. It will error out during the `start` phase.
In the second case `/bar` is created before dealing with the volume
because of the `-w`. Because of this, when the volume is being setup
docker will try to copy the image path contents in the volume, in which
case it will attempt to mount the volume and fail. This happens during
the `create` phase. This makes it so the container will not be created
(or at least fully created) and the user gets the error on `create`
instead of `start`. The error handling is different in these two phases.
Changed to only send `unmount` if the volume is mounted.
While investigating the cause of the reported issue I found some odd
behavior in unmount calls so I've cleaned those up a bit here as well.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- use /secrets for swarm secret create route
- do not specify omitempty for secret and secret reference
- simplify lookup for secret ids
- do not use pointer for secret grpc conversion
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
- fix lint issues
- use errors pkg for wrapping errors
- cleanup on error when setting up secrets mount
- fix erroneous import
- remove unneeded switch for secret reference mode
- return single mount for secrets instead of slice
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
This fix tries to add a flag `--stop-timeout` to specify the timeout value
(in seconds) for the container to stop before SIGKILL is issued. If stop timeout
is not specified then the default timeout (10s) is used.
Additional test cases have been added to cover the change.
This fix is related to #22471. Another pull request will add `--shutdown-timeout`
to daemon for #22471.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
"VolumeDriver.Mount" is being called on container start.
Make the symmetric call on container stop.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
This removes the SetStoppedLocking, and
SetRestartingLocking functions, which
were not used anywhere.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`Mounts` allows users to specify in a much safer way the volumes they
want to use in the container.
This replaces `Binds` and `Volumes`, which both still exist, but
`Mounts` and `Binds`/`Volumes` are exclussive.
The CLI will continue to use `Binds` and `Volumes` due to concerns with
parsing the volume specs on the client side and cross-platform support
(for now).
The new API follows exactly the services mount API.
Example usage of `Mounts`:
```
$ curl -XPOST localhost:2375/containers/create -d '{
"Image": "alpine:latest",
"HostConfig": {
"Mounts": [{
"Type": "Volume",
"Target": "/foo"
},{
"Type": "bind",
"Source": "/var/run/docker.sock",
"Target": "/var/run/docker.sock",
},{
"Type": "volume",
"Name": "important_data",
"Target": "/var/data",
"ReadOnly": true,
"VolumeOptions": {
"DriverConfig": {
Name: "awesomeStorage",
Options: {"size": "10m"},
Labels: {"some":"label"}
}
}]
}
}'
```
There are currently 2 types of mounts:
- **bind**: Paths on the host that get mounted into the
container. Paths must exist prior to creating the container.
- **volume**: Volumes that persist after the
container is removed.
Not all fields are available in each type, and validation is done to
ensure these fields aren't mixed up between types.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This PR adds support for running regular containers to be connected to
swarm mode multi-host network so that:
- containers connected to the same network across the cluster can
discover and connect to each other.
- Get access to services(and their associated loadbalancers)
connected to the same network
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
"--restart" and "--rm" are conflict options, if a container is started
with AutoRemove flag, we should forbid the update action for its Restart
Policy.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
The memory should always be smaller than memoryswap,
we should error out with message that user know how
to do rather than just an invalid argument error if
user update the memory limit bigger than already set
memory swap.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
As described in our ROADMAP.md, introduce new Swarm management API
endpoints relying on swarmkit to deploy services. It currently vendors
docker/engine-api changes.
This PR is fully backward compatible (joining a Swarm is an optional
feature of the Engine, and existing commands are not impacted).
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
This fix fixes#23459.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
If we attach to a running container and stream is closed afterwards, we
can never be sure if the container is stopped or detached. Adding a new
type of `detach` event can explicitly notify client that container is
detached, so client will know that there's no need to wait for its exit
code and it can move forward to next step now.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.
The `HEALTHCHECK` instruction has two forms:
* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)
The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.
When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.
The options that can appear before `CMD` are:
* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)
The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.
If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.
It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.
There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.
The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).
The command's exit status indicates the health status of the container.
The possible values are:
- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly
If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.
For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:
HEALTHCHECK --interval=5m --timeout=3s \
CMD curl -f http://localhost/ || exit 1
To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).
When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.
Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
SELinux labeling should be disabled when using --privileged mode
/etc/hosts, /etc/resolv.conf, /etc/hostname should not be relabeled if they
are volume mounted into the container.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Currently, using a custom detach key with an invalid sequence, eats a
part of the sequence, making it weird and difficult to enter some key
sequence.
This fixes by keeping the input read when trying to see if it's the key
sequence or not, and "writing" then is the key sequence is not the right
one, preserving the initial input.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Rework memoryStore so that filters and apply run
on a cloned list of containers after the lock has
been released. This avoids possible deadlocks when
these filter/apply callbacks take locks for a
container.
Fixes#22732
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This fix tries to address the issue raised in #22358 where syslog's
message tag always starts with `docker/` and can not be removed
by changing the log tag templates.
The issue is that syslog driver hardcodes `path.Base(os.Args[0])`
as the prefix, which is the binary file name of the daemon (`dockerd`).
This could be an issue for certain situations (e.g., #22358) where
user may prefer not to have a dedicated prefix in syslog messages.
There is no way to override this behavior in the current verison of
the docker.
This fix tries to address this issue without making changes in the
default behavior of the syslog driver. An additional
`{{.DaemonName}}` has been introduced in the syslog tag. This is
assigned as the `docker` when daemon starts. The default log tag
template has also been changed from
`path.Base(os.Args[0]) + "/{{.ID}}"` to `{{.DaemonName}}/{{.ID}}`.
Therefore, there is no behavior changes when log-tag is not provided.
In order to be consistent, the default log tag for fluentd has been
changed from `docker.{{.ID}}` to `{{DaemonName}}.{{.ID}}` as well.
The documentation for log-tag has been updated to reflect this change.
Additional test cases have been added to cover changes in this fix.
This fix fixes#22358.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
We need to have labels applied even if a container is running in privileged
mode. On an tightly locked down SELinux system, this will cause running
without labels will cause SELinux to block privileged mode containers.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
This generates an ID string for calls to Mount/Unmount, allowing drivers
to differentiate between two callers of `Mount` and `Unmount`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Remove function `WaitRunning` because it's actually not necessary, also
remove wait channel for state "running" to avoid mixed use of the state
wait channel.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Restore the 1.10 logic that will reset the restart manager's timeout or
backoff delay if a container executes longer than 10s reguardless of
exit status or policy.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Currently if you restart docker daemon, all the containers with restart
policy `on-failure` regardless of its `RestartCount` will be started,
this will make daemon cost more extra time for restart.
This commit will stop these containers to do unnecessary start on
daemon's restart.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
This allows a user to specify explicitly to enable
automatic copying of data from the container path to the volume path.
This does not change the default behavior of automatically copying, but
does allow a user to disable it at runtime.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
Signed-off-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Darren Stahl <darst@microsoft.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This allows users to provide a FQDN as hostname or to use distinct hostname and
domainname parts. Depends on https://github.com/docker/libnetwork/pull/950
Signed-off-by: Tim Hockin <thockin@google.com>
Attach can hang forever if there is no data to send. This PR adds notification
of Attach goroutine about container stop.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Correct creation of a non-existing WORKDIR during docker build to use
remapped root uid/gid on mkdir
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
Moving all strings to the errors package wasn't a good idea after all.
Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:
```go
func GetErrorMessage(err error) string {
switch err.(type) {
case errcode.Error:
e, _ := err.(errcode.Error)
return e.Message
case errcode.ErrorCode:
ec, _ := err.(errcode.ErrorCode)
return ec.Message()
default:
return err.Error()
}
}
```
This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.
Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.
Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:
```go
switch err.(type) {
case errcode.ErrorCode:
daError, _ := err.(errcode.ErrorCode)
statusCode = daError.Descriptor().HTTPStatusCode
errMsg = daError.Message()
case errcode.Error:
// For reference, if you're looking for a particular error
// then you can do something like :
// import ( derr "github.com/docker/docker/errors" )
// if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }
daError, _ := err.(errcode.Error)
statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
errMsg = daError.Message
default:
// This part of will be removed once we've
// converted everything over to use the errcode package
// FIXME: this is brittle and should not be necessary.
// If we need to differentiate between different possible error types,
// we should create appropriate error types with clearly defined meaning
errStr := strings.ToLower(err.Error())
for keyword, status := range map[string]int{
"not found": http.StatusNotFound,
"no such": http.StatusNotFound,
"bad parameter": http.StatusBadRequest,
"conflict": http.StatusConflict,
"impossible": http.StatusNotAcceptable,
"wrong login/password": http.StatusUnauthorized,
"hasn't been activated": http.StatusForbidden,
} {
if strings.Contains(errStr, keyword) {
statusCode = status
break
}
}
}
```
You can notice two things in that code:
1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.
This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:
```go
type errorWithStatus interface {
HTTPErrorStatusCode() int
}
```
This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.
I included helper functions to generate errors that use custom status code in `errors/errors.go`.
By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.
Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors
Signed-off-by: David Calavera <david.calavera@gmail.com>
Add `--restart` flag for `update` command, so we can change restart
policy for a container no matter it's running or stopped.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Currently, when running a container with --ipc=host, if /dev/mqueue is
a standard directory on the hos the daemon will bind mount it allowing
the container to create/modify files on the host.
This commit forces /dev/mqueue to always be of type mqueue except when
the user explicitely requested something to be bind mounted to
/dev/mqueue.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>