Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This adds the new `CreatedAt` field to the API version history
and updates some examples to show this information.
The `CreatedAt` field was implemented in a46f757c40
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit c79c16910c
inadvertently put these API changes under API 1.31,
but they were added in API 1.30.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Enables other subsystems to watch actions for a plugin(s).
This will be used specifically for implementing plugins on swarm where a
swarm controller needs to watch the state of a plugin.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
COmmit 0307fe1a0b added
a new `DataPathAddr` property to the swarm/init and swarm/join
endpoints. This property was not yet added to the
documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fix tries to add a `scope` in the query of `/networks/<id>`
(`NetworkInspect`) so that in case of duplicate network names,
it is possible to locate the network ID based on the network
scope (`local`, 'swarm', or `global`).
Multiple networks might exist in different scopes, which is a legitimate case.
For example, a network name `foo` might exists locally and in swarm network.
However, before this PR it was not possible to query a network name `foo`
in a specific scope like swarm.
This fix fixes the issue by allowing a `scope` query in `/networks/<id>`.
Additional test cases have been added to unit tests and integration tests.
This fix is related to docker/cli#167, moby/moby#30897, moby/moby#33561, moby/moby#30242
This fix fixesdocker/cli#167
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
With the Moby/Docker split, no decisions have been
made yet how, and when to bump the API version.
Although these decisions should not be lead
by Docker releases, I'm bumping the API version
to not complicate things for now; after this bump
we should make a plan how to handle this in future
(for example, using SemVer for the REST api, and
bump with every change).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch adds the untilRemoved option to the ContainerWait API which
allows the client to wait until the container is not only exited but
also removed.
This patch also adds some more CLI integration tests for waiting for a
created container and waiting with the new --until-removed flag.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Handle detach sequence in CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Update Container Wait Conditions
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Apply container wait changes to API 1.30
The set of changes to the containerWait API missed the cut for the
Docker 17.05 release (API version 1.29). This patch bumps the version
checks to use 1.30 instead.
This patch also makes a minor update to a testfile which was added to
the builder/dockerfile package.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Remove wait changes from CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address minor nits on wait changes
- Changed the name of the tty Proxy wrapper to `escapeProxy`
- Removed the unnecessary Error() method on container.State
- Fixes a typo in comment (repeated word)
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Use router.WithCancel in the containerWait handler
This handler previously added this functionality manually but now uses
the existing wrapper which does it for us.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Add WaitCondition constants to api/types/container
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
- Update ContainerWait backend interface to not return pointer values
for container.StateStatus type.
- Updated container state's Wait() method comments to clarify that a
context MUST be used for cancelling the request, setting timeouts,
and to avoid goroutine leaks.
- Removed unnecessary buffering when making channels in the client's
ContainerWait methods.
- Renamed result and error channels in client's ContainerWait methods
to clarify that only a single result or error value would be sent
on the channel.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Move container.WaitCondition type to separate file
... to avoid conflict with swagger-generated code for API response
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
in the Docker REST APIs when viewing or updating the swarm spec info, and
also propagate the desired CA key in the Docker REST APIs when updating
swarm spec info only (it is not available for viewing).
Signed-off-by: Ying Li <ying.li@docker.com>
objects into the REST API responses. In the CLI, display only
whether the nodes' TLS info matches the cluster's TLS info, or
whether the node needs cert rotation.
Signed-off-by: Ying Li <ying.li@docker.com>
This is synonymous with `docker run --cidfile=FILE` and writes the digest of
the newly built image to the named file. This is intended to be used by build
systems which want to avoid tagging (perhaps because they are in CI or
otherwise want to avoid fixed names which can clash) by enabling e.g. Makefile
constructs like:
image.id: Dockerfile
docker build --iidfile=image.id .
do-some-more-stuff: image.id
do-stuff-with <image.id
Currently the only way to achieve this is to use `docker build -q` and capture
the stdout, but at the expense of losing the build output.
In non-silent mode (without `-q`) with API >= v1.29 the caller will now see a
`JSONMessage` with the `Aux` field containing a `types.BuildResult` in the
output stream for each image/layer produced during the build, with the final
one being the end product. Having all of the intermediate images might be
interesting in some cases.
In silent mode (with `-q`) there is no change, on success the only output will
be the resulting image digest as it was previosuly.
There was no wrapper to just output an Aux section without enclosing it in a
Progress, so add one here.
Added some tests to integration cli tests.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
The `Log` field for plugins was added to `/info` in
17abacb894 but the swagger spec was not
updated.
This just updates the spec to match reality.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 31032 where it was
not possible to specify `--cpus` for `docker update`.
This fix adds `--cpus` support for `docker update`. In case both
`--cpus` and `--cpu-period/--cpu-quota` have been specified,
an error will be returned.
Related docs has been updated.
Integration tests have been added.
This fix fixes 31032.
This fix is related to 27921, 27958.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to address the request in 31324 by adding
`--filter scope=swarm|local` for `docker network ls`.
As `docker network ls` has a `SCOPE` column by default,
it is natural to add the support of `--filter scope=swarm|local`.
This fix adds the `scope=swarm|local` support for
`docker network ls --filter`.
Related docs has been updated.
Additional unit test cases have been added.
This fix fixes 31324.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to address the request in 31325 by adding
`--filter mode=global|replicated` to `docker service ls`.
As `docker service ls` has a `MODE` column by default, it is natural
to support `--filter mode=global|replicated` for `docker service ls`.
There are multiple ways to address the issue. One way is to pass
the filter of mode to SwarmKit, another way is to process the filter
of mode in the daemon.
This fix process the filter in the daemon.
Related docs has been updated.
An integration test has been added.
This fix fixes 31325.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
In https://github.com/torvalds/linux/commit/5ca3726 (released in v4.7-rc1) the
content of the `cpuacct.usage_percpu` file in sysfs was changed to include both
online and offline cpus. This broke the arithmetic in the stats helpers used by
`docker stats`, since it was using the length of the PerCPUUsage array as a
proxy for the number of online CPUs.
Add current number of online CPUs to types.StatsJSON and use it in the
calculation.
Keep a fallback to `len(v.CPUStats.CPUUsage.PercpuUsage)` so this code
continues to work when talking to an older daemon. An old client talking to a
new daemon will ignore the new field and behave as before.
Fixes#28941.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>