This adds both a daemon-wide flag and a container creation property:
- Set the `CgroupnsMode: "host|private"` HostConfig property at
container creation time to control what cgroup namespace the container
is created in
- Set the `--default-cgroupns-mode=host|private` daemon flag to control
what cgroup namespace containers are created in by default
- Set the default if the daemon flag is unset to "host", for backward
compatibility
- Default to CgroupnsMode: "host" for client versions < 1.40
Signed-off-by: Rob Gulewich <rgulewich@netflix.com>
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.
No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).
Patch generated with:
> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
After running the test suite with the race detector enabled I found
these gems that need to be fixed.
This is just round one, sadly lost my test results after I built the
binary to test this... (whoops)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
… or could be in `opts` package. Having `runconfig/opts` and `opts`
doesn't really make sense and make it difficult to know where to put
some code.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
This reverts 26103. 26103 was trying to make it so that if someone did:
docker build --build-arg FOO .
and FOO wasn't set as an env var then it would pick-up FOO from the
Dockerfile's ARG cmd. However, it went too far and removed the ability
to specify a build arg w/o any value. Meaning it required the --build-arg
param to always be in the form "name=value", and not just "name".
This PR does the right fix - it allows just "name" and it'll grab the value
from the env vars if set. If "name" isn't set in the env then it still needs
to send "name" to the server so that a warning can be printed about an
unused --build-arg. And this is why buildArgs in the options is now a
*string instead of just a string - 'nil' == mentioned but no value.
Closes#29084
Signed-off-by: Doug Davis <dug@us.ibm.com>
Daemon still does validation and errors out on incorrect options.
Fixes an issue where non-Linux clients attempting to pass tmpfs options
on `docker run` to a Linux daemon will incorrectly error out.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
`StreamConfig` carries with it a dep on libcontainerd, which is used by
other projects, but libcontainerd doesn't compile on all platforms, so
move it to `github.com/docker/docker/container/stream`
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix is a follow up to #27567 based on:
https://github.com/docker/docker/pull/27567#issuecomment-259295055
In #27567, `--dns-options` has been added to `service create/update`,
together with `--dns` and `--dns-search`. The `--dns-opt` was used
in `docker run`.
This fix add `--dns-option` (not `--dns-options`) to `docker run/create`, and hide
`--dns-opt`. It is still possible to use `--dns-opt` with
`docker run/create`, though it will not show up in help output.
This fix change `--dns-options`to --dns-option` for `docker service create`
and `docker service update`.
This fix also updates the docs and bash/zsh completion scripts.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
update cobra and use Tags
Signed-off-by: Victor Vieux <vieux@docker.com>
allow client to talk to an older server
Signed-off-by: Victor Vieux <vieux@docker.com>
This fix is part of the fix for issue 25099. In 25099, if an env
has a empty name, then `docker run` will throw out an error:
```
ubuntu@ubuntu:~/docker$ docker run -e =A busybox true
docker: Error response from daemon: invalid header field value "oci runtime error:
container_linux.go:247: starting container process caused \"process_linux.go:295:
setting oom score for ready process caused \\\"write /proc/83582/oom_score_adj:
invalid argument\\\"\"\n".
```
This fix validates the Env in the container spec before it is sent
to containerd/runc.
Integration tests have been created to cover the changes.
This fix is part of fix for 25099 (not complete yet, non-utf case
may require a fix in `runc`).
This fix is related to 25300.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to address the proposal raised in 27921 and add
`--cpus` flag for `docker run/create`.
Basically, `--cpus` will allow user to specify a number (possibly partial)
about how many CPUs the container will use. For example, on a 2-CPU system
`--cpus 1.5` means the container will take 75% (1.5/2) of the CPU share.
This fix adds a `NanoCPUs` field to `HostConfig` since swarmkit alreay
have a concept of NanoCPUs for tasks. The `--cpus` flag will translate
the number into reused `NanoCPUs` to be consistent.
This fix adds integration tests to cover the changes.
Related docs (`docker run` and Remote APIs) have been updated.
This fix fixes 27921.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to address the issue raised in 27969 where
duplicate identical bind mounts for `docker run` caused additional volumes
to be created.
The reason was that in `runconfig`, if duplicate identical bind mounts
have been specified, the `copts.volumes.Delete(bind)` will not truly
delete the second entry from the slice. (Only the first entry is deleted).
This fix fixes the issue.
An integration test has been added to cover the changes
This fix fixes 27969.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>