This adds both a daemon-wide flag and a container creation property:
- Set the `CgroupnsMode: "host|private"` HostConfig property at
container creation time to control what cgroup namespace the container
is created in
- Set the `--default-cgroupns-mode=host|private` daemon flag to control
what cgroup namespace containers are created in by default
- Set the default if the daemon flag is unset to "host", for backward
compatibility
- Default to CgroupnsMode: "host" for client versions < 1.40
Signed-off-by: Rob Gulewich <rgulewich@netflix.com>
This is enabled for all containers that are not run with --privileged,
if the kernel supports it.
Fixes#38332
Signed-off-by: Rob Gulewich <rgulewich@netflix.com>
This test case requires not just daemon >= 1.40, but also
client API >= 1.40. In case older client is used, we'll
get failure from the very first check:
> ipcmode_linux_test.go:313: assertion failed: shareable (string) != private (string)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Older versions of the daemon would concatenate hostname and
domainname, so hostname "foobar" and domainname "baz.cyphar.com"
would produce `foobar.baz.cyphar.com` as hostname.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Create a new container for each subtest, so that individual
subtests are self-contained, and there's no need to execute
them in the exact order, or resetting the container in between.
This makes the test slower (6.54s vs 3.43s), but reduced the
difference by using `network=host`, which made a substantial
difference (without `network=host`, the test took more than
twice as long: 13.96s).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Don't set `PidsLimit` when creating a container and
no limit was set (or the limit was set to "unlimited")
- Don't set `PidsLimit` if the host does not have pids-limit
support (previously "unlimited" was set).
- Do not generate a warning if the host does not have pids-limit
support, but pids-limit was set to unlimited (having no
limit set, or the limit set to "unlimited" is equivalent,
so no warning is nescessary in that case).
- When updating a container, convert `0`, and `-1` to
"unlimited" (`0`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This changes the default ipc mode of daemon/engine to be private,
meaning the containers will not have their /dev/shm bind-mounted
from the host by default. The benefits of doing this are:
1. No leaked mounts. Eliminate a possibility to leak mounts into
other namespaces (and therefore unfortunate errors like "Unable to
remove filesystem for <ID>: remove /var/lib/docker/containers/<ID>/shm:
device or resource busy").
2. Working checkpoint/restore. Make `docker checkpoint`
not lose the contents of `/dev/shm`, but save it to
the dump, and be restored back upon `docker start --checkpoint`
(currently it is lost -- while CRIU handles tmpfs mounts,
the "shareable" mount is seen as external to container,
and thus rightfully ignored).
3. Better security. Currently any container is opened to share
its /dev/shm with any other container.
Obviously, this change will break the following usage scenario:
$ docker run -d --name donor busybox top
$ docker run --rm -it --ipc container:donor busybox sh
Error response from daemon: linux spec namespaces: can't join IPC
of container <ID>: non-shareable IPC (hint: use IpcMode:shareable
for the donor container)
The soution, as hinted by the (amended) error message, is to
explicitly enable donor sharing by using --ipc shareable:
$ docker run -d --name donor --ipc shareable busybox top
Compatibility notes:
1. This only applies to containers created _after_ this change.
Existing containers are not affected and will work fine
as their ipc mode is stored in HostConfig.
2. Old backward compatible behavior ("shareable" containers
by default) can be enabled by either using
`--default-ipc-mode shareable` daemon command line option,
or by adding a `"default-ipc-mode": "shareable"`
line in `/etc/docker/daemon.json` configuration file.
3. If an older client (API < 1.40) is used, a "shareable" container
is created. A test to check that is added.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Move the test case from integration-cli to integration.
The test logic itself has not changed, except these
two things:
* the new test sets default-ipc-mode via command line
rather than via daemon.json (less code);
* the new test uses current API version.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Older API clients did not use a pointer for `PidsLimit`, so
API requests would always send `0`, resulting in any previous
value to be reset after an update:
Before this patch:
(using a 17.06 Docker CLI):
```bash
docker run -dit --name test --pids-limit=16 busybox
docker container inspect --format '{{json .HostConfig.PidsLimit}}' test
16
docker container update --memory=100M --memory-swap=200M test
docker container inspect --format '{{json .HostConfig.PidsLimit}}' test
0
docker container exec test cat /sys/fs/cgroup/pids/pids.max
max
```
With this patch applied:
(using a 17.06 Docker CLI):
```bash
docker run -dit --name test --pids-limit=16 busybox
docker container inspect --format '{{json .HostConfig.PidsLimit}}' test
16
docker container update --memory=100M --memory-swap=200M test
docker container inspect --format '{{json .HostConfig.PidsLimit}}' test
16
docker container exec test cat /sys/fs/cgroup/pids/pids.max
16
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Add support for exact list of capabilities, support only OCI model
- Support OCI model on CapAdd and CapDrop but remain backward compatibility
- Create variable locally instead of declaring it at the top
- Use const for magic "ALL" value
- Rename `cap` variable as it overlaps with `cap()` built-in
- Normalize and validate capabilities before use
- Move validation for conflicting options to validateHostConfig()
- TweakCapabilities: simplify logic to calculate capabilities
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A recent CI run shows a (seemingly random) failure from this test:
> 00:14:37.289 --- FAIL: TestRenameAnonymousContainer (1.75s)
> 00:14:37.289 rename_test.go:169: assertion failed: 0 (int) != 1 (inspect.State.ExitCode int): container baac251d5a1cb2221ffedf6f10acbad166b90e3549601e96d908e76762675a81 exited with the wrong exitcode: {ContainerJSONBase:0xc0007a4840 Mounts:[] Config:0xc000714500 NetworkSettings:0xc000235b00}
Apparently, printing the whole `inspect` value does not make any sense.
Let's output `inspect.State.Error` instead, maybe it will help to
figure out what is going on here.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
A client is already created in testenv.New(), so we can just
as well use that one, instead of creating a new client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`testEnv` is a package-level variable, so protecting / restoring
`testEnv` in parallel will result in "concurrent map write" errors.
This patch removes `t.Parallel()` from tests that use this
functionality (through `defer setupTest(t)()`).
Note that _subtests_ can still be run in parallel, as the defer
will be called after all subtests have completed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fix tries to address the issue raised in 37038 where
there were no memory.kernelTCP support for linux.
This fix add MemoryKernelTCP to HostConfig, and pass
the config to runtime-spec.
Additional test case has been added.
This fix fixes 37038.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This allows non-recursive bind-mount, i.e. mount(2) with "bind" rather than "rbind".
Swarm-mode will be supported in a separate PR because of mutual vendoring.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
- Add windows CI entrypoint script.
Signed-off-by: John Howard <jhoward@microsoft.com>
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
This adds MaskedPaths and ReadOnlyPaths options to HostConfig for containers so
that a user can override the default values.
When the value sent through the API is nil the default is used.
Otherwise the default is overridden.
Adds integration tests for MaskedPaths and ReadonlyPaths.
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
1. As daemon.ContainerStop() documentation says,
> If a negative number of seconds is given, ContainerStop
> will wait for a graceful termination.
but since commit cfdf84d5d0 (PR #32237) this is no longer the case.
This happens because `context.WithTimeout(ctx, timeout)` is implemented
as `WithDeadline(ctx, time.Now().Add(timeout))`, resulting in a deadline
which is in the past.
To fix, don't use WithDeadline() if the timeout is negative.
2. Add a test case to validate the correct behavior and
as a means to prevent a similar regression in the future.
3. Fix/improve daemon.ContainerStop() and client.ContainerStop()
description for clarity and completeness.
4. Fix/improve DefaultStopTimeout description.
Fixes: cfdf84d5d0 ("Update Container Wait")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We really need to run those on the CI too at some point.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Signed-off-by: Tibor Vass <tibor@docker.com>
This fix consists of some improvement in restart_test.go
by replacing Fatal with assert, so that they are consistent
with other tests in integration/container.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix converts some `client.ContainerCreate` to `container.Create`,
and removes some unneeded `name` fields when test containers are created.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
… and do not use the `docker` cli in it. One of the reason of this
move is to not make `integration` package using legacy
`integration-cli` package.
Next move will be to support swarm within this package *and* provide
some helper function using the api (compared to the one using cli in
`integration-cli/daemon` package).
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
This test case checks that a container created before start
of the currently running dockerd can be exported (as reported
in #36561). To satisfy this condition, either a pre-existing
container is required, or a daemon restart after container
creation.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
I am not quite sure why but this test is sometimes failing like this:
> 15:21:41 --- FAIL: TestLinksEtcHostsContentMatch (0.53s)
> 15:21:41 assertions.go:226:
>
> Error Trace: links_linux_test.go:46
> 15:21:41
> Error: Not equal:
> 15:21:41
> expected: "127.0.0.1\tlocalhost\n::1\tlocalhost
> ip6-localhost
> ip6-loopback\nfe00::0\tip6-localnet\nff00::0\tip6-mcastprefix\nff02::1\tip6-allnodes\nff02::2\tip6-allrouters\n172.17.0.2\tf53feb6df161\n"
> 15:21:41
> received: ""
To eliminate some possible failures (like ignoring stderr from `cat` or
its exit code), let's use container.Exec() to read a file from a container.
Fixes: e6bd20edcb ("Migrate some integration-cli test to api tests")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As mentioned in commit 9e31938, test cases that use t.Parallel()
and start a docker daemon might step on each other toes as they
try to configure iptables during startup, resulting in flaky tests.
To avoid this, --iptables=false should be used while starting daemon.
Fixes: eaa5192856 ("Make container resource mounts unbindable")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This fix addresses `expected` vs `actual` in integration tests
so that they match `assert.Equal(t, expected, actual)`
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix adds several improvement:
1. No need for explicit ContainerRemove as it has been handled in setupTest()
2. Added `container.WithImage` helper function and used it in commit tests.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
The canonical import comment was added some time ago, though several
newly added files do not have the comment. This fix adds the missing
canonical import comment to files in integration tests
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix is a minor enhancement to replace several ContainerCreate with
helper funcs of `container.Create` in tests.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix moves helper functions containerIsStopped and
containerIsInState to integration/internal/container,
so that they could be used outside of integration/container.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
1. Use integration/internal/exec, removing the getContainerSysFSValue().
2. Avoid repeating magic numbers, use a variable for those.
3. Fix order of arguments to assert.Equal (first "expected", then "actual").
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
An implementation of exec in TestUpdateCPUQUota had a few issues,
including resource leaking and calling both ContainerExecAttach and
ContainerExecRun. The last one makes the test flaky:
update_linux_test.go:136: expected cgroup value 20000, got: Error: Exec
command f923baf709525f6b38f6511126addc5d9bb88fb477eeca1c22440551090fa2bb
is already running
Fix by using the integration/internal/exec package.
While at it, use require/assert to further improve code readability.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When tailing a container log, if the log file is empty it will cause the
log stream to abort with an unexpected `EOF`.
Note that this only applies to the "current" log file as rotated files
cannot be empty.
This fix just skips adding the "current" file the log tail if it is
empty.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix is a sync up with 36266 so that relevant api tests
use the newly added container.Run/Create in helper package
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix is a follow up to 36266 to update some api tests
to use the newly added container helper package.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
To help creating/running/… containers using the client for test integration.
This should make test more readable and reduce duplication a bit.
Usage example
```
// Create a default container named foo
id1 := container.Create(t, ctx, client, container.WithName("foo"))
// Run a default container with a custom command
id2 := container.Run(t, ctx, client, container.WithCmd("echo", "hello world"))
```
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Both names have no real sense, but one allows to make sure these packages
aren't used outside of `integration`.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
When the daemon restores containers on daemon restart, it syncs up with
containerd to determine the existing state. For stopped containers it
then removes the container metadata from containerd.
In some cases this is not handled properly and causes an error when
someone attempts to start that container again.
In particular, this case is just a bad error check.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix migrates TestCreateTmpfsMountsTarget test to api test,
and removed integration-cli/docker_cli_create_unix_test.go
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
By default, if a user requests a bind mount it uses private propagation.
When the source path is a path within the daemon root this, along with
some other propagation values that the user can use, causes issues when
the daemon tries to remove a mountpoint because a container will then
have a private reference to that mount which prevents removal.
Unmouting with MNT_DETATCH can help this scenario on newer kernels, but
ultimately this is just covering up the problem and doesn't actually
free up the underlying resources until all references are destroyed.
This change does essentially 2 things:
1. Change the default propagation when unspecified to `rslave` when the
source path is within the daemon root path or a parent of the daemon
root (because everything is using rbinds).
2. Creates a validation error on create when the user tries to specify
an unacceptable propagation mode for these paths...
basically the only two acceptable modes are `rslave` and `rshared`.
In cases where we have used the new default propagation but the
underlying filesystem is not setup to handle it (fs must hvae at least
rshared propagation) instead of erroring out like we normally would,
this falls back to the old default mode of `private`, which preserves
backwards compatibility.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix is a follow up to 36198 by adding description
to TestContainerNetworkMountsNoChown so that it is clear
about the purpose of the test for ownership.
This fix is related to comment in 36198.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
As there is already a runSimpleContainer, I think it makes
sense to combine runSimpleContainer with runContainer for rename test
to reduce code duplication.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix migrates TestContainersAPINetworkMountsNoChown from
integration-cli to api tests in integration.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
It's a common scenario for admins and/or monitoring applications to
mount in the daemon root dir into a container. When doing so all mounts
get coppied into the container, often with private references.
This can prevent removal of a container due to the various mounts that
must be configured before a container is started (for example, for
shared /dev/shm, or secrets) being leaked into another namespace,
usually with private references.
This is particularly problematic on older kernels (e.g. RHEL < 7.4)
where a mount may be active in another namespace and attempting to
remove a mountpoint which is active in another namespace fails.
This change moves all container resource mounts into a common directory
so that the directory can be made unbindable.
What this does is prevents sub-mounts of this new directory from leaking
into other namespaces when mounted with `rbind`... which is how all
binds are handled for containers.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Fixes an issue where if cpu quota/period is sent via the update API, the
values are updated in the stored container data but not actually sent to
the running container.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Using parallel tests is nice, however it can cause an issue with
multiple daemons trying to make changes to iptables at the same time
which causes flakey tests.
This just disables iptables for the set of tests since it is not
required.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix adds a test case for 35333: Devicemapper: ignore Nodata errors when delete thin device
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
When the containerd 1.0 runtime changes were made, we inadvertantly
removed the functionality where any running containers are killed on
startup when not using live-restore.
This change restores that behavior.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
split all non-cli portions into a new internal/test/environment package
Set a test environment on packages instead of creating new ones.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
This adds a new package `integration` where `engine` integration tests
should live. Those integration tests should not depends on any `cli`
components (except from the `dockerd` daemon for now — to actually
start a daemon).
Signed-off-by: Vincent Demeester <vincent@sbr.pm>