Some configuration in a container depends on whether it has support for
IPv6 (including default entries for '::1' etc in '/etc/hosts').
Before this change, the container's support for IPv6 was determined by
whether it was connected to any IPv6-enabled networks. But, that can
change over time, it isn't a property of the container itself.
So, instead, detect IPv6 support by looking for '::1' on the container's
loopback interface. It will not be present if the kernel does not have
IPv6 support, or the user has disabled it in new namespaces by other
means.
Once IPv6 support has been determined for the container, its '/etc/hosts'
is re-generated accordingly.
The daemon no longer disables IPv6 on all interfaces during initialisation.
It now disables IPv6 only for interfaces that have not been assigned an
IPv6 address. (But, even if IPv6 is disabled for the container using the
sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6
networks still get IPv6 addresses that appear in the internal DNS. There's
more to-do!)
Signed-off-by: Rob Murray <rob.murray@docker.com>
`VolumeOptions` now has a `Subpath` field which allows to specify a path
relative to the volume that should be mounted as a destination.
Symlinks are supported, but they cannot escape the base volume
directory.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We have many "image" packages, so these vars easily conflict/shadow
imports. Let's rename them (and in some cases use a const) to
prevent that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change adds a TempDir function that ensures the correct permissions for
the fake-root user in rootless mode.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
setupTest should be called before Parallel as it modifies the test
environment which might produce:
```
fatal error: concurrent map writes
goroutine 143 [running]:
github.com/docker/docker/testutil/environment.(*Execution).ProtectContainer(...)
/go/src/github.com/docker/docker/testutil/environment/protect.go:59
github.com/docker/docker/testutil/environment.ProtectContainers({0x12e8d98, 0xc00040e420}, {0x12f2878?, 0xc0004fc340}, 0xc0001fac00)
/go/src/github.com/docker/docker/testutil/environment/protect.go:68 +0xb1
github.com/docker/docker/testutil/environment.ProtectAll({0x12e8d98, 0xc00040e210}, {0x12f2878, 0xc0004fc340}, 0xc0001fac00)
/go/src/github.com/docker/docker/testutil/environment/protect.go:45 +0xf3
github.com/docker/docker/integration/image.setupTest(0xc0004fc340)
/go/src/github.com/docker/docker/integration/image/main_test.go:46 +0x59
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Update the TestDaemonRestartKilContainers integration test to assert
that a container's healthcheck status is always reset to the Starting
state after a daemon restart, even when the container is live-restored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Layer size is the sum of the individual files count, not the tar
archive. Use the total bytes read returned by `io.Copy` to populate the
`Size` field.
Also set the digest to the actual digest of the tar archive.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The new OCI-compatible archive export relies on the Descriptors returned
by the layer (`distribution.Describable` interface implementation).
The issue with that is that the `roLayer` and the `referencedCacheLayer`
types don't implement this interface. Implementing that interface for
them based on their `descriptor` doesn't work though, because that
descriptor is empty.
To workaround this issue, just create a new descriptor if the one
provided by the layer is empty.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
No more concept of "anonymous endpoints". The equivalent is now an
endpoint with no DNSNames set.
Some of the code removed by this commit was mutating user-supplied
endpoint's Aliases to add container's short ID to that list. In order to
preserve backward compatibility for the ContainerInspect endpoint, this
commit also takes care of adding that short ID (and the container
hostname) to `EndpointSettings.Aliases` before returning the response.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.
(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)
IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.
Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.
Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Isolate the prune effects by running the test in a separate daemon.
This minimizes the impact of/on other integration tests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Rewrite `.build-empty-images` shell script that produced special images
(emptyfs with no layers, and empty danglign image) to a Go functions
that construct the same archives in a temporary directory.
Use them to load these images on demand only in the tests that need
them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
These tests build new images, setupTest sets up the test cleanup
function that clears the test environment from created images,
containers, etc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This removes various skips that accounted for running the integration tests
against older versions of the daemon before 20.10 (API version v1.41). Those
versions are EOL, and we don't run tests against them.
This reverts most of e440831802, and similar
PRs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was using API version 1.20 to test old behavior, but the actual change
in behavior was API v1.25; see commit 6d98e344c7
and 63b5a37203.
This updates the test to use API v1.24 to test the old behavior.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestKillDifferentUserContainer was migrated from integration-cli in
commit 0855922cd3. Before migration, it
was not using a specific API version, so we can assume "current"
API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Replace `time.Sleep` with a poll that checks if process no longer exists
to avoid possible race condition.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When live-restore is enabled, containers with autoremove enabled
shouldn't be forcibly killed when engine restarts.
They still should be removed if they exited while the engine was down
though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is purely cosmetic - if a non-default MTU is configured, the bridge
will have the default MTU=1500 until a container's 'veth' is connected
and an MTU is set on the veth. That's a disconcerting, it looks like the
config has been ignored - so, set the bridge's MTU explicitly.
Fixes#37937
Signed-off-by: Rob Murray <rob.murray@docker.com>
The graphdriver implementation sets the ModTime of all image content to
match the `Created` time from the image config, whereas the containerd's
archive export code just leaves it empty (zero).
Adjust the test in the case where containerd integration is enabled to
check if config file ModTime is equal to zero (UNIX epoch) instead.
This behaviour is not a part of the Docker Image Specification and the
intention behind introducing it was to make the `docker save` produce
the same archive regardless of the time it was performed.
It would also be a bit problematic with the OCI archive layout which can
contain multiple images referencing the same content.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This test broke in 98323ac114.
This commit renamed WithMacAddress into WithContainerWideMacAddress.
This helper sets the MacAddress field in container.Config. However, API
v1.44 now ignores this field if the NetworkMode has no matching entry in
EndpointsConfig.
This fix uses the helper WithMacAddress and specify for which
EndpointConfig the MacAddress is specified.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test was rewritten from an integration-cli test in commit
68d9beedbe, and originally implemented in
f4942ed864, which rewrote it from a unit-
test to an integration test.
Originally, it would check for the raw JSON response from the daemon, and
check for individual fields to be present in the output, but after commit
0fd5a65428, `client.Info()` was used, and
now the response is unmarshalled into a `system.Info`.
The remainder of the test remained the same in that rewrite, and as a
result were were now effectively testing if a `system.Info` struct,
when marshalled as JSON would show all the fields (surprise: it does).
TL;DR; the test would even pass with an empty `system.Info{}` struct,
which didn't provide much coverage, as it passed without a daemon:
func TestInfoAPI(t *testing.T) {
// always shown fields
stringsToCheck := []string{
"ID",
"Containers",
"ContainersRunning",
"ContainersPaused",
"ContainersStopped",
"Images",
"LoggingDriver",
"OperatingSystem",
"NCPU",
"OSType",
"Architecture",
"MemTotal",
"KernelVersion",
"Driver",
"ServerVersion",
"SecurityOptions",
}
out := fmt.Sprintf("%+v", system.Info{})
for _, linePrefix := range stringsToCheck {
assert.Check(t, is.Contains(out, linePrefix))
}
}
This patch makes the test _slightly_ better by checking if the fields
are non-empty. More work is needed on this test though; currently it
uses the (already running) daemon, so it's hard to check for specific
fields to be correct (withouth knowing state of the daemon), but it's
not unlikely that other tests (partially) cover some of that. A TODO
comment was added to look into that (we should probably combine some
tests to prevent overlap, and make it easier to spot "gaps" as well).
While working on this, also moving the `SystemTime` into this test,
because that field is (no longer) dependent on "debug" state
(It is was actually this change that led me down this rabbit-hole)
()_()
(-.-)
'(")(")'
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Following tests are implemented in this specific commit:
- Inter-container communications for internal and non-internal
bridge networks, over IPv4 and IPv6.
- Inter-container communications using IPv6 link-local addresses for
internal and non-internal bridge networks.
- Inter-network communications for internal and non-internal bridge
networks, over IPv4 and IPv6, are disallowed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This commit introduces a new integration test suite aimed at testing
networking features like inter-container communication, network
isolation, port mapping, etc... and how they interact with daemon-level
and network-level parameters.
So far, there's pretty much no tests making sure our networks are well
configured: 1. there're a few tests for port mapping, but they don't
cover all use cases ; 2. there're a few tests that check if a specific
iptables rule exist, but that doesn't prevent that specific iptables
rule to be wrong in the first place.
As we're planning to refactor how iptables rules are written, and change
some of them to fix known security issues, we need a way to test all
combinations of parameters. So far, this was done by hand, which is
particularly painful and time consuming. As such, this new test suite is
foundational to upcoming work.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This delete was originally added in b37fdc5dd1
and migrated from `deleteImages(repoName)` in commit 1e55ace875,
however, deleting `foobar-save-multi-images-test` (`foobar-save-multi-images-test:latest`)
always resulted in an error;
Error response from daemon: No such image: foobar-save-multi-images-test:latest
This patch removes the redundant image delete.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Shutting down containers on Windows can take a long time (with hyper-v),
causing this test to be flaky; seen failing on windows 2022;
=== FAIL: github.com/docker/docker/integration/image TestSaveRepoWithMultipleImages (23.16s)
save_test.go:104: timeout waiting for container to exit
Looking at the test, we run a container only to commit it, and the test
does not make changes to the container's filesystem; it only runs a container
with a custom command (`true`).
Instead of running the container, we can _create_ a container and commit it;
this simplifies the tests, and prevents having to wait for the container to
exit (before committing).
To verify:
make BIND_DIR=. DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestSaveRepoWithMultipleImages test-integration
INFO: Testing against a local daemon
=== RUN TestSaveRepoWithMultipleImages
--- PASS: TestSaveRepoWithMultipleImages (1.20s)
PASS
DONE 1 tests in 2.668s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adds test ensuring that additional groups set with `--group-add`
are kept on exec when container had `--user` set on run.
Regression test for https://github.com/moby/moby/issues/46712
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Having a sandbox/container-wide MacAddress field makes little sense
since a container can be connected to multiple networks at the same
time. This field is an artefact of old times where a container could be
connected to a single network only.
As we now have a way to specify per-endpoint mac address, this field is
now deprecated.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to this commit, only container.Config had a MacAddress field and
it's used only for the first network the container connects to. It's a
relic of old times where custom networks were not supported.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
commit def549c8f6 passed through the context
to the daemon.ContainerStart function. As a result, restarting containers
no longer is an atomic operation, because a context cancellation could
interrupt the restart (between "stopping" and "(re)starting"), resulting
in the container being stopped, but not restarted.
Restarting a container, or more factually; making a successful request on
the `/containers/{id]/restart` endpoint, should be an atomic operation.
This patch uses a context.WithoutCancel for restart requests.
It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).
Before this patch, starting a container that bind-mounts the docker socket,
then restarting itself from within the container would cancel the restart
operation. The container would be stopped, but not started after that:
docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3a2a741c65ff docker:cli "docker-entrypoint.s…" 26 seconds ago Exited (128) 7 seconds ago myself
With this patch: the stop still cancels the exec, but does not cancel the
restart operation, and the container is started again:
docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4393a01f7c75 docker:cli "docker-entrypoint.s…" About a minute ago Up 4 seconds myself
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Check for accurate values that may contain content sizes unknown to the
usage test in the calculation. Avoid asserting using deep equals when
only the expected value range is known to the test.
Signed-off-by: Derek McGowan <derek@mcg.dev>
We support importing images for other platforms when
using the containerd image store, so we shouldn't validate
the image OS on import.
This commit also splits the test into two, so that we can
keep running the "success" import with a custom platform tests
running w/ c8d while skipping the "error/rejection" test cases.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
This is a follow-up to 2216d3ca8d, which
implemented the StartInterval for health-checks, but did not add validation
for the minimum accepted interval;
> The time to wait between checks in nanoseconds during the start period.
> It should be 0 or at least 1000000 (1 ms). 0 means inherit.
This patch adds validation for the minimum accepted interval (1ms).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement a behavior from the graphdriver's export where `docker save
something` (untagged reference) would export all images matching the
specified repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`docker build --squash` is an experimental feature which is not
implemented for containerd image store.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The following changes were required:
* integration/build: progressui's signature changed in 6b8fbed01e
* builder-next: flightcontrol.Group has become a generic type in 8ffc03b8f0
* builder-next/executor: add github.com/moby/buildkit/executor/resources types, necessitated by 6e87e4b455
* builder-next: stub util/network/Namespace.Sample(), necessitated by 963f16179f
Co-authored-by: CrazyMax <crazy-max@users.noreply.github.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Diffing a container yielded some extra changes that come from the
files/directories that we mount inside the container (/etc/resolv.conf
for example). To avoid that we create an intermediate snapshot that has
these files, with this we can now diff the container fs with its parent
and only get the differences that were made inside the container.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This test checks how the layer store works, so we don't need it when we
use containerd as image store
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The API endpoint `/containers/create` accepts several EndpointsConfig
since v1.22 but the daemon would error out in such case. This check is
moved from the daemon to the api and is now applied only for API < 1.44,
effectively allowing the daemon to create containers connected to
several networks.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Fixes#18864, #20648, #33561, #40901.
[This GH comment][1] makes clear network name uniqueness has never been
enforced due to the eventually consistent nature of Classic Swarm
datastores:
> there is no guaranteed way to check for duplicates across a cluster of
> docker hosts.
And this is further confirmed by other comments made by @mrjana in that
same issue, eg. [this one][2]:
> we want to adopt a schema which can pave the way in the future for a
> completely decentralized cluster of docker hosts (if scalability is
> needed).
This decentralized model is what Classic Swarm was trying to be. It's
been superseded since then by Docker Swarm, which has a centralized
control plane.
To circumvent this drawback, the `NetworkCreate` endpoint accepts a
`CheckDuplicate` flag. However it's not perfectly reliable as it won't
catch concurrent requests.
Due to this design decision, API clients like Compose have to implement
workarounds to make sure names are really unique (eg.
docker/compose#9585). And the daemon itself has seen a string of issues
due to that decision, including some that aren't fixed to this day (for
instance moby/moby#40901):
> The problem is, that if you specify a network for a container using
> the ID, it will add that network to the container but it will then
> change it to reference the network by using the name.
To summarize, this "feature" is broken, has no practical use and is a
source of pain for Docker users and API consumers. So let's just remove
it for _all_ API versions.
[1]: https://github.com/moby/moby/issues/18864#issuecomment-167201414
[2]: https://github.com/moby/moby/issues/18864#issuecomment-167202589
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
container.Run() should be a synchronous operation in normal circumstances;
the container is created and started, so polling after that for the
container to be in the "running" state should not be needed.
This should also prevent issues when a container (for whatever reason)
exited immediately after starting; in that case we would continue
polling for it to be running (which likely would never happen).
Let's skip the polling; if the container is not in the expected state
(i.e. exited), tests should fail as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon would pass an EndpointCreateOption to set the interface MAC
address if the network name and the provided network mode were matching.
Obviously, if the network mode is a network ID, it won't work. To make
things worse, the network mode is never normalized if it's a partial ID.
To fix that: 1. the condition under what the container's mac-address is
applied is updated to also match the full ID; 2. the network mode is
normalized to a full ID when it's only a partial one.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Reduce some of the boiler-plating, and by combining the tests, we skip
the testenv.Clean() in between each of the tests. Performance gain isn't
really measurable, but every bit should help :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
container.Run should be an synchronous operation; the container should
be running after the request was made (or produce an error). Simplify
these tests, and remove the redundant polling.
These were added as part of 8f800c9415,
but no such polls were in place before the refactor, and there's no
mention of these during review of the PR, so I assume these were just
added either as a "precaution", or a result of "copy/paste" from another
test.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was failing frequently on Windows, where the test was waiting
for the container to exit before continuing;
=== FAIL: github.com/docker/docker/integration/container TestResizeWhenContainerNotStarted (18.69s)
resize_test.go:58: timeout hit after 10s: waiting for container to be one of (exited), currently running
It looks like this test is merely validating that a container in any non-
running state should produce an error, so there's no need to run a container
(waiting for it to stop), and just "creating" a container (which would be
in `created` state) should work for this purpose.
Looking at 8f800c9415, I see `createSimpleContainer`
and `runSimpleContainer` utilities were added, so I'm even wondering if the
original intent was to use `createSimpleContainer` for this test.
While updating, also check if we get the expected error-type, instead of
only checking for the error-message.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Integration tests will now configure clients to propagate traces as well
as create spans for all tests.
Some extra changes were needed (or desired for trace propagation) in the
test helpers to pass through tracing spans via context.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This uses otel standard environment variables to configure tracing in
the daemon.
It also adds support for propagating trace contexts in the client and
reading those from the API server.
See
https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
for details on otel environment variables.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Make sure that the content in the live-restored volume mounted in a new
container is the same as the content in the old container.
This checks if volume's _data directory doesn't get unmounted on
startup.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added in 247f4796d2, and
at the time was added as an alias for string;
> api/types/events: add "Type" type for event-type enum
>
> Currently just an alias for string, but we can change it to be an
> actual type.
Now that all code uses the defined types, we should be able to make
this an actual type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`Daemon.getPidContainer()` was wrapping the error-message with a message
("cannot join PID of a non running container") that did not reflect the
actual reason for the error; `Daemon.GetContainer()` could either return
an invalid parameter (invalid / empty identifier), or a "not found" error
if the specified container-ID could not be found.
In the latter case, we don't want to return a "not found" error through
the API, as this would indicate that the container we're _starting_ was
not found (which is not the case), so we need to convert the error into
an `errdefs.ErrInvalidParameter` (the container-ID specified for the PID
namespace is invalid if the container doesn't exist).
This logic is similar to what we do for IPC namespaces. which received
a similar fix in c3d7a0c603.
This patch updates the error-types, and moves them into the getIpcContainer
and getPidContainer container functions, both of which should return
an "invalid parameter" if the container was not found.
It's worth noting that, while `WithNamespaces()` may return an "invalid
parameter" error, the `start` endpoint itself may _not_ be. as outlined
in commit bf1fb97575, starting a container
that has an invalid configuration should be considered an internal server
error, and is not an invalid _request_. However, for uses other than
container "start", `WithNamespaces()` should return the correct error
to allow code to handle it accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>