We used DEBIAN_FRONTEND in some places to prevent installation of packages
from being blocked. However, debian bookworm now [includes a fix][1] for
situations like this (it was specifically reported for Docker situations <3),
so we can get rid of these.
Thanks to Tianon for noticing this, and for linking to the Debian ticket!
[1]: https://bugs.debian.org/929417
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Replace `time.Sleep` with a poll that checks if process no longer exists
to avoid possible race condition.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
I noticed this log being logged as an error, but the kill logic actually
proceeds after this (doing a "direct" kill instead). While usually containers
are expected to be exiting within the given timeout, I don't think this
needs to be logged as an error (an error is returned after we fail to
kill the container).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The auth service error response is not a part of the spec and containerd
doesn't parse it like the Docker's distribution does.
Check for containerd specific errors instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Debian Woodworm ships with a newer version of iptables, which caused two
tests to fail:
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
docker_cli_daemon_test.go:841: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge6.*ext-bridge6", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge6 !ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge6 ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
docker_cli_daemon_test.go:803: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge5.*ext-bridge5", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge5 !ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge5 ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
Both the `TestDaemonICCPing`, and `TestDaemonICCLinkExpose` test were introduced
in dd0666e64f. These tests called `iptables` with
the `-n` (`--numeric`) option, which prevents it from doing a reverse-DNS lookup
as an optimization.
However, the `-n` option did not have an effect to the `prot` column before
commit [da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa] (iptables < v1.8.9 or v1.8.8).
Newer versions, such as the iptables version shipping with Debian Woodworm do,
so we need to update the expected output for this version.
This patch removes the `-n` option, to keep the test more portable, also when
run non-containerized, and removes the use of regular expressions to check the
result, as these regular expressions were quite permissive (using `.*` wild-
card matching). Instead, we're getting the
With this change;
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestDaemonICC TEST_IGNORE_CGROUP_CHECK=1 test-integration
...
--- PASS: TestDockerDaemonSuite (139.11s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCLinkExpose (54.62s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCPing (84.48s)
[da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa]: https://git.netfilter.org/iptables/commit/?id=da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When live-restore is enabled, containers with autoremove enabled
shouldn't be forcibly killed when engine restarts.
They still should be removed if they exited while the engine was down
though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Change the repo name used as for an intermediate image so it doesn't
try to mount from the image pushed by `TestBuildMultiStageImplicitPull`.
Before this patch, this test failed because the distribution.source
labels are not cleared between tests and the busybox content still has
the distribution.source label pointing to the `dockercli/testf`
repository which is no longer present in the test registry.
So both `dockercli/busybox` and `dockercli/testf` are equally valid
mount candidates for `dockercli/crossrepopush` and containerd algorithm
just happens to select the last one.
This changes the repo name to not have the common repository component
(`dockercli`) with the `dockercli/testf` repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Starting with [6e0ed3d19c54603f0f7d628ea04b550151d8a262], the minimum
allowed size is now 300MB. Given that this is a sparse image, and
the size of the image is irrelevant to the test (we check for
limits defined through project-quotas, not the size of the
device itself), we can raise the size of this image.
[6e0ed3d19c54603f0f7d628ea04b550151d8a262]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=6e0ed3d19c54603f0f7d628ea04b550151d8a262
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On bookworm, AppArmor failed to start inside the container, which can be
seen at startup of the dev-container:
Created symlink /etc/systemd/system/systemd-firstboot.service → /dev/null.
Created symlink /etc/systemd/system/systemd-udevd.service → /dev/null.
Created symlink /etc/systemd/system/multi-user.target.wants/docker-entrypoint.service → /etc/systemd/system/docker-entrypoint.service.
hack/dind-systemd: starting /lib/systemd/systemd --show-status=false --unit=docker-entrypoint.target
systemd 252.17-1~deb12u1 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.
modprobe@configfs.service: Deactivated successfully.
modprobe@dm_mod.service: Deactivated successfully.
modprobe@drm.service: Deactivated successfully.
modprobe@efi_pstore.service: Deactivated successfully.
modprobe@fuse.service: Deactivated successfully.
modprobe@loop.service: Deactivated successfully.
apparmor.service: Starting requested but asserts failed.
proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 49 (systemd-binfmt)
+ source /etc/docker-entrypoint-cmd
++ hack/make.sh dynbinary test-integration
When checking "aa-status", an error was printed that the filesystem was
not mounted:
aa-status
apparmor filesystem is not mounted.
apparmor module is loaded.
Checking if "local-fs.target" was loaded, that seemed to be the case;
systemctl status local-fs.target
● local-fs.target - Local File Systems
Loaded: loaded (/lib/systemd/system/local-fs.target; static)
Active: active since Mon 2023-11-27 10:48:38 UTC; 18s ago
Docs: man:systemd.special(7)
However, **on the host**, "/sys/kernel/security" has a mount, which was not
present inside the container:
mount | grep securityfs
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
Interestingly, on `debian:bullseye`, this was not the case either; no
`securityfs` mount was present inside the container, and apparmor actually
failed to start, but succeeded silently:
mount | grep securityfs
systemctl start apparmor
systemctl status apparmor
● apparmor.service - Load AppArmor profiles
Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2023-11-27 11:59:09 UTC; 44s ago
Docs: man:apparmor(7)
https://gitlab.com/apparmor/apparmor/wikis/home/
Process: 43 ExecStart=/lib/apparmor/apparmor.systemd reload (code=exited, status=0/SUCCESS)
Main PID: 43 (code=exited, status=0/SUCCESS)
CPU: 10ms
Nov 27 11:59:09 9519f89cade1 apparmor.systemd[43]: Not starting AppArmor in container
Same, using the `/etc/init.d/apparmor` script:
/etc/init.d/apparmor start
Starting apparmor (via systemctl): apparmor.service.
echo $?
0
And apparmor was not actually active:
aa-status
apparmor module is loaded.
apparmor filesystem is not mounted.
aa-enabled
Maybe - policy interface not available.
After further investigating, I found that the non-systemd dind script
had a mount for AppArmor, which was added in 31638ab2ad
The systemd variant was missing this mount, which may have gone unnoticed
because `debian:bullseye` was silently ignoring this when starting the
apparmor service.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
BaseFS is not serialized and is lost after an unclean shutdown. Unmount
method in the containerd image service implementation will not work
correctly in that case.
This patch will allow Unmount to restore the BaseFS if the target is
still mounted.
The reason it works with graphdrivers is that it doesn't directly
operate on BaseFS. It uses RWLayer, which is explicitly restored
immediately as soon as container is loaded.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is purely cosmetic - if a non-default MTU is configured, the bridge
will have the default MTU=1500 until a container's 'veth' is connected
and an MTU is set on the veth. That's a disconcerting, it looks like the
config has been ignored - so, set the bridge's MTU explicitly.
Fixes#37937
Signed-off-by: Rob Murray <rob.murray@docker.com>
The graphdriver implementation sets the ModTime of all image content to
match the `Created` time from the image config, whereas the containerd's
archive export code just leaves it empty (zero).
Adjust the test in the case where containerd integration is enabled to
check if config file ModTime is equal to zero (UNIX epoch) instead.
This behaviour is not a part of the Docker Image Specification and the
intention behind introducing it was to make the `docker save` produce
the same archive regardless of the time it was performed.
It would also be a bit problematic with the OCI archive layout which can
contain multiple images referencing the same content.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Also, err `e` is renamed into the more standard `err` as the defer
already uses `retErr` to avoid clashes (changed in f5a611a74).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
DNS config is a property of each adapter on Windows, thus we've a
dedicated `EndpointOption` for that.
The list of `EndpointOption` that should be applied to a given endpoint
is built by `buildCreateEndpointOptions`. This function contains a
seemingly flawed condition that adds the DNS config _iff_:
1. the network isn't internal ;
2. no ports are published / exposed through another sandbox endpoint ;
While 1. does make sense, there's actually no justification for 2.,
hence this commit remove this part of the condition.
This logic flaw has been made obvious by 0fd0e82, but it was originally
introduced by d1e0a78. Commit and PR comments don't mention why this is
done like so. Most probably, this was overlooked both by the original
author and the PR reviewers.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The `buildCreateEndpointOptions` does a lot of things to build the list
of `libnetwork.EndpointOption` from the `EndpointSettings` spec. To skip
ports-related options, an early return was put in the middle of that
function body.
Early returns are generally great, but put in the middle of a 150-loc
long function that does a lot, they're just a potential footgun. And I'm
the one who pulled the trigger in 052562f. Since this commit, generic
options won't be applied to endpoints if there's already one with
exposed/published ports. As a consequence, only the first endpoint can
have a user-defined MAC address right now.
Instead of moving up the code line that adds generic options, a better
change IMO is to move ports-related options, and the early-return gating
those options, to a dedicated func to make `buildCreateEndpointOptions`
slightly easier to read and reason about.
There was actually one oddity in the original
`buildCreateEndpointOptions`: the early-return also gates the addition
of `CreateOptionDNS`. These options are Windows-specific; a comment is
added to explain that. But the oddity is really: why are we checking if
an endpoint with exposed / published ports joined this sandbox to decide
whether we want to configure DNS server on the endpoint's adapter? Well,
this early-return was most probably overlooked by the original author
and by reviewers at the time these options were added (in commit d1e0a78)
Let's fix that in a follow-up commit.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>