Some newer distros such as RHEL 9 have stopped making the xt_u32 kernel
module available with the kernels they ship. They do ship the xt_bpf
kernel module, which can do everything xt_u32 can and more. Add an
alternative implementation of the iptables match rule which uses xt_bpf
to implement exactly the same logic as the u32 filter using a BPF
program. Try programming the BPF-powered rules as a fallback when
programming the u32-powered rules fails.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rule clause used to match on the VNI of VXLAN datagrams
looks like line noise to the uninitiated. It doesn't help that the
expression is repeated twice and neither copy has any commentary.
DRY out the rule builder to a common function, and document what the
rule does and how it works.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rules which make encryption mandatory on an encrypted
overlay network are only programmed once there is a second node
participating in the network. This leaves single-node encrypted overlay
networks vulnerable to packet injection. Furthermore, failure to program
the rules is not treated as a fatal error.
Program the iptables rules to make encryption mandatory before creating
the VXLAN link to guarantee that there is no window of time where
incoming cleartext VXLAN packets for the network would be accepted, or
outgoing cleartext packets be transmitted. Only create the VXLAN link if
programming the rules succeeds to ensure that it fails closed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The overlay-network encryption code is woefully under-documented, which
is especially problematic as it operates on under-documented kernel
interfaces. Document what I have puzzled out of the implementation for
the benefit of the next poor soul to touch this code.
Signed-off-by: Cory Snider <csnider@mirantis.com>
idtools.MkdirAllAndChown on Windows does not chown directories, which makes
idtools.MkdirAllAndChown() just an alias for system.MkDirAll().
Also setting the filemode to `0`, as changing filemode is a no-op on Windows as
well; both of these changes should make it more transparent that no chown'ing,
nor changing filemode takes place.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Volumes created from the image config were not being pruned because the
volume service did not think they were anonymous since the code to
create passes along a generated name instead of letting the volume
service generate it.
This changes the code path to have the volume service generate the name
instead of doing it ahead of time.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit 3246db3755 added handling for removing
cluster volumes, but in some conditions, this resulted in errors not being
returned if the volume was in use;
docker swarm init
docker volume create foo
docker create -v foo:/foo busybox top
docker volume rm foo
This patch changes the logic for ignoring "local" volume errors if swarm
is enabled (and cluster volumes supported).
While working on this fix, I also discovered that Cluster.RemoveVolume()
did not handle the "force" option correctly; while swarm correctly handled
these, the cluster backend performs a lookup of the volume first (to obtain
its ID), which would fail if the volume didn't exist.
Before this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
volume_test.go:122: assertion failed: error is nil, not errdefs.IsConflict
volume_test.go:123: assertion failed: expected an error, got nil
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
volume_test.go:143: assertion failed: error is not nil: Error response from daemon: volume no_such_volume not found
--- FAIL: TestVolumesRemoveSwarmEnabled (1.57s)
--- FAIL: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- FAIL: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
FAIL
With this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
--- PASS: TestVolumesRemoveSwarmEnabled (1.53s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
SearchRegistryForImages does not make sense as part of the image
service interface. The implementation just wraps the search API of the
registry service to filter the results client-side. It has nothing to do
with local image storage, and the implementation of search does not need
to change when changing which backend (graph driver vs. containerd
snapshotter) is used for local image storage.
Filtering of the search results is an implementation detail: the
consumer of the results does not care which actor does the filtering so
long as the results are filtered as requested. Move filtering into the
exported API of the registry service to hide the implementation details.
Only one thing---the registry service implementation---would need to
change in order to support server-side filtering of search results if
Docker Hub or other registry servers were to add support for it to their
APIs.
Use a fake registry server in the search unit tests to avoid having to
mock out the registry API client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes a security fix for crypto/elliptic (CVE-2023-24532).
> go1.20.2 (released 2023-03-07) includes a security fix to the crypto/elliptic package,
> as well as bug fixes to the compiler, the covdata command, the linker, the runtime, and
> the crypto/ecdh, crypto/rsa, crypto/x509, os, and syscall packages.
> See the Go 1.20.2 milestone on our issue tracker for details.
https://go.dev/doc/devel/release#go1.20.minor
From the announcement:
> We have just released Go versions 1.20.2 and 1.19.7, minor point releases.
>
> These minor releases include 1 security fixes following the security policy:
>
> - crypto/elliptic: incorrect P-256 ScalarMult and ScalarBaseMult results
>
> The ScalarMult and ScalarBaseMult methods of the P256 Curve may return an
> incorrect result if called with some specific unreduced scalars (a scalar larger
> than the order of the curve).
>
> This does not impact usages of crypto/ecdsa or crypto/ecdh.
>
> This is CVE-2023-24532 and Go issue https://go.dev/issue/58647.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
5008409b5c introduced the usage of
`strings.Cut` to help parse listener addresses.
Part of that also made it error out if no addr is specified after the
protocol spec (e.g. `tcp://`).
Before the change a proto spec without an address just used the default
address for that proto.
e.g. `tcp://` would be `tcp://127.0.0.1:2375`, `unix://` would be
`unix:///var/run/docker.sock`.
Critically, socket activation (`fd://`) never has an address.
This change brings back the old behavior but keeps the usage of
`strings.Cut`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Set `dangling-name-prefix` exporter attribute to `moby-dangling` which
makes it create an containerd image even when user didn't provide any
name for the new image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In preparation for containerd v1.7 which migrates off gogo/protobuf
and changes the protobuf Any type to one that's not supported by our
vendored version of typeurl.
This fixes a compile error on usages of `typeurl.UnmarshalAny` when
upgrading to containerd v1.7.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>