Commit graph

160 commits

Author SHA1 Message Date
Rob Murray
571af915d5 Don't enforce new validation rules for existing networks
Non-swarm networks created before network-creation-time validation
was added in 25.0.0 continued working, because the checks are not
re-run.

But, swarm creates networks when needed (with 'agent=true'), to
ensure they exist on each agent - ignoring the NetworkNameError
that says the network already existed.

By ignoring validation errors on creation of a network with
agent=true, pre-existing swarm networks with IPAM config that would
fail the new checks will continue to work too.

New swarm (overlay) networks are still validated, because they are
initially created with 'agent=false'.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-02-09 11:56:46 +00:00
Albin Kerouanton
c59e93a67b Revert "daemon: automatically set network EnableIPv6 if needed"
This reverts commit 5d5eeac310.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-02 10:34:26 +01:00
Rob Murray
dae33031e0 Only restore a configured MAC addr on restart.
The API's EndpointConfig struct has a MacAddress field that's used for
both the configured address, and the current address (which may be generated).

A configured address must be restored when a container is restarted, but a
generated address must not.

The previous attempt to differentiate between the two, without adding a field
to the API's EndpointConfig that would show up in 'inspect' output, was a
field in the daemon's version of EndpointSettings, MACOperational. It did
not work, MACOperational was set to true when a configured address was
used. So, while it ensured addresses were regenerated, it failed to preserve
a configured address.

So, this change removes that code, and adds DesiredMacAddress to the wrapped
version of EndpointSettings, where it is persisted but does not appear in
'inspect' results. Its value is copied from MacAddress (the API field) when
a container is created.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-02-01 09:55:54 +00:00
Albin Kerouanton
7a9b680a9c
libnet: remove Endpoint.myAliases
This property is now unused, let's get rid of it.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-12-19 10:20:38 +01:00
Albin Kerouanton
523b907359
daemon: no more IsAnonymousEndpoint
The semantics of an "anonymous" endpoint has always been weird: it was
set on endpoints which name shouldn't be taken into account when
inserting DNS records into libnetwork's `Controller.svcRecords` (and
into the NetworkDB). However, in that case the endpoint's aliases would
still be used to create DNS records; thus, making those "anonymous
endpoints" not so anonymous.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-12-19 10:20:38 +01:00
Albin Kerouanton
ab8968437b
daemon: build the list of endpoint's DNS names
Instead of special-casing anonymous endpoints in libnetwork, let the
daemon specify what (non fully qualified) DNS names should be associated
to container's endpoints.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-12-19 10:16:04 +01:00
Albin Kerouanton
1eb0751803
daemon: endpoints on default nw aren't anonymous
They just happen to exist on a network that doesn't support DNS-based
service discovery (ie. no embedded DNS servers are started for them).

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-12-18 18:38:25 +01:00
Sebastiaan van Stijn
7cb1efebec
api/types: move NetworkListConfig to api/types/backend
This struct is intended for internal use only for the backend, and is
not intended to be used externally.

This moves the plugin-related `NetworkListConfig` types to the backend
package to prevent it being imported in the client, and to make it more
clear that this is part of internal APIs, and not public-facing.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-06 02:21:21 +01:00
Albin Kerouanton
d2865f1e8a
daemon: win: set DNS config on all adapters
DNS config is a property of each adapter on Windows, thus we've a
dedicated `EndpointOption` for that.

The list of `EndpointOption` that should be applied to a given endpoint
is built by `buildCreateEndpointOptions`. This function contains a
seemingly flawed condition that adds the DNS config _iff_:

1. the network isn't internal ;
2. no ports are published / exposed through another sandbox endpoint ;

While 1. does make sense, there's actually no justification for 2.,
hence this commit remove this part of the condition.

This logic flaw has been made obvious by 0fd0e82, but it was originally
introduced by d1e0a78. Commit and PR comments don't mention why this is
done like so. Most probably, this was overlooked both by the original
author and the PR reviewers.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-11-23 18:40:58 +01:00
Albin Kerouanton
0fd0e8255f
daemon: build ports-related ep options in a dedicated func
The `buildCreateEndpointOptions` does a lot of things to build the list
of `libnetwork.EndpointOption` from the `EndpointSettings` spec. To skip
ports-related options, an early return was put in the middle of that
function body.

Early returns are generally great, but put in the middle of a 150-loc
long function that does a lot, they're just a potential footgun. And I'm
the one who pulled the trigger in 052562f. Since this commit, generic
options won't be applied to endpoints if there's already one with
exposed/published ports. As a consequence, only the first endpoint can
have a user-defined MAC address right now.

Instead of moving up the code line that adds generic options, a better
change IMO is to move ports-related options, and the early-return gating
those options, to a dedicated func to make `buildCreateEndpointOptions`
slightly easier to read and reason about.

There was actually one oddity in the original
`buildCreateEndpointOptions`: the early-return also gates the addition
of `CreateOptionDNS`. These options are Windows-specific; a comment is
added to explain that. But the oddity is really: why are we checking if
an endpoint with exposed / published ports joined this sandbox to decide
whether we want to configure DNS server on the endpoint's adapter? Well,
this early-return was most probably overlooked by the original author
and by reviewers at the time these options were added (in commit d1e0a78)

Let's fix that in a follow-up commit.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-11-23 16:26:01 +01:00
Brian Goff
677d41aa3b Plumb context through info endpoint
I was trying to find out why `docker info` was sometimes slow so
plumbing a context through to propagate trace data through.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-11-10 20:09:25 +00:00
Albin Kerouanton
ee9f0ed895
api: Deprecate ContainerConfig.MacAddress
Having a sandbox/container-wide MacAddress field makes little sense
since a container can be connected to multiple networks at the same
time. This field is an artefact of old times where a container could be
connected to a single network only.

As we now have a way to specify per-endpoint mac address, this field is
now deprecated.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-25 22:55:59 +02:00
Albin Kerouanton
052562ffd5
api: Add a field MacAddress to EndpointSettings
Prior to this commit, only container.Config had a MacAddress field and
it's used only for the first network the container connects to. It's a
relic of old times where custom networks were not supported.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-10-25 22:52:26 +02:00
Sebastiaan van Stijn
cff4f20c44
migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
3197160114
daemon: Daemon.SetNetworkBootstrapKeys: make error-handling idiomatic
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-27 12:08:28 +02:00
Sebastiaan van Stijn
39b2bf51ca
Merge pull request #46406 from akerouanton/issue-46404
daemon: fix under what conditions container's mac-address is applied
2023-09-13 23:35:07 +02:00
Albin Kerouanton
78479b1915
libnet: Make sure network names are unique
Fixes #18864, #20648, #33561, #40901.

[This GH comment][1] makes clear network name uniqueness has never been
enforced due to the eventually consistent nature of Classic Swarm
datastores:

> there is no guaranteed way to check for duplicates across a cluster of
> docker hosts.

And this is further confirmed by other comments made by @mrjana in that
same issue, eg. [this one][2]:

> we want to adopt a schema which can pave the way in the future for a
> completely decentralized cluster of docker hosts (if scalability is
> needed).

This decentralized model is what Classic Swarm was trying to be. It's
been superseded since then by Docker Swarm, which has a centralized
control plane.

To circumvent this drawback, the `NetworkCreate` endpoint accepts a
`CheckDuplicate` flag. However it's not perfectly reliable as it won't
catch concurrent requests.

Due to this design decision, API clients like Compose have to implement
workarounds to make sure names are really unique (eg.
docker/compose#9585). And the daemon itself has seen a string of issues
due to that decision, including some that aren't fixed to this day (for
instance moby/moby#40901):

> The problem is, that if you specify a network for a container using
> the ID, it will add that network to the container but it will then
> change it to reference the network by using the name.

To summarize, this "feature" is broken, has no practical use and is a
source of pain for Docker users and API consumers. So let's just remove
it for _all_ API versions.

[1]: https://github.com/moby/moby/issues/18864#issuecomment-167201414
[2]: https://github.com/moby/moby/issues/18864#issuecomment-167202589

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-12 10:40:13 +02:00
Albin Kerouanton
5d5eeac310
daemon: automatically set network EnableIPv6 if needed
PR 4f47013feb added a validation step to `NetworkCreate` to ensure
no IPv6 subnet could be set on a network if its `EnableIPv6` parameter
is false.

Before that, the daemon was accepting such request but was doing nothing
with the IPv6 subnet.

This validation step is now deleted, and we automatically set
`EnableIPv6` if an IPv6 subnet was specified.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-11 20:53:29 +02:00
Albin Kerouanton
6cc6682f5f
daemon: fix under what conditions container's mac-address is applied
The daemon would pass an EndpointCreateOption to set the interface MAC
address if the network name and the provided network mode were matching.
Obviously, if the network mode is a network ID, it won't work. To make
things worse, the network mode is never normalized if it's a partial ID.

To fix that: 1. the condition under what the container's mac-address is
applied is updated to also match the full ID; 2. the network mode is
normalized to a full ID when it's only a partial one.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-08 18:15:00 +02:00
Sebastiaan van Stijn
0f871f8cb7
api/types/events: define "Action" type and consts
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Albin Kerouanton
4f47013feb
api: Validate IPAM config before creating a network
Currently, IPAM config is never validated by the API. Some checks
are done by the CLI, but they're not exhaustive. And some of these
misconfigurations might be caught early by libnetwork (ie. when the
network is created), and others only surface when connecting a container
to a misconfigured network. In both cases, the API would return a 500.

Although the `NetworkCreate` endpoint might already return warnings,
these are never displayed by the CLI. As such, it was decided during a
maintainer's call to return validation errors _for all API versions_.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-22 17:11:54 +02:00
Sebastiaan van Stijn
74354043ff
remove uses of libnetwork/Network.Info()
Now that we removed the interface, there's no need to cast the Network
to a NetworkInfo interface, so we can remove uses of the `Info()` method.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-08 22:05:30 +02:00
Albin Kerouanton
fd4ec26313
daemon/network.go: Fix error capitalization
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-08-08 17:40:19 +02:00
Albin Kerouanton
64e01c627f
daemon/network.go: Remove github.com/pkg/errors pkg
PR moby/moby#45759 is going to use the new `errors.Join` function  to
return a list of validation errors.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-08-08 17:39:53 +02:00
Sebastiaan van Stijn
d1aa1979b4
daemon: buildEndpointInfo: minor refactor
store network.Name() in a variable to reduce repeatedly locking/unlocking
of the network (although this is very, very minimal in the grand scheme
of things).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-05 10:43:45 +02:00
Sebastiaan van Stijn
07f2df69c7
daemon: buildCreateEndpointOptions: minor nits
- store network.Name() in a variable to reduce repeatedly locking/unlocking
  of the network (although this is very, very minimal in the grand scheme
  of things).
- un-wrap long conditions
- ever so slightly optimise some conditions by changeing the order of checks.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 16:12:36 +02:00
Sebastiaan van Stijn
5158a33f15
daemon: buildCreateEndpointOptions: use range when looping
Makes the code slightly more readable.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:44 +02:00
Sebastiaan van Stijn
1c6dae1291
daemon: buildCreateEndpointOptions: don't use PortBinding.GetCopy()
This code was initializing a new PortBinding, and creating a deep copy
for each binding. It's unclear what the intent was here, but at least
PortBinding.GetCopy() wasn't adding much value, as it created a new
PortBinding, [copying all values from the original][1], which includes
a [copy of IPAddresses in it][2]. Our original "template" did not have any
of that, so let's forego that, and just create new PortBindings as we go.

[1]: 454b6a7cf5/libnetwork/types/types.go (L110-L120)
[2]: 454b6a7cf5/libnetwork/types/types.go (L236-L244)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:44 +02:00
Sebastiaan van Stijn
cc79024761
daemon: buildCreateEndpointOptions: remove intermediate vars
These were not adding much, so just getting rid of them. Also added a
TODO to move this code to the type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:43 +02:00
Sebastiaan van Stijn
45de99aa06
daemon: buildCreateEndpointOptions: don't parse empty vip
Also keep network.ID() in a local variable to prevent locking the network
twice.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:43 +02:00
Sebastiaan van Stijn
7d429125d2
daemon: buildCreateEndpointOptions: move vars to where they're used
Move variables closer to where they're used instead of defining them all
at the start of the function.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:43 +02:00
Sebastiaan van Stijn
6ce92aa523
daemon: buildCreateEndpointOptions: skip getPortMapInfo() if not needed
`getPortMapInfo` does many things; it creates a copy of all the sandbox
endpoints, gets the driver, endpoints, and network from store, and creates
port-bindings for all exposed and mapped ports.

We should look if we can create a more minimal implementation for this
purpose, but in the meantime, let's prevent it being called if we don't
need it by making it the second condition in the check.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:43 +02:00
Sebastiaan van Stijn
9e9a17950a
daemon: FindNetwork: minor cleanups
- don't initialize slices; it's not needed to append to them
- store network-ID in a var to prevent repeated lock/unlocking in nw.ID()

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 14:40:43 +02:00
Sebastiaan van Stijn
0eea8d69b2
Merge pull request #46052 from thaJeztah/refactor_buildNetworkResource
daemon: refactor buildNetworkResource
2023-08-02 14:40:16 +02:00
Sebastiaan van Stijn
5e2a1195d7
swap logrus types for their containerd/logs aliases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-01 13:02:55 +02:00
Sebastiaan van Stijn
f3b76bc1da
daemon: refactor buildNetworkResource to use a struct-literal
Now that all helper functions are updated, we can use a struct-literal
for this function, which makes it slightly easier to read.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-26 13:03:22 +02:00
Sebastiaan van Stijn
a8c0b05052
daemon: split buildDetailedNetworkResources into two functions
Split the buildDetailedNetworkResources function into separate functions for
collecting container attachments (`buildContainerAttachments`) and service
attachments (`buildServiceAttachments`). This allows us to get rid of the
"verbose" bool, and makes the logic slightly more transparent.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-26 13:03:12 +02:00
Sebastiaan van Stijn
8caf974dcd
daemon: refactor buildEndpointResource
- Pass the endpoint and endpoint-info, instead of individual fields from the
  endpoint.
- Remove redundant nil-check, as it's already checked on the call-side
  in `buildDetailedNetworkResources`, which skips endpoints without info.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-26 12:06:19 +02:00
Sebastiaan van Stijn
437ced91ec
daemon: refactor buildIpamResources
Make the function return the constructed network.IPAM instead of applying
it to a network struct, and rename it to "buildIPAMResources".

Rewrite the function itself:

- Use struct-literals where possible to make it slightly more readable.
- Use a boolean (hasIPv4Config, hasIPv6Config) for both IPv4 and IPv6 to
  check whether the IPAM-info needs to be added. This makes the logic the
  same for both, and makes the processing order-independent. This also
  allows for the `network.IpamInfo()` call to be skipped if it's not needed.
- Change order of "ipv4 config / ipv4 info" and "ipv6 config / ipv4 info"
  blocks to make it slightly clearer (and to allow skipping the forementioned
  call to `network.IpamInfo()`).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-26 12:06:19 +02:00
Sebastiaan van Stijn
ca5ac19ea4
daemon: refactor buildPeerInfoResources
Move the length-check into the function, and change the code to
be a basic type-case, as networkdb.PeerInfo and network.PeerInfo
are identical types.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-26 12:06:17 +02:00
Sebastiaan van Stijn
64c6f72988
libnetwork: remove Network interface
There's only one implementation; drop the interface and use the
concrete type instead.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-22 11:56:41 +02:00
Albin Kerouanton
d29240d9eb
libnet: Return a 403 when overlay network isn't allowed
With this change, the API will now return a 403 instead of a 500 when
trying to create an overlay network on a non-manager node.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-07-11 12:41:24 +02:00
Albin Kerouanton
21dcbada2d
libnet: Return proper error when overlay network can't be created
The commit befff0e13f inadvertendly
disabled the error returned when trying to create an overlay network on
a node which is not part of a Swarm cluster.

Since commit e3708a89cc the overlay
netdriver returns the error: `no VNI provided`.

This commit reinstate the original error message by checking if the node
is a manager before calling libnetwork's `controller.NewNetwork()`.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-07-11 12:40:55 +02:00
Sebastiaan van Stijn
02815416bb
daemon: use string-literals for easier grep'ing
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-05 12:27:00 +02:00
Sebastiaan van Stijn
210932b3bf
daemon: format code with gofumpt
Formatting the code with https://github.com/mvdan/gofumpt

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-29 00:33:03 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Alex Stockinger
91c2b12205 Make default options for newly created networks configurable
Signed-off-by: Alex Stockinger <alex@atomicjar.com>
Co-authored-by: Sergei Egorov <bsideup@gmail.com>
Co-authored-by: Cory Snider <corhere@gmail.com>
2023-03-01 07:58:26 +01:00
Cory Snider
befff0e13f libnetwork: remove more datastore scope plumbing
Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-01-26 17:56:40 -05:00