Commit graph

3350 commits

Author SHA1 Message Date
Paweł Gronowski
e829cca0ee
Merge pull request #47584 from robmry/upstream_dns_windows
Windows DNS resolver forwarding
2024-04-19 11:34:50 +02:00
Sebastiaan van Stijn
82d8f8d6e6
Merge pull request from GHSA-x84c-p2g9-rqv9
Disable IPv6 for endpoints in '--ipv6=false' networks.
2024-04-18 17:50:35 +02:00
Rob Murray
6c68be24a2 Windows DNS resolver forwarding
Make the internal DNS resolver for Windows containers forward requests
to upsteam DNS servers when it cannot respond itself, rather than
returning SERVFAIL.

Windows containers are normally configured with the internal resolver
first for service discovery (container name lookup), then external
resolvers from '--dns' or the host's networking configuration.

When a tool like ping gets a SERVFAIL from the internal resolver, it
tries the other nameservers. But, nslookup does not, and with this
change it does not need to.

The internal resolver learns external server addresses from the
container's HNSEndpoint configuration, so it will use the same DNS
servers as processes in the container.

The internal resolver for Windows containers listens on the network's
gateway address, and each container may have a different set of external
DNS servers. So, the resolver uses the source address of the DNS request
to select external resolvers.

On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used
to prevent the internal resolver from forwarding requests (restoring the
old behaviour).

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-16 18:57:28 +01:00
Rob Murray
f07644e17e Add netiputil.AddrPortFromNet()
Co-authored-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-15 14:51:20 +01:00
Victor Toni
f51e18f58e Fix typo
Signed-off-by: Victor Toni <victor.toni@gmail.com>
2024-04-11 00:19:05 +02:00
Rob Murray
57dd56726a Disable IPv6 for endpoints in '--ipv6=false' networks.
No IPAM IPv6 address is given to an interface in a network with
'--ipv6=false', but the kernel would assign a link-local address and,
in a macvlan/ipvlan network, the interface may get a SLAAC-assigned
address.

So, disable IPv6 on the interface to avoid that.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-10 17:11:20 +01:00
Rob Murray
cd7240f6d9 Stop macvlan with no parent from using ext-dns
We document that an macvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an macvlan
network was still configured with a gateway. So, DNS proxying would
be enabled in the internal resolver (and, if the host's resolver
was on a localhost address, requests to external resolvers from the
host's network namespace would succeed).

This change disables configuration of a gateway for a macvlan Endpoint
if no parent interface is specified.

(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway will still be set up. Documentation
will need to be updated to note that '--internal' should be used to
prevent DNS request forwarding in this case.)

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-10 08:51:00 +01:00
Rob Murray
17b8631545 Enable DNS proxying for ipvlan-l3
The internal DNS resolver should only forward requests to external
resolvers if the libnetwork.Sandbox served by the resolver has external
network access (so, no forwarding for '--internal' networks).

The test for external network access was whether the Sandbox had an
Endpoint with a gateway configured.

However, an ipvlan-l3 networks with external network access does not
have a gateway, it has a default route bound to an interface.

Also, we document that an ipvlan network with no parent interface is
equivalent to a '--internal' network. But, in this case, an ipvlan-l2
network was configured with a gateway. So, DNS proxying would be enabled
in the internal resolver (and, if the host's resolver was on a localhost
address, requests to external resolvers from the host's network
namespace would succeed).

So, this change adjusts the test for enabling DNS proxying to include
a check for '--internal' (as a shortcut) and, for non-internal networks,
checks for a default route as well as a gateway. It also disables
configuration of a gateway or a default route for an ipvlan Endpoint if
no parent interface is specified.

(Note if a parent interface with no external network is supplied as
'-o parent=<dummy>', the gateway/default route will still be set up
and external DNS proxying will be enabled. The network must be
configured as '--internal' to prevent that from happening.)

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-10 08:50:57 +01:00
Rob Murray
fde80fe2e7 Restore the SetKey prestart hook.
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"

Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.

But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.

This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-25 19:35:55 +00:00
George Ma
14a8fac092 chore: fix mismatched function names in godoc
Signed-off-by: George Ma <mayangang@outlook.com>
2024-03-22 16:24:31 +08:00
Brian Goff
59c5059081
Merge pull request #47443 from corhere/cnmallocator/lift-n-shift
Vendor dependency cycle-free swarmkit
2024-03-21 12:29:46 -07:00
Bjorn Neergaard
641e341eed
Merge pull request #47538 from robmry/libnet-resolver-nxdomain
libnet: Don't forward to upstream resolvers on internal nw
2024-03-18 11:22:59 -06:00
Sebastiaan van Stijn
4ff655f4b8
resolvconf: add //go:build directives to prevent downgrading to go1.16 language
Commit 8921897e3b introduced the uses of `clear()`,
which requires go1.21, but Go is downgrading this file to go1.16 when used in
other projects (due to us not yet being a go module);

    0.175 + xx-go build '-gcflags=' -ldflags '-X github.com/moby/buildkit/version.Version=b53a13e -X github.com/moby/buildkit/version.Revision=b53a13e4f5c8d7e82716615e0f23656893df89af -X github.com/moby/buildkit/version.Package=github.com/moby/buildkit -extldflags '"'"'-static'"'" -tags 'osusergo netgo static_build seccomp ' -o /usr/bin/buildkitd ./cmd/buildkitd
    181.8 # github.com/docker/docker/libnetwork/internal/resolvconf
    181.8 vendor/github.com/docker/docker/libnetwork/internal/resolvconf/resolvconf.go:509:2: clear requires go1.21 or later (-lang was set to go1.16; check go.mod)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-03-18 12:28:21 +01:00
Albin Kerouanton
790c3039d0 libnet: Don't forward to upstream resolvers on internal nw
Commit cbc2a71c2 makes `connect` syscall fail fast when a container is
only attached to an internal network. Thanks to that, if such a
container tries to resolve an "external" domain, the embedded resolver
returns an error immediately instead of waiting for a timeout.

This commit makes sure the embedded resolver doesn't even try to forward
to upstream servers.

Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-14 17:46:48 +00:00
Sebastiaan van Stijn
1abf17c779
Merge pull request #47512 from robmry/46329_internal_resolver_ipv6_upstream
Add IPv6 nameserver to the internal DNS's upstreams.
2024-03-07 21:21:12 +01:00
Paweł Gronowski
608d77d740
Merge pull request #47497 from robmry/resolvconf_fixes
Fix 'resolv.conf' parsing issues
2024-03-07 13:05:10 +01:00
Paweł Gronowski
66adfc729a
Merge pull request #47521 from robmry/no_ipv6_addr_when_ipv6_disabled
Don't configure IPv6 addr/gw when IPv6 disabled.
2024-03-07 12:57:27 +01:00
Sebastiaan van Stijn
f5a5e3f203
golangci-lint: enable dupword linter
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-03-07 11:44:27 +01:00
Sebastiaan van Stijn
4adc40ac40
fix duplicate words (dupwords)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-03-07 10:57:03 +01:00
Rob Murray
8921897e3b Ignore bad ndots in host resolv.conf
Rather than error out if the host's resolv.conf has a bad ndots option,
just ignore it. Still validate ndots supplied via '--dns-option' and
treat failure as an error.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-07 09:27:34 +00:00
Rob Murray
ef5295cda4 Don't configure IPv6 addr/gw when IPv6 disabled.
When IPv6 is disabled in a container by, for example, using the --sysctl
option - an IPv6 address/gateway is still allocated. Don't attempt to
apply that config because doing so enables IPv6 on the interface.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-06 18:32:31 +00:00
Rob Murray
4e8d9a4522 Add IPv6 nameserver to the internal DNS's upstreams.
When configuring the internal DNS resolver - rather than keep IPv6
nameservers read from the host's resolv.conf in the container's
resolv.conf, treat them like IPv4 addresses and use them as upstream
resolvers.

For IPv6 nameservers, if there's a zone identifier in the address or
the container itself doesn't have IPv6 support, mark the upstream
addresses for use in the host's network namespace.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-06 10:47:18 +00:00
Rob Murray
f04f69e366 Accumulate resolv.conf options
If there are multiple "options" lines, keep the options from all of
them.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-01 16:59:28 +00:00
Rob Murray
7f69142aa0 resolv.conf comments have '#' or ';' in the first column
When a '#' or ';' appears anywhere else, it's not a comment marker.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-01 16:58:04 +00:00
Sebastiaan van Stijn
97a5435d33
Merge pull request #47477 from robmry/resolvconf_gocompat
Remove slices.Clone() calls to avoid Go bug
2024-03-01 17:28:01 +01:00
Rob Murray
91d9307738 Replace uses of slices.Clone()
Avoid https://github.com/golang/go/issues/64759

Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-03-01 15:27:29 +00:00
Filipe Pina
ef681124ca
fix typo in error message
Signed-off-by: Filipe Pina <hzlu1ot0@duck.com>
2024-02-29 23:27:00 +00:00
Cory Snider
4f30a930ad libn/cnmallocator: migrate tests to gotest.tools/v3
Apply command gotest.tools/v3/assert/cmd/gty-migrate-from-testify to the
cnmallocator package to be consistent with the assertion library used
elsewhere in moby.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-02-29 16:14:02 -05:00
Sebastiaan van Stijn
6c3b3523c9
Merge pull request #47041 from robmry/46968_refactor_resolvconf
Refactor 'resolv.conf' generation.
2024-02-29 09:33:55 +01:00
Cory Snider
7b0ab1011c Vendor dependency cycle-free swarmkit
Moby imports Swarmkit; Swarmkit no longer imports Moby. In order to
accomplish this feat, Swarmkit has introduced a new plugin.Getter
interface so it could stop importing our pkg/plugingetter package. This
new interface is not entirely compatible with our
plugingetter.PluginGetter interface, necessitating a thin adapter.

Swarmkit had to jettison the CNM network allocator to stop having to
import libnetwork as the cnmallocator package is deeply tied to
libnetwork. Move the CNM network allocator into libnetwork, where it
belongs. The package had a short an uninteresting Git history in the
Swarmkit repository so no effort was made to retain history.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-02-28 09:46:45 -05:00
Albin Kerouanton
83c02f7a11 libnet/ds: remove extra space in error msg
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-22 18:49:28 +01:00
Sebastiaan van Stijn
7d081179e9
Merge pull request #47422 from akerouanton/libnet-ds-DeleteIdempotent
libnet: Replace DeleteAtomic in retry loops with Delete
2024-02-22 17:24:05 +01:00
Albin Kerouanton
cbd45e83cf libnet: Replace DeleteAtomic in retry loops with DeleteIdempotent
A common pattern in libnetwork is to delete an object using
`DeleteAtomic`, ie. to check the optimistic lock, but put in a retry
loop to refresh the data and the version index used by the optimistic
lock.

This commit introduces a new `Delete` method to delete without
checking the optimistic lock. It focuses only on the few places where
it's obvious the calling code doesn't rely on the side-effects of the
retry loop (ie. refreshing the object to be deleted).

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-22 08:22:09 +01:00
Sebastiaan van Stijn
d9e082ff54
libnetwork: resolve: use structured logs for DNS error
I noticed that this log didn't use structured logs;

    [resolver] failed to query DNS server: 10.115.11.146:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:46361->10.115.11.146:53: i/o timeout
    [resolver] failed to query DNS server: 10.44.139.225:53, query: ;google.com.\tIN\t A" error="read udp 172.19.0.2:53991->10.44.139.225:53: i/o timeout

But other logs did;

    DEBU[2024-02-20T15:48:51.026704088Z] [resolver] forwarding query                   client-addr="udp:172.19.0.2:39661" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t A"
    DEBU[2024-02-20T15:48:51.028331088Z] [resolver] forwarding query                   client-addr="udp:172.19.0.2:35163" dns-server="udp:192.168.65.7:53" question=";google.com.\tIN\t AAAA"
    DEBU[2024-02-20T15:48:51.057329755Z] [resolver] received AAAA record "2a00:1450:400e:801::200e" for "google.com." from udp:192.168.65.7
    DEBU[2024-02-20T15:48:51.057666880Z] [resolver] received A record "142.251.36.14" for "google.com." from udp:192.168.65.7

As we're already constructing a logger with these fields, we may as well use it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-20 17:01:06 +01:00
Rob Murray
419f5a6372 Make 'internal' bridge networks accessible from host
Prior to release 25.0.0, the bridge in an internal network was assigned
an IP address - making the internal network accessible from the host,
giving containers on the network access to anything listening on the
bridge's address (or INADDR_ANY on the host).

This change restores that behaviour. It does not restore the default
route that was configured in the container, because packets sent outside
the internal network's subnet have always been dropped. So, a 'connect()'
to an address outside the subnet will still fail fast.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-02-07 19:12:10 +00:00
Rob Murray
beb97f7fdf Refactor 'resolv.conf' generation.
Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.

Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.

Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.

Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-02-06 22:26:12 +00:00
Sebastiaan van Stijn
0616b4190e
Merge pull request #47309 from akerouanton/libnet-bridge-mtu-ignore-einval
libnet: bridge: ignore EINVAL when configuring bridge MTU
2024-02-03 11:36:19 +01:00
Albin Kerouanton
89470a7114 libnet: bridge: ignore EINVAL when configuring bridge MTU
Since 964ab7158c, we explicitly set the bridge MTU if it was specified.
Unfortunately, kernel <v4.17 have a check preventing us to manually set
the MTU to anything greater than 1500 if no links is attached to the
bridge, which is how we do things -- create the bridge, set its MTU and
later on, attach veths to it.

Relevant kernel commit: 804b854d37

As we still have to support CentOS/RHEL 7 (and their old v3.10 kernels)
for a few more months, we need to ignore EINVAL if the MTU is > 1500
(but <= 65535).

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-02 19:32:45 +01:00
Sebastiaan van Stijn
701dd989f1
Merge pull request #47302 from akerouanton/libnet-ds-PersistConnection
libnet: boltdb: remove PersistConnection
2024-02-02 19:05:09 +01:00
Albin Kerouanton
83af50aee3 libnet: boltdb: inline getDBhandle()
Previous commit made getDBhandle a one-liner returning a struct
member -- making it useless. Inline it.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-02 09:19:07 +01:00
Albin Kerouanton
4d7c11c208 libnet: boltdb: remove PersistConnection
This parameter was used to tell the boltdb kvstore not to open/close
the underlying boltdb db file before/after each get/put operation.

Since d21d0884ae, we've a single datastore instance shared by all
components that need it. That commit set `PersistConnection=true`.
We can now safely remove this param altogether, and remove all the
code that was opening and closing the db file before and after each
operation -- it's dead code!

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-02 09:19:07 +01:00
Albin Kerouanton
8070a9aa66 libnet: drop TestMultipleControllersWithSameStore
This test is non-representative of what we now do in libnetwork.
Since the ability of opening the same boltdb database multiple
times in parallel will be dropped in the next commit, just remove
this test.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-02-02 09:19:07 +01:00
Albin Kerouanton
025967efd0
Merge pull request #47293 from robmry/47229-internal-bridge-firewalld
Add internal n/w bridge to firewalld docker zone
2024-02-02 08:36:27 +01:00
Albin Kerouanton
add2c4c79b
Merge pull request #47285 from corhere/libn/one-datastore-to-rule-them-all
libnetwork: share a single datastore with drivers
2024-02-02 08:03:01 +01:00
Sebastiaan van Stijn
8604cc400d
Merge pull request #47242 from robmry/remove_etchosts_build_unused_params
Remove unused params from etchosts.Build()
2024-02-02 01:09:10 +01:00
Rob Murray
2cc627932a Add internal n/w bridge to firewalld docker zone
Containers attached to an 'internal' bridge network are unable to
communicate when the host is running firewalld.

Non-internal bridges are added to a trusted 'docker' firewalld zone, but
internal bridges were not.

DOCKER-ISOLATION iptables rules are still configured for an internal
network, they block traffic to/from addresses outside the network's subnet.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-02-01 11:49:53 +00:00
Cory Snider
2200c0137f libnetwork/datastore: don't parse file path
File paths can contain commas, particularly paths returned from
t.TempDir() in subtests which include commas in their names. There is
only one datastore provider and it only supports a single address, so
the only use of parsing the address is to break tests in mysterious
ways.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-01-31 21:26:28 -05:00
Cory Snider
d21d0884ae libnetwork: share a single datastore with drivers
The bbolt library wants exclusive access to the boltdb file and uses
file locking to assure that is the case. The controller and each network
driver that needs persistent storage instantiates its own unique
datastore instance, backed by the same boltdb file. The boltdb kvstore
implementation works around multiple access to the same boltdb file by
aggressively closing the boltdb file between each transaction. This is
very inefficient. Have the controller pass its datastore instance into
the drivers and enable the PersistConnection option to disable closing
the boltdb between transactions.

Set data-dir in unit tests which instantiate libnetwork controllers so
they don't hang trying to lock the default boltdb database file.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-01-31 21:08:34 -05:00
Sebastiaan van Stijn
f472dda2e9
Merge pull request #47236 from akerouanton/remove-sb-leave-options-param
libnet: remove arg `options` from (*Endpoint).Leave()
2024-01-30 16:57:36 +01:00
Rob Murray
2ddec74d59 Remove unused params from etchosts.Build()
Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-01-29 15:37:08 +00:00