beenull/moby

Author	SHA1	Message	Date
Paweł Gronowski	e829cca0ee	Merge pull request #47584 from robmry/upstream_dns_windows Windows DNS resolver forwarding	2024-04-19 11:34:50 +02:00
Rob Murray	6c68be24a2	Windows DNS resolver forwarding Make the internal DNS resolver for Windows containers forward requests to upsteam DNS servers when it cannot respond itself, rather than returning SERVFAIL. Windows containers are normally configured with the internal resolver first for service discovery (container name lookup), then external resolvers from '--dns' or the host's networking configuration. When a tool like ping gets a SERVFAIL from the internal resolver, it tries the other nameservers. But, nslookup does not, and with this change it does not need to. The internal resolver learns external server addresses from the container's HNSEndpoint configuration, so it will use the same DNS servers as processes in the container. The internal resolver for Windows containers listens on the network's gateway address, and each container may have a different set of external DNS servers. So, the resolver uses the source address of the DNS request to select external resolvers. On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used to prevent the internal resolver from forwarding requests (restoring the old behaviour). Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-04-16 18:57:28 +01:00
Rob Murray	57dd56726a	Disable IPv6 for endpoints in '--ipv6=false' networks. No IPAM IPv6 address is given to an interface in a network with '--ipv6=false', but the kernel would assign a link-local address and, in a macvlan/ipvlan network, the interface may get a SLAAC-assigned address. So, disable IPv6 on the interface to avoid that. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-04-10 17:11:20 +01:00
Rob Murray	d8b768149b	Move dummy DNS server to integration/internal/network Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-04-04 12:02:22 +01:00
Rob Murray	fde80fe2e7	Restore the SetKey prestart hook. Partially reverts `0046b16` "daemon: set libnetwork sandbox key w/o OCI hook" Running SetKey to store the OCI Sandbox key after task creation, rather than from the OCI prestart hook, meant it happened after sysctl settings were applied by the runtime - which was the intention, we wanted to complete Sandbox configuration after IPv6 had been disabled by a sysctl if that was going to happen. But, it meant '--sysctl' options for a specfic network interface caused container task creation to fail, because the interface is only moved into the network namespace during SetKey. This change restores the SetKey prestart hook, and regenerates config files that depend on the container's support for IPv6 after the task has been created. It also adds a regression test that makes sure it's possible to set an interface-specfic sysctl. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-03-25 19:35:55 +00:00
Bjorn Neergaard	641e341eed	Merge pull request #47538 from robmry/libnet-resolver-nxdomain libnet: Don't forward to upstream resolvers on internal nw	2024-03-18 11:22:59 -06:00
Albin Kerouanton	790c3039d0	libnet: Don't forward to upstream resolvers on internal nw Commit `cbc2a71c2` makes `connect` syscall fail fast when a container is only attached to an internal network. Thanks to that, if such a container tries to resolve an "external" domain, the embedded resolver returns an error immediately instead of waiting for a timeout. This commit makes sure the embedded resolver doesn't even try to forward to upstream servers. Co-authored-by: Albin Kerouanton <albinker@gmail.com> Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-03-14 17:46:48 +00:00
Sebastiaan van Stijn	0fb845858d	Merge pull request #47505 from akerouanton/fix-TestBridgeICC-ipv6 inte/networking: ping with -6 specified when needed	2024-03-08 18:33:46 +01:00
Albin Kerouanton	5a009cdd5b	inte/networking: add isIPv6 flag Make sure the `ping` command used by `TestBridgeICC` actually has the `-6` flag when it runs IPv6 test cases. Without this flag, IPv6 connectivity isn't tested properly. Signed-off-by: Albin Kerouanton <albinker@gmail.com>	2024-03-07 17:55:53 +01:00
Rob Murray	ef5295cda4	Don't configure IPv6 addr/gw when IPv6 disabled. When IPv6 is disabled in a container by, for example, using the --sysctl option - an IPv6 address/gateway is still allocated. Don't attempt to apply that config because doing so enables IPv6 on the interface. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-03-06 18:32:31 +00:00
Albin Kerouanton	7c7e453255	Merge pull request #47474 from robmry/47441_mac_addr_config_migration Don't create endpoint config for MAC addr config migration	2024-03-06 11:04:17 +01:00
Albin Kerouanton	21835a5696	inte/networking: rename linkLocal flag into isLinkLocal Signed-off-by: Albin Kerouanton <albinker@gmail.com>	2024-03-06 00:16:08 +01:00
Sebastiaan van Stijn	137a9d6a4c	Merge pull request #47395 from robmry/47370_windows_natnw_dns_test Test DNS on Windows 'nat' networks	2024-03-01 13:02:52 +01:00
Rob Murray	a580544d82	Don't create endpoint config for MAC addr config migration In a container-create API request, HostConfig.NetworkMode (the identity of the "main" network) may be a name, id or short-id. The configuration for that network, including preferred IP address etc, may be keyed on network name or id - it need not match the NetworkMode. So, when migrating the old container-wide MAC address to the new per-endpoint field - it is not safe to create a new EndpointSettings entry unless there is no possibility that it will duplicate settings intended for the same network (because one of the duplicates will be discarded later, dropping the settings it contains). This change introduces a new API restriction, if the deprecated container wide field is used in the new API, and EndpointsConfig is provided for any network, the NetworkMode and key under which the EndpointsConfig is store must be the same - no mixing of ids and names. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-02-29 17:02:19 +00:00
Sebastiaan van Stijn	6c3b3523c9	Merge pull request #47041 from robmry/46968_refactor_resolvconf Refactor 'resolv.conf' generation.	2024-02-29 09:33:55 +01:00
Rob Murray	9083c2f10d	Test DNS on Windows 'nat' networks Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-02-27 11:40:11 +00:00
Rob Murray	419f5a6372	Make 'internal' bridge networks accessible from host Prior to release 25.0.0, the bridge in an internal network was assigned an IP address - making the internal network accessible from the host, giving containers on the network access to anything listening on the bridge's address (or INADDR_ANY on the host). This change restores that behaviour. It does not restore the default route that was configured in the container, because packets sent outside the internal network's subnet have always been dropped. So, a 'connect()' to an address outside the subnet will still fail fast. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-02-07 19:12:10 +00:00
Rob Murray	beb97f7fdf	Refactor 'resolv.conf' generation. Replace regex matching/replacement and re-reading of generated files with a simple parser, and struct to remember and manipulate the file content. Annotate the generated file with a header comment saying the file is generated, but can be modified, and a trailing comment describing how the file was generated and listing external nameservers. Always start with the host's resolv.conf file, whether generating config for host networking, or with/without an internal resolver - rather than editing a file previously generated for a different use-case. Resolves an issue where rewrites of the generated file resulted in default IPv6 nameservers being unnecessarily added to the config. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-02-06 22:26:12 +00:00
Albin Kerouanton	ca683c1c77	Merge pull request #47233 from robmry/47146-duplicate_mac_addrs2 Only restore a configured MAC addr on restart.	2024-02-02 09:08:17 +01:00
Rob Murray	8c64b85fb9	No inspect 'Config.MacAddress' unless configured. Do not set 'Config.MacAddress' in inspect output unless the MAC address is configured. Also, make sure it is filled in for a configured address on the default network before the container is started (by translating the network name from 'default' to 'config' so that the address lookup works). Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-02-01 09:57:35 +00:00
Rob Murray	dae33031e0	Only restore a configured MAC addr on restart. The API's EndpointConfig struct has a MacAddress field that's used for both the configured address, and the current address (which may be generated). A configured address must be restored when a container is restarted, but a generated address must not. The previous attempt to differentiate between the two, without adding a field to the API's EndpointConfig that would show up in 'inspect' output, was a field in the daemon's version of EndpointSettings, MACOperational. It did not work, MACOperational was set to true when a configured address was used. So, while it ensured addresses were regenerated, it failed to preserve a configured address. So, this change removes that code, and adds DesiredMacAddress to the wrapped version of EndpointSettings, where it is persisted but does not appear in 'inspect' results. Its value is copied from MacAddress (the API field) when a container is created. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-02-01 09:55:54 +00:00
Albin Kerouanton	794f7127ef	Merge pull request #47062 from robmry/35954-default_ipv6_enabled Detect IPv6 support in containers, generate '/etc/hosts' accordingly.	2024-01-29 16:31:35 +01:00
Rob Murray	cd53b7380c	Remove generated MAC addresses on restart. The MAC address of a running container was stored in the same place as the configured address for a container. When starting a stopped container, a generated address was treated as a configured address. If that generated address (based on an IPAM-assigned IP address) had been reused, the containers ended up with duplicate MAC addresses. So, remember whether the MAC address was explicitly configured, and clear it if not. Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-01-22 17:52:20 +00:00
Rob Murray	a8f7c5ee48	Detect IPv6 support in containers. Some configuration in a container depends on whether it has support for IPv6 (including default entries for '::1' etc in '/etc/hosts'). Before this change, the container's support for IPv6 was determined by whether it was connected to any IPv6-enabled networks. But, that can change over time, it isn't a property of the container itself. So, instead, detect IPv6 support by looking for '::1' on the container's loopback interface. It will not be present if the kernel does not have IPv6 support, or the user has disabled it in new namespaces by other means. Once IPv6 support has been determined for the container, its '/etc/hosts' is re-generated accordingly. The daemon no longer disables IPv6 on all interfaces during initialisation. It now disables IPv6 only for interfaces that have not been assigned an IPv6 address. (But, even if IPv6 is disabled for the container using the sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6 networks still get IPv6 addresses that appear in the internal DNS. There's more to-do!) Signed-off-by: Rob Murray <rob.murray@docker.com>	2024-01-19 20:24:07 +00:00
Rob Murray	27f3abd893	Allow overlapping change in bridge's IPv6 network. Calculate the IPv6 addreesses needed on a bridge, then reconcile them with the addresses on an existing bridge by deleting then adding as required. (Previously, required addresses were added one-by-one, then unwanted addresses were removed. This meant the daemon failed to start if, for example, an existing bridge had address '2000:db8::/64' and the config was changed to '2000:db8::/80'.) IPv6 addresses are now calculated and applied in one go, so there's no need for setupVerifyAndReconcile() to check the set of IPv6 addresses on the bridge. And, it was guarded by !config.InhibitIPv4, which can't have been right. So, removed its IPv6 parts, and added IPv4 to its name. Link local addresses, the example given in the original ticket, are now released when containers are stopped. Not releasing them meant that when using an LL subnet on the default bridge, no container could be started after a container was stopped (because the calculated address could not be re-allocated). In non-default bridge networks using an LL subnet, addresses leaked. Linux always uses the standard 'fe80::/64' LL network. So, if a bridge is configured with an LL subnet prefix that overlaps with it, a config error is reported. Non-overlapping LL subnet prefixes are allowed. Signed-off-by: Rob Murray <rob.murray@docker.com>	2023-12-18 16:10:41 +00:00
Sebastiaan van Stijn	58785c2932	integration/networking: fix TestBridgeICC This test broke in `98323ac114`. This commit renamed WithMacAddress into WithContainerWideMacAddress. This helper sets the MacAddress field in container.Config. However, API v1.44 now ignores this field if the NetworkMode has no matching entry in EndpointsConfig. This fix uses the helper WithMacAddress and specify for which EndpointConfig the MacAddress is specified. Signed-off-by: Sebastiaan van Stijn <github@gone.nl> Signed-off-by: Albin Kerouanton <albinker@gmail.com>	2023-11-08 10:23:24 +01:00
Albin Kerouanton	c1ab6eda4b	integration/networking: Test bridge ICC and INC Following tests are implemented in this specific commit: - Inter-container communications for internal and non-internal bridge networks, over IPv4 and IPv6. - Inter-container communications using IPv6 link-local addresses for internal and non-internal bridge networks. - Inter-network communications for internal and non-internal bridge networks, over IPv4 and IPv6, are disallowed. Signed-off-by: Albin Kerouanton <albinker@gmail.com>	2023-11-03 09:58:50 +01:00
Albin Kerouanton	409ea700c7	integration: Add a new networking integration test suite This commit introduces a new integration test suite aimed at testing networking features like inter-container communication, network isolation, port mapping, etc... and how they interact with daemon-level and network-level parameters. So far, there's pretty much no tests making sure our networks are well configured: 1. there're a few tests for port mapping, but they don't cover all use cases ; 2. there're a few tests that check if a specific iptables rule exist, but that doesn't prevent that specific iptables rule to be wrong in the first place. As we're planning to refactor how iptables rules are written, and change some of them to fix known security issues, we need a way to test all combinations of parameters. So far, this was done by hand, which is particularly painful and time consuming. As such, this new test suite is foundational to upcoming work. Signed-off-by: Albin Kerouanton <albinker@gmail.com>	2023-11-03 09:58:50 +01:00

28 commits