Commit cbc2a71c2 makes `connect` syscall fail fast when a container is
only attached to an internal network. Thanks to that, if such a
container tries to resolve an "external" domain, the embedded resolver
returns an error immediately instead of waiting for a timeout.
This commit makes sure the embedded resolver doesn't even try to forward
to upstream servers.
Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Make sure the `ping` command used by `TestBridgeICC` actually has
the `-6` flag when it runs IPv6 test cases. Without this flag,
IPv6 connectivity isn't tested properly.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When IPv6 is disabled in a container by, for example, using the --sysctl
option - an IPv6 address/gateway is still allocated. Don't attempt to
apply that config because doing so enables IPv6 on the interface.
Signed-off-by: Rob Murray <rob.murray@docker.com>
In a container-create API request, HostConfig.NetworkMode (the identity
of the "main" network) may be a name, id or short-id.
The configuration for that network, including preferred IP address etc,
may be keyed on network name or id - it need not match the NetworkMode.
So, when migrating the old container-wide MAC address to the new
per-endpoint field - it is not safe to create a new EndpointSettings
entry unless there is no possibility that it will duplicate settings
intended for the same network (because one of the duplicates will be
discarded later, dropping the settings it contains).
This change introduces a new API restriction, if the deprecated container
wide field is used in the new API, and EndpointsConfig is provided for
any network, the NetworkMode and key under which the EndpointsConfig is
store must be the same - no mixing of ids and names.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Prior to release 25.0.0, the bridge in an internal network was assigned
an IP address - making the internal network accessible from the host,
giving containers on the network access to anything listening on the
bridge's address (or INADDR_ANY on the host).
This change restores that behaviour. It does not restore the default
route that was configured in the container, because packets sent outside
the internal network's subnet have always been dropped. So, a 'connect()'
to an address outside the subnet will still fail fast.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.
Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.
Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.
Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Do not set 'Config.MacAddress' in inspect output unless the MAC address
is configured.
Also, make sure it is filled in for a configured address on the default
network before the container is started (by translating the network name
from 'default' to 'config' so that the address lookup works).
Signed-off-by: Rob Murray <rob.murray@docker.com>
The API's EndpointConfig struct has a MacAddress field that's used for
both the configured address, and the current address (which may be generated).
A configured address must be restored when a container is restarted, but a
generated address must not.
The previous attempt to differentiate between the two, without adding a field
to the API's EndpointConfig that would show up in 'inspect' output, was a
field in the daemon's version of EndpointSettings, MACOperational. It did
not work, MACOperational was set to true when a configured address was
used. So, while it ensured addresses were regenerated, it failed to preserve
a configured address.
So, this change removes that code, and adds DesiredMacAddress to the wrapped
version of EndpointSettings, where it is persisted but does not appear in
'inspect' results. Its value is copied from MacAddress (the API field) when
a container is created.
Signed-off-by: Rob Murray <rob.murray@docker.com>
The MAC address of a running container was stored in the same place as
the configured address for a container.
When starting a stopped container, a generated address was treated as a
configured address. If that generated address (based on an IPAM-assigned
IP address) had been reused, the containers ended up with duplicate MAC
addresses.
So, remember whether the MAC address was explicitly configured, and
clear it if not.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Some configuration in a container depends on whether it has support for
IPv6 (including default entries for '::1' etc in '/etc/hosts').
Before this change, the container's support for IPv6 was determined by
whether it was connected to any IPv6-enabled networks. But, that can
change over time, it isn't a property of the container itself.
So, instead, detect IPv6 support by looking for '::1' on the container's
loopback interface. It will not be present if the kernel does not have
IPv6 support, or the user has disabled it in new namespaces by other
means.
Once IPv6 support has been determined for the container, its '/etc/hosts'
is re-generated accordingly.
The daemon no longer disables IPv6 on all interfaces during initialisation.
It now disables IPv6 only for interfaces that have not been assigned an
IPv6 address. (But, even if IPv6 is disabled for the container using the
sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6
networks still get IPv6 addresses that appear in the internal DNS. There's
more to-do!)
Signed-off-by: Rob Murray <rob.murray@docker.com>
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.
(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)
IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.
Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.
Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This test broke in 98323ac114.
This commit renamed WithMacAddress into WithContainerWideMacAddress.
This helper sets the MacAddress field in container.Config. However, API
v1.44 now ignores this field if the NetworkMode has no matching entry in
EndpointsConfig.
This fix uses the helper WithMacAddress and specify for which
EndpointConfig the MacAddress is specified.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Following tests are implemented in this specific commit:
- Inter-container communications for internal and non-internal
bridge networks, over IPv4 and IPv6.
- Inter-container communications using IPv6 link-local addresses for
internal and non-internal bridge networks.
- Inter-network communications for internal and non-internal bridge
networks, over IPv4 and IPv6, are disallowed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This commit introduces a new integration test suite aimed at testing
networking features like inter-container communication, network
isolation, port mapping, etc... and how they interact with daemon-level
and network-level parameters.
So far, there's pretty much no tests making sure our networks are well
configured: 1. there're a few tests for port mapping, but they don't
cover all use cases ; 2. there're a few tests that check if a specific
iptables rule exist, but that doesn't prevent that specific iptables
rule to be wrong in the first place.
As we're planning to refactor how iptables rules are written, and change
some of them to fix known security issues, we need a way to test all
combinations of parameters. So far, this was done by hand, which is
particularly painful and time consuming. As such, this new test suite is
foundational to upcoming work.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>