The overlay-network encryption code is woefully under-documented, which
is especially problematic as it operates on under-documented kernel
interfaces. Document what I have puzzled out of the implementation for
the benefit of the next poor soul to touch this code.
Signed-off-by: Cory Snider <csnider@mirantis.com>
IPVLAN networks created on Moby v20.10 do not have the IpvlanFlag
configuration value persisted in the libnetwork database as that config
value did not exist before v23.0.0. Gracefully migrate configurations on
unmarshal to prevent type-assertion panics at daemon start after upgrade.
Fixes#44925
Signed-off-by: Cory Snider <csnider@mirantis.com>
The DatastoreConfig discovery type is unused. Remove the constant and
any resulting dead code. Today's biggest loser is the IPAM Allocator:
DatastoreConfig was the only type of discovery event it was listening
for, and there was no other place where a non-nil datastore could be
passed into the allocator. Strip out all the dead persistence code from
Allocator, leaving it as purely an in-memory implementation.
There is no more need to check the consistency of the allocator's
bit-sequences as there is no persistent storage for inconsistent bit
sequences to be loaded from.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Per the Interface Segregation Principle, network drivers should not have
to depend on GetPluginGetter methods they do not use. The remote network
driver is the only one which needs a PluginGetter, and it is already
special-cased in Controller so there is no sense warping the interfaces
to achieve a foolish consistency. Replace all other network drivers' Init
functions with Register functions which take a driverapi.Registerer
argument instead of a driverapi.DriverCallback. Add back in Init wrapper
functions for only the drivers which Swarmkit references so that
Swarmkit can continue to build.
Refactor the libnetwork Controller to use the new drvregistry.Networks
and drvregistry.IPAMs driver registries in place of the legacy ones.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The function references global shared, mutable state and is no longer
needed. Deleting it brings us one step closer to getting rid of that
pesky shared state.
Signed-off-by: Cory Snider <csnider@mirantis.com>
These package-level variables were copied over from the Linux
implementation; drop them for clarity's sake.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
GenerateRandomName now uses length to represent the overall length of
the string; this will help future users avoid creating interface names
that are too long for the kernel to accept by mistake. The test coverage
is increased and cleaned up using gotest.tools.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
- The oldest kernel version currently supported is v3.10. Bridge
parameters can be set through netlink since v3.8 (see
torvalds/linux@25c71c7). As such, we don't need to fallback to sysfs to
set hairpin mode.
- `scanInterfaceStats()` is never called, so no need to keep it alive.
- Document why `default_pvid` is set through sysfs
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When userland-proxy is turned off and on again, the iptables nat rule
doing hairpinning isn't properly removed. This fix makes sure this nat
rule is removed whenever the bridge is torn down or hairpinning is
disabled (through setting userland-proxy to true).
Unlike for ip masquerading and ICC, the `programChainRule()` call
setting up the "MASQ LOCAL HOST" rule has to be called unconditionally
because the hairpin parameter isn't restored from the driver store, but
always comes from the driver config.
For the "SKIP DNAT" rule, things are a bit different: this rule is
always deleted by `removeIPChains()` when the bridge driver is
initialized.
Fixes#44721.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Conntrack entries are created for UDP flows even if there's nowhere to
route these packets (ie. no listening socket and no NAT rules to
apply). Moreover, iptables NAT rules are evaluated by netfilter only
when creating a new conntrack entry.
When Docker adds NAT rules, netfilter will ignore them for any packet
matching a pre-existing conntrack entry. In such case, when
dockerd runs with userland proxy enabled, packets got routed to it and
the main symptom will be bad source IP address (as shown by #44688).
If the publishing container is run through Docker Swarm or in
"standalone" Docker but with no userland proxy, affected packets will
be dropped (eg. routed to nowhere).
As such, Docker needs to flush all conntrack entries for published UDP
ports to make sure NAT rules are correctly applied to all packets.
- Fixes#44688
- Fixes#8795
- Fixes#16720
- Fixes#7540
- Fixesmoby/libnetwork#2423
- and probably more.
As a precautionary measure, those conntrack entries are also flushed
when revoking external connectivity to avoid those entries to be reused
when a new sandbox is created (although the kernel should already
prevent such case).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This code was forked from libcontainer (now runc) in
fb6dd9766e
From the description of this code:
> THIS CODE DOES NOT COMMUNICATE WITH KERNEL VIA RTNETLINK INTERFACE
> IT IS HERE FOR BACKWARDS COMPATIBILITY WITH OLDER LINUX KERNELS
> WHICH SHIP WITH OLDER NOT ENTIRELY FUNCTIONAL VERSION OF NETLINK
That comment was added as part of a refactor in;
4fe2c7a4db
Digging deeper into the code, it describes:
> This is more backward-compatible than netlink.NetworkSetMaster and
> works on RHEL 6.
That comment (and code) moved around a few times;
- moved into the libcontainer pkg: 6158ccad97
- moved within the networkdriver pkg: 4cdcea2047
- moved into the networkdriver pkg: 90494600d3
Ultimately leading to 7a94cdf8ed, which implemented
this:
> create the bridge device with ioctl
>
> On RHEL 6, creation of a bridge device with netlink fails. Use the more
> backward-compatible ioctl instead. This fixes networking on RHEL 6.
So from that information, it looks indeed to support RHEL 6, and Ubuntu 12.04
which are both EOL, and we haven't supported for a long time, so probably time
to remove this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestCreateParallel, which was ostensibly added as a regression test for
race conditions inside the bridge driver, contains a race condition. The
getIPv4Data() calls race the network configuration and so will sometimes
see the existing address assignments return IP address ranges which do
not conflict with them. While normally a good thing, the test asserts
that exactly one of the 100 networks is successfully created. Pass the
same IPAM data when attempting to create every network to ensure that
the address ranges conflict.
Signed-off-by: Cory Snider <csnider@mirantis.com>
addresses() would incorrectly return all IP addresses assigned to any
interface in the network namespace if exists() is false. This went
unnoticed as the unit test covering this case tested the method inside a
clean new network namespace, which had no interfaces brought up and
therefore no IP addresses assigned. Modifying
testutils.SetupTestOSContext() to bring up the loopback interface 'lo'
resulted in the loopback addresses 127.0.0.1 and [::1] being assigned to
the loopback interface, causing addresses() to return the loopback
addresses and TestAddressesEmptyInterface to start failing. Fix the
implementation of addresses() so that it only ever returns addresses for
the bridge interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
portallocator.PortAllocator holds persistent state on port allocations
in the network namespace. It normal operation it is used as a singleton,
which is a problem in unit tests as every test runs in a clean network
namespace. The singleton "default" PortAllocator remembers all the port
allocations made in other tests---in other network namespaces---and
can result in spurious test failures. Refactor the bridge driver so that
tests can construct driver instances with unique PortAllocators.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There are a handful of libnetwork tests which need to have multiple
concurrent goroutines inside the same test OS context (network
namespace), including some platform-agnostic tests. Provide test
utilities for spawning goroutines inside an OS context and
setting/restoring an existing goroutine's OS context to abstract away
platform differences and so that each test does not need to reinvent the
wheel.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The SetupTestOSContext calls were made conditional in
https://github.com/moby/libnetwork/pull/148 to work around limitations
in runtime.LockOSThread() which existed before Go 1.10. This workaround
is no longer necessary now that runtime.UnlockOSThread() needs to be
called an equal number of times before the goroutine is unlocked from
the OS thread.
Unfortunately some tests break when SetupTestOSContext is not skipped.
(Evidently this code path has not been exercised in a long time.) A
newly-created network namespace is very barebones: it contains a
loopback interface in the down state and little else. Even pinging
localhost does not work inside of a brand new namespace. Set the
loopback interface to up during namespace setup so that tests which
need to use the loopback interface can do so.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the GenericData option is missing from the map, the bridge driver
would skip configuration. At first glance this seems fine: the driver
defaults are to not configure anything. But it also skips over
initializing the persistent storage, which is configured through other
option keys in the map. Fix this oversight by removing the early return.
Signed-off-by: Cory Snider <csnider@mirantis.com>
func (*network) watchMiss() correctly locks its goroutine to an OS
thread before changing the thread's network namespace, but neglects to
restore the thread's network namespace before unlocking. Fix this
oversight by unlocking iff the thread's network namespace is
successfully restored.
Prevent the watchMiss goroutine from being locked to the main thread to
avoid the issues which would arise if such a situation was to occur.
Signed-off-by: Cory Snider <csnider@mirantis.com>
libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive)
libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive)
libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive)
libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive)
libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive)
libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive)
libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive)
libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive)
libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive)
libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive)
libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Older versions of Go don't format comments, so committing this as
a separate commit, so that we can already make these changes before
we upgrade to Go 1.19.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inlining the string makes the code more grep'able; renaming the
const to "driverName" to reflect the remaining uses of it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inlining the string makes the code more grep'able; renaming the
const to "driverName" to reflect the remaining uses of it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was effectively a constructor, but through some indirection; make it a
regular function, which is a bit more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was effectively a constructor, but through some indirection; make it a
regular function, which is a bit more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>