unshare.Go() is not used as an existing network namespace needs to be
entered, not a new one created. Explicitly lock main() to the initial
thread so as not to depend on the side effects of importing the
internal/unshare package to achieve the same.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- The oldest kernel version currently supported is v3.10. Bridge
parameters can be set through netlink since v3.8 (see
torvalds/linux@25c71c7). As such, we don't need to fallback to sysfs to
set hairpin mode.
- `scanInterfaceStats()` is never called, so no need to keep it alive.
- Document why `default_pvid` is set through sysfs
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(*networkNamespace).InvokeFunc() cleaned up the state of the locked
thread by setting its network namespace to the netns of the goroutine
which called InvokeFunc(). If InvokeFunc() was to be called after the
caller had modified its thread's network namespace, InvokeFunc() would
incorrectly "restore" the state of its goroutine thread to the wrong
namespace, violating the invariant that unlocked threads are fungible.
Change the implementation to restore the thread's netns to the netns
that particular thread had before InvokeFunc() modified it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Aside from unconditionally unlocking the OS thread even if restoring the
thread's network namespace fails, func (*networkNamespace).InvokeFunc()
correctly implements invoking a function inside a network namespace.
This is far from obvious, however. func InitOSContext() does much of the
heavy lifting but in a bizarre fashion: it restores the initial network
namespace before it is changed in the first place, and the cleanup
function it returns does not restore the network namespace at all! The
InvokeFunc() implementation has to restore the network namespace
explicitly by deferring a call to ns.SetNamespace().
func InitOSContext() is a leaky abstraction taped to a footgun. On the
one hand, it defensively resets the current thread's network namespace,
which has the potential to fix up the thread state if other buggy code
had failed to maintain the invariant that an OS thread must be locked to
a goroutine unless it is interchangeable with a "clean" thread as
spawned by the Go runtime. On the other hand, it _facilitates_ writing
buggy code which fails to maintain the aforementioned invariant because
the cleanup function it returns unlocks the thread from the goroutine
unconditionally while neglecting to restore the thread's network
namespace! It is quite scary to need a function which fixes up threads'
network namespaces after the fact as an arbitrary number of goroutines
could have been scheduled onto a "dirty" thread and run non-libnetwork
code before the thread's namespace is fixed up. Any number of
(not-so-)subtle misbehaviours could result if an unfortunate goroutine
is scheduled onto a "dirty" thread. The whole repository has been
audited to ensure that the aforementioned invariant is never violated,
making after-the-fact fixing up of thread network namespaces redundant.
Make InitOSContext() a no-op on Linux and inline the thread-locking into
the function (singular) which previously relied on it to do so.
func ns.SetNamespace() is of similarly dubious utility. It intermixes
capturing the initial network namespace and restoring the thread's
network namespace, which could result in threads getting put into the
wrong network namespace if the wrong thread is the first to call it.
Delete it entirely; functions which need to manipulate a thread's
network namespace are better served by being explicit about capturing
and restoring the thread's namespace.
Rewrite InvokeFunc() to invoke the closure inside a goroutine to enable
a graceful and safe recovery if the thread's network namespace could not
be restored. Avoid any potential race conditions due to changing the
main thread's network namespace by preventing the aforementioned
goroutines from being eligible to be scheduled onto the main thread.
Signed-off-by: Cory Snider <csnider@mirantis.com>
testutils.SetupTestOSContext() sets the calling thread's network
namespace but neglected to restore it on teardown. This was not a
problem in practice as it called runtime.LockOSThread() twice but
runtime.UnlockOSThread() only once, so the tampered threads would be
terminated by the runtime when the test case returned and replaced with
a clean thread. Correct the utility so it restores the thread's network
namespace during teardown and unlocks the goroutine from the thread on
success.
Remove unnecessary runtime.LockOSThread() calls peppering test cases
which leverage testutils.SetupTestOSContext().
Signed-off-by: Cory Snider <csnider@mirantis.com>
libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive)
libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive)
libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive)
libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive)
libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive)
libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive)
libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive)
libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive)
libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive)
libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive)
libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
bump netlink to 1.2.1
change usages of netlink handle .Delete() to Close()
remove superfluous replace in vendor.mod
make requires of github.com/Azure/go-ansiterm direct
Signed-off-by: Martin Braun <braun@neuroforge.de>
relates to #35082, moby/libnetwork#2491
Previously, values for expire_quiescent_template, conn_reuse_mode,
and expire_nodest_conn were set only system-wide. Also apply them
for new lb_* and ingress_sbox sandboxes, so they are appropriately
propagated
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Perhaps the testutils package in the past had an `init()` function to set up
specific things, but it no longer has. so these imports were doing nothing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Further improving load balancer performance by expiring
connections to servers with weights set to 0.
Signed-off-by: Andrew Kim <taeyeonkim90@gmail.com>
IPVS module used for swarm load balancer had a performance issue
under a high load situation. conn_reuse_mode=0 sysctl variable can
be set to handle the high load situation by reusing existing
connection entries in the IPVS table.
Under a high load, IPVS module was dropping tcp SYN packets whenever
a port reuse is detected with a connection in TIME_WAIT status forcing
clients to re-initiate tcp connections after request timeout events.
By setting conn_reuse_mode=0, IPVS module avoids special handling of
existing entries in the IPVS connection table.
Along with expire_nodest_conn=1, swarm load balancer can handle
a high load of requests and forward connections to newly joining
backend services.
Signed-off-by: Andrew Kim <taeyeonkim90@gmail.com>
This reverts commit 9f58c475940fb0c0d4b69de0af7787b62a40481f.
This commit is causing TestCreateParallel to be flaky
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This reverts commit 94af1e5af2.
The reason to revert this is, that TestCreateParallel is
continously failing and breaking the CI
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Modify the loadbalancing for east-west traffic to use direct routing
rather than NAT and update tasks to use direct service return under
linux. This avoids hiding the source address of the sender and improves
the performance in single-client/single-server tests.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Go 1.10 fixed the problem related to thread and namespaces.
Details:
2595fe7fb6
In few words there is no more the possibility to have a go routine
running on a thread that is another namespace.
In this commit some cleanup is done and the method SetNamespace is
being removed. This will save tons of setns syscall, that were happening
way too frequently possibily to make sure that each operation was being
done in the host namespace.
I suspect that also all the drivers not running in a different
namespace would be able to drop also the lock of the OS Thread but
will address it in a different commit
Removed useless LockOSThreads around
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Refactor the ostweaks file to allows a more easy reuse
Add a method on the osl.Sandbox interface to allow setting
knobs on the sandbox
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
New load balancing code will require ability to add aliases to
load-balncer sandboxes. So this broadens the OSL interface to allow
adding aliases to any interface, along with the facility to get the
loopback interface's name based on the OS.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
In sandbox creation we disable IPV6 config. But this causes problem in live-restore case
where all IPV6 configs are wiped out on running container. Hence extra check has been added
take care of this issue.
Signed-off-by: selansen <elango.siva@docker.com>
Solaris support for Docker will likely not reach completion,
so removing these files as they are not in use and not
maintained.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In case of IP reuse locally there was a race condition
that was leaving the overlay namespace with wrong configuration
causing connectivity issues.
This commit introduces the use of setMatrix to handle the transient
state and make sure that the proper configuration is maintained
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Refreshed the PR: https://github.com/docker/libnetwork/pull/1585
Addressed comments suggesting to remove the IPAlias logic not anymore used
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Deletion of the dynamic mac is expected to work only if there was active
traffic with that endpoint and a dynamic entry exists. It can also age
out. Hence the mac removal failing is not error. Removing it to make the
debugging easier when parsing the logs.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
- Disable ipv6 on all interface by default at sandbox creation.
Enable IPv6 per interface basis if the interface has an IPv6
address. In case sandbox has an IPv6 interface, also enable
IPv6 on loopback interface.
Signed-off-by: Alessandro Boch <aboch@docker.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
Also fix import name to use original project name 'logrus' instead of
'log'
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
When stale delete notifications are received, we still need to make sure
to purge sandbox neighbor cache because these stale deletes are most
typically out of order delete notifications and if an add for the
peermac was received before the delete of the old peermac,vtep pair then
we process that and replace the kernel state but the old neighbor state
in the sandbox cache remains. That needs to be purged when we finally
get the out of order delete notification.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If xfrm modules cannot be loaded:
- Create netlink.Handle only for ROUTE socket
- Reject local join on overlay secure network
Signed-off-by: Alessandro Boch <aboch@docker.com>
Currently ipam/ipamutils has a bunch of dependencies
in osl and netlink which makes the ipam/ipamutils harder
to use independently with other applications. This PR
modularizes ipam/ipamutils into a standalone package
with no OS level dependencies.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If we encounter an error setting an interface's IPv4 or IPv6 address,
log the addresses we tried to use using the %v specifier rather than %q.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
- Attempt the veth delete only after both ends
are moved into the default network namespace.
Which is after both driver.Leave() and
sandbox.clearNetworkResources() are called.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Add support for overlay networking in older kernels.
Following were done to achieve this:
+ Create the vxlan network in host namespace.
+ This may create conflicts with other private
networks so check for conflicts and fail a
join if there is any conflict.
+ Add iptable based filtering to only allow
subnet bridges in the same network to forward
traffic while different network bridges will
not be able to forward b/w each other. Also
block traffic to overlay network originating
from the host itself.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Consistently with what it does for IP addresses, libnetwork
will also program the container interface's MAC address with
the value set by network driver in InterfaceInfo.
Signed-off-by: Alessandro Boch <aboch@docker.com>