bump netlink to 1.2.1
change usages of netlink handle .Delete() to Close()
remove superfluous replace in vendor.mod
make requires of github.com/Azure/go-ansiterm direct
Signed-off-by: Martin Braun <braun@neuroforge.de>
Operations performed on overlay network sandboxes are handled by
dispatching operations send through a channel. This allows for
asynchronous operations to be performed which, since they are
not called from within another function, are able to operate in
an idempotent manner with a known/measurable starting state from
which an identical series of iterative actions can be performed.
However, it was possible in some cases for an operation dispatched
from this channel to write a message back to the channel in the
case of joining a network when a sufficient volume of sandboxes
were operated on.
A goroutine which is simultaneously reading and writing to an
unbuffered channel can deadlock if it sends a message to a channel
then waits for it to be consumed and completed, since the only
available goroutine is more or less "talking to itself". In order
to break this deadlock, in the observed race, a goroutine is now
created to send the message to the channel.
Signed-off-by: Martin Dojcak <martin.dojcak@lablabs.io>
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Perhaps the testutils package in the past had an `init()` function to set up
specific things, but it no longer has. so these imports were doing nothing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were purposefully ignored before but this goes ahead and "fixes"
most of them.
Note that none of the things gosec flagged are problematic, just
quieting the linter here.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
full diff: https://github.com/vishvananda/netlink/compare/v1.0.0...v1.1.0
also updated moby/ipvs, which is compatible with this version of netlink,
and update vishvananda/netns to current master (which added go.mod)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Issue - "index out of range" panic in drivers/overlay/encryption.go:539
due to a mismatch in indices between curKeys and spis due to
case where updateKeys might bail out due to an error and
not update the spis
Fix - Reconfigure keys when there is a key update failure
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Changes included:
- Allow index specification at link creation time
- replace syscall with golang.org/x/sys/unix
- related: Use IFF_MULTI_QUEUE from x/sys/unix to define TUNTAP_MULTI_QUEUE
- related: Use IFLA_* constants from x/sys/unix
- Fix index out of range when no metadata for gretap
- added encapsulation attributes for Iptun and Sittun to support SIT tunnels
- Expose xfrm state's statistics
- Support invert in ip rules
- Support LWTUNNEL_ENCAP_SEG6
- Support setting and retrieving route MTU/AdvMSS
- Fix CalcRtable array parameter bug
- added support for Foo-over-UDP netlink calls
- Support num{tx,rx}queues and udp6zerocsum{tx,rx}
- tuntap: Add multiqueue support
- Retrieve VLAN ID when listing neighbour
- Fix LinkAdd for sit tunnel on 3.10 kernel
- Add support for managing source MACVLANs
- Two functions: one for adding bond slave, one for getting veth peer index
- Eliminate cgo from netlink
- Don't overwrite the XDP file descriptor with flags
- Fix reference to BPF instructions (on Kernel 4.13)
- Add Matchall filter
- Send IFA_CACHEINFO when setting up addresses
- Support IPv6 GRE Tun and Tap
- Add List option to RouteSubscribeWithOptions, AddrSubscribeWithOptions, and LinkSubscribeWithOptions
- Add Fq and Fq_Codel Qdisc support
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Minor changes following review of the engine pull request
for this feature;
- Remove the name of the function from the error message
as it's not a debug message.
- Add the valid range to the error message, so that a
user has sufficient information to address the problem.
- Update GoDoc for the function to describe the default
port, and valid port-ranges.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This PR chnages allow user to configure VxLAN UDP
port number. By default we use 4789 port number. But this commit
will allow user to configure port number during swarm init.
VxLAN port can't be modified after swarm init.
Signed-off-by: selansen <elango.siva@docker.com>
Internal directory is designed to contain libraries
that are exclusively used by this project
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Instead of using "sync.Once" to determine whether to initialize a
network sandbox or subnet sandbox, we use a traditional mutex +
initialization boolean. This is because the initialization state isn't
truly a once-and-done condition. Rather, libnetwork destroys network
and subnet sandboxes when the last endpoint leaves them. The use of
sync.Once in this kind of scenario requires, therefore, re-initializing
the Once which is impoissible. So the approach that libnetwork
currently takes is to use a pointer to a Once and redirect that pointer
to a new Once on reset. This leads to nasty race conditions.
In addition to refactoring the locking, this patch merges the functions
joinSandbox(), and joinSubnetSandbox(). This makes the code both cleaner
and it also holds the network and subnet locks through the series of
read-modify-writes avoiding further potential races. This does reduce
the potential parallelism which could be applied should there be many
joins coming in on many different subnets in the same overlay network.
However, this should be an extremely minor performance hit for a very
obscure case.
One important pattern in this commit is that it is crucial to avoid
sending peerDB messages while holding a driver or network lock. The
changes herein defer such (asynchronous) notifications until after
release of such locks. This prevents deadlocks where the peerDB
blocks acquiring said locks while the network method blocks trying
to send to the peerDB's channel.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
The previous code used string slices to limit the length of certain
fields like endpoint or sandbox IDs. This assumes that these strings
are at least as long as the slice length. Unfortunately, some sandbox
IDs can be smaller than 7 characters. This fix addresses this issue
by systematically converting format string calls that were taking
fixed-slice arguments to use a precision specifier in the string format
itself. From the golang fmt package documentation:
For strings, byte slices and byte arrays, however, precision limits
the length of the input to be formatted (not the size of the output),
truncating if necessary. Normally it is measured in runes, but for
these types when formatted with the %x or %X format it is measured
in bytes.
This nicely fits the desired behavior: it will limit the number of
runes considered for string interpolation to the precision value.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Refactor the ostweaks file to allows a more easy reuse
Add a method on the osl.Sandbox interface to allow setting
knobs on the sandbox
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
agent.pb.go is unchanged, but the files in networkdb and drivers
are slightly different when regenerated using the current versions
of protoc and gogoproto. This is probably because agent.pb.go
was last regenerated quite recently, in February 2018, whereas
networkdb.pb.go and overlay/overlay.pb.go were last changed in 2017,
and windows/overlay/overlay.pb.go was last changed in 2016.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Commit b64997ea prevented data corruption due to simultaneous
driver.CreateNetwork()/driver.DeleteNetwork() by holding the network
lock through the read/modify part of the operation. However, part of
the DeleteNetwork operation entails sending a message to the peerDB to
tell that goroutine to flush entries on deletion. This can lead to a
deadlock where:
* driver.DeleteNetwork() starts and acquires driver.Lock()
* peerDB receives some other request (e.g. EventNotify) and blocks
on driver.Lock()
* driver.DeleteNetwork() attempts a peerDB flush and blocks waiting
on the synchronous peerDB operation channel
This patch fixes the issue by deferring the peerDB flush operation until
after DeleteNetwork() unlocks driver.Lock(). Commit b64997ea only
modified CreateNetwork() and DeleteNetwork() and the critical section
that driver.Lock() protects in CreateNetwork() does not perform any
peerDB notifications or other locks of driver data structures. So this
solution should be a complete fix for any regressions introduced in
b64997ea.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Multiple simultaneous updates here would leave the driver in a very
inconsistent state. The disadvantage to this change is that it requires
holding the driver lock while reprogramming the keys.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This one is probably not critical. The worst that seems like could
happen would be if 2 deletes occur at the same time (one of which
should be an error):
1. network gets read from the map by delete-1
2. network gets read from the map by delete-2
3. delete-1 releases the network VNI
4. network create arrives at the driver and allocates the now free VNI
5. delete-2 releases the network VNI (error: it's been reallocated!)
6. both networks remove the VNI from the map
Part 6 could also become an issue if there were a simultaneous create
for the network at the same time. This leads to the modification of
the NewNetwork() method which now checks for an existing network before
adding it to the map.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
The overlay network driver is not properly using it's mutexes or
sync.Onces. It made the classic mistake of not holding a lock through
various read-modify-write operations. This can result in inconsistent
state storage leading to more catastrophic issues.
This patch attempts to maintain the previous semantics while holding the
driver lock through operations that are read-modify-write of the
driver's network state.
One example of this race would be if two goroutines tried to invoke
d.network() after the network ID was removed from the table. Both would
try to reinstall it causing the "once" to get reinitialized twice
without any lock protection. This could then lead to the "once" getting
invoked twice on the same network. Furthermore, the changes to one of
these network structures gets effectively discarded. It's also the
case, that because there would be two simultaneous instances of the
network, the various network Lock() invocations would be meaningless for
race prevention.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
The netlink deserialize is fetching information from the link.
This require the go routine to be in the correct namespace to
succeed
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Swarm mode does not really have anymore a use for the watchMiss.
Peer entries are configured at configuration time.
If the gcthresh denies the insertion the peerAdd will fail.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In case the file descriptor of the netlink socket is closed
the recvfrom is not returning. This may create deadlock conditions.
The current solution is to make sure that all the netlink socket used
have a proper timeout set on them to have the possibility to return
Added test to emulate the watchMiss condition
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Avoid error logs in case of local peer case, there is no need for deleteNeighbor
Avoid the network leave to readvertise already deleted entries to upper layer
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In case of IP reuse locally there was a race condition
that was leaving the overlay namespace with wrong configuration
causing connectivity issues.
This commit introduces the use of setMatrix to handle the transient
state and make sure that the proper configuration is maintained
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>