Commit graph

218 commits

Author SHA1 Message Date
Martin Braun
5edfd6d081 bump netlink library
bump netlink to 1.2.1
change usages of netlink handle .Delete() to Close()
remove superfluous replace in vendor.mod
make requires of github.com/Azure/go-ansiterm direct

Signed-off-by: Martin Braun <braun@neuroforge.de>
2022-06-16 22:25:33 +02:00
Martin Dojcak
feab0cca9f libnetwork/overlay:fix join sandbox deadlock
Operations performed on overlay network sandboxes are handled by
dispatching operations send through a channel. This allows for
asynchronous operations to be performed which, since they are
not called from within another function, are able to operate in
an idempotent manner with a known/measurable starting state from
which an identical series of iterative actions can be performed.

However, it was possible in some cases for an operation dispatched
from this channel to write a message back to the channel in the
case of joining a network when a sufficient volume of sandboxes
were operated on.

A goroutine which is simultaneously reading and writing to an
unbuffered channel can deadlock if it sends a message to a channel
then waits for it to be consumed and completed, since the only
available goroutine is more or less "talking to itself". In order
to break this deadlock, in the observed race, a goroutine is now
created to send the message to the channel.

Signed-off-by: Martin Dojcak <martin.dojcak@lablabs.io>
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
2022-03-22 11:15:14 -04:00
Sebastiaan van Stijn
25594c33b9
libnetwork: replace consul with boltdb in test
Based on randomLocalStore() in libnetwork/ipam/allocator_test.go

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:45:07 +01:00
Eng Zer Jun
c55a4ac779
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-08-27 14:56:57 +08:00
Sebastiaan van Stijn
686be57d0a
Update to Go 1.17.0, and gofmt with Go 1.17
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-24 23:33:27 +02:00
Sebastiaan van Stijn
427ad30c05
libnetwork: remove unused "testutils" imports
Perhaps the testutils package in the past had an `init()` function to set up
specific things, but it no longer has. so these imports were doing nothing.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-18 14:20:37 +02:00
Brian Goff
116f200737
Fix gosec complaints in libnetwork
These were purposefully ignored before but this goes ahead and "fixes"
most of them.
Note that none of the things gosec flagged are problematic, just
quieting the linter here.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-25 18:02:03 +02:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Brian Goff
00b2c13a1b Fix some windows issues in libnetwork tests
Fix build constraints for linux-only network drivers

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:23 +00:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
Arko Dasgupta
dc6cbb55b4 Merge pull request #2572 from bboehmke/ipv6_nat
Enable IPv6 NAT (rebase of #2023)
2020-10-29 14:13:58 -07:00
Sebastiaan van Stijn
3e1e9e878c vendor: gotest.tools v3.0.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-09-12 03:22:18 +02:00
Benjamin Böhmke
34f4706174 added TODOs for open IPv6 point
Signed-off-by: Benjamin Böhmke <benjamin@boehmke.net>
2020-07-23 16:52:40 +02:00
Billy Ridgway
8dbb5b5a7d Implement NAT IPv6 to fix the issue https://github.com/moby/moby/issues/25407
Signed-off-by: Billy Ridgway <wrridgwa@us.ibm.com>
Signed-off-by: Benjamin Böhmke <benjamin@boehmke.net>
2020-07-19 16:16:51 +02:00
Sebastiaan van Stijn
d846c2b1ab vendor: update vishvananda/netlink v1.1.0
full diff: https://github.com/vishvananda/netlink/compare/v1.0.0...v1.1.0

also updated moby/ipvs, which is compatible with this version of netlink,
and update vishvananda/netns to current master (which added go.mod)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-03-12 18:25:54 +01:00
Arko Dasgupta
cd864b50a2 Fix panic in drivers/overlay/encryption.go
Issue - "index out of range" panic in drivers/overlay/encryption.go:539
due to a mismatch in indices between curKeys and spis due to
case where updateKeys might bail out due to an error and
not update the spis

Fix - Reconfigure keys when there is a key update failure

Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
2019-10-31 12:59:41 -07:00
Euan Harris
0275b007c6 vet: Fix composite literal uses unkeyed fields warnings
Signed-off-by: Euan Harris <euan.harris@docker.com>
2019-06-26 16:50:56 +01:00
Sebastiaan van Stijn
d152888722 Bump vishvananda/netlink to 1.0.0
Changes included:

- Allow index specification at link creation time
- replace syscall with golang.org/x/sys/unix
  - related: Use IFF_MULTI_QUEUE from x/sys/unix to define TUNTAP_MULTI_QUEUE
  - related: Use IFLA_* constants from x/sys/unix
- Fix index out of range when no metadata for gretap
- added encapsulation attributes for Iptun and Sittun to support SIT tunnels
- Expose xfrm state's statistics
- Support invert in ip rules
- Support LWTUNNEL_ENCAP_SEG6
- Support setting and retrieving route MTU/AdvMSS
- Fix CalcRtable array parameter bug
- added support for Foo-over-UDP netlink calls
- Support num{tx,rx}queues and udp6zerocsum{tx,rx}
- tuntap: Add multiqueue support
- Retrieve VLAN ID when listing neighbour
- Fix LinkAdd for sit tunnel on 3.10 kernel
- Add support for managing source MACVLANs
- Two functions: one for adding bond slave, one for getting veth peer index
- Eliminate cgo from netlink
- Don't overwrite the XDP file descriptor with flags
- Fix reference to BPF instructions (on Kernel 4.13)
- Add Matchall filter
- Send IFA_CACHEINFO when setting up addresses
- Support IPv6 GRE Tun and Tap
- Add List option to RouteSubscribeWithOptions, AddrSubscribeWithOptions, and LinkSubscribeWithOptions
- Add Fq and Fq_Codel Qdisc support

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-06-24 14:56:49 +02:00
Sebastiaan van Stijn
d7f397c236 Touch-up error-message and godoc for ConfigVXLANUDPPort
Minor changes following review of the engine pull request
for this feature;

- Remove the name of the function from the error message
  as it's not a debug message.
- Add the valid range to the error message, so that a
  user has sufficient information to address the problem.
- Update GoDoc for the function to describe the default
  port, and valid port-ranges.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-23 10:56:40 +01:00
Sebastiaan van Stijn
38c8a3f84d Use sync.RWMutex for VXLANUDPPort
Looks like concurrent reads should be possible, so use
a RWMutex instead of Mutex.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-11-22 21:29:20 +01:00
selansen
56ca280b27 VXLAN port configuration - late review comments update
Some review comments came in very late after merging
	#2282. This PR addresses those review comments.

Signed-off-by: selansen <elango.siva@docker.com>
2018-11-14 13:26:56 -05:00
selansen
077ccabc45 VXLAN UDP Port configuration support
This PR chnages allow user to configure VxLAN UDP
port number. By default we use 4789 port number. But this commit
will allow user to configure port number during swarm init.
VxLAN port can't be modified after swarm init.

Signed-off-by: selansen <elango.siva@docker.com>
2018-11-01 15:20:30 -04:00
Flavio Crisciani
204ce3e31d Create internal directory
Internal directory is designed to contain libraries
that are exclusively used by this project

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-07-16 17:34:20 -07:00
Josh Soref
a06f1b2c4e Spelling fixes
* addresses
* assigned
* at least
* attachments
* auxiliary
* available
* cleanup
* communicate
* communications
* configuration
* connection
* connectivity
* destination
* encountered
* endpoint
* example
* existing
* expansion
* expected
* external
* forwarded
* gateway
* implementations
* implemented
* initialize
* internally
* loses
* message
* network
* occurred
* operational
* origin
* overlapping
* reaper
* redirector
* release
* representation
* resolver
* retrieve
* returns
* sanbdox
* sequence
* succesful
* synchronizing
* update
* validates

Signed-off-by: Josh Soref <jsoref@gmail.com>
2018-07-12 12:54:44 -07:00
Flavio Crisciani
0f593ae92b Merge pull request #2146 from ctelfer/fix-overlay-vxlan-races
Fix overlay vxlan races
2018-07-11 10:41:46 -07:00
Chris Telfer
4e6580c4c1 Refactor locking for join/leave to avoid race
Instead of using "sync.Once" to determine whether to initialize a
network sandbox or subnet sandbox, we use a traditional mutex +
initialization boolean.  This is because the initialization state isn't
truly a once-and-done condition.  Rather, libnetwork destroys network
and subnet sandboxes when the last endpoint leaves them.  The use of
sync.Once in this kind of scenario requires, therefore, re-initializing
the Once which is impoissible.  So the approach that libnetwork
currently takes is to use a pointer to a Once and redirect that pointer
to a new Once on reset.  This leads to nasty race conditions.

In addition to refactoring the locking, this patch merges the functions
joinSandbox(), and joinSubnetSandbox(). This makes the code both cleaner
and it also holds the network and subnet locks through the series of
read-modify-writes avoiding further potential races.  This does reduce
the potential parallelism which could be applied should there be many
joins coming in on many different subnets in the same overlay network.
However, this should be an extremely minor performance hit for a very
obscure case.

One important pattern in this commit is that it is crucial to avoid
sending peerDB messages while holding a driver or network lock.  The
changes herein defer such (asynchronous) notifications until after
release of such locks.  This prevents deadlocks where the peerDB
blocks acquiring said locks while the network method blocks trying
to send to the peerDB's channel.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-07-10 12:13:39 -04:00
Santhosh Manohar
5fdfa8c52c Cleanup interfaces properly when vxlan plumbling fails
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-07-10 10:33:46 -04:00
Vincent Demeester
06d471d186 Migrate to gotest.tools :)
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2018-07-06 11:01:37 -07:00
Chris Telfer
06922d2d81 Use fmt precision to limit string length
The previous code used string slices to limit the length of certain
fields like endpoint or sandbox IDs.  This assumes that these strings
are at least as long as the slice length.  Unfortunately, some sandbox
IDs can be smaller than 7 characters.   This fix addresses this issue
by systematically converting format string calls that were taking
fixed-slice arguments to use a precision specifier in the string format
itself.  From the golang fmt package documentation:

    For strings, byte slices and byte arrays, however, precision limits
    the length of the input to be formatted (not the size of the output),
    truncating if necessary. Normally it is measured in runes, but for
    these types when formatted with the %x or %X format it is measured
    in bytes.

This nicely fits the desired behavior: it will limit the number of
runes considered for string interpolation to the precision value.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-07-05 17:44:04 -04:00
Flavio Crisciani
7fc1795cdf Allows to set generic knobs on the Sandbox
Refactor the ostweaks file to allows a more easy reuse
Add a method on the osl.Sandbox interface to allow setting
knobs on the sandbox

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-06-28 16:14:08 -07:00
Euan Harris
96c7cba64c networkdb, drivers: Regenerate protocol buffers
agent.pb.go is unchanged, but the files in networkdb and drivers
are slightly different when regenerated using the current versions
of protoc and gogoproto.    This is probably because agent.pb.go
was last regenerated quite recently, in February 2018, whereas
networkdb.pb.go and overlay/overlay.pb.go were last changed in 2017,
and windows/overlay/overlay.pb.go was last changed in 2016.

Signed-off-by: Euan Harris <euan.harris@docker.com>
2018-06-22 15:03:12 +01:00
Chris Telfer
f0c86fb56e Fix deadlock introduced in b64997ea
Commit b64997ea prevented data corruption due to simultaneous
driver.CreateNetwork()/driver.DeleteNetwork() by holding the network
lock through the read/modify part of the operation.  However, part of
the DeleteNetwork operation entails sending a message to the peerDB to
tell that goroutine to flush entries on deletion.  This can lead to a
deadlock where:
  * driver.DeleteNetwork() starts and acquires driver.Lock()
  * peerDB receives some other request (e.g. EventNotify) and blocks
    on driver.Lock()
  * driver.DeleteNetwork() attempts a peerDB flush and blocks waiting
    on the synchronous peerDB operation channel

This patch fixes the issue by deferring the peerDB flush operation until
after DeleteNetwork() unlocks driver.Lock().   Commit b64997ea only
modified CreateNetwork() and DeleteNetwork() and the critical section
that driver.Lock() protects in CreateNetwork() does not perform any
peerDB notifications or other locks of driver data structures.  So this
solution should be a complete fix for any regressions introduced in
b64997ea.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-06-08 14:17:51 -04:00
Chris Telfer
c97bb41620 Remove race in encrypted overlay key update
Multiple simultaneous updates here would leave the driver in a very
inconsistent state.  The disadvantage to this change is that it requires
holding the driver lock while reprogramming the keys.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-05-01 17:41:47 -04:00
Chris Telfer
40b55d2336 Remove race condition from ovnmanager
This one is probably not critical.  The worst that seems like could
happen would be if 2 deletes occur at the same time (one of which
should be an error):
  1. network gets read from the map by delete-1
  2. network gets read from the map by delete-2
  3. delete-1 releases the network VNI
  4. network create arrives at the driver and allocates the now free VNI
  5. delete-2 releases the network VNI (error: it's been reallocated!)
  6. both networks remove the VNI from the map

Part 6 could also become an issue if there were a simultaneous create
for the network at the same time.  This leads to the modification of
the NewNetwork() method which now checks for an existing network before
adding it to the map.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-05-01 17:41:42 -04:00
Chris Telfer
b64997ea82 Fix race conditions in overlay network driver
The overlay network driver is not properly using it's mutexes or
sync.Onces.  It made the classic mistake of not holding a lock through
various read-modify-write operations.  This can result in inconsistent
state storage leading to more catastrophic issues.

This patch attempts to maintain the previous semantics while holding the
driver lock through operations that are read-modify-write of the
driver's network state.

One example of this race would be if two goroutines tried to invoke
d.network() after the network ID was removed from the table.  Both would
try to reinstall it causing the "once" to get reinitialized twice
without any lock protection.  This could then lead to the "once" getting
invoked twice on the same network.  Furthermore, the changes to one of
these network structures gets effectively discarded.  It's also the
case, that because there would be two simultaneous instances of the
network, the various network Lock() invocations would be meaningless for
race prevention.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-05-01 17:17:27 -04:00
Jim Carroll
bab08251c0 Allow for larger preset property values, do not override
Signed-off-by: Jim Carroll <jim.carroll@docker.com>
2018-04-11 13:09:02 -05:00
ZhiPeng Lu
9ba57c93b8 Add warning message for the failure of deleting link device
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
2018-03-06 16:37:45 +08:00
ZhiPeng Lu
83d1ce9fb5 fix for #1333, calling LinkDel to delete link device when the err of LinkByName is NULL
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
2018-02-28 16:57:39 +08:00
Flavio Crisciani
e975f3caa0 Fix watchMiss thread context
The netlink deserialize is fetching information from the link.
This require the go routine to be in the correct namespace to
succeed

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-01-10 13:50:49 -08:00
Flavio Crisciani
dd47466a4d Remove watchMiss for swarm mode
Swarm mode does not really have anymore a use for the watchMiss.
Peer entries are configured at configuration time.
If the gcthresh denies the insertion the peerAdd will fail.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-01-05 14:51:43 -08:00
Flavio Crisciani
6736b223ec Set socket timeout on netlink sockets
In case the file descriptor of the netlink socket is closed
the recvfrom is not returning. This may create deadlock conditions.
The current solution is to make sure that all the netlink socket used
have a proper timeout set on them to have the possibility to return

Added test to emulate the watchMiss condition

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-12-04 09:40:27 -08:00
Sebastiaan van Stijn
01688ba253 Fix typo in overlay log message
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-10-07 01:55:30 +02:00
Flavio Crisciani
1fe48e8608 Fix IPMask marshalling
Fix marshalling and add test

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-03 19:38:54 -07:00
Flavio Crisciani
1bb664f689 Merge pull request #1788 from abhi/ipam_alloc
Serializing bitseq alloc
2017-10-03 13:30:34 -07:00
Abhinandan Prativadi
3d44975995 Adding a unit case to verify rollover
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
2017-10-03 12:15:34 -07:00
Flavio Crisciani
ad577a25fe Changed ipMask to string
Avoid error logs in case of local peer case, there is no need for deleteNeighbor
Avoid the network leave to readvertise already deleted entries to upper layer

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-02 17:29:18 -07:00
Flavio Crisciani
181115b350 Addressing code review comments
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-02 11:12:36 -07:00
Flavio Crisciani
2bad0fbedf log for miss notification
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-02 11:12:36 -07:00
Flavio Crisciani
3e7b6c9cb0 flush peerdb entries on network delete
peerDB was never being flushed on network delete
leaveing behind stale entries

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-02 11:12:35 -07:00
Flavio Crisciani
711d033757 Handle IP reuse in overlay
In case of IP reuse locally there was a race condition
that was leaving the overlay namespace with wrong configuration
causing connectivity issues.
This commit introduces the use of setMatrix to handle the transient
state and make sure that the proper configuration is maintained

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-02 11:12:33 -07:00