Commit graph

79 commits

Author SHA1 Message Date
Flavio Crisciani
d6440c9139 optimize the rebroadcast for failure case
Before when a node was failing, all the nodes would bump the lamport time of all their
entries. This means that if a node flap, there will be a storm of update of all the entries.
This commit on the base of the previous logic guarantees that only the node that joins back
will readvertise its own entries, the other nodes won't need to advertise again.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-08-01 14:08:54 -07:00
Flavio Crisciani
a3ecb8902a fix join/leave
join/leave fixes:
 - when a node leaves the network will deletes all the other nodes entries but will keep track of its
   to make sure that other nodes if they are tcp syncing will be aware of them being deleted. (a node that
   did not yet receive the network leave will potentially tcp/sync)

add network reapTime, was not being set locally

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-08-01 14:08:45 -07:00
Flavio Crisciani
e77c245e45 2x faster to converge
- Introduced back the Invalidate
- optimized the rebroadcast logic

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-08-01 13:47:18 -07:00
Flavio Crisciani
585964bf32 NetworkDB testing infra
- Diagnose framework that exposes REST API for db interaction
- Dockerfile to build the test image
- Periodic print of stats regarding queue size
- Client and server side for integration with testkit
- Added write-delete-leave-join
- Added test write-delete-wait-leave-join
- Added write-wait-leave-join

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-27 08:50:43 -07:00
Flavio Crisciani
60b5add4af NetworkDB allow setting PacketSize
- Introduce the possibility to specify the max buffer length
  in network DB. This will allow to use the whole MTU limit of
  the interface

- Add queue stats per network, it can be handy to identify the
  node's throughput per network and identify unbalance between
  nodes that can point to an MTU missconfiguration

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-26 13:44:33 -07:00
Flavio Crisciani
051a0d5ce9 NetworkDB incorrect number of entries in networkNodes
A rapid (within networkReapTime 30min) leave/join network
can corrupt the list of nodes per network with multiple copies
of the same nodes.
The fix makes sure that each node is present only once

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-18 16:57:49 -07:00
Sebastiaan van Stijn
3dd1fb1217 Make node join event logging less noisy
Commit ca9a768d80
added a number of debugging messages for node join/leave
events.

This patch checks if a node already was listed,
and otherwise skips the logging to make the logs a bit
less noisy.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-07-10 17:25:14 -07:00
Santhosh Manohar
6bd57f977d Fix go generate for protobuf
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-07-05 16:31:12 -07:00
Flavio Crisciani
39d2204896 Service discovery logic rework
changed the ipMap to SetMatrix to allow transient states
Compacted the addSvc and deleteSvc into a one single method
Updated the datastructure for backends to allow storing all the information needed
to cleanup properly during the cleanupServiceBindings
Removed the enable/disable Service logic that was racing with sbLeave/sbJoin logic
Add some debug logs to track further race conditions

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-06-11 20:49:29 -07:00
Madhu Venugopal
78a910ee17 Merge pull request #1787 from fcrisciani/goroutine_leak
Fix leak of handleTableEvents
2017-06-06 13:17:17 -07:00
Madhu Venugopal
59994bbb15 Merge pull request #1775 from sanimej/gossip
Handle single manager reload by having workers reconnect
2017-05-31 14:57:34 -07:00
Santhosh Manohar
ca9a768d80 Handle single manager reload by having workers reconnect
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-05-31 14:36:23 -07:00
Flavio Crisciani
6d768ef73c Fix leak of handleTableEvents
The channel ch.C is never closed.
Added the listen of the ch.Done() to guarantee
that the goroutine is exiting once the event channel
is closed

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-31 11:04:19 -07:00
Flavio Crisciani
f585f33042 Node failure timeout fix
The time to keep a node failed into the failed node list
was originally supposed to be 24h.

If a node leaves explicitly it will be removed from the list of nodes
and put into the leftNodes list. This way the NotifyLeave event won't
insert it into the retry list.
NOTE: if the event is lost instead the behavior will be the same as a failed node.

If a node fails, the NotifyLeave will insert it into the failedNodes
list with a reapTime of 24h. This means that the node will be checked
for 24h before being completely forgot. The current check time is every
1 second and is done by the reconnectNode function.
The failed node list is updated every 2h instead.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-22 17:19:31 -07:00
Santhosh Manohar
06c3489bb8 retry once on a bulk sync failure
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-05-11 21:13:18 -07:00
Flavio Crisciani
da9ac65ea6 Remove explicit set of memberlist protocol
Memberlist does a full validation of the protocol version (min, current, max)
amoung all the ndoes of the cluster.
The previous code was setting the protocol version to max version.
That made the upgrade incompatible.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-08 16:58:53 -07:00
Madhu Venugopal
1624c61ef2 Merge pull request #1727 from sanimej/cphard
control-plane hardening: Avoid nDB stale entries
2017-04-25 11:04:13 -07:00
Santhosh Manohar
1693144ae2 Merge pull request #1713 from aboch/nse
On clusterLeave, notify only if there are peers
2017-04-23 16:31:46 -07:00
Alessandro Boch
1323730eca On send node envents, notify only if there are peers
- Otherwise operation will unnecessarely block
  for five seconds.
- This is particularly noticeable on graceful
  shutdown of daemon in one node cluster.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-04-21 10:19:08 -07:00
Santhosh Manohar
102f9d230d Avoid nDB stale entries because of intermittent nw issues.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-04-19 14:01:28 -07:00
Santhosh Manohar
69ad7ef244 control-plane hardning: cleanup local state on peer leaving a network
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-03-31 01:49:03 -07:00
Santhosh Manohar
539888412b Merge pull request #1689 from aboch/inv
Do not invalidate table event messages
2017-03-16 13:47:01 -07:00
Alessandro Boch
9c3c86a931 Do not invalidate table event messages
- Do not run the risk of suppressing meaningful messages
  for the rest of the cluster, as a many services depend
  on it, like the service records and the distributed
  load balancers.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-03-16 00:49:58 -07:00
Alessandro Boch
4b306ee83d Fix panic in networkdb test code
fatal error: concurrent map read and map write

goroutine 264 [running]:
runtime.throw(0x90043c, 0x21)
	/usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4203d1d68 sp=0xc4203d1d48
runtime.mapaccess2_faststr(0x86df20, 0xc4203f5470, 0xc42044afc0, 0x5, 0xc4203d1e40, 0x4ed6b8)
	/usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4203d1dc8 sp=0xc4203d1d68
github.com/docker/libnetwork/networkdb.(*NetworkDB).verifyNodeExistence(0xc42007e160, 0xc42008a240, 0xc42044afc0, 0x5, 0x1)
	/go/src/github.com/docker/libnetwork/networkdb/networkdb_test.go:58 +0x6c fp=0xc4203d1e50 sp=0xc4203d1dc8

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-03-15 23:26:32 -07:00
Santhosh Manohar
bfab379411 swarm mode network inspect should provide cluser-wide task details
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-03-10 19:12:00 -08:00
Madhu Venugopal
bb560a1f44 Generating node discovery events to the drivers from networkdb
With the introduction of networkdb, the node discovery events were not
sent to the drivers. This commit generates the node discovery events and
sents it to the drivers interested in it.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2017-02-01 17:54:51 -08:00
Alessandro Boch
595246bdfb Merge pull request #1568 from likel/refactor
Remove unnecessary string formats
2016-12-29 12:18:06 -08:00
Santhosh Manohar
176088a742 Merge pull request #968 from aboch/ed6
Control IPv6 on container's interface
2016-12-22 18:15:15 -08:00
Santhosh Manohar
0c2b4b267c Check for node's presence in networkDB's node map before accessing.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-12-05 00:58:59 -08:00
Madhu Venugopal
224a73d60b Merge pull request #1576 from daehyeok/misspell
Fixed misspelling
2016-12-02 16:02:23 -08:00
Aaron Lehmann
bb8b9a6040 networkdb: Properly format memberlist logs
Right now, items logged by memberlist end up as a complete log line
embedded inside another log line, like the following:

    Nov 22 16:34:16 hostname dockerd: time="2016-11-22T16:34:16.802103258-08:00" level=info msg="2016/11/22 16:34:16 [INFO] memberlist: Marking xyz-1d1ec2dfa053 as failed, suspect timeout reached\n"

This has two time and date stamps, and an escaped newline inside the
"msg" field of the outer log message.

To fix this, define a custom logger that only prints the message itself.
Capture this message in logWriter, strip off the log level (added
directly by memberlist), and route to the appropriate logrus method.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-12-01 19:08:07 -08:00
Alessandro Boch
fac86cf69a Add missing locks in agent and service code
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-29 13:58:06 -08:00
Daehyeok Mun
f89d6b0073 Fixed misspelling
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
2016-11-28 11:46:52 -07:00
Alessandro Boch
f195563a4e Control IPv6 on container's interface
- Disable ipv6 on all interface by default at sandbox creation.
  Enable IPv6 per interface basis if the interface has an IPv6
  address. In case sandbox has an IPv6 interface, also enable
  IPv6 on loopback interface.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-22 15:38:24 -08:00
Ke Li
23ac56fdd0 Remove unnecessary string formats
Signed-off-by: Ke Li <kel@splunk.com>
2016-11-22 09:29:53 +08:00
Santhosh Manohar
27500b1e35 Separate service LB & SD from network plumbing
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-11-17 13:09:14 -08:00
Victor Vieux
236dc57a9e fix unsafe acces on arm
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-11-10 23:05:11 -08:00
Santhosh Manohar
31dd4362a8 Merge pull request #1542 from allencloud/change-reapNode-interval
update reapNode interval
2016-11-08 11:14:23 -08:00
allencloud
0b4f68390d remove unused mConfig
Signed-off-by: allencloud <allen.sun@daocloud.io>
2016-11-08 18:18:55 +08:00
allencloud
99f84ff5a7 update reapNode interval
Signed-off-by: allencloud <allen.sun@daocloud.io>
2016-11-08 15:28:42 +08:00
Alessandro Boch
c5ca82daf4 Merge pull request #1519 from sanimej/newlb
Add sandbox API for task insertion to service LB and service discovery
2016-11-03 13:31:46 -07:00
Santhosh Manohar
c52c8ca6eb Add NetworkDB API to fetch the per network peer (gossip cluster) list
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-11-02 13:58:15 -07:00
Santhosh Manohar
a7e1718800 Add sandbox API for task insertion to service LB and service discovery
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-10-25 05:41:44 -07:00
Santhosh Manohar
e98b152bac Reap failed nodes after 24 hours
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-10-20 11:24:04 -07:00
Alessandro Boch
6b74a8d479 Merge pull request #1476 from sanimej/time
Use monotonic clock source to reap networkDB entries
2016-10-20 07:30:41 -07:00
Santhosh Manohar
0a2537eea3 Use monotonic clock for reaping networkDB entries
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-10-19 22:30:47 -07:00
Alexander Morozov
c772d14e58 networkdb: fix race in deleteNetwork
There are multiple places which reads from that slice(i.e. bulkSync).

Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-10-12 08:42:05 -07:00
Alexander Morozov
03088ace1b networkdb: fix race in access to nodes len
Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-10-04 12:19:25 -07:00
Jana Radhakrishnan
f649d5ae61 Do not hold ack channel in ack table after closing
Once the bulksync ack channel is closed remove it from the ack table
right away. There is no reason to keep it in the ack table and later
delete it in the ack waiter. Ack waiter anyways has reference to the
channel on which it is waiting.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-10-03 09:50:02 -07:00
Jana Radhakrishnan
22c322dded Avoid returning early on agent join failures
When a gossip join failure happens do not return early in the call chain
because a join failure is most likely transient and the retry logic
built in the networkdb is going to retry and succeed. Returning early
makes the initialization of ingress network/sandbox to not happen which
causes a problem even after the gossip join on retry is successful.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-27 08:36:10 -07:00