0ct0pu5/moby

Author	SHA1	Message	Date
Sebastiaan van Stijn	cd381aea56	libnetwork: fix empty-lines (revive) libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive) libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive) libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive) libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive) libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive) libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive) libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive) libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive) libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive) libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive) libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive) libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive) libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive) libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive) libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive) libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive) libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive) libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive) libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive) libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive) libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive) libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive) libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive) libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive) libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive) libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive) libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive) libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive) libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive) libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive) libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive) libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive) libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive) libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive) libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive) libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive) libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive) libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive) libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive) libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive) libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive) libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-09-26 19:21:58 +02:00
Sebastiaan van Stijn	561a010161	linting: suppress false positive for G404 (gosec) The linter falsely detects this as using "math/rand": libnetwork/networkdb/cluster.go:721:14: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec) val, err := rand.Int(rand.Reader, big.NewInt(int64(n))) ^ Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-09-04 15:36:49 +02:00
Sebastiaan van Stijn	b9c8eca468	libnetwork/networkdb: remove some redundant fmt.Sprintf()'s Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-02-15 12:56:23 +01:00
frobnicaty	d78b883576	Fix grammar for "does not exist" as opposed to "does not exists" Signed-off-by: frobnicaty <92033765+frobnicaty@users.noreply.github.com>	2021-12-03 15:50:13 +00:00
Roman Volosatovs	d7a2635537	libnetwork: make rejoin intervals configurable This allows the rejoin intervals to be chosen according to the context within which the component is used, and, in particular, this allows lower intervals to be used within TestNetworkDBIslands test. Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>	2021-07-12 19:25:49 +02:00
Brian Goff	116f200737	Fix gosec complaints in libnetwork These were purposefully ignored before but this goes ahead and "fixes" most of them. Note that none of the things gosec flagged are problematic, just quieting the linter here. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2021-06-25 18:02:03 +02:00
Brian Goff	4b981436fe	Fixup libnetwork lint errors Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2021-06-01 23:48:32 +00:00
Flavio Crisciani	2b1e45c682	Merge pull request #2238 from talex5/networkdb-docs Add NetworkDB docs	2019-03-14 16:05:31 -07:00
Flavio Crisciani	151f42aeaa	Fix possible nil pointer exception It is possible that the node is not yet present in the node list map. In this case just print a warning and return. The next iteration would be fine Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2019-01-22 17:07:15 -08:00
Thomas Leonard	05c05ea5e9	Add NetworkDB docs This is based on reading the code in the `networkdb` directory. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>	2018-08-08 13:35:11 +01:00
Josh Soref	a06f1b2c4e	Spelling fixes * addresses * assigned * at least * attachments * auxiliary * available * cleanup * communicate * communications * configuration * connection * connectivity * destination * encountered * endpoint * example * existing * expansion * expected * external * forwarded * gateway * implementations * implemented * initialize * internally * loses * message * network * occurred * operational * origin * overlapping * reaper * redirector * release * representation * resolver * retrieve * returns * sanbdox * sequence * succesful * synchronizing * update * validates Signed-off-by: Josh Soref <jsoref@gmail.com>	2018-07-12 12:54:44 -07:00
Flavio Crisciani	b0a0059237	Merge pull request #2216 from fcrisciani/netdb-qlen-issue NetworkDB qlen optimization	2018-07-05 15:02:58 -07:00
Chris Telfer	06922d2d81	Use fmt precision to limit string length The previous code used string slices to limit the length of certain fields like endpoint or sandbox IDs. This assumes that these strings are at least as long as the slice length. Unfortunately, some sandbox IDs can be smaller than 7 characters. This fix addresses this issue by systematically converting format string calls that were taking fixed-slice arguments to use a precision specifier in the string format itself. From the golang fmt package documentation: For strings, byte slices and byte arrays, however, precision limits the length of the input to be formatted (not the size of the output), truncating if necessary. Normally it is measured in runes, but for these types when formatted with the %x or %X format it is measured in bytes. This nicely fits the desired behavior: it will limit the number of runes considered for string interpolation to the precision value. Signed-off-by: Chris Telfer <ctelfer@docker.com>	2018-07-05 17:44:04 -04:00
Flavio Crisciani	55e4cc7262	Optimize networkDB queue Added some optimizations to reduce the messages in the queue: 1) on join network the node execute a tcp sync with all the nodes that it is aware part of the specific network. During this time before the node was redistributing all the entries. This meant that if the network had 10K entries the queue of the joining node will jump to 10K. The fix adds a flag on the network that would avoid to insert any entry in the queue till the sync happens. Note that right now the flag is set in a best effort way, there is no real check if at least one of the nodes succeed. 2) limit the number of messages to redistribute coming from a TCP sync. Introduced a threshold that limit the number of messages that are propagated, this will disable this optimization in case of heavy load. Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2018-07-02 16:59:45 -07:00
Flavio Crisciani	500d9f4515	Adjust corner case for reconnect logic Previous logic was not accounting that each node is in the node list so the bootstrap nodes won't retry to reconnect because they will always find themselves in the node map Added test that validate the gossip island condition Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2018-06-21 18:04:55 -07:00
Dani Louca	96472cdaea	Adding a recovery mechanism for a split gossip cluster Signed-off-by: Dani Louca <dani.louca@docker.com>	2018-04-23 14:18:46 -04:00
Flavio Crisciani	f0fcb0bbe6	Fixed race on quick node fail/join The previous logic was not properly handling the case of a node that was failing and oining back in short period of time. The issue was in the handling of the network messages. When a node joins it sync with other nodes, these are passing the whole list of nodes that at best of their knowledge are part of a network. At this point if the node receives that node A is part of the network it saves it before having received the notification that node A is actually alive (coming from memberlist). If node A failed the source node will receive the notification while the new joined node won't because memberlist never advertise node A as available. In this case the new node will never purge node A from its state but also worse, will accept any table notification where node A is the owner and so will end up in a out of sync state with the rest of the cluster. This commit contains also some code cleanup around the area of node management Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-11-27 14:38:06 -08:00
Flavio Crisciani	7fbaf6de2c	Add test to confirm garbage collection - Create a test to verify that a node that joins in an async way is not going to extend the life of a already deleted object Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-10-23 09:58:57 +02:00
Flavio Crisciani	8c31217a44	NetworkDB create NodeID for cluster nodes Separate the hostname from the node identifier. All the messages that are exchanged on the network are containing a nodeName field that today was hostname-uniqueid. Now being encoded as strings in the protobuf without any length restriction they plays a role on the effieciency of protocol itself. If the hostname is very long the overhead will increase and will degradate the performance of the database itself that each single cycle by default allows 1400 bytes payload Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-09-26 10:48:04 -07:00
Flavio Crisciani	a4e64d05c1	Avoid alignment of reapNetwork and tableEntries Make sure that the network is garbage collected after the entries. Entries to be deleted requires that the network is present. Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-09-22 10:57:47 -07:00
Flavio Crisciani	053a534ab1	Changed ReapTable logic - Changed the loop per network. Previous implementation was taking a ReadLock to update the reapTime but now with the residualReapTime also the bulkSync is using the same ReadLock creating possible issues in concurrent read and update of the value. The new logic fetches the list of networks and proceed to the cleanup network by network locking the database and releasing it after each network. This should ensure a fair locking avoiding to keep the database blocked for too much time. Note: The ticker does not guarantee that the reap logic runs precisely every reapTimePeriod, actually documentation says that if the routine is too long will skip ticks. In case of slowdown of the process itself it is possible that the lifetime of the deleted entries increases, it still should not be a huge problem because now the residual reaptime is propagated among all the nodes a slower node will let the deleted entry being repropagate multiple times but the state will still remain consistent. Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-09-21 09:37:47 -07:00
Flavio Crisciani	2d2a2bc568	Fix reapTime logic in NetworkDB - Added remainingReapTime field in the table event. Wihtout it a node that did not have a state for the element was marking the element for deletion setting the max reapTime. This was creating the possibility to keep the entry being resync between nodes forever avoding the purpose of the reap time itself. - On broadcast of the table event the node owner was rewritten with the local node name, this was not correct because the owner should continue to remain the original one of the message Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-09-21 09:37:37 -07:00
Derek McGowan	710e0664c4	Update logrus to v1.0.1 Fix case sensitivity issue Update docker and runc vendors Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2017-08-07 11:20:47 -07:00
Flavio Crisciani	d6440c9139	optimize the rebroadcast for failure case Before when a node was failing, all the nodes would bump the lamport time of all their entries. This means that if a node flap, there will be a storm of update of all the entries. This commit on the base of the previous logic guarantees that only the node that joins back will readvertise its own entries, the other nodes won't need to advertise again. Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-08-01 14:08:54 -07:00
Flavio Crisciani	60b5add4af	NetworkDB allow setting PacketSize - Introduce the possibility to specify the max buffer length in network DB. This will allow to use the whole MTU limit of the interface - Add queue stats per network, it can be handy to identify the node's throughput per network and identify unbalance between nodes that can point to an MTU missconfiguration Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-07-26 13:44:33 -07:00
Flavio Crisciani	051a0d5ce9	NetworkDB incorrect number of entries in networkNodes A rapid (within networkReapTime 30min) leave/join network can corrupt the list of nodes per network with multiple copies of the same nodes. The fix makes sure that each node is present only once Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-07-18 16:57:49 -07:00
Santhosh Manohar	ca9a768d80	Handle single manager reload by having workers reconnect Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2017-05-31 14:36:23 -07:00
Santhosh Manohar	06c3489bb8	retry once on a bulk sync failure Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2017-05-11 21:13:18 -07:00
Flavio Crisciani	da9ac65ea6	Remove explicit set of memberlist protocol Memberlist does a full validation of the protocol version (min, current, max) amoung all the ndoes of the cluster. The previous code was setting the protocol version to max version. That made the upgrade incompatible. Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2017-05-08 16:58:53 -07:00
Santhosh Manohar	102f9d230d	Avoid nDB stale entries because of intermittent nw issues. Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2017-04-19 14:01:28 -07:00
Aaron Lehmann	bb8b9a6040	networkdb: Properly format memberlist logs Right now, items logged by memberlist end up as a complete log line embedded inside another log line, like the following: Nov 22 16:34:16 hostname dockerd: time="2016-11-22T16:34:16.802103258-08:00" level=info msg="2016/11/22 16:34:16 [INFO] memberlist: Marking xyz-1d1ec2dfa053 as failed, suspect timeout reached\n" This has two time and date stamps, and an escaped newline inside the "msg" field of the outer log message. To fix this, define a custom logger that only prints the message itself. Capture this message in logWriter, strip off the log level (added directly by memberlist), and route to the appropriate logrus method. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>	2016-12-01 19:08:07 -08:00
Alessandro Boch	fac86cf69a	Add missing locks in agent and service code Signed-off-by: Alessandro Boch <aboch@docker.com>	2016-11-29 13:58:06 -08:00
Santhosh Manohar	31dd4362a8	Merge pull request #1542 from allencloud/change-reapNode-interval update reapNode interval	2016-11-08 11:14:23 -08:00
allencloud	0b4f68390d	remove unused mConfig Signed-off-by: allencloud <allen.sun@daocloud.io>	2016-11-08 18:18:55 +08:00
allencloud	99f84ff5a7	update reapNode interval Signed-off-by: allencloud <allen.sun@daocloud.io>	2016-11-08 15:28:42 +08:00
Santhosh Manohar	e98b152bac	Reap failed nodes after 24 hours Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-10-20 11:24:04 -07:00
Santhosh Manohar	0a2537eea3	Use monotonic clock for reaping networkDB entries Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-10-19 22:30:47 -07:00
Alexander Morozov	03088ace1b	networkdb: fix race in access to nodes len Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>	2016-10-04 12:19:25 -07:00
Jana Radhakrishnan	f649d5ae61	Do not hold ack channel in ack table after closing Once the bulksync ack channel is closed remove it from the ack table right away. There is no reason to keep it in the ack table and later delete it in the ack waiter. Ack waiter anyways has reference to the channel on which it is waiting. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-10-03 09:50:02 -07:00
Jana Radhakrishnan	22c322dded	Avoid returning early on agent join failures When a gossip join failure happens do not return early in the call chain because a join failure is most likely transient and the retry logic built in the networkdb is going to retry and succeed. Returning early makes the initialization of ingress network/sandbox to not happen which causes a problem even after the gossip join on retry is successful. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-27 08:36:10 -07:00
Jana Radhakrishnan	7b905d3c63	Purge stale nodes with same prefix and IP Since the node name randomization fix, we need to make sure that we purge the old node with the same prefix and same IP from the nodes database if it still present. This causes unnecessary reconnect attempts. Also added a change to avoid unnecessary update of local lamport time and only do it of we are ready to do a push pull on a join. Join should happen only when the node is bootstrapped or when trying to reconnect with a failed node. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-23 14:48:54 -07:00
Madhu Venugopal	d1f6eb1812	Allow the memberlist shutdown even if networkdb leave fails Signed-off-by: Madhu Venugopal <madhu@docker.com>	2016-09-23 05:19:07 -07:00
Jana Radhakrishnan	b0a7084c05	Honor user provided listen address for gossip If user provided a non-zero listen address, honor that and bind only to that address. Right now it is not honored and we always bind to all ip addresses in the host. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-22 11:41:57 -07:00
Jana Radhakrishnan	5f5dad3c02	Recover from transient gossip failures Currently if there is any transient gossip failure in any node the recoevry process depends on other nodes propogating the information indirectly. In cases if these transient failures affects all the nodes that this node has in its memberlist then this node will be permenantly cutoff from the the gossip channel. Added node state management code in networkdb to address these problems by trying to rejoin the cluster via the failed nodes when there is a failure. This also necessitates the need to add new messages called node event messages to differentiate between node leave and node failure. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-19 15:58:14 -07:00
Jana Radhakrishnan	2bead02c87	Ignore delete events for non-existent entries In networkdb we should ignore delete events for entries which doesn't exist in the db. This is always true because if the entry did not exist then the entry has been removed way earlier and got purged after the reap timer and this notification is very stale. Also there were duplicate delete notifications being sent to the clients. One when the actual delete event was received from gossip and later when the entry was getting reaped. The second notification is unnecessary and may cause issues with the clients if they are not coded for idempotency. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-08-18 13:57:24 -07:00
Santhosh Manohar	2bab9b6bdb	Cleanup networkdb state when the network is deleted locally Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-08-10 12:44:05 -07:00
Madhu Venugopal	6368406c26	Adding Advertise-addr support With this change, all the auto-detection of the addresses are removed from libnetwork and the caller takes the responsibilty to have a proper advertise-addr in various scenarios (including externally facing public advertise-addr with an internal facing private listen-addr) Signed-off-by: Madhu Venugopal <madhu@docker.com>	2016-07-21 02:44:25 -07:00
Alexander Morozov	af3158ecdb	networkdb: do nothing in bulkSync if nodes is empty This patch allows getting rid of annoying debug message. Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>	2016-07-11 09:11:07 -07:00
Jana Radhakrishnan	8936daab5e	Retain deleted entries for longer time When deleting entries or when learning about deleted entries remember then for a longer time to avoid excessive delete duplicates in the gossip cluster. Also added code changes to ignore event messages originated from the source node so that it doesn't get added into the rebroadcast queue. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-30 18:24:13 -07:00
Santhosh Manohar	929921a640	Add debugs for key change events in networkdb Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-06-14 03:13:48 -07:00

1 2

61 commits