Commit graph

61 commits

Author SHA1 Message Date
Sebastiaan van Stijn
cd381aea56
libnetwork: fix empty-lines (revive)
libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive)
    libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive)
    libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-09-26 19:21:58 +02:00
Sebastiaan van Stijn
561a010161
linting: suppress false positive for G404 (gosec)
The linter falsely detects this as using "math/rand":

    libnetwork/networkdb/cluster.go:721:14: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
       val, err := rand.Int(rand.Reader, big.NewInt(int64(n)))
                   ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-09-04 15:36:49 +02:00
Sebastiaan van Stijn
b9c8eca468
libnetwork/networkdb: remove some redundant fmt.Sprintf()'s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-15 12:56:23 +01:00
frobnicaty
d78b883576 Fix grammar for "does not exist"
as opposed to "does not exists"

Signed-off-by: frobnicaty <92033765+frobnicaty@users.noreply.github.com>
2021-12-03 15:50:13 +00:00
Roman Volosatovs
d7a2635537
libnetwork: make rejoin intervals configurable
This allows the rejoin intervals to be chosen according to the context
within which the component is used, and, in particular, this allows
lower intervals to be used within TestNetworkDBIslands test.

Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-07-12 19:25:49 +02:00
Brian Goff
116f200737
Fix gosec complaints in libnetwork
These were purposefully ignored before but this goes ahead and "fixes"
most of them.
Note that none of the things gosec flagged are problematic, just
quieting the linter here.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-25 18:02:03 +02:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Flavio Crisciani
2b1e45c682 Merge pull request #2238 from talex5/networkdb-docs
Add NetworkDB docs
2019-03-14 16:05:31 -07:00
Flavio Crisciani
151f42aeaa Fix possible nil pointer exception
It is possible that the node is not yet present in
the node list map. In this case just print a warning
and return. The next iteration would be fine

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2019-01-22 17:07:15 -08:00
Thomas Leonard
05c05ea5e9 Add NetworkDB docs
This is based on reading the code in the `networkdb` directory.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
2018-08-08 13:35:11 +01:00
Josh Soref
a06f1b2c4e Spelling fixes
* addresses
* assigned
* at least
* attachments
* auxiliary
* available
* cleanup
* communicate
* communications
* configuration
* connection
* connectivity
* destination
* encountered
* endpoint
* example
* existing
* expansion
* expected
* external
* forwarded
* gateway
* implementations
* implemented
* initialize
* internally
* loses
* message
* network
* occurred
* operational
* origin
* overlapping
* reaper
* redirector
* release
* representation
* resolver
* retrieve
* returns
* sanbdox
* sequence
* succesful
* synchronizing
* update
* validates

Signed-off-by: Josh Soref <jsoref@gmail.com>
2018-07-12 12:54:44 -07:00
Flavio Crisciani
b0a0059237 Merge pull request #2216 from fcrisciani/netdb-qlen-issue
NetworkDB qlen optimization
2018-07-05 15:02:58 -07:00
Chris Telfer
06922d2d81 Use fmt precision to limit string length
The previous code used string slices to limit the length of certain
fields like endpoint or sandbox IDs.  This assumes that these strings
are at least as long as the slice length.  Unfortunately, some sandbox
IDs can be smaller than 7 characters.   This fix addresses this issue
by systematically converting format string calls that were taking
fixed-slice arguments to use a precision specifier in the string format
itself.  From the golang fmt package documentation:

    For strings, byte slices and byte arrays, however, precision limits
    the length of the input to be formatted (not the size of the output),
    truncating if necessary. Normally it is measured in runes, but for
    these types when formatted with the %x or %X format it is measured
    in bytes.

This nicely fits the desired behavior: it will limit the number of
runes considered for string interpolation to the precision value.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
2018-07-05 17:44:04 -04:00
Flavio Crisciani
55e4cc7262 Optimize networkDB queue
Added some optimizations to reduce the messages in the queue:
1) on join network the node execute a tcp sync with all the nodes that
it is aware part of the specific network. During this time before the
node was redistributing all the entries. This meant that if the network
had 10K entries the queue of the joining node will jump to 10K. The fix
adds a flag on the network that would avoid to insert any entry in the
queue till the sync happens. Note that right now the flag is set in
a best effort way, there is no real check if at least one of the nodes
succeed.
2) limit the number of messages to redistribute coming from a TCP sync.
Introduced a threshold that limit the number of messages that are
propagated, this will disable this optimization in case of heavy load.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-07-02 16:59:45 -07:00
Flavio Crisciani
500d9f4515 Adjust corner case for reconnect logic
Previous logic was not accounting that each node is
in the node list so the bootstrap nodes won't retry
to reconnect because they will always find themselves
in the node map
Added test that validate the gossip island condition

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2018-06-21 18:04:55 -07:00
Dani Louca
96472cdaea Adding a recovery mechanism for a split gossip cluster
Signed-off-by: Dani Louca <dani.louca@docker.com>
2018-04-23 14:18:46 -04:00
Flavio Crisciani
f0fcb0bbe6 Fixed race on quick node fail/join
The previous logic was not properly handling the case of a node
that was failing and oining back in short period of time.
The issue was in the handling of the network messages.
When a node joins it sync with other nodes, these are passing
the whole list of nodes that at best of their knowledge are part
of a network. At this point if the node receives that node A is part
of the network it saves it before having received the notification
that node A is actually alive (coming from memberlist).
If node A failed the source node will receive the notification
while the new joined node won't because memberlist never advertise
node A as available. In this case the new node will never purge
node A from its state but also worse, will accept any table notification
where node A is the owner and so will end up in a out of sync state
with the rest of the cluster.

This commit contains also some code cleanup around the area of node
management

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-11-27 14:38:06 -08:00
Flavio Crisciani
7fbaf6de2c Add test to confirm garbage collection
- Create a test to verify that a node that joins
  in an async way is not going to extend the life
  of a already deleted object

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-10-23 09:58:57 +02:00
Flavio Crisciani
8c31217a44 NetworkDB create NodeID for cluster nodes
Separate the hostname from the node identifier. All the messages
that are exchanged on the network are containing a nodeName field
that today was hostname-uniqueid. Now being encoded as strings in
the protobuf without any length restriction they plays a role
on the effieciency of protocol itself. If the hostname is very long
the overhead will increase and will degradate the performance of
the database itself that each single cycle by default allows 1400
bytes payload

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-09-26 10:48:04 -07:00
Flavio Crisciani
a4e64d05c1 Avoid alignment of reapNetwork and tableEntries
Make sure that the network is garbage collected after
the entries. Entries to be deleted requires that the network
is present.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-09-22 10:57:47 -07:00
Flavio Crisciani
053a534ab1 Changed ReapTable logic
- Changed the loop per network. Previous implementation was taking a
  ReadLock to update the reapTime but now with the residualReapTime
  also the bulkSync is using the same ReadLock creating possible
  issues in concurrent read and update of the value.
  The new logic fetches the list of networks and proceed to the
  cleanup network by network locking the database and releasing it
  after each network. This should ensure a fair locking avoiding
  to keep the database blocked for too much time.

  Note: The ticker does not guarantee that the reap logic runs
  precisely every reapTimePeriod, actually documentation says that
  if the routine is too long will skip ticks. In case of slowdown
  of the process itself it is possible that the lifetime of the
  deleted entries increases, it still should not be a huge problem
  because now the residual reaptime is propagated among all the nodes
  a slower node will let the deleted entry being repropagate multiple
  times but the state will still remain consistent.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-09-21 09:37:47 -07:00
Flavio Crisciani
2d2a2bc568 Fix reapTime logic in NetworkDB
- Added remainingReapTime field in the table event.
  Wihtout it a node that did not have a state for the element
  was marking the element for deletion setting the max reapTime.
  This was creating the possibility to keep the entry being resync
  between nodes forever avoding the purpose of the reap time
  itself.

- On broadcast of the table event the node owner was rewritten
  with the local node name, this was not correct because the owner
  should continue to remain the original one of the message

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-09-21 09:37:37 -07:00
Derek McGowan
710e0664c4 Update logrus to v1.0.1
Fix case sensitivity issue
Update docker and runc vendors

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-08-07 11:20:47 -07:00
Flavio Crisciani
d6440c9139 optimize the rebroadcast for failure case
Before when a node was failing, all the nodes would bump the lamport time of all their
entries. This means that if a node flap, there will be a storm of update of all the entries.
This commit on the base of the previous logic guarantees that only the node that joins back
will readvertise its own entries, the other nodes won't need to advertise again.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-08-01 14:08:54 -07:00
Flavio Crisciani
60b5add4af NetworkDB allow setting PacketSize
- Introduce the possibility to specify the max buffer length
  in network DB. This will allow to use the whole MTU limit of
  the interface

- Add queue stats per network, it can be handy to identify the
  node's throughput per network and identify unbalance between
  nodes that can point to an MTU missconfiguration

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-26 13:44:33 -07:00
Flavio Crisciani
051a0d5ce9 NetworkDB incorrect number of entries in networkNodes
A rapid (within networkReapTime 30min) leave/join network
can corrupt the list of nodes per network with multiple copies
of the same nodes.
The fix makes sure that each node is present only once

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-18 16:57:49 -07:00
Santhosh Manohar
ca9a768d80 Handle single manager reload by having workers reconnect
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-05-31 14:36:23 -07:00
Santhosh Manohar
06c3489bb8 retry once on a bulk sync failure
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-05-11 21:13:18 -07:00
Flavio Crisciani
da9ac65ea6 Remove explicit set of memberlist protocol
Memberlist does a full validation of the protocol version (min, current, max)
amoung all the ndoes of the cluster.
The previous code was setting the protocol version to max version.
That made the upgrade incompatible.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-08 16:58:53 -07:00
Santhosh Manohar
102f9d230d Avoid nDB stale entries because of intermittent nw issues.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-04-19 14:01:28 -07:00
Aaron Lehmann
bb8b9a6040 networkdb: Properly format memberlist logs
Right now, items logged by memberlist end up as a complete log line
embedded inside another log line, like the following:

    Nov 22 16:34:16 hostname dockerd: time="2016-11-22T16:34:16.802103258-08:00" level=info msg="2016/11/22 16:34:16 [INFO] memberlist: Marking xyz-1d1ec2dfa053 as failed, suspect timeout reached\n"

This has two time and date stamps, and an escaped newline inside the
"msg" field of the outer log message.

To fix this, define a custom logger that only prints the message itself.
Capture this message in logWriter, strip off the log level (added
directly by memberlist), and route to the appropriate logrus method.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-12-01 19:08:07 -08:00
Alessandro Boch
fac86cf69a Add missing locks in agent and service code
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-29 13:58:06 -08:00
Santhosh Manohar
31dd4362a8 Merge pull request #1542 from allencloud/change-reapNode-interval
update reapNode interval
2016-11-08 11:14:23 -08:00
allencloud
0b4f68390d remove unused mConfig
Signed-off-by: allencloud <allen.sun@daocloud.io>
2016-11-08 18:18:55 +08:00
allencloud
99f84ff5a7 update reapNode interval
Signed-off-by: allencloud <allen.sun@daocloud.io>
2016-11-08 15:28:42 +08:00
Santhosh Manohar
e98b152bac Reap failed nodes after 24 hours
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-10-20 11:24:04 -07:00
Santhosh Manohar
0a2537eea3 Use monotonic clock for reaping networkDB entries
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-10-19 22:30:47 -07:00
Alexander Morozov
03088ace1b networkdb: fix race in access to nodes len
Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-10-04 12:19:25 -07:00
Jana Radhakrishnan
f649d5ae61 Do not hold ack channel in ack table after closing
Once the bulksync ack channel is closed remove it from the ack table
right away. There is no reason to keep it in the ack table and later
delete it in the ack waiter. Ack waiter anyways has reference to the
channel on which it is waiting.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-10-03 09:50:02 -07:00
Jana Radhakrishnan
22c322dded Avoid returning early on agent join failures
When a gossip join failure happens do not return early in the call chain
because a join failure is most likely transient and the retry logic
built in the networkdb is going to retry and succeed. Returning early
makes the initialization of ingress network/sandbox to not happen which
causes a problem even after the gossip join on retry is successful.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-27 08:36:10 -07:00
Jana Radhakrishnan
7b905d3c63 Purge stale nodes with same prefix and IP
Since the node name randomization fix, we need to make sure that we
purge the old node with the same prefix and same IP from the nodes
database if it still present. This causes unnecessary reconnect
attempts.

Also added a change to avoid unnecessary update of local lamport time
and only do it of we are ready to do a push pull on a join. Join should
happen only when the node is bootstrapped or when trying to reconnect
with a failed node.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-23 14:48:54 -07:00
Madhu Venugopal
d1f6eb1812 Allow the memberlist shutdown even if networkdb leave fails
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-09-23 05:19:07 -07:00
Jana Radhakrishnan
b0a7084c05 Honor user provided listen address for gossip
If user provided a non-zero listen address, honor that and bind only to
that address. Right now it is not honored and we always bind to all ip
addresses in the host.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-22 11:41:57 -07:00
Jana Radhakrishnan
5f5dad3c02 Recover from transient gossip failures
Currently if there is any transient gossip failure in any node the
recoevry process depends on other nodes propogating the information
indirectly. In cases if these transient failures affects all the nodes
that this node has in its memberlist then this node will be permenantly
cutoff from the the gossip channel. Added node state management code in
networkdb to address these problems by trying to rejoin the cluster via
the failed nodes when there is a failure. This also necessitates the
need to add new messages called node event messages to differentiate
between node leave and node failure.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-19 15:58:14 -07:00
Jana Radhakrishnan
2bead02c87 Ignore delete events for non-existent entries
In networkdb we should ignore delete events for entries which doesn't
exist in the db. This is always true because if the entry did not exist
then the entry has been removed way earlier and got purged after the
reap timer and this notification is very stale.

Also there were duplicate delete notifications being sent to the
clients. One when the actual delete event was received from gossip and
later when the entry was getting reaped. The second notification is
unnecessary and may cause issues with the clients if they are not
coded for idempotency.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-18 13:57:24 -07:00
Santhosh Manohar
2bab9b6bdb Cleanup networkdb state when the network is deleted locally
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-10 12:44:05 -07:00
Madhu Venugopal
6368406c26 Adding Advertise-addr support
With this change, all the auto-detection of the addresses are removed
from libnetwork and the caller takes the responsibilty to have a proper
advertise-addr in various scenarios (including externally facing public
advertise-addr with an internal facing private listen-addr)

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-07-21 02:44:25 -07:00
Alexander Morozov
af3158ecdb networkdb: do nothing in bulkSync if nodes is empty
This patch allows getting rid of annoying debug message.

Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-07-11 09:11:07 -07:00
Jana Radhakrishnan
8936daab5e Retain deleted entries for longer time
When deleting entries or when learning about deleted entries remember
then for a longer time to avoid excessive delete duplicates in the
gossip cluster. Also added code changes to ignore event messages
originated from the source node so that it doesn't get added into the
rebroadcast queue.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-06-30 18:24:13 -07:00
Santhosh Manohar
929921a640 Add debugs for key change events in networkdb
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-14 03:13:48 -07:00