Commit graph

96 commits

Author SHA1 Message Date
Flavio Crisciani
39d2204896 Service discovery logic rework
changed the ipMap to SetMatrix to allow transient states
Compacted the addSvc and deleteSvc into a one single method
Updated the datastructure for backends to allow storing all the information needed
to cleanup properly during the cleanupServiceBindings
Removed the enable/disable Service logic that was racing with sbLeave/sbJoin logic
Add some debug logs to track further race conditions

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-06-11 20:49:29 -07:00
Flavio Crisciani
6d768ef73c Fix leak of handleTableEvents
The channel ch.C is never closed.
Added the listen of the ch.Done() to guarantee
that the goroutine is exiting once the event channel
is closed

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-31 11:04:19 -07:00
Flavio Crisciani
34ce7c7e6a Revert "Move Cluster provider back to Moby"
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-25 10:47:02 -07:00
Flavio Crisciani
627da8bf04 Moved the cluster provider to Moby
Moved the cluster provider interface definition from
libnetwork to moby

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-24 11:28:23 -07:00
Alessandro Boch
254d082cc3 Add ConnectivityScope capability for network drivers along with scope network option
- It specifies whether the network driver can
  provide containers connectivity across hosts.
- As of now, the data scope of the driver was
  being overloaded with this notion.
- The driver scope information is still valid
  and it defines whether the data allocation
  of the network resources can be done globally
  or only locally.
- With the scope network option, user can now
  force a network as swarm scoped
  regardless of the driver data scope.
- In case the network is configured as swarm scoped,
  and the network driver is multihost capable,
  a network DB instance will be launched for it.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-05-12 17:16:34 -07:00
Flavio Crisciani
a2bf0b35d6 Fix for swarm/libnetwork init race condition
This change cleans up the SetClusterProvider method.
Swarm calls the SetClusterProvider to pass to libnetwork the pointer
of the provider from which libnetwork can fetch all the information to
initialize the internal agent.

The method can be and is called multiple times passing the same value,
with the previous logic that was erroneusly spawning multiple go routines that
were making possiblea race between an agentInit and an agentClose.

The new logic aims to disallow it by checking for the provider passed and
ensuring that if the provider is already present there is nothing to do because
there is already an active go routine that is ready to process cluster events.
Moreover a patch on moby side takes care of clearing up the Cluster Events
dispacthing using only 1 channel to handle all the events types.
This will also guarantee in order event handling because now all the events are
piped into one single channel.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-04 15:35:28 -07:00
Flavio Crisciani
552c16dc92 Fix for remote addr parsing
Fix initialization of starting vector

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-04-28 09:10:29 -07:00
Flavio Crisciani
3d7bc23901 Change GetRemoteAddr to return all managers
Change in the provider interface to let the provider
return the whole list of managers.
This will allow the netwrok db to have multiple choice
to establish the first adjacencies

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-04-27 16:58:42 -07:00
Alessandro Boch
8711829092 Merge pull request #1719 from fcrisciani/data_path
Add the datapath-addr in libnetwork
2017-04-24 13:55:24 -07:00
Alessandro Boch
46ebc9613e agentSetup to first check if clusterProvider is nil
- concurrent swarm join and daemon stop seen in
  integration tests may cause agentSetup to access
  a nil clusterProvider, resulting in a panic

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-04-18 11:34:05 -07:00
Flavio Crisciani
a0e0231909 Add the data-path-addr
During configuration in SWARM mode is now possible to pass an additional
parameter --data-path-addr <ip|interface>.
The information is going to be used to configure which is the interface
that is going to be used for the data path for global scope drivers.
Up to now the only driver really using this extra parameter is the
overlay driver.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-04-14 16:52:40 -07:00
Alessandro Boch
18098ab1c8 Add AgentStopWait method
- to signal when the networking cluster agent is stopped

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-04-05 11:13:56 -07:00
Santhosh Manohar
bfab379411 swarm mode network inspect should provide cluser-wide task details
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-03-10 19:12:00 -08:00
Alessandro Boch
d718efd92f Add anonymous container alias to service record on attachable network
- Currently when a non-named container with network aliases
  is connected to a swarm attachable network, its aliases are
  not added to the service records.
  This is not in line with what we do when connecting to
  a local scope network, or to a kv-store based overlay network.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-03-02 12:28:39 -08:00
Madhu Venugopal
bb560a1f44 Generating node discovery events to the drivers from networkdb
With the introduction of networkdb, the node discovery events were not
sent to the drivers. This commit generates the node discovery events and
sents it to the drivers interested in it.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2017-02-01 17:54:51 -08:00
Alessandro Boch
fac86cf69a Add missing locks in agent and service code
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-29 13:58:06 -08:00
Santhosh Manohar
27500b1e35 Separate service LB & SD from network plumbing
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-11-17 13:09:14 -08:00
Alessandro Boch
efc25da851 Allow concurrent calls to agentClose
- This fixes a panic in memberlist.Leave() because called
  after memberlist.shutdown = false
  It happens because of two interlocking calls to NetworkDB.clusterLeave()
  It is easily reproducible with two back-to-back calls
  to docker swarm init && docker swarm leave --force
  While the first clusterLeave() is waiting for sendNodeEvent(NodeEventTypeLeave)
  to timeout (5 sec) a second clusterLeave() is called. The second clusterLeave()
  will end up invoking memberlist.Leave() after the previous call already did
  the same, therefore after memberlist.shutdown was set false.
- The fix is to have agentClose() acquire the agent instance and reset the
  agent pointer right away under lock. Then execute the closing/leave functions
  on the agent instance.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-01 14:51:08 -07:00
Jana Radhakrishnan
22c322dded Avoid returning early on agent join failures
When a gossip join failure happens do not return early in the call chain
because a join failure is most likely transient and the retry logic
built in the networkdb is going to retry and succeed. Returning early
makes the initialization of ingress network/sandbox to not happen which
causes a problem even after the gossip join on retry is successful.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-27 08:36:10 -07:00
Jana Radhakrishnan
b0a7084c05 Honor user provided listen address for gossip
If user provided a non-zero listen address, honor that and bind only to
that address. Right now it is not honored and we always bind to all ip
addresses in the host.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-22 11:41:57 -07:00
Alessandro Boch
8653b72786 Lock agent access in addDriverWatches
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-09-20 14:18:49 -07:00
Santhosh Manohar
5b632d752c Make nodenames unique in Gossip cluster
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-09-19 09:57:23 -07:00
Jana Radhakrishnan
b29ba21551 Avoid double close of agentInitDone
Avoid by reinitializing the channel immediately after closing the
channel within a lock. Also change the wait code to cache the channel in
stack be retrieving it from controller and wait on the stack copy of the
channel.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-24 14:00:36 -07:00
Jana Radhakrishnan
8a1092fe78 Notify agentInitDone after joining the cluster
Currently the initDone notification is provided immediately after
initializing the cluster. This may be fine for the first manager. But
for all subsequent nodes which join the cluster we need to wait until
the node completes the joining to the gossip cluster inorder to
synchronize the gossip network clock with other nodes. If we don't have
uptodate clock the updates that this node provides to the cluster may be
discarded by the other nodes if they have entries which are yet to be
reaped but have a better clock.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-19 17:57:58 -07:00
Santhosh Manohar
6e965c03ad Reset the encryption keys on swarm leave
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-16 17:37:33 -07:00
Madhu Venugopal
c7d98e0081 Merge pull request #1382 from mrjana/overlay
Fix spurious overlay errors
2016-08-11 11:38:57 +05:30
Jana Radhakrishnan
004e56a4d1 Fix spurious overlay errors
Fixed certain spurious overlay errors which were not errors at all but
showing up everytime service tasks are started in the engine.

Also added a check to make sure a delete is valid by checking the
incoming endpoint id wih the one in peerdb just to make sure if the
delete from gossip is not stale.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-08 11:55:06 -07:00
Santhosh Manohar
ab02b015ef Remove unused key handling functions
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-05 04:46:01 -07:00
Jana Radhakrishnan
0030332e4e Merge pull request #1372 from sanimej/gossip
Add container short-id as an alias for swarm mode tasks
2016-08-03 17:27:49 -07:00
Santhosh Manohar
b54a4b5936 Add container short-id as an alias for swarm mode tasks
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-02 20:28:33 -07:00
Aaron Lehmann
3f542419ac Check size of keys slice
If not enough keys are provided to SetKeys, this may cause a panic. This
should not cause problems with the current integration in Docker 1.12.0,
but the panic might happen loading data created by an earlier version,
or data that is corrupted somehow. Add a length check to be defensive.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-08-02 19:07:43 -07:00
Madhu Venugopal
6368406c26 Adding Advertise-addr support
With this change, all the auto-detection of the addresses are removed
from libnetwork and the caller takes the responsibilty to have a proper
advertise-addr in various scenarios (including externally facing public
advertise-addr with an internal facing private listen-addr)

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-07-21 02:44:25 -07:00
Alessandro Boch
d0192db0cd On agent init, re-join on existing cluster networks
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-07-12 17:35:32 -07:00
Santhosh Manohar
ec17841ea4 Switch overlay encryption to use IPSec susbsystem keys
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-15 04:10:23 -07:00
Santhosh Manohar
8ded762a0b Update key handling logic to process keyring with 3 keys
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-11 04:50:25 -07:00
Jana Radhakrishnan
acac7ee812 Add service alias support
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-06-14 16:40:54 -07:00
Madhu Venugopal
64d8c5f87f Resolve host-name before trying the interface-name in agent bind
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-06-12 10:08:26 -07:00
Alessandro Boch
93b5073a7d Overlay driver to support network layer encryption
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-06-08 23:38:55 -07:00
Santhosh Manohar
c4d5bbad7a Use controller methods for handling the encyrption keys from agent
instead of the Provider interface methods.

Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-05 00:47:30 -07:00
Santhosh Manohar
b2b87577d4 Add support for encrypting gossip traffic
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-04 03:55:14 -07:00
Madhu Venugopal
9054ac2b48 Provide a way for libnetwork to make use of Agent mode functionalities
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-06-05 18:41:21 -07:00
Jana Radhakrishnan
0f89c9b7bc Add ingress load balancer
Ingress load balancer is achieved via a service sandbox which acts as
the proxy to translate incoming node port requests and mapping that to a
service entry. Once the right service is identified, the same internal
loadbalancer implementation is used to load balance to the right backend
instance.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-06-04 20:38:32 -07:00
Jana Radhakrishnan
d05adebf30 Add loadbalancer support
This PR adds support for loadbalancing across a group of endpoints that
share the same service configuration as passed in by
`OptionService`. The loadbalancer is implemented using ipvs with just
round robin scheduling supported for now.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-05-26 13:05:58 -07:00
Jana Radhakrishnan
b1e5178bc3 Convert endpoint gossip to use protobuf
Endpoint gossip will use protobuf so that we can make changes in a
backward compatible way.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-05-17 19:05:06 -07:00
Jana Radhakrishnan
ffdceda255 Add service support
Add a notion of service in libnetwork so that a group of endpoints
which form a service can be treated as such so that service level
features can be added on top. Initially as part of this PR the support
to assign a name to the said service is added which results in DNS
queries to the service name to return all the IPs of the backing
endpoints so that DNS RR behavior on the service name can be achieved.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-05-05 16:47:05 -07:00
Jana Radhakrishnan
0580043718 Add libnetwork agent mode support
libnetwork agent mode is a mode where libnetwork can act as a local
agent for network and discovery plumbing alone while the state
management is done elsewhere. This completes the support for making
libnetwork and its associated drivers to be completely independent of a
k/v store(if needed) and work purely based on the state information
passed along by some some external controller or manager. This does not
mean that libnetwork support for decentralized state management via a
k/v store is removed.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-05-02 18:19:32 -07:00