- Now that docker has the code to release the ingress
network, have docker do the release on cluster leave
and on graceful daemon shutdown.
This is a cleaner approach in line with the cleanup
triggered by who created the resource and will avoid
races on ingress network removal as revealed by the
docker tests.
Signed-off-by: Alessandro Boch <aboch@docker.com>
With the introduction of networkdb, the node discovery events were not
sent to the drivers. This commit generates the node discovery events and
sents it to the drivers interested in it.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
Also fix import name to use original project name 'logrus' instead of
'log'
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
1. Base work was done by msabansal and nwoodmsft
from : https://github.com/msabansal/docker/tree/overlay
2. reorganized under drivers/windows/overlay and rebased to
libnetwork master
3. Porting overlay common fixes to windows driver
* 46f525c
* ba8714e
* 6368406
4. Windows Service Discovery changes for swarm-mode
5. renaming default windows ipam drivers as "windows"
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Signed-off-by: msabansal <sabansal@microsoft.com>
Signed-off-by: nwoodmsft <Nicholas.Wood@microsoft.com>
As part of daemon init, network and ipam drivers are passed a
pluginstore object that implements the plugin/getter interface. Use this
interface methods in libnetwork to interact with network plugins. This
interface provides the new and improved pluginv2 functionality and falls
back to pluginv1 (legacy) if necessary.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
When dynamic networks are created and there is a race in creation of the
same network from two different tasks then one of them will fail while
the other will succeed. For service tasks this is not a big problem
because they will be rescheduled again. But for attachment tasks this
can be a problem since they won't get recreated and making the whole
connection fail. Fixed it by serializing network creation for the
network with the same id and trying to see if the id is present after
coming out of wait.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This also allows pubslied services to be accessible from containers on bridge
networks on the host
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
Avoid by reinitializing the channel immediately after closing the
channel within a lock. Also change the wait code to cache the channel in
stack be retrieving it from controller and wait on the stack copy of the
channel.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When leaving the entire gossip cluster or when leaving a network
specific gossip cluster, we may not have had a chance to cleanup service
bindings by way of gossip updates due to premature closure of gossip
channel. Make sure to cleanup all service bindings since we are not
participating in the cluster any more.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When a node leaves the swarm cluster, we should cleanup the ingress
network and sandbox. This makes sure that when the next time the node
joins the swarm it will be able to update the cluster with the right
information.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
While trying to update loadbalancer state index the service both on id
and portconfig. From libnetwork point of view a service is not just
defined by its id but also the ports it exposes. When a service updates
its port its id remains the same but its portconfigs change which should
be treated as a new service in libnetwork in order to ensure proper
cleanup of old LB state and creation of new LB state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When adding a loadbalancer to a sandbox, the sandbox may have a valid
namespace but it might not have populated all the dependent network
resources yet. In that case do not populate that endpoint's loadbalancer
into that sandbox yet. The loadbalancer will be populated into the
sandbox when it is done populating all the dependent network resources.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If the IPAM pools are not reserved before resource cleanup happens then
the resource release will not happen correctly.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When leaving a cluster the agentInitDone should be re-initialized so tha
when a new cluster is initialized this is usable.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Agent initialization wait method is added to make sure callers for
controller methods which depend on agent initialization to be complete
can wait on it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Ingress load balancer is achieved via a service sandbox which acts as
the proxy to translate incoming node port requests and mapping that to a
service entry. Once the right service is identified, the same internal
loadbalancer implementation is used to load balance to the right backend
instance.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Also restore older behavior where overlap check is not run
when preferred pool is specified. Got broken by recent changes
Signed-off-by: Alessandro Boch <aboch@docker.com>
Add a notion of service in libnetwork so that a group of endpoints
which form a service can be treated as such so that service level
features can be added on top. Initially as part of this PR the support
to assign a name to the said service is added which results in DNS
queries to the service name to return all the IPs of the backing
endpoints so that DNS RR behavior on the service name can be achieved.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
libnetwork agent mode is a mode where libnetwork can act as a local
agent for network and discovery plumbing alone while the state
management is done elsewhere. This completes the support for making
libnetwork and its associated drivers to be completely independent of a
k/v store(if needed) and work purely based on the state information
passed along by some some external controller or manager. This does not
mean that libnetwork support for decentralized state management via a
k/v store is removed.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With the introduction of a driver generic gossip in libnetwork it is not
necessary for drivers to run their own gossip protocol (like what
overlay driver is doing currently) but instead rely on the gossip
instance run centrally in libnetwork. In order to achieve this, certain
enhancements to driver api are needed. This api aims to provide these
enhancements.
The new api provides a way for drivers to register interest on table
names of their choice by returning a list of table names of interest as
a response to CreateNetwork. By doing that they will get notified if a
CRUD operation happened on the tables of their interest, via the newly
added EventNotify call.
Drivers themselves can add entries to any table during a Join call by
invoking AddTableEntry method any number of times during the Join
call. These entries lifetime is the same as the endpoint itself. As soon
as the container leaves the endpoint, those entries added by driver
during that endpoint's Join call will be automatically removed by
libnetwork. This action may trigger notification of such deletion to all
driver instances in the cluster who have registered interest in that
table's notification.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the libnetwork function `NewNetwork` does not allow
caller to pass a network ID and it is always generated internally.
This is sufficient for engine use. But it doesn't satisfy the needs
of libnetwork being used as an independent library in programs other
than the engine. This enhancement is one of the many needed to
facilitate a generic libnetwork.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently driver management logic is tightly coupled with
libnetwork package and that makes it very difficult to
modularize it and use it separately. This PR modularizes
the driver management logic by creating a driver registry
package.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With the current implementation, a config relaod event causes all the
datastores to reinitialize and that impacts objects with Persist=false
such as none and host network.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- ... on ungraceful shutdown during network create
- Allow forceful deletion of network
- On network delete, first mark the network for deletion
- On controller creation, first forcely remove any network
that is marked for deletion.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Move DiscoverNew() and DiscoverDelete() methods into the new interface
- Add DatastoreUpdate notification
- Now this interface can be implemented by any drivers, not only network drivers
Signed-off-by: Alessandro Boch <aboch@docker.com>
Stale sandbox and endpoints are cleaned up during controller init.
Since we reuse the exact same code-path, for sandbox and endpoint
delete, they try to load the plugin and it causes daemon startup
timeouts since the external plugin containers cant be loaded at that
time. Since the cleanup is actually performed for the libnetwork core
states, we can force delete sandbox and endpoint even if the driver is
not loaded.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- So that a DHCP based plugin can express it needs
the endpoint MAC address when requested for an IP address.
- In such case libnetwork will allocate one if not already
provided by user
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Remove from contract predefined errors which are no longer
valid (ex. ErrInvalidIpamService, ErrInvalidIpamConfigService)
- Do not use network driver error for ipam load failure in controller.go
- Bitseq to expose two well-known errors (no more bit available, bit is already set)
- Default ipam to report proper well-known error on RequestAddress()
based on bitseq returned error
- Default ipam errors to comply with types error interface
Signed-off-by: Alessandro Boch <aboch@docker.com>
At times, when checkpointed sandbox from store cannot be
cleaned up properly we still retain the sandbox in both
the store and in memory. But this sandbox store may not
contain important configuration information from docker.
So when docker requests a new sandbox, instead of using
it as is, reconcile the sandbox state from store with the
the configuration information provided by docker. To do this
mark the sandbox from store as stub and never reveal it to
external searches. When docker requests a new sandbox, update
the stub sandbox and clear the stub flag.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There is a race in os sandbox sharing code where two containers which
are sharing the os sandbox try to recreate the os sandbox again which
might result in destroying the os sandbox and recreating it. Since the
os sandbox sharing is happening only for default sandbox, refactored the
code to create os sandbox only once inside a `sync.Once` api so that it
happens exactly once and gets reused by other containers. Also disabled
deleting this os sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Since we share the host sandbox with many containers we
need to serialize creation of the sandbox. Otherwise
container starts may see the namespace path in inconsistent
state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When we bootup cleanup all dangling local
endpoints since they are not needed anymore.
The only reason it can happen is when the process
went down ungracefully after an endpoint is
created but before join is successfull.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently endpoint count is maintained as part of
network object and the endpoint count gets updated
frequently while the rest of network is quite stable.
Because of the frequent updates to endpoint count the
network object is getting marshalled and unmarshalled
ferquently. This is causing a lot of churn and transient
memory usage. Fix this by creating a deparate object of
endpoint count so that only that gets updated.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently when docker exits ungracefully it may leave
dangling sandboxes which may hold onto precious network
resources. Added checkpoint state for sandboxes which
on boot up will be used to clean up the sandboxes and
network resources.
On bootup the remaining dangling state in the checkpoint
are read and cleaned up before accepting any new
network allocation requests.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Always on watching of networks and endpoints can
affect scalability of the cluster beyond a few nodes.
Remove pro active watching and watch only the objects
you are interested in.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
* integrated hostdiscovery package with the new Docker Discovery
* Integrated hostdiscovery package with libnetwork core
* removed libnetwork_discovery tag
* Introduced driver apis for discovery events
* moved overlay driver to make use of the discovery events
* Using Docker Discovery service.
* Changed integration-tests to make use of the new discovery
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- DisableBridgeCreation is misleading. change it to DefaultBridge
- Dont fail the init if localstore cannot be initialized
- added a convenience function to get endpoint for a container
Signed-off-by: Madhu Venugopal <madhu@docker.com>
1. Don't save localscope endpoints to localstore for now.
2. Add common function updateToStore/deleteFromStore to store KVObjects.
3. Merge `getNetworksFromGlobalStore` and `getNetworksFromLocalStore`
4. Add `n.isGlobalScoped` before `n.watchEndpoints` in `addNetwork`
5. Fix integration-tests
6. Fix test failure in drivers/remote/driver_test.go
7. Restore network to store if deleteNework failed
Currently the driver configuration is pushed through a separate
api. This makes driver configuration possible at any arbitrary
time. This unncessarily complicates the driver implementation.
More importantly the driver does not get access to it's
configuration before it can do the handshake with libnetwork.
This make the internal drivers a little bit different to
external plugins which can get their configuration before the handshake
with libnetwork.
This PR attempts to fix that mismatch between internal drivers and
external plugins.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Convinience API which detaches the sandbox from
all endpoints, resets and reapply config options,
setup discovery files, reattach to the endpoints.
No change to the osl sandbox in use.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Maps 1 to 1 with container's networking stack
- It holds container's specific nw options which
before were incorrectly owned by Endpoint.
- Sandbox creation no longer coupled with Endpoint Join,
sandbox and endpoint have now separate lifecycle.
- LeaveAll naturally replaced by Sandbox.Delete
- some pkg and file renaming in order to have clear
mapping between structure name and entity ("sandbox")
- Revisited hosts and resolv.conf handling
- Removed from JoinInfo interface capability of setting hosts and resolv.conf paths
- Changed etchosts.Build() to first write the search domains and then the nameservers
Signed-off-by: Alessandro Boch <aboch@docker.com>
In that commit, AtomicPutCreate takes previous = nil to Atomically create keys
that don't exist. We need a create operation that is atomic to prevent races
between multiple libnetworks creating the same object.
Previously, we just created new KVs with an index of 0 and wrote them to the
datastore. Consul accepts this behaviour and interprets index of 0 as
non-existing, but other data backends do no.
- Add Exists() to the KV interface. SetIndex() should also modify a KV so
that it exists.
- Call SetIndex() from within the GetObject() method on DataStore interface.
- This ensures objects have the updated values for exists and index.
- Add SetValue() to the KV interface. This allows implementers to define
their own method to marshall and unmarshall (as bitseq and allocator have).
- Update existing users of the DataStore (endpoint, network, bitseq,
allocator, ov_network) to new interfaces.
- Fix UTs.
Currently container can join one endpoint when it is started.
More endpoints can be attached at a later point in time. But
when that happens this attachment should only have meaning
only as long as the container is alive. The attachment should
lose it's meaning when the container goes away. Cuurently there
is no way for the container management code to tell libnetwork
to detach the container from all attached endpoints. This PR
provides an additional API `LeaveAll` which adds this
functionality,
To facilitate this and make the sanbox lifecycle consistent
some slight changes have been made to the behavior of sandbox
management code. The sandbox is no longer destroyed when the
last endpoint is detached from the container. Instead the sandbox
ie kept alive and can only be destroyed with a `LeaveAll` call.
This gives better control of sandbox lifecycle by the container
management code and the sandbox doesn't get destroyed from under
the carpet while the container is still using it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add a minimal service discover support using service names or
service names qualified with network name. This is achieved
by populating the container's /etc/hosts file record with the
appropriate entries
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add support for in-tree driver configuration labels
by pushing the labels which has the appropriate
prefix to the corresponding driver.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
1. replaced --net option for service UI with SERVICE.[NETWORK] format
2. Making using of the default network/driver backend support
3. NetworkName and NetworkType from the UI/API can be empty string
and it will be replaced with DefaultNetwork and DefaultDriver
As per the design goals, we wanted to keep libnetwork core free of
handling defaults. Rather, the clients (docker & dnet) must handle the
defaultness of these entities.
Also, since there is no API to get these Default values from the
backend, UI will not handle the default values either. Hence, this falls
under the responsibility of the API layer to handle this specific case.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Since the Config is a read-only entity, Confg() method returned a value
instead of the pointer. In cases the config is nil, we should return an
empty config.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
The configuration format for docker runtime is based on daemon flags and
hence adjusting the libnetwork configuration to accomodate it by moving
the TOML based configuration to the dnet tool.
Also changed the controller configuration via options
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently store makes use of a static isReservedNetwork check to decide
if a network needs to be stored in the distributed store or not. But it
is better if the check is not static, but be determined based on the
capability of the driver that backs the network.
Hence introducing a new capability mechanism to the driver which it can
express its capability during registration. Making use of first such
capability : Scope. This can be expanded in the future for more such cases.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
* Removed network from being marshalled (it is part of the key anyways)
* Reworked the watch function to handle container-id on endpoints
* Included ContainerInfo to be marshalled which needs to be synchronized
* Resolved multiple race issues by introducing data locks
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Right now the namespace paths are cleaned up every
garbage collection period. But if the daemon is restarted
before all the namespace paths of removed containers are
garbage collected they will remain there forever. The fix
is to provide a GC() api so that garbage collection can be
triggered immediately.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When an endpoint is joined by a container it may
optionally pass a priority to resolve resource
conflicts inside the sandbox when more than one
endpoint provides the same kind of resource. If the
the priority is the same for two endpoints with
conflicting resources then the endpoint network names
are used to resolve the conflict.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Now that Endpoint interface has the Info method there is no
need to return container data as a return value in the Join
method. Removed the return value and fixed all the callers.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>