This PR adds support for loadbalancing across a group of endpoints that
share the same service configuration as passed in by
`OptionService`. The loadbalancer is implemented using ipvs with just
round robin scheduling supported for now.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add a notion of service in libnetwork so that a group of endpoints
which form a service can be treated as such so that service level
features can be added on top. Initially as part of this PR the support
to assign a name to the said service is added which results in DNS
queries to the service name to return all the IPs of the backing
endpoints so that DNS RR behavior on the service name can be achieved.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently ipam/ipamutils has a bunch of dependencies
in osl and netlink which makes the ipam/ipamutils harder
to use independently with other applications. This PR
modularizes ipam/ipamutils into a standalone package
with no OS level dependencies.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Restoring original behavior where on disconnect
from overlay network (only connected network), it also
disconnects from default gw network.
- On sandbox delete, the leave and delete of each
endpoint is performed, regardless of whether the endpoint
is the gw network endpoint. This endpoint is already
automatically removed in endpoint.sbLeave()
- Also do not let internal network dictate container does
not need external connectivity. Before this fix, if a container
was connected to an overlay and an internal network, it may not
get attached to the default gw network.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- On sandbox delete, the leave and delete of each
endpoint is performed, regardless of whether the endpoint
is the gw network endpoint. This endpoint is already
automatically removed in endpoint.sbLeave() by
sb.clearDefaultGW() when the sandbox is marked for
deletion.
- Also restoring otiginal behavior where on disconnect
from overlay network (only connected network), it also
disconnects from default gw network.
- Also do not let internal network dictate container does
not need external connectivity. Before this fix, if a container
was connected to an overlay and an internal network, it may not
get attached to the default gw network.
- needDefaultGw() takes now into account whether the sandbox
is marked for deletion
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Attempt the veth delete only after both ends
are moved into the default network namespace.
Which is after both driver.Leave() and
sandbox.clearNetworkResources() are called.
Signed-off-by: Alessandro Boch <aboch@docker.com>
By removing the need to clear the default gateway during sbJoin and
sbLeave to account for other bridge network, the default-gw endpoint
will stay with the container, it will also help retain the container
property.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Stale sandbox and endpoints are cleaned up during controller init.
Since we reuse the exact same code-path, for sandbox and endpoint
delete, they try to load the plugin and it causes daemon startup
timeouts since the external plugin containers cant be loaded at that
time. Since the cleanup is actually performed for the libnetwork core
states, we can force delete sandbox and endpoint even if the driver is
not loaded.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- Consistently with what it does for IP addresses, libnetwork
will also program the container interface's MAC address with
the value set by network driver in InterfaceInfo.
Signed-off-by: Alessandro Boch <aboch@docker.com>
The first issue is an ordering problem where sandbox
attached version of endpoint object should be pushed
to the watch database first so that any other create endpoint
which is in progress can make use of it immediately to update
the container hosts file. And only after that the current
container should try to retrieve the service records from the
service data base and upate it's hosts file. With the previous
order there is a small time window, when another endpoint create
will find this endpoint but it doesn't have the sandbox context
while the svc record population from svc db has already happened
so that container will totally miss to populate the service record
of the newly created endpoint.
The second issue is trying to rebuild the /etc/hosts file from scratch
during endpoint join and this may sometimes happen after the service
record add for another endpoint has happened on the container
file. Obviously this rebuilding will wipe out that service record which
was just added. Removed the rebuilding of /etc/hosts file during
endpoint join. The initial population of /etc/hosts file should only
happen during sandbox creation time. In the endpoint join just added
the backward-compatible self ip -> hostname entry as just another
record.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
It is sufficient to check only if network is available
in store to make the decision of whether to retain the
stale sandbox. If the endpoints are not available then
there is no point in retaining the sandbox anyways. This
fixes some extreme corner cases, where daemon goes down
right in the middle of sandbox cleanup happening.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
At times, when checkpointed sandbox from store cannot be
cleaned up properly we still retain the sandbox in both
the store and in memory. But this sandbox store may not
contain important configuration information from docker.
So when docker requests a new sandbox, instead of using
it as is, reconcile the sandbox state from store with the
the configuration information provided by docker. To do this
mark the sandbox from store as stub and never reveal it to
external searches. When docker requests a new sandbox, update
the stub sandbox and clear the stub flag.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There is a race in os sandbox sharing code where two containers which
are sharing the os sandbox try to recreate the os sandbox again which
might result in destroying the os sandbox and recreating it. Since the
os sandbox sharing is happening only for default sandbox, refactored the
code to create os sandbox only once inside a `sync.Once` api so that it
happens exactly once and gets reused by other containers. Also disabled
deleting this os sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
For ungraceful daemon restarts, libnetwork has sandbox cleanup logic to
remove any stale & dangling resources. But, if the store is down during
the daemon restart, then the cleanup logic would not be able to perform
complete cleanup. During such cases, the sandbox has been removed. With
this fix, we retain the sandbox if the store is down and the endpoint
couldnt be cleaned. When the container is later restarted in docker
daemon, we will perform a sandbox cleanup and that will complete the
cleanup round.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Introduced a path level lock to synchronize updates
to /etc/hosts writes. A path level cache is maintained
to only synchronize only at the file level.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>