+# Depends on binaries because vet will silently fail if it can not load compiled imports
+vet: ## run go vet
+ @echo "🐳 $@"
+ @test -z "$$(go vet ${PACKAGES} 2>&1 | grep -v 'constant [0-9]* not a string in call to Errorf' | egrep -v '(timestamp_test.go|duration_test.go|exit status 1)' | tee /dev/stderr)"
+Libnetwork provides a native Go implementation for connecting containers
+
+The goal of libnetwork is to deliver a robust Container Network Model that provides a consistent programming interface and the required network abstractions for applications.
+
+#### Design
+Please refer to the [design](docs/design.md) for more information.
+
+#### Using libnetwork
+
+There are many networking solutions available to suit a broad range of use-cases. libnetwork uses a driver / plugin model to support all of these solutions while abstracting the complexity of the driver implementations by exposing a simple and consistent Network Model to users.
+ flag "github.com/docker/libnetwork/client/mflag"
+)
+
+var (
+ i int
+ str string
+ b, b2, h bool
+)
+
+func init() {
+ flag.Bool([]string{"#hp", "#-help"}, false, "display the help")
+ flag.BoolVar(&b, []string{"b", "#bal", "#bol", "-bal"}, false, "a simple bool")
+ flag.BoolVar(&b, []string{"g", "#gil"}, false, "a simple bool")
+ flag.BoolVar(&b2, []string{"#-bool"}, false, "a simple bool")
+ flag.IntVar(&i, []string{"-integer", "-number"}, -1, "a simple integer")
+ flag.StringVar(&str, []string{"s", "#hidden", "-string"}, "", "a simple string") //-s -hidden and --string will work, but -hidden won't be in the usage
+ flag.BoolVar(&h, []string{"h", "#help", "-help"}, false, "display the help")
+ flag.StringVar(&str, []string{"mode"}, "mode1", "set the mode\nmode1: use the mode1\nmode2: use the mode2\nmode3: use the mode3")
+ logrus.Warnf("The element with key:%s does not belong to any node on this network", v.Key)
+ orphanKeys = append(orphanKeys, v.Key)
+ }
+ if _, ok := clusterPeers[v.Owner]; !ok {
+ logrus.Warnf("The element with key:%s does not belong to any node on this cluster", v.Key)
+ }
+ }
+
+ if len(orphanKeys) > 0 && remediate {
+ logrus.Warnf("The following keys:%v results as orphan, do you want to proceed with the deletion (this operation is irreversible)? [Yes/No]", orphanKeys)
+ssd is a troubleshooting utility for Docker swarm networks.
+
+### control-plane and datapath consistency check on a node
+ssd checks for the consistency between docker network control-plane (from the docker daemon in-memory state) and kernel data path programming. Currently the tool checks only for the consistency of the Load balancer (implemented using IPVS).
+
+In a three node swarm cluser ssd status for a overlay network `ov2` which has three services running, each replicated to 3 instances.
+Verifying LB programming for containers on network ingress
+Verifying container Ingress...
+service web... OK
+````
+
+ssd checks the required iptables programming to direct an incoming packet with the <host ip>:<published port> to the right <backend ip>:<target port>
+
+### control-plane consistency check across nodes in a cluster
+
+Docker networking uses a gossip protocol to synchronize networking state across nodes in a cluster. ssd's `gossip-consistency` command verifies if the state maintained by all the nodes are consistent.
+
+````bash
+In a three node cluster with services running on an overlay network ov2 ssd consistency-checker shows
+This is hash digest of the control-plane state for the network `ov2` from all the cluster nodes. If the values have a mismatch `docker network inspect --verbose` on the individual nodes can help in identifying what the specific difference is.
+This document describes how libnetwork has been designed in order to achieve this.
+Requirements for individual releases can be found on the [Project Page](https://github.com/docker/libnetwork/wiki).
+
+Many of the design decisions are inspired by the learnings from the Docker networking design as of Docker v1.6.
+Please refer to this [Docker v1.6 Design](legacy.md) document for more information on networking design as of Docker v1.6.
+
+## Goal
+
+libnetwork project will follow Docker and Linux philosophy of developing small, highly modular and composable tools that work well independently.
+Libnetwork aims to satisfy that composable need for Networking in Containers.
+
+## The Container Network Model
+
+Libnetwork implements Container Network Model (CNM) which formalizes the steps required to provide networking for containers while providing an abstraction that can be used to support multiple network drivers. The CNM is built on 3 main components (shown below)
+
+
+
+**Sandbox**
+
+A Sandbox contains the configuration of a container's network stack.
+This includes management of the container's interfaces, routing table and DNS settings.
+An implementation of a Sandbox could be a Linux Network Namespace, a FreeBSD Jail or other similar concept.
+A Sandbox may contain *many* endpoints from *multiple* networks.
+
+**Endpoint**
+
+An Endpoint joins a Sandbox to a Network.
+An implementation of an Endpoint could be a `veth` pair, an Open vSwitch internal port or similar.
+An Endpoint can belong to only one network and it can belong to only one Sandbox, if connected.
+
+**Network**
+
+A Network is a group of Endpoints that are able to communicate with each-other directly.
+An implementation of a Network could be a Linux bridge, a VLAN, etc.
+Networks consist of *many* endpoints.
+
+## CNM Objects
+
+**NetworkController**
+`NetworkController` object provides the entry-point into libnetwork that exposes simple APIs for the users (such as Docker Engine) to allocate and manage Networks. libnetwork supports multiple active drivers (both inbuilt and remote). `NetworkController` allows user to bind a particular driver to a given network.
+
+**Driver**
+`Driver` is not a user visible object, but drivers provide the actual network implementation. `NetworkController` provides an API to configure a driver with driver-specific options/labels that is transparent to libnetwork, but can be handled by the drivers directly. Drivers can be both inbuilt (such as Bridge, Host, None & overlay) and remote (from plugin providers) to satisfy various use cases & deployment scenarios. At this point, the Driver owns a network and is responsible for managing the network (including IPAM, etc.). This can be improved in the future by having multiple drivers participating in handling various network management functionalities.
+
+**Network**
+`Network` object is an implementation of the `CNM : Network` as defined above. `NetworkController` provides APIs to create and manage `Network` object. Whenever a `Network` is created or updated, the corresponding `Driver` will be notified of the event. LibNetwork treats `Network` objects at an abstract level to provide connectivity between a group of endpoints that belong to the same network and isolation from the rest. The `Driver` performs the actual work of providing the required connectivity and isolation. The connectivity can be within the same host or across multiple hosts. Hence `Network` has a global scope within a cluster.
+
+**Endpoint**
+`Endpoint` represents a Service Endpoint. It provides the connectivity for services exposed by a container in a network with other services provided by other containers in the network. `Network` object provides APIs to create and manage an endpoint. An endpoint can be attached to only one network. `Endpoint` creation calls are made to the corresponding `Driver` which is responsible for allocating resources for the corresponding `Sandbox`. Since `Endpoint` represents a Service and not necessarily a particular container, `Endpoint` has a global scope within a cluster.
+
+**Sandbox**
+`Sandbox` object represents container's network configuration such as IP address, MAC address, routes, DNS entries. A `Sandbox` object is created when the user requests to create an endpoint on a network. The `Driver` that handles the `Network` is responsible for allocating the required network resources (such as the IP address) and passing the info called `SandboxInfo` back to libnetwork. libnetwork will make use of OS specific constructs (example: netns for Linux) to populate the network configuration into the containers that is represented by the `Sandbox`. A `Sandbox` can have multiple endpoints attached to different networks. Since `Sandbox` is associated with a particular container in a given host, it has a local scope that represents the Host that the Container belong to.
+
+**CNM Attributes**
+
+***Options***
+`Options` provides a generic and flexible mechanism to pass `Driver` specific configuration options from the user to the `Driver` directly. `Options` are just key-value pairs of data with `key` represented by a string and `value` represented by a generic object (such as a Go `interface{}`). Libnetwork will operate on the `Options` ONLY if the `key` matches any of the well-known `Labels` defined in the `net-labels` package. `Options` also encompasses `Labels` as explained below. `Options` are generally NOT end-user visible (in UI), while `Labels` are.
+
+***Labels***
+`Labels` are very similar to `Options` and are in fact just a subset of `Options`. `Labels` are typically end-user visible and are represented in the UI explicitly using the `--labels` option. They are passed from the UI to the `Driver` so that `Driver` can make use of it and perform any `Driver` specific operation (such as a subnet to allocate IP-Addresses from in a Network).
+
+## CNM Lifecycle
+
+Consumers of the CNM, like Docker, interact through the CNM Objects and its APIs to network the containers that they manage.
+
+1. `Drivers` register with `NetworkController`. Built-in drivers register inside of libnetwork, while remote drivers register with libnetwork via the Plugin mechanism (*plugin-mechanism is WIP*). Each `driver` handles a particular `networkType`.
+
+2. `NetworkController` object is created using `libnetwork.New()` API to manage the allocation of Networks and optionally configure a `Driver` with driver specific `Options`.
+
+3. `Network` is created using the controller's `NewNetwork()` API by providing a `name` and `networkType`. `networkType` parameter helps to choose a corresponding `Driver` and binds the created `Network` to that `Driver`. From this point, any operation on `Network` will be handled by that `Driver`.
+
+4. `controller.NewNetwork()` API also takes in optional `options` parameter which carries Driver-specific options and `Labels`, which the Drivers can make use of for its purpose.
+
+5. `network.CreateEndpoint()` can be called to create a new Endpoint in a given network. This API also accepts optional `options` parameter which drivers can make use of. These 'options' carry both well-known labels and driver-specific labels. Drivers will in turn be called with `driver.CreateEndpoint` and it can choose to reserve IPv4/IPv6 addresses when an `Endpoint` is created in a `Network`. The `Driver` will assign these addresses using `InterfaceInfo` interface defined in the `driverapi`. The IP/IPv6 are needed to complete the endpoint as service definition along with the ports the endpoint exposes since essentially a service endpoint is nothing but a network address and the port number that the application container is listening on.
+
+6. `endpoint.Join()` can be used to attach a container to an `Endpoint`. The Join operation will create a `Sandbox` if it doesn't exist already for that container. The Drivers can make use of the Sandbox Key to identify multiple endpoints attached to a same container. This API also accepts optional `options` parameter which drivers can make use of.
+ * Though it is not a direct design issue of LibNetwork, it is highly encouraged to have users like `Docker` to call the endpoint.Join() during Container's `Start()` lifecycle that is invoked *before* the container is made operational. As part of Docker integration, this will be taken care of.
+ * One of a FAQ on endpoint join() API is that, why do we need an API to create an Endpoint and another to join the endpoint.
+ - The answer is based on the fact that Endpoint represents a Service which may or may not be backed by a Container. When an Endpoint is created, it will have its resources reserved so that any container can get attached to the endpoint later and get a consistent networking behaviour.
+
+7. `endpoint.Leave()` can be invoked when a container is stopped. The `Driver` can cleanup the states that it allocated during the `Join()` call. LibNetwork will delete the `Sandbox` when the last referencing endpoint leaves the network. But LibNetwork keeps hold of the IP addresses as long as the endpoint is still present and will be reused when the container(or any container) joins again. This ensures that the container's resources are reused when they are Stopped and Started again.
+
+8. `endpoint.Delete()` is used to delete an endpoint from a network. This results in deleting an endpoint and cleaning up the cached `sandbox.Info`.
+
+9. `network.Delete()` is used to delete a network. LibNetwork will not allow the delete to proceed if there are any existing endpoints attached to the Network.
+
+
+## Implementation Details
+
+### Networks & Endpoints
+
+LibNetwork's Network and Endpoint APIs are primarily for managing the corresponding Objects and book-keeping them to provide a level of abstraction as required by the CNM. It delegates the actual implementation to the drivers which realize the functionality as promised in the CNM. For more information on these details, please see [the drivers section](#drivers)
+
+### Sandbox
+
+Libnetwork provides a framework to implement of a Sandbox in multiple operating systems. Currently we have implemented Sandbox for Linux using `namespace_linux.go` and `configure_linux.go` in `sandbox` package.
+This creates a Network Namespace for each sandbox which is uniquely identified by a path on the host filesystem.
+Netlink calls are used to move interfaces from the global namespace to the Sandbox namespace.
+Netlink is also used to manage the routing table in the namespace.
+
+## Drivers
+
+## API
+
+Drivers are essentially an extension of libnetwork and provide the actual implementation for all of the LibNetwork APIs defined above. Hence there is an 1-1 correspondence for all the `Network` and `Endpoint` APIs, which includes :
+* `driver.Config`
+* `driver.CreateNetwork`
+* `driver.DeleteNetwork`
+* `driver.CreateEndpoint`
+* `driver.DeleteEndpoint`
+* `driver.Join`
+* `driver.Leave`
+
+These Driver facing APIs make use of unique identifiers (`networkid`,`endpointid`,...) instead of names (as seen in user-facing APIs).
+
+The APIs are still work in progress and there can be changes to these based on the driver requirements especially when it comes to Multi-host networking.
+
+### Driver semantics
+
+ * `Driver.CreateEndpoint`
+
+This method is passed an interface `EndpointInfo`, with methods `Interface` and `AddInterface`.
+
+If the value returned by `Interface` is non-nil, the driver is expected to make use of the interface information therein (e.g., treating the address or addresses as statically supplied), and must return an error if it cannot. If the value is `nil`, the driver should allocate exactly one _fresh_ interface, and use `AddInterface` to record them; or return an error if it cannot.
+
+It is forbidden to use `AddInterface` if `Interface` is non-nil.
+
+## Implementations
+
+Libnetwork includes the following driver packages:
+
+- null
+- bridge
+- overlay
+- remote
+
+### Null
+
+The null driver is a `noop` implementation of the driver API, used only in cases where no networking is desired. This is to provide backward compatibility to the Docker's `--net=none` option.
+
+### Bridge
+
+The `bridge` driver provides a Linux-specific bridging implementation based on the Linux Bridge.
+For more details, please [see the Bridge Driver documentation](bridge.md).
+
+### Overlay
+
+The `overlay` driver implements networking that can span multiple hosts using overlay network encapsulations such as VXLAN.
+For more details on its design, please see the [Overlay Driver Design](overlay.md).
+
+### Remote
+
+The `remote` package does not provide a driver, but provides a means of supporting drivers over a remote transport.
+This allows a driver to be written in a language of your choice.
+For further details, please see the [Remote Driver Design](remote.md).
+During the Network and Endpoints lifecycle, the CNM model controls the IP address assignment for network and endpoint interfaces via the IPAM driver(s).
+Libnetwork has a default, built-in IPAM driver and allows third party IPAM drivers to be dynamically plugged. On network creation, the user can specify which IPAM driver libnetwork needs to use for the network's IP address management. This document explains the APIs with which the IPAM driver needs to comply, and the corresponding HTTPS request/response body relevant for remote drivers.
+
+
+## Remote IPAM driver
+
+On the same line of remote network driver registration (see [remote.md](./remote.md) for more details), libnetwork initializes the `ipams.remote` package with the `Init()` function. It passes a `ipamapi.Callback` as a parameter, which implements `RegisterIpamDriver()`. The remote driver package uses this interface to register remote drivers with libnetwork's `NetworkController`, by supplying it in a `plugins.Handle` callback. The remote drivers register and communicate with libnetwork via the Docker plugin package. The `ipams.remote` provides the proxy for the remote driver processes.
+
+
+## Protocol
+
+Communication protocol is the same as the remote network driver.
+
+## Handshake
+
+During driver registration, libnetwork will query the remote driver about the default local and global address spaces strings, and about the driver capabilities.
+More detailed information can be found in the respective section in this document.
+
+## Datastore Requirements
+
+It is the remote driver's responsibility to manage its database.
+
+## Ipam Contract
+
+The remote IPAM driver must serve the following requests:
+
+- **GetDefaultAddressSpaces**
+
+- **RequestPool**
+
+- **ReleasePool**
+
+- **Request address**
+
+- **Release address**
+
+
+The following sections explain each of the above requests' semantic, when they are called during network/endpoint lifecycle, and the corresponding payload for remote driver HTTP request/responses.
+
+
+## IPAM Configuration and flow
+
+A libnetwork user can provide IPAM related configuration when creating a network, via the `NetworkOptionIpam` setter function.
+The caller has to provide the IPAM driver name and may provide the address space and a list of `IpamConf` structures for IPv4 and a list for IPv6. The IPAM driver name is the only mandatory field. If not provided, network creation will fail.
+
+In the list of configurations, each element has the following form:
+
+```go
+// IpamConf contains all the ipam related configurations for a network
+type IpamConf struct {
+ // The master address pool for containers and network interfaces
+ PreferredPool string
+ // A subset of the master pool. If specified,
+ // this becomes the container pool
+ SubPool string
+ // Input options for IPAM Driver (optional)
+ Options map[string]string
+ // Preferred Network Gateway address (optional)
+ Gateway string
+ // Auxiliary addresses for network driver. Must be within the master pool.
+ // libnetwork will reserve them if they fall into the container pool
+ AuxAddresses map[string]string
+}
+```
+
+On network creation, libnetwork will iterate the list and perform the following requests to the IPAM driver:
+
+1. Request the address pool and pass the options along via `RequestPool()`.
+2. Request the network gateway address if specified. Otherwise request any address from the pool to be used as network gateway. This is done via `RequestAddress()`.
+3. Request each of the specified auxiliary addresses via `RequestAddress()`.
+
+If the list of IPv4 configurations is empty, libnetwork will automatically add one empty `IpamConf` structure. This will cause libnetwork to request IPAM driver an IPv4 address pool of the driver's choice on the configured address space, if specified, or on the IPAM driver default address space otherwise. If the IPAM driver is not able to provide an address pool, network creation will fail.
+If the list of IPv6 configurations is empty, libnetwork will not take any action.
+The data retrieved from the IPAM driver during the execution of point 1) to 3) will be stored in the network structure as a list of `IpamInfo` structures for IPv4 and a list for IPv6.
+
+On endpoint creation, libnetwork will iterate over the list of configs and perform the following operation:
+
+1. Request an IPv4 address from the IPv4 pool and assign it to the endpoint interface IPv4 address. If successful, stop iterating.
+2. Request an IPv6 address from the IPv6 pool (if exists) and assign it to the endpoint interface IPv6 address. If successful, stop iterating.
+
+Endpoint creation will fail if any of the above operation does not succeed
+
+On endpoint deletion, libnetwork will perform the following operations:
+
+1. Release the endpoint interface IPv4 address
+2. Release the endpoint interface IPv6 address if present
+
+On network deletion, libnetwork will iterate the list of `IpamData` structures and perform the following requests to ipam driver:
+
+1. Release the network gateway address via `ReleaseAddress()`
+2. Release each of the auxiliary addresses via `ReleaseAddress()`
+3. Release the pool via `ReleasePool()`
+
+### GetDefaultAddressSpaces
+
+GetDefaultAddressSpaces returns the default local and global address space names for this IPAM. An address space is a set of non-overlapping address pools isolated from other address spaces' pools. In other words, same pool can exist on N different address spaces. An address space naturally maps to a tenant name.
+In libnetwork, the meaning associated to `local` or `global` address space is that a local address space doesn't need to get synchronized across the
+cluster whereas the global address spaces does. Unless specified otherwise in the IPAM configuration, libnetwork will request address pools from the default local or default global address space based on the scope of the network being created. For example, if not specified otherwise in the configuration, libnetwork will request address pool from the default local address space for a bridge network, whereas from the default global address space for an overlay network.
+
+During registration, the remote driver will receive a POST message to the URL `/IpamDriver.GetDefaultAddressSpaces` with no payload. The driver's response should have the form:
+
+
+ {
+ "LocalDefaultAddressSpace": string
+ "GlobalDefaultAddressSpace": string
+ }
+
+
+
+### RequestPool
+
+This API is for registering an address pool with the IPAM driver. Multiple identical calls must return the same result.
+It is the IPAM driver's responsibility to keep a reference count for the pool.
+For this API, the remote driver will receive a POST message to the URL `/IpamDriver.RequestPool` with the following payload:
+
+ {
+ "AddressSpace": string
+ "Pool": string
+ "SubPool": string
+ "Options": map[string]string
+ "V6": bool
+ }
+
+
+Where:
+
+ * `AddressSpace` the IP address space. It denotes a set of non-overlapping pools.
+ * `Pool` The IPv4 or IPv6 address pool in CIDR format
+ * `SubPool` An optional subset of the address pool, an ip range in CIDR format
+ * `Options` A map of IPAM driver specific options
+ * `V6` Whether an IPAM self-chosen pool should be IPv6
+
+AddressSpace is the only mandatory field. If no `Pool` is specified IPAM driver may choose to return a self chosen address pool. In such case, `V6` flag must be set if caller wants an IPAM-chosen IPv6 pool. A request with empty `Pool` and non-empty `SubPool` should be rejected as invalid.
+If a `Pool` is not specified IPAM will allocate one of the default pools. When `Pool` is not specified, the `V6` flag should be set if the network needs IPv6 addresses to be allocated.
+
+A successful response is in the form:
+
+
+ {
+ "PoolID": string
+ "Pool": string
+ "Data": map[string]string
+ }
+
+
+Where:
+
+* `PoolID` is an identifier for this pool. Same pools must have same pool id.
+* `Pool` is the pool in CIDR format
+* `Data` is the IPAM driver supplied metadata for this pool
+
+
+### ReleasePool
+
+This API is for releasing a previously registered address pool.
+
+```go
+ReleasePool(poolID string) error
+```
+
+For this API, the remote driver will receive a POST message to the URL `/IpamDriver.ReleasePool` with the following payload:
+For this API, the remote driver will receive a POST message to the URL `/IpamDriver.RequestAddress` with the following payload:
+
+ {
+ "PoolID": string
+ "Address": string
+ "Options": map[string]string
+ }
+
+Where:
+
+* `PoolID` is the pool identifier
+* `Address` is the required address in regular IP form (A.B.C.D). If this address cannot be satisfied, the request fails. If empty, the IPAM driver chooses any available address on the pool
+* `Options` are IPAM driver specific options
+
+
+A successful response is in the form:
+
+
+ {
+ "Address": string
+ "Data": map[string]string
+ }
+
+
+Where:
+
+* `Address` is the allocated address in CIDR format (A.B.C.D/MM)
+* `Data` is some IPAM driver specific metadata
+
+### ReleaseAddress
+
+This API is for releasing an IP address.
+
+For this API, the remote driver will receive a POST message to the URL `/IpamDriver.ReleaseAddress` with the following payload:
+
+ {
+ "PoolID": string
+ "Address": string
+ }
+
+Where:
+
+* `PoolID` is the pool identifier
+* `Address` is the IP address to release
+
+
+
+### GetCapabilities
+
+During the driver registration, libnetwork will query the driver about its capabilities. It is not mandatory for the driver to support this URL endpoint. If driver does not support it, registration will succeed with empty capabilities automatically added to the internal driver handle.
+
+During registration, the remote driver will receive a POST message to the URL `/IpamDriver.GetCapabilities` with no payload. The driver's response should have the form:
+
+
+ {
+ "RequiresMACAddress": bool
+ "RequiresRequestReplay": bool
+ }
+
+
+
+## Capabilities
+
+Capabilities are requirements, features the remote ipam driver can express during registration with libnetwork.
+As of now libnetwork accepts the following capabilities:
+
+### RequiresMACAddress
+
+It is a boolean value which tells libnetwork whether the ipam driver needs to know the interface MAC address in order to properly process the `RequestAddress()` call.
+If true, on `CreateEndpoint()` request, libnetwork will generate a random MAC address for the endpoint (if an explicit MAC address was not already provided by the user) and pass it to `RequestAddress()` when requesting the IP address inside the options map. The key will be the `netlabel.MacAddress` constant: `"com.docker.network.endpoint.macaddress"`.
+
+### RequiresRequestReplay
+
+It is a boolean value which tells libnetwork whether the ipam driver needs to receive the replay of the `RequestPool()` and `RequestAddress()` requests on daemon reload. When libnetwork controller is initializing, it retrieves from local store the list of current local scope networks and, if this capability flag is set, it allows the IPAM driver to reconstruct the database of pools by replaying the `RequestPool()` requests for each pool and the `RequestAddress()` for each network gateway owned by the local networks. This can be useful to ipam drivers which decide not to persist the pools allocated to local scope networks.
+
+
+## Appendix
+
+A Go extension for the IPAM remote API is available at [docker/go-plugins-helpers/ipam](https://github.com/docker/go-plugins-helpers/tree/master/ipam)
+This document provides a TLD&R version of https://docs.docker.com/v1.6/articles/networking/.
+If more interested in detailed operational design, please refer to this link.
+
+## Docker Networking design as of Docker v1.6
+
+Prior to libnetwork, Docker Networking was handled in both Docker Engine and libcontainer.
+Docker Engine makes use of the Bridge Driver to provide single-host networking solution with the help of linux bridge and IPTables.
+Docker Engine provides simple configurations such as `--link`, `--expose`,... to enable container connectivity within the same host by abstracting away networking configuration completely from the Containers.
+For external connectivity, it relied upon NAT & Port-mapping
+
+Docker Engine was responsible for providing the configuration for the container's networking stack.
+
+Libcontainer would then use this information to create the necessary networking devices and move them in to a network namespace.
+This namespace would then be used when the container is started.
+The Macvlan driver provides operators the ability to integrate Docker networking in a simple and lightweight fashion into the underlying network. Macvlan is supported by the Linux kernel and is a well known Linux network type. The Macvlan built-in driver does not require any port mapping and supports VLAN trunking (Virtual Local Area Network). VLANs are a traditional method of network virtualization and layer 2 datapath isolation that is prevalent in some form or fashion in most data centers.
+
+The Linux implementation is considered lightweight because it eliminates the need for using a Linux bridge for isolating containers on the Docker host. The VLAN driver requires full access to the underlying host making it suitable for Enterprise data centers that have administrative access to the host.
+
+Instead of attaching container network interfaces to a Docker host Linux bridge for a network, the driver simply connects the container interface to the Docker Host Ethernet interface (or sub-interface). Each network is attached to a unique parent interface. Containers in a network share a common broadcast domain and intra-network connectivity is permitted. Two separate networks will each have a unique parent interface and that parent is what enforces datapath isolation between two networks. In order for inter-network communications to occur, an IP router, external to the Docker host, is required to route between the two networks by hair-pining into the physical network and then back to the Docker host. While hairpinning traffic can be less efficient then east/west traffic staying local to the host, there is often more complexity associated with disaggregating services to the host. It can be practical for some users to leverage existing network services, such firewalls and load balancers that already exist in a data center architecture.
+
+When using traditional Linux bridges there are two common techniques to get traffic out of a container and into the physical network and vice versa. The first method to connect containers to the underlying network is to use Iptable rules which perform a NAT translation from a bridge that represents the Docker network to the physical Ethernet connection such as `eth0`. The upside of Iptables using the Docker built-in bridge driver is that the NIC does not have to be in promiscuous mode. The second bridge driver method is to move a host's external Ethernet connection into the bridge. Moving the host Ethernet connection can at times be unforgiving. Common mistakes such as cutting oneself off from the host, or worse, creating bridging loops that can cripple a VLAN throughout a data center can open a network design up to potential risks as the infrastructure grows.
+
+Connecting containers without any NATing is where the VLAN drivers accel. Rather than having to manage a bridge for each Docker network containers are connected directly to a `parent` interface such as `eth0` that attaches the container to the same broadcast domain as the parent interface. A simple example is if a host's `eth0` is on the network `192.168.1.0/24` with a gateway of `192.168.1.1` then a Macvlan Docker network can start containers on the addresses `192.168.1.2 - 192.168.1.254`. Containers use the same network as the parent `-o parent` that is specified in the `docker network create` command.
+
+There are positive performance implication as a result of bypassing the Linux bridge, along with the simplicity of less moving parts, which is also attractive. Macvlan containers are easy to troubleshoot. The actual MAC and IP address of the container is bridged into the upstream network making a problematic application easy for operators to trace from the network. Existing underlay network management and monitoring tools remain relevant.
+
+### Pre-Requisites
+
+- The examples on this page are all single host and require Docker v1.12 or greater running on Linux.
+
+- Any examples using a sub-interface like `eth0.10` can be replaced with `eth0` or any other valid parent interface on the Docker host. Sub-interfaces with a `.` are dynamically created. The parent `-o parent` interface parameter can also be left out of the `docker network create` all together and the driver will create a `dummy` Linux type interface that will enable local host connectivity to perform the examples.
+
+- Kernel requirements:
+
+ - To check your current kernel version, use `uname -r` to display your kernel version.
+ - Macvlan Linux kernel v3.9–3.19 and 4.0+.
+
+### MacVlan Bridge Mode Example Usage
+
+- Macvlan driver networks are attached to a parent Docker host interface. Examples are a physical interface such as `eth0`, a sub-interface for 802.1q VLAN tagging like `eth0.10` (`.10` representing VLAN `10`) or even bonded `bond0` host adapters which bundle two Ethernet interfaces into a single logical interface and provide diversity in the server connection.
+
+- The specified gateway is external to the host that is expected to be provided by the network infrastructure. If a gateway is not specified using the `--gateway` parameter, then Libnetwork will infer the first usable address of a subnet. For example, if a network's subnet is `--subnet 10.1.100.0/24` and no gateway is specified, Libnetwork will assign a gateway of `10.1.100.1` to the container. A second example would be a subnet of `--subnet 10.1.100.128/25` would receive a gateway of `10.1.100.129`.
+
+- Containers on separate networks cannot reach one another without an external process routing between the two networks/subnets.
+
+- Each Macvlan Bridge mode Docker network is isolated from one another and there can be only one network attached to a parent interface at a time. There is a theoretical limit of 4,094 sub-interfaces per host adapter that a Docker network could be attached to.
+
+- The driver limits one network per parent interface. The driver does however accommodate secondary subnets to be allocated in a single Docker network for a multi-subnet requirement. The upstream router is responsible for proxy-arping between the two subnets.
+
+- Any Macvlan container sharing the same subnet can communicate via IP to any other container in the same subnet without a gateway. It is important to note, that the parent will go into promiscuous mode when a container is attached to the parent since each container has a unique MAC address. Alternatively, Ipvlan which is currently an experimental driver uses the same MAC address as the parent interface and thus precluding the need for the parent being promiscuous.
+
+In the following example, `eth0` on the docker host has an IP on the `172.16.86.0/24` network and a default gateway of `172.16.86.1`. The gateway is an external router with an address of `172.16.86.1`. An IP address is not required on the Docker host interface `eth0` in `bridge` mode, it merely needs to be on the proper upstream network to get forwarded by a network switch or network router.
+**Note** The Docker network subnet specified needs to match the network that parent interface of the Docker host for external communications. For example, use the same subnet and gateway of the Docker host ethernet interface specified by the `-o parent=` option. The parent interface is not required to have a IP address assigned to it, since this is simply L2 flooding and learning.
+
+- The parent interface used in this example is `eth0` and it is on the subnet `172.16.86.0/24`. The containers in the `docker network` will also need to be on this same subnet as the parent `-o parent=`. The gateway is an external router on the network.
+
+- Libnetwork driver types are specified with the `-d <driver_name>` option. In this case `-d macvlan`
+
+- The parent interface `-o parent=eth0` is configured as followed:
+
+```
+ip addr show eth0
+3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
+ inet 172.16.86.250/24 brd 172.16.86.255 scope global eth0
+```
+
+Create the macvlan network and run a couple of containers attached to it:
+
+```
+# Macvlan (-o macvlan_mode= Defaults to Bridge mode if not specified)
+docker network create -d macvlan \
+ --subnet=172.16.86.0/24 \
+ --gateway=172.16.86.1 \
+ -o parent=eth0 pub_net
+
+# Run a container on the new network specifying the --ip address.
+docker run --net=pub_net --ip=172.16.86.10 -itd alpine /bin/sh
+
+# Start a second container and ping the first
+docker run --net=pub_net -it --rm alpine /bin/sh
+ping -c 4 172.16.86.10
+
+```
+
+ Take a look at the containers ip and routing table:
+
+```
+
+ip a show eth0
+ eth0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
+# NOTE: the containers can NOT ping the underlying host interfaces as
+# they are intentionally filtered by Linux for additional isolation.
+# In this case the containers cannot ping the -o parent=172.16.86.250
+```
+
+
+Users can explicitly specify the `bridge` mode option `-o macvlan_mode=bridge` or leave the mode option out since the most common mode of `bridge` is the driver default.
+
+While the `eth0` interface does not need to have an IP address, it is not uncommon to have an IP address on the interface. Addresses can be excluded from getting an address from the default built in IPAM by using the `--aux-address=x.x.x.x` argument. This will blacklist the specified address from being handed out to containers from the built-in Libnetwork IPAM.
+
+- The following is the same network example as above, but blacklisting the `-o parent=eth0` address from being handed out to a container.
+
+```
+docker network create -d macvlan \
+ --subnet=172.16.86.0/24 \
+ --gateway=172.16.86.1 \
+ --aux-address="exclude_host=172.16.86.250" \
+ -o parent=eth0 pub_net
+```
+
+Another option for specifying what subpool or range of usable addresses is used by the default Docker IPAM driver is to use the argument `--ip-range=`. This instructs the driver to allocate container addresses from the specific range, rather then the broader range from the `--subnet=` argument.
+
+- The network create in the following example, allocates addresses beginning at `192.168.32.128` and increments n+1 upwards from there.
+
+```
+docker network create -d macvlan \
+ --subnet=192.168.32.0/24 \
+ --ip-range=192.168.32.128/25 \
+ --gateway=192.168.32.254 \
+ -o parent=eth0 macnet32
+
+# Start a container and verify the address is 192.168.32.128
+docker run --net=macnet32 -it --rm alpine /bin/sh
+```
+
+The network can then be deleted with:
+
+```
+docker network rm <network_name or id>
+```
+
+- **Note:** Linux Macvlan interface types are not able to ping or communicate with the default namespace IP address. For example, if you create a container and try to ping the Docker host's `eth0` it will **not** work. That traffic is explicitly filtered by the kernel to offer additional provider isolation and security. This is a common gotcha when a user first uses those Linux interface types since it is natural to ping local addresses when testing.
+
+For more on Docker networking commands see: [Working with Docker network commands](https://docs.docker.com/engine/userguide/networking/work-with-networks/)
+
+### Macvlan 802.1q Trunk Bridge Mode Example Usage
+
+VLANs have long been a primary means of virtualizing data center networks and are still in virtually all existing networks today. VLANs work by tagging a Layer-2 isolation domain with a 12-bit identifier ranging from 1-4094. The VLAN tag is inserted into a packet header that enables a logical grouping of a single subnet or multiple subnets of IPv4 and/or IPv6. It is very common for network operators to separate traffic using VLANs based on a subnet(s) function or security profile such as `web`, `db` or any other isolation requirements.
+
+It is very common to have a compute host requirement of running multiple virtual networks concurrently on a host. Linux networking has long supported VLAN tagging, also known by its standard 802.1Q, for maintaining datapath isolation between networks. The Ethernet link connected to a Docker host can be configured to support the 802.1q VLAN IDs by creating Linux sub-interfaces, each sub-interface being allocated a unique VLAN ID.
+Trunking 802.1q to a Linux host is notoriously painful for operations. It requires configuration file changes in order to be persistent through a reboot. If a bridge is involved, a physical NIC needs to be moved into the bridge and the bridge then gets the IP address. This has lead to many a stranded servers since the risk of cutting off access or misconfiguration is relatively high.
+
+Like all of the Docker network drivers, the overarching goal is to alleviate the operational pains of managing network resources. To that end, when a network receives a sub-interface as the parent that does not exist, the drivers create the VLAN tagged interfaces while creating the network. If the sub-interface already exists it is simply used as is.
+
+In the case of a host reboot, instead of needing to modify often complex network configuration files the driver will recreate all network links when the Docker daemon restarts. The driver tracks if it created the VLAN tagged sub-interface originally with the network create and will **only** recreate the sub-interface after a restart if it created the link in the first place.
+
+The same holds true if the network is deleted `docker network rm`. If driver created the sub-interface with `docker network create` it will remove the sub-interface link for the operator.
+
+If the user doesn't want Docker to create and delete the `-o parent` sub-interface, then you simply pass an interface that already exists as the parent link. Parent interfaces such as `eth0` are not deleted, only interfaces that are slave links.
+
+For the driver to add/delete the vlan sub-interfaces the format needs to be `-o parent interface_name.vlan_tag`.
+
+For example: `-o parent eth0.50` denotes a parent interface of `eth0` with a slave of `eth0.50` tagged with vlan id `50`. The equivalent `ip link` command would be `ip link add link eth0 name eth0.50 type vlan id 50`.
+
+Replace the `macvlan` with `ipvlan` in the `-d` driver argument to create macvlan 802.1q trunks.
+
+**Vlan ID 50**
+
+In the next example, the network is tagged and isolated by the Docker host. A parent of `eth0.50` will tag the Ethernet traffic with the vlan id `50` specified by the parent nomenclature `-o parent=eth0.50`. Other naming formats can be used, but the links need to be added and deleted manually using `ip link` or Linux configuration files. As long as the `-o parent` exists, anything can be used if compliant with Linux netlink.
+
+```
+# now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
+docker network create -d macvlan \
+ --subnet=192.168.50.0/24 \
+ --gateway=192.168.50.1 \
+ -o parent=eth0.50 macvlan50
+
+# In two separate terminals, start a Docker container and the containers can now ping one another.
+docker run --net=macvlan50 -it --name macvlan_test5 --rm alpine /bin/sh
+docker run --net=macvlan50 -it --name macvlan_test6 --rm alpine /bin/sh
+```
+
+**Vlan ID 60**
+
+In the second network, tagged and isolated by the Docker host, `eth0.60` is the parent interface tagged with vlan id `60` specified with `-o parent=eth0.60`. The `macvlan_mode=` defaults to `macvlan_mode=bridge`. It can also be explicitly set with the same result, as shown in the next example.
+
+```
+# now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged.
+docker network create -d macvlan \
+ --subnet=192.168.60.0/24 \
+ --gateway=192.168.60.1 \
+ -o parent=eth0.60 -o \
+ -o macvlan_mode=bridge macvlan60
+
+# In two separate terminals, start a Docker container and the containers can now ping one another.
+docker run --net=macvlan60 -it --name macvlan_test7 --rm alpine /bin/sh
+docker run --net=macvlan60 -it --name macvlan_test8 --rm alpine /bin/sh
+The same as the example before except there is an additional subnet bound to the network that the user can choose to provision containers on. In MacVlan/Bridge mode, containers can only ping one another if they are on the same subnet/broadcast domain unless there is an external router that routes the traffic (answers ARP etc) between the two subnets. Multiple subnets assigned to a network require a gateway external to the host that falls within the subnet range to hairpin the traffic back to the host.
+
+
+```
+docker network create -d macvlan \
+ --subnet=10.1.20.0/24 --subnet=10.1.10.0/24 \
+ --gateway=10.1.20.1 --gateway=10.1.10.1 \
+ -o parent=eth0.101 mcv101
+
+# View Links after to network create `ip link`
+$ ip link
+
+# Test 10.1.20.10.0/24 connectivity
+docker run --net=mcv101 --ip=10.1.20.9 -itd alpine /bin/sh
+# Run ip links again and verify the links are cleaned up
+ip link
+```
+
+Hosts on the same VLAN are typically on the same subnet and almost always are grouped together based on their security policy. In most scenarios, a multi-tier application is tiered into different subnets because the security profile of each process requires some form of isolation. For example, hosting your credit card processing on the same virtual network as the front-end web-server would be a regulatory compliance issue, along with circumventing the long standing best practice of layered defense in depth architectures. VLANs or the equivalent VNI (Virtual Network Identifier) when using the built-in Overlay driver, are the first step in isolating tenant traffic.
+
+
+
+
+### Dual Stack IPv4 IPv6 Macvlan Bridge Mode
+
+The following specifies both v4 and v6 addresses. An address from each family will be assigned to each container. You can specify either family type explicitly or allow the Libnetwork IPAM to assign them from the subnet pool.
+
+*Note on IPv6:* When declaring a v6 subnet with a `docker network create`, the flag `--ipv6` is required along with the subnet (in the following example `--subnet=2001:db8:abc8::/64`). Similar to IPv4 functionality, if a IPv6 `--gateway` is not specified, the first usable address in the v6 subnet is inferred and assigned as the gateway for the broadcast domain.
+
+The following example creates a network with multiple IPv4 and IPv6 subnets. The network is attached to a sub-interface of `eth0.218`. By specifying `eth0.218` as the parent, the driver will create the sub-interface (if it does not already exist) and tag all traffic for containers in the network with a VLAN ID of 218. The physical switch port on the ToR (top of rack) network port needs to have 802.1Q trunking enabled for communications in and out of the host to work.
+ inet6 2001:db8:abc8::50/64 scope global flags 02
+ valid_lft forever preferred_lft forever
+```
+
+The next example demonstrates how default gateways are inferred if the `--gateway` option is not specified for a subnet in the `docker network create ...` command. If the gateway is not specified, the first usable address in the subnet is selected. It also demonstrates how `--ip-range` and `--aux-address` are used in conjunction to exclude address assignments within a network and reserve sub-pools of usable addresses within a network's subnet. All traffic is untagged since `eth0` is used rather then a sub-interface.
+
+```
+docker network create -d macvlan \
+ --subnet=192.168.136.0/24 \
+ --subnet=192.168.138.0/24 \
+ --ipv6 --subnet=fd11::/64 \
+ --ip-range=192.168.136.0/25 \
+ --ip-range=192.168.138.0/25 \
+ --aux-address="reserved1=fd11::2" \
+ --aux-address="reserved2=192.168.136.2" \
+ --aux-address="reserved3=192.168.138.2" \
+ -o parent=eth0 mcv0
+
+docker run --net=mcv0 -it --rm alpine /bin/sh
+```
+
+Next is the output from a running container provisioned on the example network named `mcv0`.
+
+```
+# Container eth0 output (the fe80::42:c0ff:fea8:8803/64 address is the local link addr)
+ip address show eth0
+100: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
+# IPv6 routing table from within the container (the second v6 addresses is the local link addr)
+$ ip -6 route
+fd11::/64 dev eth0 metric 256
+fe80::/64 dev eth0 metric 256
+default via fd11::1 dev eth0 metric 1024
+```
+
+- After the examples, `docker rm -f `docker ps -qa`` can be used to remove all existing containers on the host, both running and stopped.
+
+A key takeaway is, operators have the ability to map their physical network into their virtual network for integrating containers into their environment with no operational overhauls required. NetOps simply drops an 802.1q trunk into the Docker host. That virtual link would be the `-o parent=` passed in the network creation. For untagged (non-VLAN) links, it is as simple as `-o parent=eth0` or for 802.1q trunks with VLAN IDs each network gets mapped to the corresponding VLAN/Subnet from the network.
+
+An example being, NetOps provides VLAN ID and the associated subnets for VLANs being passed on the Ethernet link to the Docker host server. Those values are simply plugged into the `docker network create` commands when provisioning the Docker networks. These are persistent configurations that are applied every time the Docker engine starts which alleviates having to manage often complex configuration files. The network interfaces can also be managed manually by being pre-created and docker networking will never modify them, simply use them as parent interfaces. Example mappings from NetOps to Docker network commands are as follows:
+If a user does not want the driver to create the vlan sub-interface it simply needs to exist prior to the `docker network create`. If you have sub-interface naming that is not `interface.vlan_id` it is honored in the `-o parent=` option again as long as the interface exists and us up.
+
+Links if manually created can be named anything you want. As long as the exist when the network is created that is all that matters. Manually created links do not get deleted regardless of the name when the network is deleted with `docker network rm`.
+
+```
+# create a new sub-interface tied to dot1q vlan 40
+ip link add link eth0 name eth0.40 type vlan id 40
+
+# enable the new sub-interface
+ip link set eth0.40 up
+
+# now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
+docker network create -d macvlan \
+ --subnet=192.168.40.0/24 \
+ --gateway=192.168.40.1 \
+ -o parent=eth0.40 macvlan40
+
+# in two separate terminals, start a Docker container and the containers can now ping one another.
+docker run --net=macvlan40 -it --name mcv_test5 --rm alpine /bin/sh
+docker run --net=macvlan40 -it --name mcv_test6 --rm alpine /bin/sh
+```
+
+**Example:** Vlan sub-interface manually created with any name:
+
+```
+# create a new sub interface tied to dot1q vlan 40
+ip link add link eth0 name foo type vlan id 40
+
+# enable the new sub-interface
+ip link set foo up
+
+# now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
+# in two separate terminals, start a Docker container and the containers can now ping one another.
+docker run --net=macvlan40 -it --name mcv_test5 --rm alpine /bin/sh
+docker run --net=macvlan40 -it --name mcv_test6 --rm alpine /bin/sh
+```
+
+Manually created links can be cleaned up with:
+
+```
+ip link del foo
+```
+
+As with all of the Libnetwork drivers, networks of various driver types can be mixed and matched. This even applies to 3rd party ecosystem drivers that can be run in parallel with built-in drivers for maximum flexibility to the user.
+This document describes docker networking in bridge and overlay mode delivered via libnetwork. Libnetwork uses iptables extensively to configure NATting and forwarding rules. [https://wiki.archlinux.org/index.php/iptables](https://wiki.archlinux.org/index.php/iptables) provides a good introduction to iptables and its default chains. More details may be found in [http://ipset.netfilter.org/iptables-extensions.man.html](http://ipset.netfilter.org/iptables-extensions.man.html)
+The above diagram illustrates the network topology when a container instantiated with network mode set to bridge by docker engine. In this case, the libnetwork does the following
+
+
+
+1. Creates a new network namespace container NS for this container
+2. Creates a veth-pair, attaching one end to docker0 bridge on host NS, and move the other end to the new container NS.
+3. In the new NS, assigns an IP address from docker0 subnet, sets default route gateway to docker0 IP address.
+
+This completes network setup for container running in bridge mode. Outbound traffic from container flows through routing (Container NS) ? veth-pair ? docker0 bridge (Host NS) ---> docker0 interface (HOST NS) ? routing (HostNS) ? eth0 (HostNS) and out of host. And inbound traffic to container flows through the reverse direction.
+
+Note that the container?s assigned IP (172.17.0.2 in above example) address is on docker0 subnet, and is not visible to externally to host. For this reason, a default masquerading rule is added to nat iptable?s POSTROUTING chain in host NS at docker engine initialization time. It states that for request traffic flow that has gone through the routing stage and the srcIP is within docker0 subnet (172.17.0.0/16), the traffic request must be originated from docker containers, therefore its srcIP is replaced with IP of outbound interface determined by routing. In the above diagram eth0?s IP 172.31.2.1 is used by replacement IP. In another word, masquerade is same as SNAT with replacement srcIP set to outbound interface?s IP.
+
+If the container backends a service and has a listening targetPort in the container NS, it also must also have a corresponding publishedPort in host NS to receive the request and forward it to the container. Two rules are created in host NS for this purpose:
+
+
+
+1. In nat iptable, a DOCKER(nat) chain is inserted to PREROUTING chain. And a rule such as ?DNAT tcp any any dport:45999 to conrts? is added to DOCKER chain, it does a DNAT for any traffic arriving at eth0 of host NS with dstIP=172.17.0.2 and dst Port=80, so that the DNATted request become routable to backend container listening on port 80.
+2. In filter iptable, a DOCKER(filter) chain is inserted to FORWARD chain. And a rule such as ?ACCEPT tcp any containerIP dport:targetPort? is added. This allows request that is DNATted in 1) to be forwarded container.
+
+
+# Swarm/Overlay Mode
+
+Libnetwork use completely different set of namespaces, bridges, and iptables to forward container traffic in swarm/overlay mode.
+As depicted in the above diagram, when a host joinis a swarm cluster, the docker engine creates following network topology.
+
+Initial Setup ( before any services are created)
+
+
+
+1. In host NS, creates a docker_gwbridge bridge, assigning a subnet range to this bridge. In this above diagram 172.18.0.1/16. This subnet is local, and does not leak outside of the host.
+2. In host NS, adds masquerading rule in nat iptable (5) PREROUTING chain for any request with srcIP within 172.18.0.0/16 subnet.
+3. Creates a new network namespace ingress_sbox NS, creates two veth-pairs, one eth1 connects to docker_gwbridge bridge with fixed IP 172.18.0.2, and other (eth0) connected to ingress NS bridge br0. The eth0 is assigned an IP address, in this example, 10.255.0.2.
+4. In ingress_sbox NS, adds to nat iptable(2)?s PREROUTING chain a rule that snat and redirect service request to ipvs for load-balancing. For instance,
+
+ ?SNAT all -- anywhere 10.255.0.0/16 ipvs to:10.255.0.2?
+
+5. In ingress_sbox NS, adds to nat iptable(2)?s POSTROUTING and OUTPUT chains rules that allows DNS lookup to be redirected to/from docker engine as is required by swarm service discovery.
+6. Creates a new network namespace ingress NS, and creates a bridge br0 that has two links, one attaches to eth0 interface on ingress_sbox NS, and other to vxlan interface that in essence makes bridge br0 span across all hosts on the same swarm cluster. Each eth0 interface in ingress_sbox, each container instance of a service, and each service itself are given a unique IP 10.255.xx.xx, and is attached to br0, so that on the same swarm cluster, services, container instances of services, and ingress_sbox?s eth0 are all connected via bridge br0.
+
+When a service is created, say with targetPort=80 and publishedPort=30000. The following are added to the existing network topology.
+
+Service Setup
+
+
+
+1. A container backends the service has its own namespace container NS. And two vether-pairs are created, eth1 is attached to docker_gwbridge in host NS, is given an IP assigned from docker_gwbridge subnet, in the example, 172.18.0.2, eth0 is attached to br0 in ingress NS, is given an IP of 10.255.0.5 in this example.
+2. In container NS, adds rules to filter iptable (3)?s INPUT and OUTPUT chain that only allows targetPortt =80 traffic.
+3. In container NS, adds rule to nat iptables (4)?s PREROUTING chain that changes publishedPort to targetPort. For instance,
+4. In container NS, adds rules to nat iptable(4)?s INPUT/OUTPUT chain that allow DNS lookup in this container to be redirected to docker engine.
+5. In host NS, in filter iptable (5)?s FORWARD chain, inserts DOCKER-INGRESS chain, and adds a rule to allow service request to port 30000 and its reply.
+
+ I.e ?ACCEPT tcp -- anywhere anywhere tcp dpt:30000? and
+
+
+ ?ACCEPT tcp -- anywhere anywhere state RELATED,ESTABLISHED tcp spt:30000?
+
+6. In host NS, in nat iptable(6)?s PREROUTING chain, inserts (different) DOCKER-INGRESS chain, and adds a rule to dnat service request to ingress NS?s eth1?s IP (172.18.0.2). i.e
+7. In ingress_sbox NS, in mangle iptable(1)?s PREROUTING chain, adds a rule to mark service request, i.e
+
+ ?MARK tcp -- anywhere anywhere tcp dpt:30000 MARK set 0x100?
+
+8. In ingress_sbox NS, in nat iptable(2)?s POSTROUTING chain, adds a rule to snat request?s srcIP to eth1?s IP, and forward to ipvs to load-balancing, i.e
+
+ ?SNAT all -- 0.0.0.0/0 10.255.0.0/16 ipvs to:10.255.0.2?
+
+9. In ingress_sbox, configures ipvs LB policy for marked traffic, i.e
+
+ FWM 256 rr
+
+
+ -> 10.255.0.5:0 Masq 1 0 0
+
+
+ -> 10.255.0.7:0 Masq 1 0 0
+
+
+ -> 10.255.0.8:0 Masq 1 0 0
+
+
+ Here each of 10.255.0.x represents IP address of of container instance backending the service.
+
+
+
+## Service Traffic Flow
+
+This section describes traffic flow of request and reply to/from a service with publishedPort = 30000, targetPort = 80
+
+
+
+1. Request arrives at eth0 in host NS, with dstIP=172.31.2.1, dstPort=30000, srcIP=CLIENT_IP, srcPort=CLIENT_PORT. Before routing, It first goes through NAT rule in service setup (6) that dnats request with dstIP=172.18.0.2; It then go through FORWARD rule in service setup (5) during routing that allows request with dstPort=30000 to go through. The routing then forward request to docker_gwbridge, and in turn ...
+2. The request arrives at eth1 in ingress_sbox NS with dstIP=172.18.0.2, dstPort=30000, srcIP=CLIENT_IP, srcPort=CLIENT_PORT. Before routing, the request is marked before by mangle iptable rule in service setup (7). After routing, it is snated with eth1?s IP 10.255.0.2, forwarded to ipvs for LB by nat iptable rule in service setup (8). The ipvs policy in setup (9) picks one container instance of the service, in this example 10.255.0.5 and dnats it.
+3. The request arrives at br0 in ingress NS with dstIP=10.255.0.5, dstPort=30000, srcIP=10.255.0.2, srcPort=EPHEMERAL_PORT. For simplicity, we assume the container instance 10.255.0,5 is the local host, therefore simply forwards it. Note since br0 spans across all hosts in the cluster via vxlan, with all services instances latching onto it, so whether to pick remote or local container instance, it does not change the routing policy configuration.
+4. The request arrives at eth0 of container NS with dstIP=10.255.0.5, dstPort=30000, srcIP=10.255.0.x (eth1 IP in ingress_sbox NS), srcPort=EMPHEMERAL_PORT. Before routing, it?s dstPort is changed to 80 via nat rule in service setup (3), and is allowed to be forwarded to local process by INPUT rule in service setup (2) post routig. The process listening on tcp:80 receives request with dstIP=10.255.0.5, dstPort=80, srcIP=10.255.0.2, , srcPort=EPHEMERAL_PORT.
+5. The process replies, The reply has dstIP=10.255.0.2, dstPort=EPHEMERAL_PORT, srcIp=not_known, srcPort=80. It goes through filter rule in OUTPUT chain in service setup(2), which allows it to pass. It goes through routing that determines outbound interface is eth1, and srcIP=10.255.0.5; and it ?un-dnats? srcPort=80 to 30000 via nat table rule in service setup (3).
+6. The reply arrives at br0 in ingress NS with dstIP=10.255.0.2, dstPort=EPHEMERAL_PORT, srcIP=10.255.0.5, srcPort=30000, which duly forwarded it to ...
+7. The eh0 interface in sb_ingress NS. The reply first go through ipvs LB that ?un-dnats? srcIP from 10.255.0.5 to 172.18.0.2; then ?un-snats? via nat rule in service setup (8) dstIP from 10.255.0.2 to CLIENT_IP, dstPort from EMPHERAL_PORT to CLIENT_PORT.
+8. The reply arrives at docker_gwbridge0 interface of host NS with dstIP=CLIENT_IP, dstPort=CLIENT_PORT, srcIP=172.18.0.2, srcPort=30000. The reply ?un-snats? with nat rule in service setup(6) with srcIP changes to 172.31.2.1. And is then forwarded out of eth0 interface, and complete the traffic flow. From external view, request enters host with dstIP=172.31.2.1, dstPort=30000, srcIP=CLIENT_IP, srcPort=CLIENT_PORT; and reply exits with dstIP=CLIENT_IP, dstPort=CLIENT_PORT, srcIP=172.31.2.1, srcPort=30000.
+
+
+## Other Flows
+
+**Northbound traffic originated from a container instance, for example, ping [www.cnn.com](www.cnn.com):**
+
+The traffic flow is exactly the same as in bridge mode, except it is via docker_gwbridge in host NS, and traffic is masqueraded with nat rule in initial setup (2).
+
+**DNS traffic**
+
+DNS lookup traffic is routed to docker engine from container instance for service discovery, filling the blank.
+
+
+# Other IPTable Chain and Rules
+
+Other iptable chains and rules created and/or managed by docker engine/libnetwork.
+
+**DOCKER-USER**: inserted as the first rule to FORWARD chain of filter iptable in host NS. So that user can independently managed traffic that may or may not be related docker containers.
+
+**DOCKER-ISOLATION-STAGE-1** / 2: Filling in the blank
+- A persistent database that stores the network configuration requested by the user. This is typically the SwarmKit managers' raft store.
+- A non-persistent peer-to-peer gossip-based database that keeps track of the current runtime state. This is NetworkDB.
+
+NetworkDB is based on the [SWIM][] protocol, which is implemented by the [memberlist][] library.
+`memberlist` manages cluster membership (nodes can join and leave), as well as message encryption.
+Members of the cluster send each other ping messages from time to time, allowing the cluster to detect when a node has become unavailable.
+
+The information held by each node in NetworkDB is:
+
+- The set of nodes currently in the cluster (plus nodes that have recently left or failed).
+- For each peer node, the set of networks to which that node is connected.
+- For each of the node's currently-in-use networks, a set of named tables of key/value pairs.
+ Note that nodes only keep track of tables for networks to which they belong.
+
+Updates spread through the cluster from node to node, and nodes may have inconsistent views at any given time.
+They will eventually converge (quickly, if the network is operating well).
+Nodes look up information using their local networkdb instance. Queries are not sent to remote nodes.
+
+NetworkDB does not impose any structure on the tables; they are just maps from `string` keys to `[]byte` values.
+Other components in libnetwork use the tables for their own purposes.
+For example, there are tables for service discovery and load balancing,
+and the [overlay](overlay.md) driver uses NetworkDB to store routing information.
+Updates to a network's tables are only shared between nodes that are on that network.
+
+All libnetwork nodes join the gossip cluster.
+To do this, they need the IP address and port of at least one other member of the cluster.
+In the case of a SwarmKit cluster, for example, each Docker engine will use the IP addresses of the swarm managers as the initial join addresses.
+The `Join` method can be used to update these bootstrap IPs if they change while the system is running.
+
+When joining the cluster, the new node will initially synchronise its cluster-wide state (known nodes and networks, but not tables) with at least one other node.
+The state will be mostly kept up-to-date by small UDP gossip messages, but each node will also periodically perform a push-pull TCP sync with another random node.
+In a push-pull sync, the initiator sends all of its cluster-wide state to the target, and the target then sends all of its own state back in response.
+
+Once part of the gossip cluster, a node will also send a `NodeEventTypeJoin` message, which is a custom message defined by NetworkDB.
+This is not actually needed now, but keeping it is useful for backwards compatibility with nodes running previous versions.
+
+While a node is active in the cluster, it can join and leave networks.
+When a node wants to join a network, it will send a `NetworkEventTypeJoin` message via gossip to the whole cluster.
+It will also perform a bulk-sync of the network-specific state (the tables) with every other node on the network being joined.
+This will allow it to get all the network-specific information quickly.
+The tables will mostly be kept up-to-date by UDP gossip messages between the nodes on that network, but
+each node in the network will also periodically do a full TCP bulk sync of the tables with another random node on the same network.
+
+Note that there are two similar, but separate, gossip-and-periodic-sync mechanisms here:
+
+1. memberlist-provided gossip and push-pull sync of cluster-wide state, involving all nodes in the cluster.
+2. networkdb-provided gossip and bulk sync of network tables, for each network, involving just those nodes in that network.
+
+When a node wishes to leave a network, it will send a `NetworkEventTypeLeave` via gossip. It will then delete the network's table data.
+When a node hears that another node is leaving a network, it deletes all table entries belonging to the leaving node.
+Deleting an entry in this case means marking it for deletion for a while, so that we can detect and ignore any older events that may arrive about it.
+
+When a node wishes to leave the cluster, it will send a `NodeEventTypeLeave` message via gossip.
+Nodes receiving this will mark the node as "left".
+The leaving node will then send a memberlist leave message too.
+If we receive the memberlist leave message without first getting the `NodeEventTypeLeave` one, we mark the node as failed (for a while).
+Every node periodically attempts to reconnect to failed nodes, and will do a push-pull sync of cluster-wide state on success.
+On success we also send the node a `NodeEventTypeJoin` and then do a bulk sync of network-specific state for all networks that we have in common.
+**host-2** Start a container that publishes a service svc2 in the network dev that is managed by overlay driver.
+
+```
+$ docker run -i -t --publish-service=svc2.dev.overlay debian
+root@d217828eb876:/# ping svc1
+PING svc1 (172.21.0.16): 56 data bytes
+64 bytes from 172.21.0.16: icmp_seq=0 ttl=64 time=0.706 ms
+64 bytes from 172.21.0.16: icmp_seq=1 ttl=64 time=0.687 ms
+64 bytes from 172.21.0.16: icmp_seq=2 ttl=64 time=0.841 ms
+```
+### Detailed Setup
+
+You can also setup networks and services and then attach a running container to them.
+
+**host-1**:
+
+```
+docker network create -d overlay prod
+docker network ls
+docker network info prod
+docker service publish db1.prod
+cid=$(docker run -itd -p 8000:8000 ubuntu)
+docker service attach $cid db1.prod
+```
+
+**host-2**:
+
+```
+docker network ls
+docker network info prod
+docker service publish db2.prod
+cid=$(docker run -itd -p 8000:8000 ubuntu)
+docker service attach $cid db2.prod
+```
+
+Once a container is started, a container on `host-1` and `host-2` both containers should be able to ping one another via IP, service name, \<service name>.\<network name>
+
+
+View information about the networks and services using `ls` and `info` subcommands like so:
+The `drivers.remote` package provides the integration point for dynamically-registered drivers. Unlike the other driver packages, it does not provide a single implementation of a driver; rather, it provides a proxy for remote driver processes, which are registered and communicate with LibNetwork via the Docker plugin package.
+
+For the semantics of driver methods, which correspond to the protocol below, please see the [overall design](design.md).
+
+## LibNetwork integration with the Docker `plugins` package
+
+When LibNetwork initializes the `drivers.remote` package with the `Init()` function, it passes a `DriverCallback` as a parameter, which implements `RegisterDriver()`. The remote driver package uses this interface to register remote drivers with LibNetwork's `NetworkController`, by supplying it in a `plugins.Handle` callback.
+
+The callback is invoked when a driver is loaded with the `plugins.Get` API call. How that comes about is out of scope here (but it might be, for instance, when that driver is mentioned by the user).
+
+This design ensures that the details of driver registration mechanism are owned by the remote driver package, and it doesn't expose any of the driver layer to the North of LibNetwork.
+
+## Implementation
+
+The remote driver implementation uses a `plugins.Client` to communicate with the remote driver process. The `driverapi.Driver` methods are implemented as RPCs over the plugin client.
+
+The payloads of these RPCs are mostly direct translations into JSON of the arguments given to the method. There are some exceptions to account for the use of the interfaces `InterfaceInfo` and `JoinInfo`, and data types that do not serialise to JSON well (e.g., `net.IPNet`). The protocol is detailed below under "Protocol".
+
+## Usage
+
+A remote driver proxy follows all the rules of any other in-built driver and has exactly the same `Driver` interface exposed. LibNetwork will also support driver-specific `options` and user-supplied `labels` which may influence the behaviour of a remote driver process.
+
+## Protocol
+
+The remote driver protocol is a set of RPCs, issued as HTTP POSTs with JSON payloads. The proxy issues requests, and the remote driver process is expected to respond usually with a JSON payload of its own, although in some cases these are empty maps.
+
+### Errors
+
+If the remote process cannot decode, or otherwise detects a syntactic problem with the HTTP request or payload, it must respond with an HTTP error status (4xx or 5xx).
+
+If the remote process http server receives a request for an unknown URI, it should respond with the HTTP StatusCode `404 Not Found`. This allows LibNetwork to detect when a remote driver does not implement yet a newly added method, therefore not to deem the request as failed.
+
+If the remote process can decode the request, but cannot complete the operation, it must send a response in the form
+
+ {
+ "Err": string
+ }
+
+The string value supplied may appear in logs, so should not include confidential information.
+
+### Handshake
+
+When loaded, a remote driver process receives an HTTP POST on the URL `/Plugin.Activate` with no payload. It must respond with a manifest of the form
+
+ {
+ "Implements": ["NetworkDriver"]
+ }
+
+Other entries in the list value are allowed; `"NetworkDriver"` indicates that the plugin should be registered with LibNetwork as a driver.
+
+### Set capability
+
+After Handshake, the remote driver will receive another POST message to the URL `/NetworkDriver.GetCapabilities` with no payload. The driver's response should have the form:
+
+ {
+ "Scope": "local"
+ "ConnectivityScope": "global"
+ }
+
+Value of "Scope" should be either "local" or "global" which indicates whether the resource allocations for this driver's network can be done only locally to the node or globally across the cluster of nodes. Any other value will fail driver's registration and return an error to the caller.
+Similarly, value of "ConnectivityScope" should be either "local" or "global" which indicates whether the driver's network can provide connectivity only locally to this node or globally across the cluster of nodes. If the value is missing, libnetwork will set it to the value of "Scope". should be either "local" or "global" which indicates
+
+### Create network
+
+When the proxy is asked to create a network, the remote process shall receive a POST to the URL `/NetworkDriver.CreateNetwork` of the form
+
+ {
+ "NetworkID": string,
+ "IPv4Data" : [
+ {
+ "AddressSpace": string,
+ "Pool": ipv4-cidr-string,
+ "Gateway" : ipv4-cidr-string,
+ "AuxAddresses": {
+ "<identifier1>" : "<ipv4-address1>",
+ "<identifier2>" : "<ipv4-address2>",
+ ...
+ }
+ },
+ ],
+ "IPv6Data" : [
+ {
+ "AddressSpace": string,
+ "Pool": ipv6-cidr-string,
+ "Gateway" : ipv6-cidr-string,
+ "AuxAddresses": {
+ "<identifier1>" : "<ipv6-address1>",
+ "<identifier2>" : "<ipv6-address2>",
+ ...
+ }
+ },
+ ],
+ "Options": {
+ ...
+ }
+ }
+
+* `NetworkID` value is generated by LibNetwork which represents a unique network.
+* `Options` value is the arbitrary map given to the proxy by LibNetwork.
+* `IPv4Data` and `IPv6Data` are the ip-addressing data configured by the user and managed by IPAM driver. The network driver is expected to honor the ip-addressing data supplied by IPAM driver. The data include,
+* `AddressSpace` : A unique string represents an isolated space for IP Addressing
+* `Pool` : A range of IP Addresses represented in CIDR format address/mask. Since, the IPAM driver is responsible for allocating container ip-addresses, the network driver can make use of this information for the network plumbing purposes.
+* `Gateway` : Optionally, the IPAM driver may provide a Gateway IP address in CIDR format for the subnet represented by the Pool. The network driver can make use of this information for the network plumbing purposes.
+* `AuxAddresses` : A list of pre-allocated ip-addresses with an associated identifier as provided by the user to assist network driver if it requires specific ip-addresses for its operation.
+
+The response indicating success is empty:
+
+ {}
+
+### Delete network
+
+When a network owned by the remote driver is deleted, the remote process shall receive a POST to the URL `/NetworkDriver.DeleteNetwork` of the form
+
+ {
+ "NetworkID": string
+ }
+
+The success response is empty:
+
+ {}
+
+### Create endpoint
+
+When the proxy is asked to create an endpoint, the remote process shall receive a POST to the URL `/NetworkDriver.CreateEndpoint` of the form
+
+ {
+ "NetworkID": string,
+ "EndpointID": string,
+ "Options": {
+ ...
+ },
+ "Interface": {
+ "Address": string,
+ "AddressIPv6": string,
+ "MacAddress": string
+ }
+ }
+
+The `NetworkID` is the generated identifier for the network to which the endpoint belongs; the `EndpointID` is a generated identifier for the endpoint.
+
+`Options` is an arbitrary map as supplied to the proxy.
+
+The `Interface` value is of the form given. The fields in the `Interface` may be empty; and the `Interface` itself may be empty. If supplied, `Address` is an IPv4 address and subnet in CIDR notation; e.g., `"192.168.34.12/16"`. If supplied, `AddressIPv6` is an IPv6 address and subnet in CIDR notation. `MacAddress` is a MAC address as a string; e.g., `"6e:75:32:60:44:c9"`.
+
+A success response is of the form
+
+ {
+ "Interface": {
+ "Address": string,
+ "AddressIPv6": string,
+ "MacAddress": string
+ }
+ }
+
+with values in the `Interface` as above. As far as the value of `Interface` is concerned, `MacAddress` and either or both of `Address` and `AddressIPv6` must be given.
+
+If the remote process was supplied a non-empty value in `Interface`, it must respond with an empty `Interface` value. LibNetwork will treat it as an error if it supplies a non-empty value and receives a non-empty value back, and roll back the operation.
+
+### Endpoint operational info
+
+The proxy may be asked for "operational info" on an endpoint. When this happens, the remote process shall receive a POST to `/NetworkDriver.EndpointOperInfo` of the form
+
+ {
+ "NetworkID": string,
+ "EndpointID": string
+ }
+
+where `NetworkID` and `EndpointID` have meanings as above. It must send a response of the form
+
+ {
+ "Value": { ... }
+ }
+
+where the value of the `Value` field is an arbitrary (possibly empty) map.
+
+### Delete endpoint
+
+When an endpoint is deleted, the remote process shall receive a POST to the URL `/NetworkDriver.DeleteEndpoint` with a body of the form
+
+ {
+ "NetworkID": string,
+ "EndpointID": string
+ }
+
+where `NetworkID` and `EndpointID` have meanings as above. A success response is empty:
+
+ {}
+
+### Join
+
+When a sandbox is given an endpoint, the remote process shall receive a POST to the URL `NetworkDriver.Join` of the form
+
+ {
+ "NetworkID": string,
+ "EndpointID": string,
+ "SandboxKey": string,
+ "Options": { ... }
+ }
+
+The `NetworkID` and `EndpointID` have meanings as above. The `SandboxKey` identifies the sandbox. `Options` is an arbitrary map as supplied to the proxy.
+
+The response must have the form
+
+ {
+ "InterfaceName": {
+ SrcName: string,
+ DstPrefix: string
+ },
+ "Gateway": string,
+ "GatewayIPv6": string,
+ "StaticRoutes": [{
+ "Destination": string,
+ "RouteType": int,
+ "NextHop": string,
+ }, ...]
+ }
+
+`Gateway` is optional and if supplied is an IP address as a string; e.g., `"192.168.0.1"`. `GatewayIPv6` is optional and if supplied is an IPv6 address as a string; e.g., `"fe80::7809:baff:fec6:7744"`.
+
+The entries in `InterfaceName` represent actual OS level interfaces that should be moved by LibNetwork into the sandbox; the `SrcName` is the name of the OS level interface that the remote process created, and the `DstPrefix` is a prefix for the name the OS level interface should have after it has been moved into the sandbox (LibNetwork will append an index to make sure the actual name does not collide with others).
+
+The entries in `"StaticRoutes"` represent routes that should be added to an interface once it has been moved into the sandbox. Since there may be zero or more routes for an interface, unlike the interface name they can be supplied in any order.
+
+Routes are either given a `RouteType` of `0` and a value for `NextHop`; or, a `RouteType` of `1` and no value for `NextHop`, meaning a connected route.
+
+If no gateway and no default static route is set by the driver in the Join response, LibNetwork will add an additional interface to the sandbox connecting to a default gateway network (a bridge network named *docker_gwbridge*) and program the default gateway into the sandbox accordingly, pointing to the interface address of the bridge *docker_gwbridge*.
+
+### Leave
+
+If the proxy is asked to remove an endpoint from a sandbox, the remote process shall receive a POST to the URL `/NetworkDriver.Leave` of the form
+
+ {
+ "NetworkID": string,
+ "EndpointID": string
+ }
+
+where `NetworkID` and `EndpointID` have meanings as above. The success response is empty:
+
+ {}
+
+### DiscoverNew Notification
+
+LibNetwork listens to inbuilt docker discovery notifications and passes it along to the interested drivers.
+
+When the proxy receives a DiscoverNew notification, the remote process shall receive a POST to the URL `/NetworkDriver.DiscoverNew` of the form
+
+ {
+ "DiscoveryType": int,
+ "DiscoveryData": {
+ ...
+ }
+ }
+
+`DiscoveryType` represents the discovery type. Each Discovery Type is represented by a number.
+`DiscoveryData` carries discovery data the structure of which is determined by the DiscoveryType
+
+The response indicating success is empty:
+
+ {}
+
+* Node Discovery
+
+Node Discovery is represented by a `DiscoveryType` value of `1` and the corresponding `DiscoveryData` will carry Node discovery data.
+
+ {
+ "DiscoveryType": int,
+ "DiscoveryData": {
+ "Address" : string
+ "self" : bool
+ }
+ }
+
+### DiscoverDelete Notification
+
+When the proxy receives a DiscoverDelete notification, the remote process shall receive a POST to the URL `/NetworkDriver.DiscoverDelete` of the form
+
+ {
+ "DiscoveryType": int,
+ "DiscoveryData": {
+ ...
+ }
+ }
+
+`DiscoveryType` represents the discovery type. Each Discovery Type is represented by a number.
+`DiscoveryData` carries discovery data the structure of which is determined by the DiscoveryType
+
+The response indicating success is empty:
+
+ {}
+
+* Node Discovery
+
+Similar to the DiscoverNew call, Node Discovery is represented by a `DiscoveryType` value of `1` and the corresponding `DiscoveryData` will carry Node discovery data to be deleted.
+This documentation highlights how to use Vagrant to start a three nodes setup to test Docker network.
+
+## Pre-requisites
+
+This was tested on:
+
+- Vagrant 1.7.2
+- VirtualBox 4.3.26
+
+## Machine Setup
+
+The Vagrantfile provided will start three virtual machines. One will act as a consul server, and the other two will act as Docker host.
+The experimental version of Docker is installed.
+
+- `consul-server` is the Consul server node, based on Ubuntu 14.04, this has IP 192.168.33.10
+- `net-1` is the first Docker host based on Ubuntu 14.10, this has IP 192.168.33.11
+- `net-2` is the second Docker host based on Ubuntu 14.10, this has IP 192.168.33.12
+
+## Getting Started
+
+Clone this repo, change to the `docs` directory and let Vagrant do the work.
+
+ $ vagrant up
+ $ vagrant status
+ Current machine states:
+
+ consul-server running (virtualbox)
+ net-1 running (virtualbox)
+ net-2 running (virtualbox)
+
+You are now ready to SSH to the Docker hosts and start containers.
+
+ $ vagrant ssh net-1
+ vagrant@net-1:~$ docker version
+ Client version: 1.8.0-dev
+ ...<snip>...
+
+Check that Docker network is functional by listing the default networks:
+
+ vagrant@net-1:~$ docker network ls
+ NETWORK ID NAME TYPE
+ 4275f8b3a821 none null
+ 80eba28ed4a7 host host
+ 64322973b4aa bridge bridge
+
+No services has been published so far, so the `docker service ls` will return an empty list:
+
+ $ docker service ls
+ SERVICE ID NAME NETWORK CONTAINER
+
+Start a container and check the content of `/etc/hosts`.
+
+ $ docker run -it --rm ubuntu:14.04 bash
+ root@df479e660658:/# cat /etc/hosts
+ 172.21.0.3 df479e660658
+ 127.0.0.1 localhost
+ ::1 localhost ip6-localhost ip6-loopback
+ fe00::0 ip6-localnet
+ ff00::0 ip6-mcastprefix
+ ff02::1 ip6-allnodes
+ ff02::2 ip6-allrouters
+ 172.21.0.3 distracted_bohr
+ 172.21.0.3 distracted_bohr.multihost
+
+In a separate terminal on `net-1` list the networks again. You will see that the _multihost_ overlay now appears.
+The overlay network _multihost_ is your default network. This was setup by the Docker daemon during the Vagrant provisioning. Check `/etc/default/docker` to see the options that were set.
+
+ vagrant@net-1:~$ docker network ls
+ NETWORK ID NAME TYPE
+ 4275f8b3a821 none null
+ 80eba28ed4a7 host host
+ 64322973b4aa bridge bridge
+ b5c9f05f1f8f multihost overlay
+
+Now in a separate terminal, SSH to `net-2`, check the network and services. The networks will be the same, and the default network will also be _multihost_ of type overlay. But the service will show the container started on `net-1`:
+Start a container on `net-2` and check the `/etc/hosts`.
+
+ vagrant@net-2:~$ docker run -ti --rm ubuntu:14.04 bash
+ root@2ac726b4ce60:/# cat /etc/hosts
+ 172.21.0.4 2ac726b4ce60
+ 127.0.0.1 localhost
+ ::1 localhost ip6-localhost ip6-loopback
+ fe00::0 ip6-localnet
+ ff00::0 ip6-mcastprefix
+ ff02::1 ip6-allnodes
+ ff02::2 ip6-allrouters
+ 172.21.0.3 distracted_bohr
+ 172.21.0.3 distracted_bohr.multihost
+ 172.21.0.4 modest_curie
+ 172.21.0.4 modest_curie.multihost
+
+You will see not only the container that you just started on `net-2` but also the container that you started earlier on `net-1`.
+And of course you will be able to ping each container.
+
+## Creating a Non Default Overlay Network
+
+In the previous test we started containers with regular options `-ti --rm` and these containers got placed automatically in the default network which was set to be the _multihost_ network of type overlay.
+
+But you could create your own overlay network and start containers in it. Let's create a new overlay network.
+On one of your Docker hosts, `net-1` or `net-2` do:
+Automatically, the second host will also see this network. To start a container on this new network, simply use the `--publish-service` option of `docker run` like so:
+
+ $ docker run -it --rm --publish-service=bar.foobar.overlay ubuntu:14.04 bash
+
+Note, that you could directly start a container with a new overlay using the `--publish-service` option and it will create the network automatically.
+
+Check the docker services now:
+
+ $ docker service ls
+ SERVICE ID NAME NETWORK CONTAINER
+ b1ffdbfb1ac6 bar foobar 6635a3822135
+
+Repeat the getting started steps, by starting another container in this new overlay on the other host, check the `/etc/hosts` file and try to ping each container.
+
+## A look at the interfaces
+
+This new Docker multihost networking is made possible via VXLAN tunnels and the use of network namespaces.
+Check the [design](design.md) documentation for all the details. But to explore these concepts a bit, nothing beats an example.
+
+With a running container in one overlay, check the network namespace:
+This is a none default location for network namespaces which might confuse things a bit. So let's become root, head over to this directory that contains the network namespaces of the containers and check the interfaces:
+
+ $ sudo su
+ root@net-2:/home/vagrant# cd /var/run/docker/
+ root@net-2:/var/run/docker# ls netns
+ 6635a3822135
+ 8805e22ad6e2
+
+To be able to check the interfaces in those network namespace using `ip` command, just create a symlink for `netns` that points to `/var/run/docker/netns`: