The homedir package was only used to print default values for
flags that contained paths inside the user's home-directory in
a slightly nicer way (replace `/users/home` with `~`).
Given that this is not critical, we can replace this with golang's
function, which does not depend on libcontainer.
There's still one use of the homedir package in docker/docker/opts,
which is used by the dnet binary (but only requires the homedir
package when running in rootless mode)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All distros that are supported by Docker now have at least
kernel version 3.10, so this check should no longer be needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All distros that are supported by Docker now have at least
kernel version 3.10, so this check should no longer be needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, failing to disable IPv6 router advertisement prevented the daemon to
start.
An issue was reported by a user that started docker using `systemd-nspawn "machine"`,
which produced an error;
failed to start daemon: Error initializing network controller:
Error creating default "bridge" network: libnetwork:
Unable to disable IPv6 router advertisement:
open /proc/sys/net/ipv6/conf/docker0/accept_ra: read-only file system
This patch changes the error to a log-message instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The homedir package was only used to print default values for
flags that contained paths inside the user's home-directory in
a slightly nicer way (replace `/users/home` with `~`).
Given that this is not critical, we can replace this with golang's
function, which does not depend on libcontainer.
There's still one use of the homedir package in docker/docker/opts,
which is used by the dnet binary (but only requires the homedir
package when running in rootless mode)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Samuel Karp <skarp@amazon.com>
(cherry picked from commit 9489546c44d94d37337191c263879a7ac075a331)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix 'failed to get network during CreateEndpoint' during container starting.
Change the error type to `libnetwork.ErrNoSuchNetwork`, so `Start()` in `daemon/cluster/executor/container/controller.go` will recreate the network.
Signed-off-by: Xinfeng Liu <xinfeng.liu@gmail.com>
This function always returned `nil`, so we can remove the error
return, and update other functions that were handling errors.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 2a480d515e updated the DNS library
and updated the error handling.
Due to changes in the library, we now had to check the response itself
to check if the response was truncated (Truncated DNS replies should
be sent to the client so that the client can retry over TCP).
However, 1e02aae252 added an incorrect
`nil` check to fix a panic, which ignored situations where
an error was returned, but no response (for example, if we failed
to connect to the DNS server).
In that situation, the error would be ignored, and further down we
would consider the connection to have been succesfull, but the DNS
server not returning a result.
After a "successful" lookup (but no results), we break the loop,
and don't attempt lookups in other DNS servers.
Versions before 1e02aae252 would produce:
Name To resolve: bbc.co.uk.
[resolver] query bbc.co.uk. (A) from 172.21.0.2:36181, forwarding to udp:192.168.5.1
[resolver] read from DNS server failed, read udp 172.21.0.2:36181->192.168.5.1:53: i/o timeout
[resolver] query bbc.co.uk. (A) from 172.21.0.2:38582, forwarding to udp:8.8.8.8
[resolver] received A record "151.101.0.81" for "bbc.co.uk." from udp:8.8.8.8
[resolver] received A record "151.101.192.81" for "bbc.co.uk." from udp:8.8.8.8
[resolver] received A record "151.101.64.81" for "bbc.co.uk." from udp:8.8.8.8
[resolver] received A record "151.101.128.81" for "bbc.co.uk." from udp:8.8.8.8
Versions after that commit would ignore the error, and stop further lookups:
Name To resolve: bbc.co.uk.
[resolver] query bbc.co.uk. (A) from 172.21.0.2:59870, forwarding to udp:192.168.5.1
[resolver] external DNS udp:192.168.5.1 returned empty response for "bbc.co.uk."
This patch updates the logic to handle the error to log the error (and continue with the next DNS):
- if an error is returned, and no response was received
- if an error is returned, but it was not related to a truncated response
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Tibor Vass <tibor@docker.com>
If firewalld is running, create a new docker zone and
add the docker interfaces to the docker zone to allow
container networking for distros with firewalld enabled
Fixes: https://github.com/moby/libnetwork/issues/2496
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
full diff: https://github.com/moby/ipvs/compare/v1.0.0...v1.0.1
- Fix compatibility issue on older kernels (< 3.18) where the address
family attribute for destination servers do not exist
- Fix the stats attribute check when parsing destination addresses
- NetlinkSocketsTimeout should be a constant
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This PR carryforwards https://github.com/moby/libnetwork/pull/2239
and incorporates the suggestions in comments to fix the NPE and
potential NPEs due to a null value returned by ep.Iface()
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Under certain conditions it appears that the DNS response and returned
error can be nil. When this happens, checking resp.Truncated results in
a nil panic so we must first check that the response is not nil before
checking if a truncated response was received.
See moby/moby#40715
Signed-off-by: Sam Whited <sam@samwhited.com>
full diff: https://github.com/vishvananda/netlink/compare/v1.0.0...v1.1.0
also updated moby/ipvs, which is compatible with this version of netlink,
and update vishvananda/netns to current master (which added go.mod)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The ipvs package was moved to a separate repo.
The ipvs package is a fairly generic set of helpers for managing IPVS.
The ipvs package is used by docker swarm and kubernetes.
Because we want to merge libnetwork back into the moby/moby codebase
while also not creating more dependencies for other projects on
moby/moby itself, it was decided that the best path for ipvs is to live
on it's own since there are no other ties to libnetwork.
Ref: https://github.com/moby/libnetwork/issues/2522
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
https://github.com/docker/libnetwork/pull/2419 and
https://github.com/docker/libnetwork/pull/2407
attempted to seperate out empty parent and internal for
macvlan and ipvlan networks
However it didnt pass the integration tests in moby
https://github.com/moby/moby/pull/40596 and exposed some
more plumbing that needed to be done to make sure
we separate the two things
If the -o parent is empty we create a dummylink
and if internal is set we dont add a default gateway
and make sure north-south communication cannot take place
(only east-west / container-container can)
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Deleting a network sandbox on Linux implicitly clears OS (ipvs) load
balancer state. Deleting an HNS network on Windows by contrast does not
inherently remove its corresponding VFP load balancers. The method to
remove load balancers belongs to the network and so must be called prior
to or while deleting a network. This commit reverts one line from
ea2fa20859, reintroducing a call to
explicitly remove backend load balancers during network removal.
Signed-off-by: Trapier Marshall <tmarshall@mirantis.com>
Debian Buster is now the current "stable", and will be the default
baseimage for Golang images going forward.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using dummy interface allows communication beween containers only if
they are running on the same node in swarm.
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
Using dummy interface allows communication beween containers only if
they are running on the same node in swam.
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
Since docker container can be connected to combination of several
internal and external networks change of default gateway of the internal
ones breaks communication via the external ones.
This fixes only ipvlan network type
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
Further improving load balancer performance by expiring
connections to servers with weights set to 0.
Signed-off-by: Andrew Kim <taeyeonkim90@gmail.com>
IPVS module used for swarm load balancer had a performance issue
under a high load situation. conn_reuse_mode=0 sysctl variable can
be set to handle the high load situation by reusing existing
connection entries in the IPVS table.
Under a high load, IPVS module was dropping tcp SYN packets whenever
a port reuse is detected with a connection in TIME_WAIT status forcing
clients to re-initiate tcp connections after request timeout events.
By setting conn_reuse_mode=0, IPVS module avoids special handling of
existing entries in the IPVS connection table.
Along with expire_nodest_conn=1, swarm load balancer can handle
a high load of requests and forward connections to newly joining
backend services.
Signed-off-by: Andrew Kim <taeyeonkim90@gmail.com>
This fix addresses https://docker.atlassian.net/browse/ENGCORE-1115
Expected behaviors upon docker engine restarts:
1. IPTableEnable=true, DOCKER-USER chain present
-- no change to DOCKER-USER chain
2. IPTableEnable=true, DOCKER-USER chain not present
-- DOCKER-USER chain created and inserted top of FORWARD
chain.
3. IPTableEnable=false, DOCKER-USER chain present
-- no change to DOCKER-USER chain
the rational is that DOCKER-USER is populated
and may be used by end-user for purpose other than
filtering docker container traffic. Thus even if
IPTableEnable=false, docker engine does not touch
pre-existing DOCKER-USER chain.
4. IPTableEnable=false, DOCKER-USER chain not present
-- DOCKER-USER chain is not created.
Signed-off-by: Su Wang <su.wang@docker.com>
Issue - "index out of range" panic in drivers/overlay/encryption.go:539
due to a mismatch in indices between curKeys and spis due to
case where updateKeys might bail out due to an error and
not update the spis
Fix - Reconfigure keys when there is a key update failure
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Golang 1.12.12
-------------------------------
full diff: https://github.com/golang/go/compare/go1.12.11...go1.12.12
go1.12.12 (released 2019/10/17) includes fixes to the go command, runtime,
syscall and net packages. See the Go 1.12.12 milestone on our issue tracker for
details.
https://github.com/golang/go/issues?q=milestone%3AGo1.12.12
Golang 1.12.11 (CVE-2019-17596)
-------------------------------
full diff: https://github.com/golang/go/compare/go1.12.10...go1.12.11
go1.12.11 (released 2019/10/17) includes security fixes to the crypto/dsa
package. See the Go 1.12.11 milestone on our issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.12.11
[security] Go 1.13.2 and Go 1.12.11 are released
Hi gophers,
We have just released Go 1.13.2 and Go 1.12.11 to address a recently reported
security issue. We recommend that all affected users update to one of these
releases (if you're not sure which, choose Go 1.13.2).
Invalid DSA public keys can cause a panic in dsa.Verify. In particular, using
crypto/x509.Verify on a crafted X.509 certificate chain can lead to a panic,
even if the certificates don't chain to a trusted root. The chain can be
delivered via a crypto/tls connection to a client, or to a server that accepts
and verifies client certificates. net/http clients can be made to crash by an
HTTPS server, while net/http servers that accept client certificates will
recover the panic and are unaffected.
Moreover, an application might crash invoking
crypto/x509.(*CertificateRequest).CheckSignature on an X.509 certificate
request, parsing a golang.org/x/crypto/openpgp Entity, or during a
golang.org/x/crypto/otr conversation. Finally, a golang.org/x/crypto/ssh client
can panic due to a malformed host key, while a server could panic if either
PublicKeyCallback accepts a malformed public key, or if IsUserAuthority accepts
a certificate with a malformed public key.
The issue is CVE-2019-17596 and Go issue golang.org/issue/34960.
Thanks to Daniel Mandragona for discovering and reporting this issue. We'd also
like to thank regilero for a previous disclosure of CVE-2019-16276.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also reduce the allowed port range as the total number of containers
per host is typically less than 1K.
This change helps in scenarios where there are other services on
the same host that uses ephemeral ports in iptables manipulation.
The workflow requires changes in docker engine (
https://github.com/moby/moby/pull/40055) and this change. It
works as follows:
1. user can now specified to docker engine an option
--published-port-range="50000-60000" as cmdline argument or
in daemon.json.
2. docker engine read and pass this info to libnetwork via
config.go:OptionDynamicPortRange.
3. libnetwork uses this range to allocate dynamic port henceforth.
4. --published-port-range can be set either via SIGHUP or
restart docker engine
5. if --published-port-range is not set by user, a OS specific
default range is used for dynamic port allocation.
Linux: 49153-60999, Windows: 60000-65000
6 if --published-port-range is invalid, that is, the range
given is outside of allowed default range, no change takes place.
libnetwork will continue to use old/existing port range for
dynamic port allocation.
Signed-off-by: Su Wang <su.wang@docker.com>
Reverts 141b53c77a (PR #2450)
Fallout from changing the forwarding default policy to deny was greater than anticipated.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Fixed these tests :
1.TestNetworkDBIslands
Addresses : https://github.com/docker/libnetwork/issues/2402
2.TestNetworkDBCRUDMediumCluster
Addresses : https://github.com/docker/libnetwork/issues/2401
By :
1. Importing gotest.tools/poll to use poll.WaitOn
Above function can be used to check a condition at regular intervals
until a timeout is reached
2. Replacing Sleep with poll.WaitOn
2. Adding closeNetworkDBInstances to close remaining DBs
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
The output of "bridge fdb show" command invoked under a network
namespace is unpredicable. Sometime it returns empty, and sometime
non-stop rolling output. This perhaps is a bug in kernel
and/or iproute2 implementation. To work around, display fdb for
each bridge.
Signed-off-by: Su Wang <su.wang@docker.com>
This commit allows a user to specify a Host IP via the
com.docker.network.host_ipv4 label which is used as the
Source IP during SNAT for bridge networks .
The use case is for hosts with multiple interfaces and
this label can dictate which IP will be used as Source IP
for North-South traffic
In the absence of this label, MASQUERADE is used which picks the Source IP
based on Next Hop from the Route Table
Addresses: https://github.com/moby/moby/issues/30053
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
In windows HNS manages IPAM. If the user does not specify a subnet, HNS will choose one
for them. However, in order for the IPAM to show up in the output of "docker inspect",
we need to update the network IPAMv4Config field.
Signed-off-by: Pradip Dhara <pradipd@microsoft.com>
This reverts commit 9f58c475940fb0c0d4b69de0af7787b62a40481f.
This commit is causing TestCreateParallel to be flaky
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This reverts commit 94af1e5af2.
The reason to revert this is, that TestCreateParallel is
continously failing and breaking the CI
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This allows overriding the version of Go without making modifications in the
source code, which can be useful to test against multiple versions.
For example:
make GO_VERSION=1.13beta1 build
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit carries forward the work done in
https://github.com/docker/libnetwork/pull/2295
and fixes two things
1. Allows macvlan and ipvlan to be restored properly
after dockerd or the system is restarted
2. Makes sure the refcount for the configOnly network
is not incremented for the above case so this network
can be deleted after all the associated ConfigFrom networks
are deleted
Addresses: https://github.com/docker/libnetwork/issues/1743
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Since docker container can be connected to combination of several
internal and external networks change of default gateway of the internal
ones breaks communication via the external ones.
This fixes only macvlan network type
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
This commit updates the vendored ishidawataru/sctp and adapts its used
types.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
moby/moby commit b27f70d45 wraps the ErrNotFound error returned when
a plugin cannot be found, to include a backtrace. This changes the
type of the error, so contoller.loadIPAMDriver no longer converts it
to a libnetwork plugin.NotFoundError.
This is a similar patch as was merged in 9b114971e5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Changes included:
- Allow index specification at link creation time
- replace syscall with golang.org/x/sys/unix
- related: Use IFF_MULTI_QUEUE from x/sys/unix to define TUNTAP_MULTI_QUEUE
- related: Use IFLA_* constants from x/sys/unix
- Fix index out of range when no metadata for gretap
- added encapsulation attributes for Iptun and Sittun to support SIT tunnels
- Expose xfrm state's statistics
- Support invert in ip rules
- Support LWTUNNEL_ENCAP_SEG6
- Support setting and retrieving route MTU/AdvMSS
- Fix CalcRtable array parameter bug
- added support for Foo-over-UDP netlink calls
- Support num{tx,rx}queues and udp6zerocsum{tx,rx}
- tuntap: Add multiqueue support
- Retrieve VLAN ID when listing neighbour
- Fix LinkAdd for sit tunnel on 3.10 kernel
- Add support for managing source MACVLANs
- Two functions: one for adding bond slave, one for getting veth peer index
- Eliminate cgo from netlink
- Don't overwrite the XDP file descriptor with flags
- Fix reference to BPF instructions (on Kernel 4.13)
- Add Matchall filter
- Send IFA_CACHEINFO when setting up addresses
- Support IPv6 GRE Tun and Tap
- Add List option to RouteSubscribeWithOptions, AddrSubscribeWithOptions, and LinkSubscribeWithOptions
- Add Fq and Fq_Codel Qdisc support
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
RFC434 states that DNS Servers should be case insensitive
This commit makes sure that all DNS queries will be translated
to lower ASCII characters and all svcRecords will be saved in
lower case to abide by the RFC
Relates to https://github.com/moby/moby/issues/21169
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This test was producing error messages due to missing endpoints
in the plugin API;
```
=== RUN TestValidRemoteDriver
ERRO[0039] error getting capability for valid-network-driver due to NetworkDriver.GetCapabilities: 404 page not found
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
systemd and udev in their default configuration attempt to set a
persistent MAC address for network interfaces that don't have one
already [systemd-def-link]. We set the address only after creating the
interface, so there is a race between us and udev. There are several
outcomes (that actually occur, this race is very much not a theoretical
one):
* We set the address before udev gets to the networking rules, so udev
sees `/sys/devices/virtual/net/docker0/addr_assign_type = 3`
(NET_ADDR_SET). This means there's no need to assign a different
address and everything is fine.
* udev reads `/sys/devices/virtual/net/docker0/addr_assign_type` before
we set the address, gets `1` (NET_ADDR_RANDOM), and proceeds to
generate and set a persistent address.
Old versions of udev (pre-v242, i.e. without [udev-patch]) would then
fail to generate an address, spit out "Could not generate persistent
MAC address for docker0: No such file or directory" (see [udev-issue],
and everything would be probably fine as well.
Current version of udev (with [udev-patch]) will generate an address
just fine and then race us setting it. As udev does more work than we,
the most probable outcome is that udev will overwrite the address we
set and possibly cause some trouble later on.
On a clean Debian Buster (from Vagrant) VM with systemd/udev 242 from
Debian Experimental, `docker network create net1` up to `net7` resulted
in 3 bridges having a 02:42: address and 4 bridges having a seemingly
random (actually generated from interface name) address. With systemd
241, the result would be all bridges having a 02:42:, but some "Could
not generate persistent MAC address for" messages in the log.
The fix is to revert the MAC address setting fix from 6901ea51dc,
as it is no longer necessary with current netlink [netlink-addr-add],
and set the address atomically when creating the bridge interface, not
after that.
[systemd-def-link]: a166cd3aac/network/99-default.link
[udev-patch]: 6d36464065
[udev-issue]: https://github.com/systemd/systemd/issues/3374
[netlink-addr-add]: 7d9b424492
...
Do note that a similar race happens when creating veth devices as well.
I wasn't able to reproduce getting a wrong (non-02:42:) address,
possibly because the address is set by docker later, maybe only after
the interface is moved to another network namespace (but I'm just
guessing here). Still, different timings result in various error
messages being logged ("link_config: could not get ethtool features for
vethd9c938e" and the like) depending on when the interface disappears
from the primary network namespace. I'm not sure how to fix this and I
don't intend to dig deeper into this.
Signed-off-by: Tomas Janousek <tomi@nomi.cz>
The endpoint count for --config-only networks
was being incremented even when the respective --config-from
inherited network failed to create a network
This was due to a variable shadowing problem with err causing
the deferred function to not execute correctly.
Using the same err variable across the entire function fixes
the issue
Fixes: moby/moby#35101
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Starting from 18.09, there's a per node LB for each overlay
network, this change adds the check to node LB.
This change should not break on older docker versions.
Signed-off-by: Xinfeng Liu <xinfeng.liu@gmail.com>
For overlay, l2bridge, and l2tunnel, if the user does not specify a host port, windows driver will select a random port for them. This matches linux behavior.
For ics and nat networks the windows OS will choose the port.
Signed-off-by: Pradip Dhara <pradipd@microsoft.com>