Some configuration in a container depends on whether it has support for
IPv6 (including default entries for '::1' etc in '/etc/hosts').
Before this change, the container's support for IPv6 was determined by
whether it was connected to any IPv6-enabled networks. But, that can
change over time, it isn't a property of the container itself.
So, instead, detect IPv6 support by looking for '::1' on the container's
loopback interface. It will not be present if the kernel does not have
IPv6 support, or the user has disabled it in new namespaces by other
means.
Once IPv6 support has been determined for the container, its '/etc/hosts'
is re-generated accordingly.
The daemon no longer disables IPv6 on all interfaces during initialisation.
It now disables IPv6 only for interfaces that have not been assigned an
IPv6 address. (But, even if IPv6 is disabled for the container using the
sysctl 'net.ipv6.conf.all.disable_ipv6=1', interfaces connected to IPv6
networks still get IPv6 addresses that appear in the internal DNS. There's
more to-do!)
Signed-off-by: Rob Murray <rob.murray@docker.com>
All components of the path are locked before the check, and
released once the path is already mounted.
This makes it impossible to replace the mounted directory until it's
actually mounted in the container.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All subpath components are opened with openat, relative to the base
volume directory and checked against the volume escape.
The final file descriptor is mounted from the /proc/self/fd/<fd> to a
temporary mount point owned by the daemon and then passed to the
underlying container runtime.
Temporary mountpoint is removed after the container is started.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`VolumeOptions` now has a `Subpath` field which allows to specify a path
relative to the volume that should be mounted as a destination.
Symlinks are supported, but they cannot escape the base volume
directory.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We constructed a "function level" logger, which was used once "as-is", but
also added additional Fields in a loop (for each resource), effectively
overwriting the previous one for each iteration. Adding additional
fields can result in some overhead, so let's construct a "logger" only for
inside the loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We have many "image" packages, so these vars easily conflict/shadow
imports. Let's rename them (and in some cases use a const) to
prevent that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For some time, when adding an interface with no IPv6 address (an
interface to a network that does not have IPv6 enabled), we've been
disabling IPv6 on that interface.
As part of a separate change, I'm removing that logic - there's nothing
wrong with having IPv6 enabled on an interface with no routable address.
The difference is that the kernel will assign a link-local address.
TestAddRemoveInterface does this...
- Assign an IPv6 link-local address to one end of a veth interface, and
add it to a namespace.
- Add a bridge with no assigned IPv6 address to the namespace.
- Remove the veth interface from the namespace.
- Put the veth interface back into the namespace, still with an
explicitly assigned IPv6 link local address.
When IPv6 is disabled on the bridge interface, the test passes.
But, when IPv6 is enabled, the bridge gets a kernel assigned link-local
address.
Then, when re-adding the veth interface, the test generates an error in
'osl/interface_linux.go:checkRouteConflict()'. The conflict is between
the explicitly assigned fe80::2 on the veth, and a route for fe80::/64
belonging to the bridge.
So, in preparation for not-disabling IPv6 on these interfaces, use a
unique-local address in the test instead of link-local.
I don't think that changes the intent of the test.
With the change to not-always disable IPv6, it is possible to repro the
problem with a real container, disconnect and re-connect a user-defined
network with '--subnet fe80::/64' while the container's connected to an
IPv4 network. So, strictly speaking, that will be a regression.
But, it's also possible to repro the problem in master, by disconnecting
and re-connecting the fe80::/64 network while another IPv6 network is
connected. So, I don't think it's a problem we need to address, perhaps
other than by prohibiting '--subnet fe80::/64'.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Message is different with containerd backend. The Linux test
`TestPullLinuxImageFailsOnLinux` was adjusted before, but we missed this
one.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
All commonly used filesystems should use ref-counted mounter, so make it
the default instead of having to whitelist them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Prior to this commit, a container running with `--net=host` had
`{"type":"network","path":"/var/run/docker/netns/default"}` in
the ``.linux.namespaces` field of the OCI Runtime Config,
but this wasn't needed.
Close issue 47100
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The actual divergence is due to differences in the snapshotter and
graphfilter mount behaviour on Windows, but the snapshotter behaviour is
better, so we deal with it here rather than changing the snapshotter
behaviour.
We're relying on the internals of containerd's Windows mount
implementation here. Unless this code flow is replaced, future work is
to move getBackingDeviceForContainerdMount into containerd's mount
implementation.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
That means 'null', not that we can call builder-next on Windows. If and
when we do get builder-next going, this will need to be solved properly
in some way.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The existing API ImageService.GetLayerFolders didn't have access to the
ID of the container, and once we have that, the snapshotter Mounts API
provides all the information we need here.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Needed for Diff on Windows. Don't remount it afterwards as the layer is
going to be released anyway.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is consistent with layerStore's CreateRWLayer behaviour.
Potentially this can be refactored to avoid creating the -init layer,
but as noted in layerStore's initMount, this name may be special, and
should be cleared-out all-at-once.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This change adds a TempDir function that ensures the correct permissions for
the fake-root user in rootless mode.
Signed-off-by: Evan Lezar <elezar@nvidia.com>