Commit graph

30 commits

Author SHA1 Message Date
Cory Snider
d21d0884ae libnetwork: share a single datastore with drivers
The bbolt library wants exclusive access to the boltdb file and uses
file locking to assure that is the case. The controller and each network
driver that needs persistent storage instantiates its own unique
datastore instance, backed by the same boltdb file. The boltdb kvstore
implementation works around multiple access to the same boltdb file by
aggressively closing the boltdb file between each transaction. This is
very inefficient. Have the controller pass its datastore instance into
the drivers and enable the PersistConnection option to disable closing
the boltdb between transactions.

Set data-dir in unit tests which instantiate libnetwork controllers so
they don't hang trying to lock the default boltdb database file.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-01-31 21:08:34 -05:00
Paweł Gronowski
f07387466a
daemon/oci: Extract side effects from withMounts
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-19 17:27:16 +01:00
Sebastiaan van Stijn
210932b3bf
daemon: format code with gofumpt
Formatting the code with https://github.com/mvdan/gofumpt

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-29 00:33:03 +02:00
Cory Snider
dea870f4ea daemon: stop setting container resources to zero
Many of the fields in LinuxResources struct are pointers to scalars for
some reason, presumably to differentiate between set-to-zero and unset
when unmarshaling from JSON, despite zero being outside the acceptable
range for the corresponding kernel tunables. When creating the OCI spec
for a container, the daemon sets the container's OCI spec CPUShares and
BlkioWeight parameters to zero when the corresponding Docker container
configuration values are zero, signifying unset, despite the minimum
acceptable value for CPUShares being two, and BlkioWeight ten. This has
gone unnoticed as runC does not distingiush set-to-zero from unset as it
also uses zero internally to represent unset for those fields. However,
kata-containers v3.2.0-alpha.3 tries to apply the explicit-zero resource
parameters to the container, exactly as instructed, and fails loudly.
The OCI runtime-spec is silent on how the runtime should handle the case
when those parameters are explicitly set to out-of-range values and
kata's behaviour is not unreasonable, so the daemon must therefore be in
the wrong.

Translate unset values in the Docker container's resources HostConfig to
omit the corresponding fields in the container's OCI spec when starting
and updating a container in order to maximize compatibility with
runtimes.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-06 12:13:05 -04:00
Cory Snider
9ff169ccf4 daemon: modernize oci_linux_test.go
Switch to using t.TempDir() instead of rolling our own.

Clean up mounts leaked by the tests as otherwise the tests fail due to
the leaked mounts because unlike the old cleanup code, t.TempDir()
cleanup does not ignore errors from os.RemoveAll.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-05 18:30:30 -04:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Sebastiaan van Stijn
a82c434447
daemon: setupFakeDaemon(): add fakeImageService
To prevent a panic happening when running tests:

    === FAIL: daemon TestTmpfsDevShmNoDupMount (0.01s)
    panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x120 pc=0x261a373]

    goroutine 134 [running]:
    testing.tRunner.func1.2({0x28baf20, 0x3ea8000})
        /usr/local/go/src/testing/testing.go:1526 +0x24e
    testing.tRunner.func1()
        /usr/local/go/src/testing/testing.go:1529 +0x39f
    panic({0x28baf20, 0x3ea8000})
        /usr/local/go/src/runtime/panic.go:884 +0x213
    github.com/docker/docker/daemon.(*Daemon).createSpec(0xc0006e0000, {0x2ea5588, 0xc00012a008}, 0xc0003b5900)
        /go/src/github.com/docker/docker/daemon/oci_linux.go:1060 +0xf33
    github.com/docker/docker/daemon.TestTmpfsDevShmNoDupMount(0xc000b781a0?)
        /go/src/github.com/docker/docker/daemon/oci_linux_test.go:77 +0x20a
    testing.tRunner(0xc000b78340, 0x2c74210)
        /usr/local/go/src/testing/testing.go:1576 +0x10b
    created by testing.(*T).Run
        /usr/local/go/src/testing/testing.go:1629 +0x3ea

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-04-18 15:02:41 +02:00
Nicolas De Loof
def549c8f6
imageservice: Add context to various methods
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2022-11-03 12:22:40 +01:00
Cory Snider
e332c41e9d pkg/containerfs: alias ContainerFS to string
Drop the constructor and redundant string() type-casts.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-09-23 16:56:52 -04:00
Cory Snider
098a44c07f Finish refactor of UID/GID usage to a new struct
Finish the refactor which was partially completed with commit
34536c498d, passing around IdentityMapping structs instead of pairs of
[]IDMap slices.

Existing code which uses []IDMap relies on zero-valued fields to be
valid, empty mappings. So in order to successfully finish the
refactoring without introducing bugs, their replacement therefore also
needs to have a useful zero value which represents an empty mapping.
Change IdentityMapping to be a pass-by-value type so that there are no
nil pointers to worry about.

The functionality provided by the deprecated NewIDMappingsFromMaps
function is required by unit tests to to construct arbitrary
IdentityMapping values. And the daemon will always need to access the
mappings to pass them to the Linux kernel. Accommodate these use cases
by exporting the struct fields instead. BuildKit currently depends on
the UIDs and GIDs methods so we cannot get rid of them yet.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-03-14 16:28:57 -04:00
Sebastiaan van Stijn
a826ca3aef
daemon.WithCommonOptions() fix detection of user-namespaces
Commit dae652e2e5 added support for non-privileged
containers to use ICMP_PROTO (used for `ping`). This option cannot be set for
containers that have user-namespaces enabled.

However, the detection looks to be incorrect; HostConfig.UsernsMode was added
in 6993e891d1 / ee2183881b,
and the property only has meaning if the daemon is running with user namespaces
enabled. In other situations, the property has no meaning.
As a result of the above, the sysctl would only be set for containers running
with UsernsMode=host on a daemon running with user-namespaces enabled.

This patch adds a check if the daemon has user-namespaces enabled (RemappedRoot
having a non-empty value), or if the daemon is running inside a user namespace
(e.g. rootless mode) to fix the detection.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-30 19:48:29 +02:00
Akihiro Suda
9e7bbdb9ba
Merge pull request #40084 from thaJeztah/hostconfig_const_cleanup
api/types: hostconfig: add some constants/enums and minor code cleanup
2021-08-28 00:21:31 +09:00
Eng Zer Jun
c55a4ac779
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-08-27 14:56:57 +08:00
Sebastiaan van Stijn
98f0f0dd87
api/types: hostconfig: define consts for IpcMode
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-06 19:05:51 +02:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
Arnaud Rebillout
6349b32e1b daemon/oci_linux_test: Skip privileged tests when non-root
These tests fail when run by a non-root user

  === RUN   TestTmpfsDevShmNoDupMount
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestTmpfsDevShmNoDupMount (0.00s)
  === RUN   TestIpcPrivateVsReadonly
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestIpcPrivateVsReadonly (0.00s)
  === RUN   TestSysctlOverride
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestSysctlOverride (0.00s)
  === RUN   TestSysctlOverrideHost
      oci_linux_test.go:29: assertion failed: error is not nil: mkdir /var/lib/docker: permission denied
  --- FAIL: TestSysctlOverrideHost (0.00s)

Signed-off-by: Arnaud Rebillout <elboulangero@gmail.com>
2020-12-15 09:47:44 +07:00
Justin Cormack
dae652e2e5
Add default sysctls to allow ping sockets and privileged ports with no capabilities
Currently default capability CAP_NET_RAW allows users to open ICMP echo
sockets, and CAP_NET_BIND_SERVICE allows binding to ports under 1024.
Both of these are safe operations, and Linux now provides ways that
these can be set, per container, to be allowed without any capabilties
for non root users. Enable these by default. Users can revert to the
previous behaviour by overriding the sysctl values explicitly.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2020-06-04 18:11:08 +01:00
Sebastiaan van Stijn
eeef12f469
daemon: address some minor linting issues and nits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:17 +02:00
Sebastiaan van Stijn
9f0b3f5609
bump gotest.tools v3.0.1 for compatibility with Go 1.14
full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 00:06:42 +01:00
Aleksa Sarai
f38ac72bca
oci: add integration tests for kernel.domainname configuration
This also includes a few refactors of oci_linux_test.go.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2018-11-30 19:44:50 +11:00
Lihua Tang
8df0b2de54 Fix typos in comment
Signed-off-by: Lihua Tang <lhtang@alauda.io>
2018-09-07 13:17:42 +08:00
Salahuddin Khan
763d839261 Add ADD/COPY --chown flag support to Windows
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.

NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.

The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>

On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.

Signed-off-by: Salahuddin Khan <salah@docker.com>
2018-08-13 21:59:11 -07:00
Vincent Demeester
3845728524
Update tests to use gotest.tools 👼
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2018-06-13 09:04:30 +02:00
Kir Kolyshkin
d8fd6137a1 daemon.getSourceMount(): fix for / mount point
A recent optimization in getSourceMount() made it return an error
in case when the found mount point is "/". This prevented bind-mounted
volumes from working in such cases.

A (rather trivial but adeqate) unit test case is added.

Fixes: 871c957242 ("getSourceMount(): simplify")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-05-10 12:53:37 -07:00
Daniel Nephin
6be0f70983 Automated migration using
gty-migrate-from-testify --ignore-build-tags

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-03-16 11:03:43 -04:00
Kir Kolyshkin
33dd562e3a daemon/oci_linux_test: add TestIpcPrivateVsReadonly
The test case checks that in case of IpcMode: private and
ReadonlyRootfs: true (as in "docker run --ipc private --read-only")
the resulting /dev/shm mount is NOT made read-only.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-08 14:04:03 -08:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Kir Kolyshkin
1861abdc4a Fix "duplicate mount point" when --tmpfs /dev/shm is used
This is a fix to the following issue:

$ docker run --tmpfs /dev/shm busybox sh
docker: Error response from daemon: linux mounts: Duplicate mount point '/dev/shm'.

In current code (daemon.createSpec()), tmpfs mount from --tmpfs is added
to list of mounts (`ms`), when the mount from IpcMounts() is added.
While IpcMounts() is checking for existing mounts first, it does that
by using container.HasMountFor() function which only checks container.Mounts
but not container.Tmpfs.

Ultimately, the solution is to get rid of container.Tmpfs (moving its
data to container.Mounts). Current workaround is to add checking
of container.Tmpfs into container.HasMountFor().

A unit test case is included.

Unfortunately we can't call daemon.createSpec() from a unit test,
as the code relies a lot on various daemon structures to be initialized
properly, and it is hard to achieve. Therefore, we minimally mimick
the code flow of daemon.createSpec() -- barely enough to reproduce
the issue.

https://github.com/moby/moby/issues/35455

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-11-20 18:48:27 -08:00