Commit graph

1015 commits

Author SHA1 Message Date
Djordje Lukic
b4ffe3a9fb Move the inspect code away from the image service
The LoopkupImage method is only used by the inspect image route and
returns an api/type struct. The depenency to api/types of the
daemon/images package is wrong, the daemon doesn't need to know about
the api types.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2022-06-22 15:08:55 +02:00
Sebastiaan van Stijn
b241e2008e
daemon.NewDaemon(): fix network feature detection on first start
Commit 483aa6294b introduced a regression, causing
spurious warnings to be shown when starting a daemon for the first time after
a fresh install:

    docker info
    ...
    WARNING: IPv4 forwarding is disabled
    WARNING: bridge-nf-call-iptables is disabled
    WARNING: bridge-nf-call-ip6tables is disabled

The information shown is incorrect, as checking the corresponding options on
the system, shows that these options are available:

    cat /proc/sys/net/ipv4/ip_forward
    1
    cat /proc/sys/net/bridge/bridge-nf-call-iptables
    1
    cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
    1

The reason this is failing is because the daemon itself reconfigures those
options during networking initialization in `configureIPForwarding()`;
cf4595265e/libnetwork/drivers/bridge/setup_ip_forwarding.go (L14-L25)

Network initialization happens in the `daemon.restore()` function within `daemon.NewDaemon()`:
cf4595265e/daemon/daemon.go (L475-L478)

However, 483aa6294b moved detection of features
earlier in the `daemon.NewDaemon()` function, and collects the system information
(`d.RawSysInfo()`) before we enter `daemon.restore()`;
cf4595265e/daemon/daemon.go (L1008-L1011)

For optimization (collecting the system information comes at a cost), those
results are cached on the daemon, and will only be performed once (using a
`sync.Once`).

This patch:

- introduces a `getSysInfo()` utility, which collects system information without
  caching the results
- uses `getSysInfo()` to collect the preliminary information needed at that
  point in the daemon's lifecycle.
- moves printing warnings to the end of `daemon.NewDaemon()`, after all information
  can be read correctly.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-06-03 17:54:43 +02:00
Djordje Lukic
70dc392bfa
Use hashicorp/go-memdb instead of truncindex
memdb already knows how to search by prefix so there is no need to keep
a separate list of container ids in the truncindex

Benchmarks:

$ go test -benchmem -run=^$ -count 5 -tags linux -bench ^BenchmarkDBGetByPrefix100$ github.com/docker/docker/container
goos: linux
goarch: amd64
pkg: github.com/docker/docker/container
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkDBGetByPrefix100-6        16018             73935 ns/op           33888 B/op       1100 allocs/op
BenchmarkDBGetByPrefix100-6        16502             73150 ns/op           33888 B/op       1100 allocs/op
BenchmarkDBGetByPrefix100-6        16218             74014 ns/op           33856 B/op       1100 allocs/op
BenchmarkDBGetByPrefix100-6        15733             73370 ns/op           33792 B/op       1100 allocs/op
BenchmarkDBGetByPrefix100-6        16432             72546 ns/op           33744 B/op       1100 allocs/op
PASS
ok      github.com/docker/docker/container      9.752s

$ go test -benchmem -run=^$ -count 5 -tags linux -bench ^BenchmarkTruncIndexGet100$ github.com/docker/docker/pkg/truncindex
goos: linux
goarch: amd64
pkg: github.com/docker/docker/pkg/truncindex
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkTruncIndexGet100-6        16862             73732 ns/op           44776 B/op       1173 allocs/op
BenchmarkTruncIndexGet100-6        16832             73629 ns/op           45184 B/op       1179 allocs/op
BenchmarkTruncIndexGet100-6        17214             73571 ns/op           45160 B/op       1178 allocs/op
BenchmarkTruncIndexGet100-6        16113             71680 ns/op           45360 B/op       1182 allocs/op
BenchmarkTruncIndexGet100-6        16676             71246 ns/op           45056 B/op       1184 allocs/op
PASS
ok      github.com/docker/docker/pkg/truncindex 9.759s

$ go test -benchmem -run=^$ -count 5 -tags linux -bench ^BenchmarkDBGetByPrefix500$ github.com/docker/docker/container
goos: linux
goarch: amd64
pkg: github.com/docker/docker/container
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkDBGetByPrefix500-6         1539            753541 ns/op          169381 B/op       5500 allocs/op
BenchmarkDBGetByPrefix500-6         1624            749975 ns/op          169458 B/op       5500 allocs/op
BenchmarkDBGetByPrefix500-6         1635            761222 ns/op          169298 B/op       5500 allocs/op
BenchmarkDBGetByPrefix500-6         1693            727856 ns/op          169297 B/op       5500 allocs/op
BenchmarkDBGetByPrefix500-6         1874            710813 ns/op          169570 B/op       5500 allocs/op
PASS
ok      github.com/docker/docker/container      6.711s

$ go test -benchmem -run=^$ -count 5 -tags linux -bench ^BenchmarkTruncIndexGet500$ github.com/docker/docker/pkg/truncindex
goos: linux
goarch: amd64
pkg: github.com/docker/docker/pkg/truncindex
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkTruncIndexGet500-6         1934            780328 ns/op          224073 B/op       5929 allocs/op
BenchmarkTruncIndexGet500-6         1713            713935 ns/op          225011 B/op       5937 allocs/op
BenchmarkTruncIndexGet500-6         1780            702847 ns/op          224090 B/op       5943 allocs/op
BenchmarkTruncIndexGet500-6         1736            711086 ns/op          224027 B/op       5929 allocs/op
BenchmarkTruncIndexGet500-6         2448            508694 ns/op          222322 B/op       5914 allocs/op
PASS
ok      github.com/docker/docker/pkg/truncindex 6.877s

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2022-05-20 18:22:21 +02:00
Brian Goff
4e025b54d5 Remove mount spec backport
This was added in 1.13 to "upgrade" old mount specs to the new format.
This is no longer needed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2022-05-13 23:14:43 +00:00
Sebastiaan van Stijn
3228dbaaa9
Merge pull request #43555 from thaJeztah/separate_engine_id
daemon: separate daemon ID from trust-key, and disable generating
2022-05-10 14:27:42 +02:00
Sebastiaan van Stijn
6b4696e18d
Merge pull request #43544 from thaJeztah/daemon_fix_hosts_validation_step1h
daemon/config: remove uses of pointers for ints
2022-05-06 17:52:52 +02:00
Sebastiaan van Stijn
070da63310
daemon: only create trust-key if DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE is set
The libtrust trust-key is only used for pushing legacy image manifests;
pushing these images has been deprecated, and we only need to be able
to push them in our CI.

This patch disables generating the trust-key (and related paths) unless
the DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE env-var is set (which we do in
our CI).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-04 20:18:08 +02:00
Sebastiaan van Stijn
bb1208639b
daemon: separate daemon ID from trust-key
This change is in preparation of deprecating support for old manifests.
Currently the daemon's ID is based on the trust-key ID, which will be
removed once we fully deprecate support for old manifests (the trust
key is currently only used in tests).

This patch:

- looks if a trust-key is present; if so, it migrates the trust-key
  ID to the new "engine-id" file within the daemon's root.
- if no trust-key is present (so in case it's a "fresh" install), we
  generate a UUID instead and use that as ID.

The migration is to prevent engines from getting a new ID on upgrades;
while we don't provide any guarantees on the engine's ID, users may
expect the ID to be "stable" (not change) between upgrades.

A test has been added, which can be ran with;

    make DOCKER_GRAPHDRIVER=vfs TEST_FILTER='TestConfigDaemonID' test-integration

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-04 20:17:18 +02:00
Sebastiaan van Stijn
e62382d014
daemon/config: remove uses of pointers for ints
Use the default (0) value to indicate "not set", which simplifies
working with these configuration options, preventing the need to
use intermediate variables etc.

While changing this code, also making some small cleanups, such
as replacing "fmt.Sprintf()" for "strconv" variants.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-29 09:39:34 +02:00
Sebastiaan van Stijn
dbd575ef91
daemon: daemon.initNetworkController(): dont return the controller
This method returned the network controller, only to set it on the daemon.

While making this change, also;

- update some error messages to be in the correct format
- use errors.Wrap() where possible
- extract configuring networks into a separate function to make the flow
  slightly easier to follow.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-29 09:08:49 +02:00
Sebastiaan van Stijn
3b56c0663d
daemon: daemon.networkOptions(): don't pass Config as argument
This is a method on the daemon, which itself holds the Config, so
there's no need to pass the same configuration as an argument.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-23 23:34:13 +02:00
Sebastiaan van Stijn
90de570cfa
backend: add StopOptions to ContainerRestart and ContainerStop
While we're modifying the interface, also add a context to both.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-20 21:29:30 +02:00
Sebastiaan van Stijn
5edf9acf9c
daemon: move default stop-timeout to containerStop()
This avoids having to determine what the default is in various
parts of the code. If no custom timeout is passed (nil), the
default will be used.

Also remove the named return variable from cleanupContainer(),
as it wasn't used.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-20 21:29:15 +02:00
Sebastiaan van Stijn
690a6fddf9
daemon: move default namespaces to daemon/config
Keeping the defaults in a single location, which also reduces
the list of imports needed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-17 13:10:57 +02:00
Sebastiaan van Stijn
0a3336fd7d
Merge pull request #43366 from corhere/finish-identitymapping-refactor
Finish refactor of UID/GID usage to a new struct
2022-03-25 14:51:05 +01:00
CrazyMax
a2aaf4cc83
vendor buildkit v0.10.0
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
2022-03-22 18:51:27 +01:00
Sebastiaan van Stijn
273dca4e3c
registry: remove unused error return from HostCertsDir()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-03-17 17:09:13 +01:00
Cory Snider
098a44c07f Finish refactor of UID/GID usage to a new struct
Finish the refactor which was partially completed with commit
34536c498d, passing around IdentityMapping structs instead of pairs of
[]IDMap slices.

Existing code which uses []IDMap relies on zero-valued fields to be
valid, empty mappings. So in order to successfully finish the
refactoring without introducing bugs, their replacement therefore also
needs to have a useful zero value which represents an empty mapping.
Change IdentityMapping to be a pass-by-value type so that there are no
nil pointers to worry about.

The functionality provided by the deprecated NewIDMappingsFromMaps
function is required by unit tests to to construct arbitrary
IdentityMapping values. And the daemon will always need to access the
mappings to pass them to the Linux kernel. Accommodate these use cases
by exporting the struct fields instead. BuildKit currently depends on
the UIDs and GIDs methods so we cannot get rid of them yet.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-03-14 16:28:57 -04:00
Cory Snider
b36fb04e03 vendor: github.com/containerd/containerd v1.6.1
Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-03-10 17:48:10 -05:00
Sebastiaan van Stijn
85f1bfc6f7
Merge pull request #43255 from thaJeztah/imageservice_nologs
daemon/images: ImageService.Cleanup(): return error instead of logging
2022-03-05 21:23:38 +01:00
Sebastiaan van Stijn
ac2cd5a8f2
daemon: unexport Daemon.ID and Daemon.RegistryService
These are used internally only, and set by daemon.NewDaemon(). If they're
used externally, we should add an accessor added (which may be something
we want to do for daemon.registryService (which should be its own backend)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-03-02 22:19:22 +01:00
Akihiro Suda
54d35c071d
Merge pull request #43130 from thaJeztah/daemon_cache_sysinfo
daemon: load and cache sysInfo on initialization
2022-02-18 13:46:15 +09:00
Sebastiaan van Stijn
79cad59d97
daemon/images: ImageService.Cleanup(): return error instead of logging
This makes the function a bit more idiomatic, and leaves it to the caller to
decide wether or not the error can be ignored.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-17 22:04:03 +01:00
Brian Goff
047d58f007
Merge pull request #43187 from thaJeztah/remove_lcow_checks
Remove various leftover LCOW checks
2022-02-17 11:22:19 -08:00
Sebastiaan van Stijn
b36d896fce
layer: remove OS from layerstore
This was added in commits fc21bf280b and
0380fbff37 in support of LCOW, but was
now always set to runtime.GOOS.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 15:23:23 +01:00
Sebastiaan van Stijn
5c870b421a
daemon/images.NewImageService() don't print debug logs
These logs were meant to be logged when starting the daemon. Moving the logs
to the daemon startup code (which also prints similar messages) instead of
having the images service log them.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-24 15:55:45 +01:00
Sebastiaan van Stijn
483aa6294b
daemon: load and cache sysInfo on initialization
The `daemon.RawSysInfo()` function can be a heavy operation, as it collects
information about all cgroups on the host, networking, AppArmor, Seccomp, etc.

While looking at our code, I noticed that various parts in the code call this
function, potentially even _multiple times_ per container, for example, it is
called from:

- `verifyPlatformContainerSettings()`
- `oci.WithCgroups()` if the daemon has `cpu-rt-period` or `cpu-rt-runtime` configured
- in `ContainerDecoder.DecodeConfig()`, which is called on boith `container create` and `container commit`

Given that this information is not expected to change during the daemon's
lifecycle, and various information coming from this (such as seccomp and
apparmor status) was already cached, we may as well load it once, and cache
the results in the daemon instance.

This patch updates `daemon.RawSysInfo()` to use a `sync.Once()` so that
it's only executed once for the daemon's lifecycle.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-12 18:28:15 +01:00
Sebastiaan van Stijn
9492354782
daemon: remove daemon.discoveryWatcher
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:22 +01:00
Sebastiaan van Stijn
ff2a5301b8
daemon: remove discovery-related config handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:17 +01:00
Akihiro Suda
d116e12c6d
Merge pull request #42726 from thaJeztah/daemon_simplify_nwconfig
daemon: simplify networking config
2021-11-12 01:19:07 +09:00
Sebastiaan van Stijn
a80c450fb3
Merge pull request #41935 from alexisries/Issue-41871-Restore-healthcheck-at-dockerd-restart
Resume healthcheck when daemon restarts
2021-10-15 12:46:43 +02:00
Alexis Ries
9f39889dee Fixes #41871: Update daemon/daemon.go: resume healthcheck on restore
Call updateHealthMonitor for alive non-paused containers

Signed-off-by: Alexis Ries <alexis.ries.ext@orange.com>
2021-10-07 21:23:27 +02:00
Brian Goff
03f1c3d78f
Lock down docker root dir perms.
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.

0701 on lower directories also happens to allow any user to modify
     things in, for instance, the overlay upper dir which neccessarily
     has 0755 permissions.

This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.

This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 93ac040bf0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-05 09:57:00 +02:00
Brian Goff
7ccf750daa Allow switching Windows runtimes.
This adds support for 2 runtimes on Windows, one that uses the built-in
HCSv1 integration and another which uses containerd with the runhcs
shim.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-09-23 17:44:04 +00:00
Eng Zer Jun
c55a4ac779
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-08-27 14:56:57 +08:00
Roman Volosatovs
135cec5d4d
daemon,volume: share disk usage computations
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-08-09 19:59:39 +02:00
Roman Volosatovs
5adc29ffe2
daemon: sort imports according to gofmt
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
2021-08-09 19:59:37 +02:00
Sebastiaan van Stijn
e8e278c44f
daemon: simplify networking config
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-09 11:15:49 +02:00
Sebastiaan van Stijn
9b795c3e50
pkg/sysinfo.New(), daemon.RawSysInfo(): remove "quiet" argument
The "quiet" argument was only used in a single place (at daemon startup), and
every other use had to pass "false" to prevent this function from logging
warnings.

Now that SysInfo contains the warnings that occurred when collecting the
system information, we can make leave it up to the caller to use those
warnings (and log them if wanted).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-14 23:10:07 +02:00
Sebastiaan van Stijn
472f21b923
replace uses of deprecated containerd/sys.RunningInUserNS()
This utility was moved to a separate package, which has no
dependencies.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-18 11:01:24 +02:00
Sebastiaan van Stijn
3eb1257698
revendor BuildKit (master branch)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-16 01:17:48 +02:00
Sebastiaan van Stijn
dc7cbb9b33
remove layerstore indexing by OS (used for LCOW)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 17:49:11 +02:00
Sebastiaan van Stijn
076d9c6037
daemon: remove graphdriver indexing by OS (used for LCOW)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-10 10:17:26 +02:00
Akihiro Suda
0ad2293d0e
Merge pull request #41656 from thaJeztah/unexport_things 2021-06-08 12:07:40 +09:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
Sebastiaan van Stijn
34b854f965
daemon: un-export ModifyRootKeyLimit()
it's only used internally, so no need to export

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-31 19:06:14 +02:00
Sebastiaan van Stijn
0f3b94a5c7
daemon: remove migration code from docker 1.11 to 1.12
This code was added in 391441c28b, to fix
upgrades from docker 1.11 to 1.12 with existing containers.

Given that any container after 1.12 should have the correct configuration
already, it should be safe to assume this upgrade logic is no longer needed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-22 11:36:43 +01:00
Brian Goff
7f5e39bd4f
Use real root with 0701 perms
Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.

Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.

This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:01:25 +01:00
Brian Goff
4b5aa28f24
Do not set DOCKER_TMP to be owned by remapped root
The remapped root does not need access to this dir.
Having this owned by the remapped root opens the host up to an
uprivileged user on the host being able to escalate privileges.

While it would not be normal for the remapped UID to be used outside of
the container context, it could happen.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit bfedd27259)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:01:22 +01:00