After the context is cancelled, update the progress for the last time.
This makes sure that the progress also includes finishing updates.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Basically every exported method which takes a libnetwork.Sandbox
argument asserts that the value's concrete type is *sandbox. Passing any
other implementation of the interface is a runtime error! This interface
is a footgun, and clearly not necessary. Export and use the concrete
type instead.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
containerd: Push progress
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
If the imported layer archive is uncompressed, it gets compressed with
gzip before writing to the content store.
Archives compressed with gzip and zstd are imported as-is.
Xz and bzip2 are recompressed into gzip.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This helps ensure that users are not surprised by unexpected tokens in
the JSON parser, or fallout later in the daemon.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a pragmatic but impure choice, in order to better support the
default tools available on Windows Server, and reduce user confusion due
to otherwise inscrutable-to-the-uninitiated errors like the following:
> invalid character 'þ' looking for beginning of value
> invalid character 'ÿ' looking for beginning of value
While meaningful to those who are familiar with and are equipped to
diagnose encoding issues, these characters will be hidden when the file
is edited with a BOM-aware text editor, and further confuse the user.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
[RFC 8259] allows for JSON implementations to optionally ignore a BOM
when it helps with interoperability; do so in Moby as Notepad (the only
text editor available out of the box in many versions of Windows Server)
insists on writing UTF-8 with a BOM.
[RFC 8259]: https://tools.ietf.org/html/rfc8259#section-8.1
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
By relying on the kernel UAPI (userspace API), we can drop a dependency
and simplify building Moby, while also ensuring that we are using a
stable/supported source of the C types and defines we need.
btrfs-progs mirrors the kernel headers, but the headers it ships with
are not the canonical source and as [we have seen before][44698], could
be subject to changes.
Depending on the canonical headers from the kernel both is more
idiomatic, and ensures we are protected by the kernel's promise to not
break userspace.
[44698]: https://github.com/moby/moby/issues/44698
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is actually quite meaningless as we are reporting the libbtrfs
version, but we do not use libbtrfs. We only use the kernel interface to
btrfs instead.
While we could report the version of the kernel headers in play, they're
rather all-or-nothing: they provide the structures and defines we need,
or they don't. As such, drop all version information as the host kernel
version is the only thing that matters.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a squashed version of various PRs (or related code-changes)
to implement image inspect with the containerd-integration;
- add support for image inspect
- introduce GetImageOpts to manage image inspect data in backend
- GetImage to return image tags with details
- list images matching digest to discover all tags
- Add ExposedPorts and Volumes to the image returned
- Refactor resolving/getting images
- Return the image ID on inspect
- consider digest and ignore tag when both are set
- docker run --platform
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The image store's used are an interface, so there's no guarantee
that implementations don't wrap the errors. Make sure to catch
such cases by using errors.Is.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Having this function hides what it's doing, which is just to type-cast
to an image.ID (which is a digest). Using a cast is more transparent,
so deprecating this function in favor of a regular typecast.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Simplify the error message so that we don't have to distinguish between static-
and non-static builds. Also update the link to the storage-driver section to
use a "/go/" redirect in the docs, as the anchor link was no longer correct.
Using a "/go/" redirect makes sure the link remains functional if docs is moving
around.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in b86e3bee5a to
work around an issue in os/user.Current(), which SEGFAULTS when compiling
statically with cgo enabled (see golang/go#13470).
We hit similar issues in other parts, and contributed a "osusergo" build-
tag in https://go-review.googlesource.com/c/go/+/330753. The "osusergo"
build tag must be set when compiling static binaries with cgo enabled.
If that build-tag is set, the cgo implementation for user.Current() won't
be used, and a pure-go implementation is used instead;
https://github.com/golang/go/blob/go1.19.4/src/os/user/cgo_lookup_unix.go#L5
With the above in place, we no longer need this workaround, and can remove
the ensureHomeIfIAmStatic() function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After further looking at the code, it appears that the default exit-code
for unknown (other) errors is 128 (set in `defer`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These matches were overwriting the previous "match", so reversing the
order in which they're tried so that we can return early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Rename the variable make it more visible where it's used, as there's were
other "err" variables masking it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fixes a (theoretical?) panic if ID would be shorter than 12
characters. Also trim the ID _after_ cutting off the suffix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types and functions are more closely related to the functionality
provided by pkg/systeminfo, and used in conjunction with the other functions
in that package, so moving them there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since b58de39ca7, this option was now only used
to produce a fatal error when starting the daemon. That change is in the 23.0
release, so we can remove it from the master branch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added in 677a6b3506, and named
"common", because at the time, the "docker" and "dockerd" (daemon) code
were still in the same repository, and shared this type. Renaming it, now
that's no longer the case.
As there are no external consumers of this type, I'm not adding an alias.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(*Daemon).registerLinks() calling the WriteHostConfig() method of its
container argument is a vestigial behaviour. In the distant past,
registerLinks() would persist the container links in an SQLite database
and drop the link config from the container's persisted HostConfig. This
changed in Docker v1.10 (#16032) which migrated away from SQLite and
began using the link config in the container's HostConfig as the
persistent source of truth. registerLinks() no longer mutates the
HostConfig at all so persisting the HostConfig to disk falls outside of
its scope of responsibilities.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*Container).CheckpointTo() upserts a snapshot of the container to the
daemon's in-memory ViewDB and also persists the snapshot to disk. It
does not register the live container object with the daemon's container
store, however. The ViewDB and container store are used as the source of
truth for different operations, so having a container registered in one
but not the other can result in inconsistencies. In particular, the List
Containers API uses the ViewDB as its source of truth and the Container
Inspect API uses the container store.
The (*Daemon).setHostConfig() method is called fairly early in the
process of creating a container, long before the container is registered
in the daemon's container store. Due to a rogue CheckpointTo() call
inside setHostConfig(), there is a window of time where a container can
be included in a List Containers API response but "not exist" according
to the Container Inspect API and similar endpoints which operate on a
particular container. Remove the rogue call so that the caller has full
control over when the container is checkpointed and update callers to
checkpoint explicitly. No changes to (*Daemon).create() are needed as it
checkpoints the fully-created container via (*Daemon).Register().
Fixes#44512.
Signed-off-by: Cory Snider <csnider@mirantis.com>
GetContainer() would return (nil, nil) when looking up a container
if the container was inserted into the containersReplica ViewDB but not
the containers Store at the time of the lookup. Callers which reasonably
assume that the returned err == nil implies returned container != nil
would dereference a nil pointer and panic. Change GetContainer() so that
it always returns a container or an error.
Signed-off-by: Cory Snider <csnider@mirantis.com>
the non-exported "daemon.createNetwork" already returns nil if there's
an error, so no need to check the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows differentiating how the detailed data is collected between
the containerd-integration code and the existing implementation.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The List Images API endpoint has accepted multiple values for the
`since` and `before` filter predicates, but thanks to Go's randomizing
of map iteration order, it would pick an arbitrary image to compare
created timestamps against. In other words, the behaviour was undefined.
Change these filter predicates to have well-defined semantics: the
logical AND of all values for each of the respective predicates. As
timestamps are a totally-ordered relation, this is exactly equivalent to
applying the newest and oldest creation timestamps for the `since` and
`before` predicates, respectively.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This one is a "bit" fuzzy, as it may not be _directly_ related to `archive`,
but it's always used _in combination_ with the archive package, so moving it
there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The singleflight function was capturing the context.Context of the first
caller that invoked the `singleflight.Do`. This could cause all
concurrent calls to be cancelled when the first request is cancelled.
singleflight calls were also moved from the ImageService to Daemon, to
avoid having to implement this logic in both graphdriver and containerd
based image services.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is only used for tests, and the key is not verified anymore, so
instead of creating a key and storing it, we can just use an ad-hoc
one.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turned out that the loadOrCreateTrustKey() utility was doing exactly the
same as libtrust.LoadOrCreateTrustKey(), so making it a thin wrapped. I kept
the tests to verify the behavior, but we could remove them as we only need this
for our integration tests.
The storage location for the generated key was changed (again as we only need
this for some integration tests), so we can remove the TrustKeyPath from the
config.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The migration code is in the 22.06 branch, and if we don't migrate
the only side-effect is the daemon's ID being regenerated (as a
UUID).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Raspberry Pi allows to start system under overlayfs.
Docker is successfully fallbacks to fuse-overlay but not starting
because of the `Error starting daemon: rename /var/lib/docker/runtimes /var/lib/docker/runtimes-old: invalid cross-device link` error
It's happening because `rename` is not supported by overlayfs.
After manually removing directory `runtimes` docker starts and works successfully
Signed-off-by: Illia Antypenko <ilya@antipenko.pp.ua>
Both of these were deprecated in 55f675811a,
but the format of the GoDoc comments didn't follow the correct format, which
caused them not being picked up by tools as "deprecated".
This patch updates uses in the codebase to use the alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the file doesn't exist, the process isn't running, so we should be able
to ignore that.
Also remove an intermediate variable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Mounting a container's volumes under its rootfs directory inside the
host mount namespace causes problems with cross-namespace mount
propagation when /var/lib/docker is bind-mounted into the container as a
volume. The mount event propagates into the container's mount namespace,
overmounting the volume, but the propagated unmount events do not fully
reverse the effect. Each archive operation causes the mount table in the
container's mount namespace to grow larger and larger, until the kernel
limiton the number of mounts in a namespace is hit. The only solution to
this issue which is not subject to race conditions or other blocker
caveats is to avoid mounting volumes into the container's rootfs
directory in the host mount namespace in the first place.
Mount the container volumes inside an unshared mount namespace to
prevent any mount events from propagating into any other mount
namespace. Greatly simplify the archiving implementations by also
chrooting into the container rootfs to sidestep the need to resolve
paths in the host.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It is only applicable to Windows so it does not need to be called from
platform-generic code. Fix locking in the Windows implementation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Linux implementation needs to diverge significantly from the Windows
one in order to fix platform-specific bugs. Cut the generic
implementation out of daemon/archive.go and paste identical, verbatim
copies of that implementation into daemon/archive_{windows,linux}.go to
make it easier to compare the progression of changes to the respective
implementations through Git history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`system.MkdirAll()` is a special version of os.Mkdir to handle creating directories
using Windows volume paths (`"\\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}"`).
This may be important when `MkdirAll` is used, which traverses all parent paths to
create them if missing (ultimately landing on the "volume" path).
The daemon.NewDaemon() function used `system.MkdirAll()` in various places where
a subdirectory within `daemon.Root` was created. This appeared to be mostly out
of convenience (to not have to handle `os.ErrExist` errors). The `daemon.Root`
directory should already be set up in these locations, and should be set up with
correct permissions. Using `system.MkdirAll()` would potentially mask errors if
the root directory is missing, and instead set up parent directories (possibly
with incorrect permissions).
Because of the above, this patch changes `system.MkdirAll` to `os.Mkdir`. As we
are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
One location continues to use `system.MkdirAll`, as the temp-directory may be
configured to be outside of `daemon.Root`, but a redundant `os.Stat(realTmp)`
was removed, as `system.MkdirAll` is expected to handle this.
As we are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it more transparent that it's unused for Linux,
and we don't pass "root", which has no relation with the
path on Linux.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows us to run CI with the containerd snapshotter enabled, without
patching the daemon.json, or changing how tests set up daemon flags.
A warning log is added during startup, to inform if this variable is set,
as it should only be used for our integration tests.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The implementation of CanAccess() is very rudimentary, and should
not be used for anything other than a basic check (and maybe not
even for that). It's only used in a single location in the daemon,
so move it there, and un-export it to not encourage others to use
it out of context.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Building off insights from the great work Cory Snider has been doing,
this replaces a reexec with a much lower overhead implementation which
performs the `Chddir` in a new goroutine that is locked to a specific
thread with CLONE_FS unshared.
The thread is thrown away afterwards and the Chdir does effectively the
same thing as what the reexec was being used for.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
go-winio now defines this function, so we can consume that.
Note that there's a difference between the old implementation and the original
one (added in 1cb9e9b44e). The old implementation
had special handling for win32 error codes, which was removed in the go-winio
implementation in 0966e1ad56
As `go-winio.GetFileSystemType()` calls `filepath.VolumeName(path)` internally,
this patch also removes the `string(home[0])`, which is redundant, and could
potentially panic if an empty string would be passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 955c1f881a (Docker v17.12.0) replaced
detection of support for multiple lowerdirs (as required by overlay2) to not
depend on the kernel version. The `overlay2.override_kernel_check` was still
used to print a warning that older kernel versions may not have full support.
After this, commit e226aea280 (Docker v20.10.0,
backported to v19.03.7) removed uses of the `overlay2.override_kernel_check`
option altogether, but we were still parsing it.
This patch changes the `parseOptions()` function to not parse the option,
printing a deprecation warning instead. We should change this to be an error,
but the `overlay2.override_kernel_check` option was not deprecated in the
documentation, so keeping it around for one more release.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal
will be sent to the process when the OS thread on which cmd.Start() was
executed dies. The runtime terminates an OS thread when a goroutine
exits after being wired to the thread with runtime.LockOSThread(). If
other goroutines are allowed to be scheduled onto a thread which called
cmd.Start(), an unrelated goroutine could cause the thread to be
terminated and prematurely signal the command. See
https://github.com/golang/go/issues/27505 for more information.
Prevent started subprocesses with Pdeathsig from getting signaled
prematurely by wiring the starting goroutine to the OS thread until the
subprocess has exited. No other goroutines can be scheduled onto a
locked thread so it will remain alive until unlocked or the daemon
process exits.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The pkg/fsutils package was forked in containerd, and later moved to
containerd/continuity/fs. As we're moving more bits to containerd, let's also
use the same implementation to reduce code-duplication and to prevent them from
diverging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change, the awslogs collectBatch and processEvent
function documentation still referenced the batchPublishFrequency
constant which was removed in favor of the configurable log stream
forceFlushInterval member.
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
Before this change restarting the daemon in live-restore with running
containers + a restart policy meant that volume refs were not restored.
This specifically happens when the container is still running *and*
there is a restart policy that would make sure the container was running
again on restart.
The bug allows volumes to be removed even though containers are
referencing them. 😱
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This package was moved to a separate repository, using the steps below:
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/docker.git moby_pubsub_temp
cd moby_pubsub_temp
# for reference
git rev-parse HEAD
# --> 572ca799db
# remove all code, except for pkg/pubsub, license, and notice, and rename pkg/pubsub to /
git filter-repo --path pkg/pubsub/ --path LICENSE --path NOTICE --path-rename pkg/pubsub/:
# remove canonical imports
git revert -s -S 585ff0ebbe6bc25b801a0e0087dd5353099cb72e
# initialize module
go mod init github.com/moby/pubsub
go mod tidy
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Managed containerd processes are executed with SysProcAttr.Pdeathsig set
to syscall.SIGKILL so that the managed containerd is automatically
killed along with the daemon. At least, that is the intention. In
practice, the signal is sent to the process when the creating _OS
thread_ dies! If a goroutine exits while locked to an OS thread, the Go
runtime will terminate the thread. If that thread happens to be the
same thread which the subprocess was started from, the subprocess will
be signaled. Prevent the journald driver from sometimes unintentionally
killing child processes by ensuring that all runtime.LockOSThread()
calls are paired with runtime.UnlockOSThread().
Signed-off-by: Cory Snider <csnider@mirantis.com>
daemon/network/filter_test.go:174:19: empty-lines: extra empty line at the end of a block (revive)
daemon/restart.go:17:116: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_linux_test.go:255:41: empty-lines: extra empty line at the end of a block (revive)
daemon/reload_test.go:340:58: empty-lines: extra empty line at the end of a block (revive)
daemon/oci_linux.go:495:101: empty-lines: extra empty line at the end of a block (revive)
daemon/seccomp_linux_test.go:17:36: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:560:73: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:558:76: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:1092:64: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:587:24: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:807:18: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:813:42: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:872:72: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/images/image_squash.go:17:71: empty-lines: extra empty line at the start of a block (revive)
daemon/images/store.go:128:27: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_list.go:154:55: empty-lines: extra empty line at the start of a block (revive)
daemon/images/image_delete.go:135:13: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_search.go:25:64: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/logger/loggertest/logreader.go:58:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/ring_test.go:119:34: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:37:12: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:41:44: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:170:9: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp_test.go:152:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp.go:124:117: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/syslog/syslog.go:249:87: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/graphdriver/aufs/aufs.go:239:80: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:249:27: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/testutil.go:271:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:179:32: empty-block: this block is empty, you can remove it (revive)
daemon/graphdriver/zfs/zfs.go:375:48: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/overlay/overlay.go:248:89: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:636:21: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1150:70: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1613:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1645:65: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/btrfs/btrfs.go:53:101: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1944:89: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/cluster/convert/service.go:96:34: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:169:44: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:470:30: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/container.go:224:23: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/convert/network.go:109:14: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:537:27: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:247:19: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:252:41: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:256:12: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:289:80: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/health_test.go:18:37: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/adapter.go:437:68: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrExtractPointNotDirectory was
not checked for, or used as a sentinel error.
This error was introduced in c32dde5baa. It was
never used as a sentinel error, but from that commit, it looks like it was added
as a package variable to mirror already existing errors defined at the package
level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrVolumeReadonly was not checked
for, or used as a sentinel error.
This error was introduced in c32dde5baa. It was
never used as a sentinel error, but from that commit, it looks like it was added
as a package variable to mirror already existing errors defined at the package
level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrRootFSReadOnly was not checked
for, or used as a sentinel error.
This error was introduced in c32dde5baa, originally
named `ErrContainerRootfsReadonly`. It was never used as a sentinel error, but
from that commit, it looks like it was added as a package variable to mirror
the coding style of already existing errors defined at the package level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The getPortMapInfo var was introduced in f198dfd856,
and (from looking at that patch) looks to have been as a quick and dirty workaround
for the `container` argument colliding with the `container` import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was unclear what the distinction was between these configuration
structs, so merging them to simplify.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the "deadcode", "structcheck", and "varcheck" linters, as they are
deprecated:
WARN [runner] The linter 'deadcode' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'structcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'varcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [linters context] structcheck is disabled because of generics. You can track the evolution of the generics support by following the https://github.com/golangci/golangci-lint/issues/2649.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that the type of Container.BaseFS has been reverted to a string,
values can never implement the extractor or archiver interfaces. Rip out
the dead code to support archiving and unarchiving through those
interfcaes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Driver abstraction was needed for Linux Containers on Windows,
support for which has since been removed.
There is no direct equivalent to Lchmod() in the standard library so
continue to use the containerd/continuity version.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Now that we can pass any custom containerd shim to dockerd there is need
for this check. Without this it becomes possible to use wasm shims for
example with images that have "wasi" as the OS.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
After discussing in the maintainers meeting, we concluded that Slowloris attacks
are not a real risk other than potentially having some additional goroutines
lingering around, so setting a long timeout to satisfy the linter, and to at
least have "some" timeout.
libnetwork/diagnostic/server.go:96:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv := &http.Server{
Addr: net.JoinHostPort(ip, strconv.Itoa(port)),
Handler: s,
}
api/server/server.go:60:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv: &http.Server{
Addr: addr,
},
daemon/metrics_unix.go:34:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
cmd/dockerd/metrics.go:27:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These interfaces were added in aacddda89d, with
no clear motivation, other than "Also hide ViewDB behind an interface".
This patch removes the interface in favor of using a concrete implementation;
There's currently only one implementation of this interface, and if we would
decide to change to an alternative implementation, we could define relevant
interfaces on the receiver side.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure we use the same alias everywhere for easier finding,
and to prevent accidentally introducing duplicate imports with
different aliases for the same package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- prefer error over panic where possible
- ContainerChanges is not implemented by snapshotter-based ImageService
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The wrapper sets the default namespace in the context if none is
provided, this is needed because we are calling these services directly
and not trough GRPC that has an interceptor to set the default namespace
to all calls.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Prevent new health check probes from racing the task deletion. This may
have been a root cause of containers taking so long to stop on Windows.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We have integration tests which assert the invariant that a
GET /containers/{id}/json response lists only IDs of execs which are in
the Running state, according to GET /exec/{id}/json. The invariant could
be violated if those requests were to race the handling of the exec's
task-exit event. The coarse-grained locking of the container ExecStore
when starting an exec task was accidentally synchronizing
(*Daemon).ProcessEvent and (*Daemon).ContainerExecInspect to it just
enough to make it improbable for the integration tests to catch the
invariant violation on execs which exit immediately. Removing the
unnecessary locking made the underlying race condition more likely for
the tests to hit.
Maintain the invariant by deleting the exec from its container's
ExecCommands before clearing its Running flag. Additionally, fix other
potential data races with execs by ensuring that the ExecConfig lock is
held whenever a mutable field is read from or written to.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing logic to handle container ID conflicts when attempting to
create a plugin container is not nearly as robust as the implementation
in daemon for user containers. Extract and refine the logic from daemon
and use it in the plugin executor.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.
Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The OOMKilled flag on a container's state has historically behaved
rather unintuitively: it is updated on container exit to reflect whether
or not any process within the container has been OOM-killed during the
preceding run of the container. The OOMKilled flag would be set to true
when the container exits if any process within the container---including
execs---was OOM-killed at any time while the container was running,
whether or not the OOM-kill was the cause of the container exiting. The
flag is "sticky," persisting through the next start of the container;
only being cleared once the container exits without any processes having
been OOM-killed that run.
Alter the behavior of the OOMKilled flag such that it signals whether
any process in the container had been OOM-killed since the most recent
start of the container. Set the flag immediately upon any process being
OOM-killed, and clear it when the container transitions to the "running"
state.
There is an ulterior motive for this change. It reduces the amount of
state the libcontainerd client needs to keep track of and clean up on
container exit. It's one less place the client could leak memory if a
container was to be deleted without going through libcontainerd.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The daemon.containerd.Exec call does not access or mutate the
container's ExecCommands store in any way, and locking the exec config
is sufficient to synchronize with the event-processing loop. Locking
the ExecCommands store while starting the exec process only serves to
block unrelated operations on the container for an extended period of
time.
Convert the Store struct's mutex to an unexported field to prevent this
from regressing in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an integration test to verify that health checks are killed on
timeout and that the output is captured.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just a minor cleanup; I went looking where these options were used, and
these occurrences set then to their default value. Removing these
assignments makes it easier to find where they're actually used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat (storage/graph)Driver as snapshotter
Also moved some layerStore related initialization to the non-c8d case
because otherwise they get treated as a graphdriver plugins.
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since runtimes can now just be containerd shims, we need to check if the
reference is possibly a containerd shim.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The `-g` / `--graph` options were soft deprecated in favor of `--data-root` in
261ef1fa27 (v17.05.0) and at the time considered
to not be removed. However, with the move towards containerd snapshotters, having
these options around adds additional complexity to handle fallbacks for deprecated
(and hidden) flags, so completing the deprecation.
With this patch:
dockerd --graph=/var/lib/docker --validate
Flag --graph has been deprecated, Use --data-root instead
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
mkdir -p /etc/docker
echo '{"graph":"/var/lib/docker"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used as an intermediate variable to store what's returned
by layerstore.DriverName() / ImageService.StorageDriver()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the function name more generic, as it's no longer used only
for graphdrivers but also for snapshotters.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Makes sure that tests use a config struct that's more representative
to how it's used in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This centralizes more defaults, to be part of the config struct that's
created, instead of interweaving the defaults with other code in various
places.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the information stored as part of the container for the error-message,
instead of querying the current storage driver from the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The check was accounting for old containers that did not have a storage-driver
set in their config, and was added in 4908d7f81d
for docker v0.7.0-rc6 - nearly 9 Years ago, so very likely nobody is still
depending on this ;-)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 0cba7740d4, as part of
the LCOW implementation. LCOW support has been removed, so we can remove
this check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently only provides the existing "platform" option, but more
options will be added in follow-ups.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 6068d1894d...48dd89375d
Finishes off the work to change references to cluster volumes in the API
from using "csi" as the magic word to "cluster". This reflects that the
volumes are "cluster volumes", not "csi volumes".
Notably, there is no change to the plugin definitions being "csinode"
and "csicontroller". This terminology is appropriate with regards to
plugins because it accurates reflects what the plugin is.
Signed-off-by: Drew Erny <derny@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
rename some variables that collided with imports or (upcoming)
changes, e.g. `ctx`, which is commonly used for `context.Context`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Contrary to popular belief, the OCI Runtime specification does not
specify the command-line API for runtimes. Looking at containerd's
architecture from the lens of the OCI Runtime spec, the _shim_ is the
OCI Runtime and runC is "just" an implementation detail of the
io.containerd.runc.v2 runtime. When one configures a non-default runtime
in Docker, what they're really doing is instructing Docker to create
containers using the io.containerd.runc.v2 runtime with a configuration
option telling the runtime that the runC binary is at some non-default
path. Consequently, only OCI runtimes which are compatible with the
io.containerd.runc.v2 shim, such as crun, can be used in this manner.
Other OCI runtimes, including kata-containers v2, come with their own
containerd shim and are not compatible with io.containerd.runc.v2.
As Docker has not historically provided a way to select a non-default
runtime which requires its own shim, runtimes such as kata-containers v2
could not be used with Docker.
Allow other containerd shims to be used with Docker; no daemon
configuration required. If the daemon is instructed to create a
container with a runtime name which does not match any of the configured
or stock runtimes, it passes the name along to containerd verbatim. A
user can start a container with the kata-containers runtime, for
example, simply by calling
docker run --runtime io.containerd.kata.v2
Runtime names which containerd would interpret as a path to an arbitrary
binary are disallowed. While handy for development and testing it is not
strictly necessary and would allow anyone with Engine API access to
trivially execute any binary on the host as root, so we have decided it
would be safest for our users if it was not allowed.
It is not yet possible to set an alternative containerd shim as the
default runtime; it can only be configured per-container.
Signed-off-by: Cory Snider <csnider@mirantis.com>
doCopyXattrs() never reached due to copyXattrs boolean being false, as
a result file capabilities not being copied.
moved copyXattr() out of doCopyXattrs()
Signed-off-by: Illo Abdulrahim <abdulrahim.illo@nokia.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement --follow entirely correctly for the journald log reader, such
that it exits immediately upon reading back the last log message written
to the journal before the logger was closed. The impossibility of doing
so has been slightly exaggerated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Fix journald and logfile-powered (jsonfile, local) log readers
incorrectly filtering out messages with timestamps < Since which were
preceded by a message with a timestamp >= Since.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Careful management of the journal read pointer is sufficient to ensure
that no entry is read more than once.
Unit test the journald logger without requiring a running journald by
using the systemd-journal-remote command to write arbitrary entries to
journal files.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Wrap the libsystemd journal reading functionality in a more idiomatic Go
API and refactor the journald logging driver's ReadLogs implementation
to use the wrapper. Rewrite the parts of the ReadLogs implementation in
Go which were previously implemented in C as part of the cgo preamble.
Separating the business logic from the cgo minutiae should hopefully
make the code more accessible to a wider audience of developers for
reviewing the code and contributing improvements.
The structure of the ReadLogs implementation is retained with few
modifications. Any ignored errors were also ignored before the refactor;
the explicit error return values afforded by the sdjournal wrapper makes
this more obvious.
The package github.com/coreos/go-systemd/v22/sdjournal also provides a
more idiomatic Go wrapper around libsystemd. It is unsuitable for our
needs as it does not expose wrappers for the sd_journal_process and
sd_journal_get_fd functions.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure the package can be imported, no matter the build constratints, by
adding an unconstrained doc.go containing a package statement.
Signed-off-by: Cory Snider <csnider@mirantis.com>