This changes mounts.NewParser() to create a parser for the current operatingsystem,
instead of one specific to a (possibly non-matching, in case of LCOW) OS.
With the OS-specific handling being removed, the "OS" parameter is also removed
from `daemon.verifyContainerSettings()`, and various other container-related
functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The LCOW implementation in dockerd has been deprecated in favor of re-implementation
in containerd (in progress). Microsoft started removing the LCOW V1 code from the
build dependencies we use in Microsoft/opengcs (soon to be part of Microsoft/hcshhim),
which means that we need to start removing this code.
This first step removes the lcow graphdriver, the LCOW initialization code, and
some LCOW-related utilities.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Needed for runc >= 1.0.0-rc94.
See runc issue 2928.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When writing container's `hostconfig.json`, permissions were set to 0644 (world-
readable). While this is not a security concern (as the `/var/lib/docker/containers`
directory has `0700` or `0701` permissions), there is no real need to have these
permissions, as this file is only accessed by the daemon.
Looking at history for file permissions;
- 06b53e3fc7 (first implementation) used `0666` (world-writable)
- cf1a6c08fa refactored the code, and removed explicit permissions
- ea3cbd3274 introduced atomic writes, and brought back the `0666` permissions
- 3ec8fed747 removed world-writable bits, but kept world-readable
This patch updates the permissions to `0600`, matching what's used for `config.v2.json`,
which was updated in ae52cea3ab, but forgot to update
`hostconfig.json`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 12c7541f1f updated the
opencontainers/selinux dependency to v1.3.1, which had a breaking
change in the errors that were returned.
Before v1.3.1, the "raw" `syscall.ENOTSUP` was returned if the
underlying filesystem did not support xattrs, but later versions
wrapped the error, which caused our detection to fail.
This patch uses `errors.Is()` to check for the underlying error.
This requires github.com/pkg/errors v0.9.1 or above (older versions
could use `errors.Cause()`, but are not compatible with "native"
wrapping of errors in Go 1.13 and up, and could potentially cause
these errors to not being detected again.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since we don't need the actual split values, instead of calling
`strings.Split`, which allocates new slices on each call, use
`strings.Index`.
This significantly reduces the allocations required when doing env value
replacements.
Additionally, pre-allocate the env var slice, even if we allocate a
little more than we need, it keeps us from having to do multiple
allocations while appending.
```
benchmark old ns/op new ns/op delta
BenchmarkReplaceOrAppendEnvValues/0-8 486 313 -35.60%
BenchmarkReplaceOrAppendEnvValues/100-8 10553 1535 -85.45%
BenchmarkReplaceOrAppendEnvValues/1000-8 94275 12758 -86.47%
BenchmarkReplaceOrAppendEnvValues/10000-8 1161268 129269 -88.87%
benchmark old allocs new allocs delta
BenchmarkReplaceOrAppendEnvValues/0-8 5 2 -60.00%
BenchmarkReplaceOrAppendEnvValues/100-8 110 0 -100.00%
BenchmarkReplaceOrAppendEnvValues/1000-8 1013 0 -100.00%
BenchmarkReplaceOrAppendEnvValues/10000-8 10022 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkReplaceOrAppendEnvValues/0-8 192 24 -87.50%
BenchmarkReplaceOrAppendEnvValues/100-8 7360 0 -100.00%
BenchmarkReplaceOrAppendEnvValues/1000-8 64832 0 -100.00%
BenchmarkReplaceOrAppendEnvValues/10000-8 1146049 0 -100.00%
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.
This commit was generated by the following bash script:
```
set -e -u -o pipefail
for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
$file
goimports -w $file
done
```
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Configuration over the API per container is intentionally left out for
the time being, but is supported to configure the default from the
daemon config.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit cbecf48bc352e680a5390a7ca9cff53098cd16d7)
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This supplements any log driver which does not support reads with a
custom read implementation that uses a local file cache.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit d675e2bf2b75865915c7a4552e00802feeb0847f)
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Format the source according to latest goimports.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This made my IDE unhappy; `ConfigFilePath` is an exported function, so
it makes sense to use the same signature for both Linux and Windows.
This patch also adds error handling (same as on Linux), even though the
current implementation will never return an error (it's good practice
to handle errors, so I assumed this would be the right approach)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
also renamed the non-windows variant of this file to be
consistent with other files in this package
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When cleaning up IPC mounts, the daemon could log a warning if the IPC mount was not found;
```
cleanup: failed to unmount IPC: umount /var/lib/docker/containers/90f408e26e205d30676655a08504dddc0d17f5713c1dd4654cf67ded7d3bbb63/mounts/shm, flags: 0x2: no such file or directory"
```
These warnings are safe to ignore, but can cause some confusion; `container.UnmountIpcMount()`
already attempted to suppress these warnings, however, `mount.Unmount()` returns a `mountError`,
which nests the original error, therefore detecting failed.
This parch uses `errors.Cause()` to get the _underlying_ error to detect if it's a "is not exist".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the second part to
https://github.com/containerd/containerd/pull/3361 and will help process
delete not block forever when the process exists but the I/O was
inherited by a subprocess that lives on.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
pborman/uuid and google/uuid used to be different versions of
the same package, but now pborman/uuid is a compatibility wrapper
around google/uuid, maintained by the same person.
Clean up some of the usage as the functions differ slightly.
Not yet removed some uses of pborman/uuid in vendored code but
I have PRs in process for these.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.
Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This import got lost after commit 56cc56b0fa
was merged, likely because the PR was built against an outdated
master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.
Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:
> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted
or
> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted
Before this patch, it was just
> operation not permitted
[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As standard mount.Unmount does what we need, let's use it.
In addition, this adds ignoring "not mounted" condition, which
was previously implemented (see PR#33329, commit cfa2591d3f)
via a very expensive call to mount.Mounted().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This allows non-recursive bind-mount, i.e. mount(2) with "bind" rather than "rbind".
Swarm-mode will be supported in a separate PR because of mutual vendoring.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This driver uses protobuf to store log messages and has better defaults
for log file handling (e.g. compression and file rotation enabled by
default).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
With a full attach, each attach was leaking 4 goroutines.
This updates attach to use errgroup instead of the hodge-podge of
waitgroups and channels.
In addition, the detach event was never being sent.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
These network operations really don't have anything to do with the
container but rather are setting up the networking.
Ideally these wouldn't get shoved into the daemon package, but doing
something else (e.g. extract a network service into a new package) but
there's a lot more work to do in that regard.
In reality, this probably simplifies some of that work as it moves all
the network operations to the same place.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
1. As daemon.ContainerStop() documentation says,
> If a negative number of seconds is given, ContainerStop
> will wait for a graceful termination.
but since commit cfdf84d5d0 (PR #32237) this is no longer the case.
This happens because `context.WithTimeout(ctx, timeout)` is implemented
as `WithDeadline(ctx, time.Now().Add(timeout))`, resulting in a deadline
which is in the past.
To fix, don't use WithDeadline() if the timeout is negative.
2. Add a test case to validate the correct behavior and
as a means to prevent a similar regression in the future.
3. Fix/improve daemon.ContainerStop() and client.ContainerStop()
description for clarity and completeness.
4. Fix/improve DefaultStopTimeout description.
Fixes: cfdf84d5d0 ("Update Container Wait")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".
Many vendored packages still use x/net/context, so vendor entry remains
for now.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The unit test is checking that setting of non-default StopTimeout
works, but it checked the value of StopSignal instead.
Amazingly, the test was working since the default StopSignal is SIGTERM,
which has the numeric value of 15.
Fixes: commit e66d21089 ("Add config parameter to change ...")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This moves the platform specific stuff in a separate package and keeps
the `volume` package and the defined interfaces light to import.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit 7a7357dae1 ("LCOW: Implemented support for docker cp + build")
changed `container.BaseFS` from being a string (that could be empty but
can't lead to nil pointer dereference) to containerfs.ContainerFS,
which could be be `nil` and so nil dereference is at least theoretically
possible, which leads to panic (i.e. engine crashes).
Such a panic can be avoided by carefully analysing the source code in all
the places that dereference a variable, to make the variable can't be nil.
Practically, this analisys are impossible as code is constantly
evolving.
Still, we need to avoid panics and crashes. A good way to do so is to
explicitly check that a variable is non-nil, returning an error
otherwise. Even in case such a check looks absolutely redundant,
further changes to the code might make it useful, and having an
extra check is not a big price to pay to avoid a panic.
This commit adds such checks for all the places where it is not obvious
that container.BaseFS is not nil (which in this case means we do not
call daemon.Mount() a few lines earlier).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
While debugging #32838, it was found (https://github.com/moby/moby/issues/32838#issuecomment-356005845) that the utility VM in some circumstances was crashing. Unfortunately, this was silently thrown away, and as far as the build step (also applies to docker run) was concerned, the exit code was zero and the error was thrown away. Windows containers operate differently to containers on Linux, and there can be legitimate system errors during container shutdown after the init process exits. This PR handles this and passes the error all the way back to the client, and correctly causes a build step running a container which hits a system error to fail, rather than blindly trying to keep going, assuming all is good, and get a subsequent failure on a commit.
With this change, assuming an error occurs, here's an example of a failure which previous was reported as a commit error:
```
The command 'powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue'; Install-WindowsFeature -Name Web-App-Dev ; Install-WindowsFeature -Name ADLDS; Install-WindowsFeature -Name Web-Mgmt-Compat; Install-WindowsFeature -Name Web-Mgmt-Service; Install-WindowsFeature -Name Web-Metabase; Install-WindowsFeature -Name Web-Lgcy-Scripting; Install-WindowsFeature -Name Web-WMI; Install-WindowsFeature -Name Web-WHC; Install-WindowsFeature -Name Web-Scripting-Tools; Install-WindowsFeature -Name Web-Net-Ext45; Install-WindowsFeature -Name Web-ASP; Install-WindowsFeature -Name Web-ISAPI-Ext; Install-WindowsFeature -Name Web-ISAPI-Filter; Install-WindowsFeature -Name Web-Default-Doc; Install-WindowsFeature -Name Web-Dir-Browsing; Install-WindowsFeature -Name Web-Http-Errors; Install-WindowsFeature -Name Web-Static-Content; Install-WindowsFeature -Name Web-Http-Redirect; Install-WindowsFeature -Name Web-DAV-Publishing; Install-WindowsFeature -Name Web-Health; Install-WindowsFeature -Name Web-Http-Logging; Install-WindowsFeature -Name Web-Custom-Logging; Install-WindowsFeature -Name Web-Log-Libraries; Install-WindowsFeature -Name Web-Request-Monitor; Install-WindowsFeature -Name Web-Http-Tracing; Install-WindowsFeature -Name Web-Stat-Compression; Install-WindowsFeature -Name Web-Dyn-Compression; Install-WindowsFeature -Name Web-Security; Install-WindowsFeature -Name Web-Windows-Auth; Install-WindowsFeature -Name Web-Basic-Auth; Install-WindowsFeature -Name Web-Url-Auth; Install-WindowsFeature -Name Web-WebSockets; Install-WindowsFeature -Name Web-AppInit; Install-WindowsFeature -Name NET-WCF-HTTP-Activation45; Install-WindowsFeature -Name NET-WCF-Pipe-Activation45; Install-WindowsFeature -Name NET-WCF-TCP-Activation45;' returned a non-zero code: 4294967295: container shutdown failed: container ba9c65054d42d4830fb25ef55e4ab3287550345aa1a2bb265df4e5bfcd79c78a encountered an error during WaitTimeout: failure in a Windows system call: The compute system exited unexpectedly. (0xc0370106)
```
Without this change, it would be incorrectly reported such as in this comment: https://github.com/moby/moby/issues/32838#issuecomment-309621097
```
Step 3/8 : ADD buildtools C:/buildtools
re-exec error: exit status 1: output: time="2017-06-20T11:37:38+10:00" level=error msg="hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3) layerId=\\\\?\\C:\\ProgramData\\docker\\windowsfilter\\b41d28c95f98368b73fc192cb9205700e21
6691495c1f9ac79b9b04ec4923ea2 flavour=1 folder=C:\\Windows\\TEMP\\hcs232661915"
hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3) layerId=\\?\C:\ProgramData\docker\windowsfilter\b41d28c95f98368b73fc192cb9205700e216691495c1f9ac79b9b04ec4923ea2 flavour=1 folder=C:\Windows\TEMP\hcs232661915
```
This fixes an issue where the container LogPath was empty when the
non-blocking logging mode was enabled. This change sets the LogPath on
the container as soon as the path is generated, instead of setting the
LogPath on a logger struct and then attempting to pull it off that
logger at a later point. That attempt to pull the LogPath off the logger
was error prone since it assumed that the logger would only ever be a
single type.
Prior to this change docker inspect returned an empty string for
LogPath. This caused issues with tools that rely on docker inspect
output to discover container logs, e.g. Kubernetes.
This commit also removes some LogPath methods that are now unnecessary
and are never invoked.
Signed-off-by: junzhe and mnussbaum <code@getbraintree.com>
On unix, merge secrets/configs handling. This is important because
configs can contain secrets (via templating) and potentially a config
could just simply have secret information "by accident" from the user.
This just make sure that configs are as secure as secrets and de-dups a
lot of code.
Generally this makes everything simpler and configs more secure.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
It's a common scenario for admins and/or monitoring applications to
mount in the daemon root dir into a container. When doing so all mounts
get coppied into the container, often with private references.
This can prevent removal of a container due to the various mounts that
must be configured before a container is started (for example, for
shared /dev/shm, or secrets) being leaked into another namespace,
usually with private references.
This is particularly problematic on older kernels (e.g. RHEL < 7.4)
where a mount may be active in another namespace and attempting to
remove a mountpoint which is active in another namespace fails.
This change moves all container resource mounts into a common directory
so that the directory can be made unbindable.
What this does is prevents sub-mounts of this new directory from leaking
into other namespaces when mounted with `rbind`... which is how all
binds are handled for containers.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in 35920 where the filter
of `docker ps` with `health=starting` always returns nothing.
The issue was that in container view, the human readable string (`HealthString()` => `Health.String()`)
of health status was used. In case of starting it is `"health: starting"`.
However, the filter still uses `starting` so no match returned.
This fix fixes the issue by using `container.Health.Status()` instead so that it matches
the string (`starting`) passed by filter.
This fix fixes 35920.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a fix to the following issue:
$ docker run --tmpfs /dev/shm busybox sh
docker: Error response from daemon: linux mounts: Duplicate mount point '/dev/shm'.
In current code (daemon.createSpec()), tmpfs mount from --tmpfs is added
to list of mounts (`ms`), when the mount from IpcMounts() is added.
While IpcMounts() is checking for existing mounts first, it does that
by using container.HasMountFor() function which only checks container.Mounts
but not container.Tmpfs.
Ultimately, the solution is to get rid of container.Tmpfs (moving its
data to container.Mounts). Current workaround is to add checking
of container.Tmpfs into container.HasMountFor().
A unit test case is included.
Unfortunately we can't call daemon.createSpec() from a unit test,
as the code relies a lot on various daemon structures to be initialized
properly, and it is hard to achieve. Therefore, we minimally mimick
the code flow of daemon.createSpec() -- barely enough to reproduce
the issue.
https://github.com/moby/moby/issues/35455
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The code in question looks up mounts two times: first by using
HasMountFor(), and then directly by looking in container.MountPoints.
There is no need to do it twice.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Adds a mutex to protect the status, as well. When running the race
detector with the unit test, we can see that the Status field is written
without holding this lock. Adding a mutex to read and set status
addresses the issue.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
While this code was likely called from a single thread before, we have
now seen panics, indicating that it could be called in parallel. This
change adds a mutex to protect opening and closing of the channel. There
may be another root cause associated with this panic, such as something
that led to the calling of this in parallel, as this code is old and we
had seen this condition until recently.
This fix is by no means a permanent fix. Typically, bugs like this
indicate misplaced channel ownership. In idiomatic uses, the channel
should have a particular "owner" that coordinates sending and closure.
In this case, the owner of the channel is unclear, so it gets opened
lazily. Synchronizing this access is a decent solution, but a refactor
may yield better results.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Currently, if a container removal has failed for some reason,
any client waiting for removal (e.g. `docker run --rm`) is
stuck, waiting for removal to succeed while it has failed already.
For more details and the reproducer, please check
https://github.com/moby/moby/issues/34945
This commit addresses that by allowing `ContainerWait()` with
`container.WaitCondition == "removed"` argument to return an
error in case of removal failure. The `ContainerWaitOKBody`
stucture returned to a client is amended with a pointer to `struct Error`,
containing an error message string, and the `Client.ContainerWait()`
is modified to return the error, if any, to the client.
Note that this feature is only available for API version >= 1.34.
In order for the old clients to be unstuck, we just close the connection
without writing anything -- this causes client's error.
Now, docker-cli would need a separate commit to bump the API to 1.34
and to show an error returned, if any.
[v2: recreate the waitRemove channel after closing]
[v3: document; keep legacy behavior for older clients]
[v4: convert Error from string to pointer to a struct]
[v5: don't emulate old behavior, send empty response in error case]
[v6: rename legacy* vars to include version suffix]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The shutdown timeout for containers in insufficient on Windows. If the daemon is shutting down, and a container takes longer than expected to shut down, this can cause the container to remain in a bad state after restart, and never be able to start again. Increasing the timeout makes this less likely to occur.
Signed-off-by: Darren Stahl <darst@microsoft.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
Take an extra reference to rwlayer while the container is being
committed or exported to avoid the removal of that layer.
Also add some checks before commit/export.
Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
The promise package represents a simple enough concurrency pattern that
replicating it in place is sufficient. To end the propagation of this
package, it has been removed and the uses have been inlined.
While this code could likely be refactored to be simpler without the
package, the changes have been minimized to reduce the possibility of
defects. Someone else may want to do further refactoring to remove
closures and reduce the number of goroutines in use.
Signed-off-by: Stephen J Day <stephen.day@docker.com>