Signed-off-by: John Howard <jhoward@microsoft.com>
Some permissions corrections here. Also needs re-vendor of go-winio.
- Create the layer folder directory as standard, not with SDDL. It will inherit permissions from the data-root correctly.
- Apply the VM Group SID access to layer.vhd
Permissions after this changes
Data root:
```
PS C:\> icacls test
test BUILTIN\Administrators:(OI)(CI)(F)
NT AUTHORITY\SYSTEM:(OI)(CI)(F)
```
lcow subdirectory under dataroot
```
PS C:\> icacls test\lcow
test\lcow BUILTIN\Administrators:(I)(OI)(CI)(F)
NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
```
layer.vhd in a layer folder for LCOW
```
.\test\lcow\c33923d21c9621fea2f990a8778f469ecdbdc57fd9ca682565d1fa86fadd5d95\layer.vhd NT VIRTUAL MACHINE\Virtual Machines:(R)
BUILTIN\Administrators:(I)(F)
NT AUTHORITY\SYSTEM:(I)(F)
```
And showing working
```
PS C:\> docker-ci-zap -folder=c:\test
INFO: Zapped successfully
PS C:\> docker run --rm alpine echo hello
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
8e402f1a9c57: Pull complete
Digest: sha256:644fcb1a676b5165371437feaa922943aaf7afcfa8bfee4472f6860aad1ef2a0
Status: Downloaded newer image for alpine:latest
hello
```
Signed-off-by: John Howard <jhoward@microsoft.com>
Also fixes https://github.com/moby/moby/issues/22874
This commit is a pre-requisite to moving moby/moby on Windows to using
Containerd for its runtime.
The reason for this is that the interface between moby and containerd
for the runtime is an OCI spec which must be unambigious.
It is the responsibility of the runtime (runhcs in the case of
containerd on Windows) to ensure that arguments are escaped prior
to calling into HCS and onwards to the Win32 CreateProcess call.
Previously, the builder was always escaping arguments which has
led to several bugs in moby. Because the local runtime in
libcontainerd had context of whether or not arguments were escaped,
it was possible to hack around in daemon/oci_windows.go with
knowledge of the context of the call (from builder or not).
With a remote runtime, this is not possible as there's rightly
no context of the caller passed across in the OCI spec. Put another
way, as I put above, the OCI spec must be unambigious.
The other previous limitation (which leads to various subtle bugs)
is that moby is coded entirely from a Linux-centric point of view.
Unfortunately, Windows != Linux. Windows CreateProcess uses a
command line, not an array of arguments. And it has very specific
rules about how to escape a command line. Some interesting reading
links about this are:
https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/https://stackoverflow.com/questions/31838469/how-do-i-convert-argv-to-lpcommandline-parameter-of-createprocesshttps://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017
For this reason, the OCI spec has recently been updated to cater
for more natural syntax by including a CommandLine option in
Process.
What does this commit do?
Primary objective is to ensure that the built OCI spec is unambigious.
It changes the builder so that `ArgsEscaped` as commited in a
layer is only controlled by the use of CMD or ENTRYPOINT.
Subsequently, when calling in to create a container from the builder,
if follows a different path to both `docker run` and `docker create`
using the added `ContainerCreateIgnoreImagesArgsEscaped`. This allows
a RUN from the builder to control how to escape in the OCI spec.
It changes the builder so that when shell form is used for RUN,
CMD or ENTRYPOINT, it builds (for WCOW) a more natural command line
using the original as put by the user in the dockerfile, not
the parsed version as a set of args which loses fidelity.
This command line is put into args[0] and `ArgsEscaped` is set
to true for CMD or ENTRYPOINT. A RUN statement does not commit
`ArgsEscaped` to the commited layer regardless or whether shell
or exec form were used.
Signed-off-by: John Howard <jhoward@microsoft.com>
This is the first step in refactoring moby (dockerd) to use containerd on Windows.
Similar to the current model in Linux, this adds the option to enable it for runtime.
It does not switch the graphdriver to containerd snapshotters.
- Refactors libcontainerd to a series of subpackages so that either a
"local" containerd (1) or a "remote" (2) containerd can be loaded as opposed
to conditional compile as "local" for Windows and "remote" for Linux.
- Updates libcontainerd such that Windows has an option to allow the use of a
"remote" containerd. Here, it communicates over a named pipe using GRPC.
This is currently guarded behind the experimental flag, an environment variable,
and the providing of a pipename to connect to containerd.
- Infrastructure pieces such as under pkg/system to have helper functions for
determining whether containerd is being used.
(1) "local" containerd is what the daemon on Windows has used since inception.
It's not really containerd at all - it's simply local invocation of HCS APIs
directly in-process from the daemon through the Microsoft/hcsshim library.
(2) "remote" containerd is what docker on Linux uses for it's runtime. It means
that there is a separate containerd service running, and docker communicates over
GRPC to it.
To try this out, you will need to start with something like the following:
Window 1:
containerd --log-level debug
Window 2:
$env:DOCKER_WINDOWS_CONTAINERD=1
dockerd --experimental -D --containerd \\.\pipe\containerd-containerd
You will need the following binary from github.com/containerd/containerd in your path:
- containerd.exe
You will need the following binaries from github.com/Microsoft/hcsshim in your path:
- runhcs.exe
- containerd-shim-runhcs-v1.exe
For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`.
This is no different to the current requirements. However, you may need updated binaries,
particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit
binaries are somewhat out of date.
Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required
functionality needed for docker. This will come in time - this is a baby (although large)
step to migrating Docker on Windows to containerd.
Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use
HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable.
This abstraction is done in HCSShim. (Referring specifically to runtime)
Note the LCOW graphdriver still uses HCS v1 APIs regardless.
Note also that this does not migrate docker to use containerd snapshotters
rather than graphdrivers. This needs to be done in conjunction with Linux also
doing the same switch.
syscall.Stat (and Lstat), unlike functions from os pkg,
return "raw" errors (like EPERM or EINVAL), and those are
propagated up the function call stack unchanged, and gets
logged and/or returned to the user as is.
Wrap those into os.PathError{} so the error message will
at least have function name and file name.
Note we use Capitalized function names to distinguish
between functions in os and ours.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
Addresses https://github.com/moby/moby/pull/35089#issuecomment-367802698.
This change enables the daemon to automatically select an image under LCOW
that can be used if the API doesn't specify an explicit platform.
For example:
FROM supertest2014/nyan
ADD Dockerfile /
And docker build . will download the linux image (not a multi-manifest image)
And similarly docker pull ubuntu will match linux/amd64
Signed-off-by: John Howard <jhoward@microsoft.com>
The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.
This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Standard golang's `os.MkdirAll()` function returns "not a directory" error
in case a directory to be created already exists but is not a directory
(e.g. a file). Our own `idtools.MkdirAs*()` functions do not replicate
the behavior.
This is a bug since all `Mkdir()`-like functions are expected to ensure
the required directory exists and is indeed a directory, and return an
error otherwise.
As the code is using our in-house `system.Stat()` call returning a type
which is incompatible with that of golang's `os.Stat()`, I had to amend
the `system` package with `IsDir()`.
A test case is also provided.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Update golang.org/x/sys to 8dbc5d05d6edcc104950cc299a1ce6641235bc86 in
order to get the Major, Minor and Mkdev functions for every unix-like
OS. Use them instead of the locally defined versions which currently use
the Linux specific device major/minor encoding.
This means that the device number should now be properly encoded on e.g.
Darwin, FreeBSD or Solaris.
Also, the SIGUNUSED constant was removed from golang.org/x/sys/unix in
https://go-review.googlesource.com/61771 as it is also removed from the
respective glibc headers.
Remove it from signal.SignalMap as well after the golang.org/x/sys
re-vendoring.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Use CreateEvent, OpenEvent (which both map to the respective *EventW
function) and PulseEvent from golang.org/x/sys instead of local copies.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use the symlink xattr syscall wrappers Lgetxattr and Lsetxattr from
x/sys/unix (introduced in golang/sys@b90f89a) instead of providing own
wrappers. Leave the functionality of system.Lgetxattr intact with
respect to the retry with a larger buffer, but switch it to use
unix.Lgetxattr. Also leave system.Lsetxattr intact (even though it's
just a wrapper around the corresponding function from unix) in order to
keep moby building for !linux.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Before this, if `forceRemove` is set the container data will be removed
no matter what, including if there are issues with removing container
on-disk state (rw layer, container root).
In practice this causes a lot of issues with leaked data sitting on
disk that users are not able to clean up themselves.
This is particularly a problem while the `EBUSY` errors on remove are so
prevalent. So for now let's not keep this behavior.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>