Commit graph

60 commits

Author SHA1 Message Date
Cory Snider
2bdc7fb0a1 daemon: archive in a dedicated mount namespace
Mounting a container's volumes under its rootfs directory inside the
host mount namespace causes problems with cross-namespace mount
propagation when /var/lib/docker is bind-mounted into the container as a
volume. The mount event propagates into the container's mount namespace,
overmounting the volume, but the propagated unmount events do not fully
reverse the effect. Each archive operation causes the mount table in the
container's mount namespace to grow larger and larger, until the kernel
limiton the number of mounts in a namespace is hit. The only solution to
this issue which is not subject to race conditions or other blocker
caveats is to avoid mounting volumes into the container's rootfs
directory in the host mount namespace in the first place.

Mount the container volumes inside an unshared mount namespace to
prevent any mount events from propagating into any other mount
namespace. Greatly simplify the archiving implementations by also
chrooting into the container rootfs to sidestep the need to resolve
paths in the host.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-10-27 12:52:14 -04:00
Sebastiaan van Stijn
686be57d0a
Update to Go 1.17.0, and gofmt with Go 1.17
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-24 23:33:27 +02:00
Sebastiaan van Stijn
eb14d936bf
daemon: rename variables that collide with imported package names
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:23 +02:00
Kir Kolyshkin
39048cf656 Really switch to moby/sys/mount*
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.

This commit was generated by the following bash script:

```
set -e -u -o pipefail

for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
	sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
		-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
		$file
	goimports -w $file
done
```

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-03-20 09:46:25 -07:00
Kir Kolyshkin
a6773f69f2 daemon/mountVolumes(): eliminate MakeRPrivate call
It is sufficient to add "rprivate" to mount flags.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-04-09 12:58:38 -07:00
Kir Kolyshkin
4e65b17ac4 daemon/mountVolumes: no need to specify fstype
For bind mounts, fstype argument to mount(2) is ignored.
Usual convention is either empty string or "none".

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-04-09 12:58:19 -07:00
Akihiro Suda
596cdffb9f mount: add BindOptions.NonRecursive (API v1.40)
This allows non-recursive bind-mount, i.e. mount(2) with "bind" rather than "rbind".

Swarm-mode will be supported in a separate PR because of mutual vendoring.

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2018-11-06 17:51:58 +09:00
Salahuddin Khan
763d839261 Add ADD/COPY --chown flag support to Windows
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.

NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.

The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>

On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.

Signed-off-by: Salahuddin Khan <salah@docker.com>
2018-08-13 21:59:11 -07:00
Brian Goff
6a70fd222b Move mount parsing to separate package.
This moves the platform specific stuff in a separate package and keeps
the `volume` package and the defined interfaces light to import.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-04-19 06:35:54 -04:00
Brian Goff
0023abbad3 Remove old/uneeded volume migration from vers 1.7
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-04-17 14:06:53 -04:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Victor Vieux
462d79165f
Merge pull request #34224 from estesp/no-chown-nwfiles-outside-metadata
Only chown network files within container metadata
2017-11-02 15:00:42 -07:00
Yong Tang
4785f1a7ab Remove solaris build tag and `contrib/mkimage/solaris
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-11-02 00:01:46 +00:00
Phil Estes
42716dcf5c
Only chown network files within container metadata
If the user specifies a mountpath from the host, we should not be
attempting to chown files outside the daemon's metadata directory
(represented by `daemon.repository` at init time).

This forces users who want to use user namespaces to handle the
ownership needs of any external files mounted as network files
(/etc/resolv.conf, /etc/hosts, /etc/hostname) separately from the
daemon. In all other volume/bind mount situations we have taken this
same line--we don't chown host file content.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
2017-11-01 10:14:01 -04:00
Fabio Kung
76d96418b1 avoid saving container state to disk before daemon.Register
Migrate legacy volumes (Daemon.verifyVolumesInfo) before containers are
registered on the Daemon, so state on disk is not overwritten and legacy
fields lost during registration.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
2017-06-23 07:52:34 -07:00
Fabio Kung
edad52707c save deep copies of Container in the replica store
Reuse existing structures and rely on json serialization to deep copy
Container objects.

Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
2017-06-23 07:52:33 -07:00
Fabio Kung
201a37f7a1 verifyVolumesInfo needs a container lock
It operates on containers that have already been registered on the
daemon, and are visible to other goroutines.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
2017-06-23 07:52:33 -07:00
Daniel Nephin
93fbdb69ac Remove error return from RootPair
There is no case which would resolve in this error. The root user always exists, and if the id maps are empty, the default value of 0 is correct.

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-06-07 11:45:33 -04:00
Daniel Nephin
09cd96c5ad Partial refactor of UID/GID usage to use a unified struct.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-06-07 11:44:33 -04:00
Lei Jitang
7318eba5b2 Don't create source directory while the daemon is being shutdown, fix #30348
If a container mount the socket the daemon is listening on into
container while the daemon is being shutdown, the socket will
not exist on the host, then daemon will assume it's a directory
and create it on the host, this will cause the daemon can't start
next time.

fix issue https://github.com/moby/moby/issues/30348

To reproduce this issue, you can add following code

```
--- a/daemon/oci_linux.go
+++ b/daemon/oci_linux.go
@@ -8,6 +8,7 @@ import (
        "sort"
        "strconv"
        "strings"
+       "time"

        "github.com/Sirupsen/logrus"
        "github.com/docker/docker/container"
@@ -666,7 +667,8 @@ func (daemon *Daemon) createSpec(c *container.Container) (*libcontainerd.Spec, e
        if err := daemon.setupIpcDirs(c); err != nil {
                return nil, err
        }
-
+       fmt.Printf("===please stop the daemon===\n")
+       time.Sleep(time.Second * 2)
        ms, err := daemon.setupMounts(c)
        if err != nil {
                return nil, err

```

step1 run a container which has `--restart always` and `-v /var/run/docker.sock:/sock`
```
$ docker run -ti --restart always -v /var/run/docker.sock:/sock busybox
/ #

```
step2 exit the the container
```
/ # exit
```
and kill the daemon when you see
```
===please stop the daemon===
```
in the daemon log

The daemon can't restart again and fail with `can't create unix socket /var/run/docker.sock: is a directory`.

Signed-off-by: Lei Jitang <leijitang@huawei.com>
2017-05-30 22:59:51 -04:00
zhukj
cf3ef262b1 close the file
Signed-off-by: zhukj <zhu.kunjia@zte.com.cn>
2016-11-21 19:56:01 +08:00
Amit Krishnan
934328d8ea Add functional support for Docker sub commands on Solaris
Signed-off-by: Amit Krishnan <krish.amit@gmail.com>

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-11-07 09:06:34 -08:00
Akihiro Suda
18768fdc2e api: add TypeTmpfs to api/types/mount
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2016-10-28 08:38:32 +00:00
Brian Goff
fc7b904dce Add new HostConfig field, Mounts.
`Mounts` allows users to specify in a much safer way the volumes they
want to use in the container.
This replaces `Binds` and `Volumes`, which both still exist, but
`Mounts` and `Binds`/`Volumes` are exclussive.
The CLI will continue to use `Binds` and `Volumes` due to concerns with
parsing the volume specs on the client side and cross-platform support
(for now).

The new API follows exactly the services mount API.

Example usage of `Mounts`:

```
$ curl -XPOST localhost:2375/containers/create -d '{
  "Image": "alpine:latest",
  "HostConfig": {
    "Mounts": [{
      "Type": "Volume",
      "Target": "/foo"
      },{
      "Type": "bind",
      "Source": "/var/run/docker.sock",
      "Target": "/var/run/docker.sock",
      },{
      "Type": "volume",
      "Name": "important_data",
      "Target": "/var/data",
      "ReadOnly": true,
      "VolumeOptions": {
	"DriverConfig": {
	  Name: "awesomeStorage",
	  Options: {"size": "10m"},
	  Labels: {"some":"label"}
	}
      }]
    }
}'
```

There are currently 2 types of mounts:

  - **bind**: Paths on the host that get mounted into the
    container. Paths must exist prior to creating the container.
  - **volume**: Volumes that persist after the
    container is removed.

Not all fields are available in each type, and validation is done to
ensure these fields aren't mixed up between types.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-09-13 09:55:35 -04:00
Tõnis Tiigi
a6daa94e3e Merge pull request #26342 from cpuguy83/20079_restore_volume_migrate
restore migrating pre-1.7.0 volumes
2016-09-07 10:56:07 -07:00
Brian Goff
dc712b9249 restore migrating pre-1.7.0 volumes
This was removed in a clean-up
(060f4ae617) but should not have been.
Fixes issues with volumes when upgrading from pre-1.7.0 daemons.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-09-06 17:17:47 -04:00
Antonis Kalipetis
72d8a77d52
Make host directory mounts use idtools.MkdirAllNewAs
This makes sure that:
1. Already existing directories are left untouched
2. Newly created directories are chowned to the correct root UID/GID in case of user namespaces
3. All parent directories still get created with host root UID/GID

Fix #21738

Signed-off-by: Antonis Kalipetis <akalipetis@gmail.com>
2016-09-05 12:46:57 +03:00
Brian Goff
6d98e344c7 revendor engine-api
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-08-16 14:16:12 -04:00
Antonio Murdaca
756f6cef4a daemon: allow tmpfs to trump over VOLUME(s)
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-06-15 16:01:51 +02:00
Dan Walsh
322cc99c69 Need to create bind mount volume if it does not exist.
In order to be consistent on creation of volumes for bind mounts
we need to create the source directory if it does not exist and the
user specified he wants it relabeled.

Can not do this lower down the stack, since we are not passing in the
mode fields.

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2016-06-02 07:14:17 -04:00
Tonis Tiigi
9c4570a958 Replace execdrivers with containerd implementation
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
2016-03-18 13:38:32 -07:00
Jessica Frazelle
8dd88afb5b
remove dead code
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-16 19:15:14 -07:00
David Calavera
aab3596397 Remove duplicated lazy volume initialization.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-01-13 11:22:31 -05:00
Darren Shepherd
2aa673aed7 Lazy initialize Volume on container Mount object
Currently on daemon start volumes are "created" which involves invoking
a volume driver if needed.  If this process fails the mount is left in a
bad state in which there is no source or Volume set.  This now becomes
an unrecoverable state in which that container can not be started.  The
only way to fix is to restart the daemon and hopefully you don't get
another error on startup.

This change moves "createVolume" to be done at container start.  If the
start fails it leaves it in the state in which you can try another
start.  If the second start can contact the volume driver everything
will recover fine.

Signed-off-by: Darren Shepherd <darren@rancher.com>
2016-01-12 17:19:59 -05:00
David Calavera
9d12d09300 Add volume events.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-30 17:39:33 -05:00
Vivek Goyal
a2dc4f79f2 Add capability to specify mount propagation per volume
Allow passing mount propagation option shared, slave, or private as volume
property.

For example.
docker run -ti -v /root/mnt-source:/root/mnt-dest:slave fedora bash

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-12-14 10:39:53 -05:00
David Calavera
6bb0d1816a Move Container to its own package.
So other packages don't need to import the daemon package when they
want to use this struct.

Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
2015-12-03 17:39:49 +01:00
David Calavera
060f4ae617 Remove the container initializers per platform.
By removing deprecated volume structures, now that windows mount volumes we don't need a initializer per platform.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-11-18 08:41:46 -05:00
David Calavera
63efc12070 Remove further references to the daemon within containers.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-11-04 12:28:54 -05:00
John Howard
a7e686a779 Windows: Add volume support
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-22 10:42:53 -07:00
Phil Estes
79240b9eaf Correct mismatched function names (UID() and Gid())
All the go-lint work forced any existing "Uid" -> "UID", but seems to
not have the same rules for Gid, so stat package has calls UID() and
Gid().

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-10-12 10:58:33 -04:00
Phil Estes
442b45628e Add user namespace (mapping) support to the Docker engine
Adds support for the daemon to handle user namespace maps as a
per-daemon setting.

Support for handling uid/gid mapping is added to the builder,
archive/unarchive packages and functions, all graphdrivers (except
Windows), and the test suite is updated to handle user namespace daemon
rootgraph changes.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-10-09 17:47:37 -04:00
Tibor Vass
b08f071e18 Revert "Merge pull request #16228 from duglin/ContextualizeEvents"
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.

Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).

This reverts commit de41640435, reversing
changes made to 7daeecd42d.

Signed-off-by: Tibor Vass <tibor@docker.com>

Conflicts:
	api/server/container.go
	builder/internals.go
	daemon/container_unix.go
	daemon/create.go
2015-09-29 14:26:51 -04:00
Doug Davis
26b1064967 Add context.RequestID to event stream
This PR adds a "request ID" to each event generated, the 'docker events'
stream now looks like this:

```
2015-09-10T15:02:50.000000000-07:00 [reqid: c01e3534ddca] de7c5d4ca927253cf4e978ee9c4545161e406e9b5a14617efb52c658b249174a: (from ubuntu) create
```
Note the `[reqID: c01e3534ddca]` part, that's new.

Each HTTP request will generate its own unique ID. So, if you do a
`docker build` you'll see a series of events all with the same reqID.
This allow for log processing tools to determine which events are all related
to the same http request.

I didn't propigate the context to all possible funcs in the daemon,
I decided to just do the ones that needed it in order to get the reqID
into the events. I'd like to have people review this direction first, and
if we're ok with it then I'll make sure we're consistent about when
we pass around the context - IOW, make sure that all funcs at the same level
have a context passed in even if they don't call the log funcs - this will
ensure we're consistent w/o passing it around for all calls unnecessarily.

ping @icecrime @calavera @crosbymichael

Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-24 11:56:37 -07:00
Doug Davis
a283a30fb0 Move api/errors/ to errors/
Per @calavera's suggestion: https://github.com/docker/docker/pull/16355#issuecomment-141139220

Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-17 11:54:14 -07:00
Doug Davis
f7d4b4fe2b Convert some "daemon" static error strings to the new errocode package format
Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-16 16:16:42 -07:00
David Calavera
55a601e3f1 Vendor libcontainer v0.0.4
Noteworthy changes:

- Add Prestart/Poststop hook support
- Fix bug finding cgroup mount directory
- Add OomScoreAdj as a container configuration option
- Ensure the cleanup jobs in the deferrer are executed on error
- Don't make modifications to /dev when it is bind mounted

Other changes in runc:

https://github.com/opencontainers/runc/compare/v0.0.3...v0.0.4

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-09-11 16:17:59 -04:00
Antonio Murdaca
587823af27 daemon: remove unused function params
Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-09-09 22:37:46 +02:00
Brian Goff
9ca4aa4797 Merge pull request #15798 from calavera/volume_driver_host_config
Move VolumeDriver to HostConfig to make containers portable.
2015-09-08 22:05:40 -04:00
Jessie Frazelle
7c667f9d6e Merge pull request #15999 from cpuguy83/15994_ext_volume_bind
Set bind driver after volume is created
2015-09-04 09:47:10 -07:00