Commit graph

81 commits

Author SHA1 Message Date
Mrunal Patel
fb43ef649b Add support for --pid=container:<id>
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-05-17 13:49:05 -04:00
Yong Tang
632b314b23 Relative symlinks don't work with --device argument
This fix tries to address the issue raised in #22271 where
relative symlinks don't work with --device argument.

Previously, the symlinks in --device was implemneted (#20684)
with `os.Readlink()` which does not resolve if the linked
target is a relative path. In this fix, `filepath.EvalSymlinks()`
has been used which will reolve correctly with relative
paths.

An additional test case has been added to the existing
`TestRunDeviceSymlink` to cover changes in this fix.

This fix is related to #13840 and #20684, #22271.
This fix fixes #22271.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-04-25 07:22:56 -07:00
Brian Goff
8adc8c3a68 Merge pull request #21901 from mavenugo/sid
Add container's short-id as default network alias
2016-04-19 08:16:41 -04:00
Madhu Venugopal
ea531f061d Add container's short-id as default network alias
link feature in docker0 bridge by default provides short-id as a
container alias. With built-in SD feature, providing a container
short-id as a network alias will fill that gap.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-04-18 14:45:16 -07:00
Vivek Goyal
cacd400777 Mount volumes rprivate for archival and other use cases
People have reported following problem.

- docker run -ti --name=foo -v /dev/:/dev/ fedora bash
- docker cp foo:/bin/bash /tmp

Once the cp operation is complete, it unmounted /dev/pts on the host. /dev/pts
is a submount of /dev/. This is completely unexpected. Following is the
reson for this behavior.

containerArchivePath() call mountVolumes() which goes through all the mounts
points of a container and mounts them in daemon mount namespace in
/var/lib/docker/devicemapper/mnt/<containerid>/rootfs dir. And once we have
extracted the data required, these are unmounted using UnmountVolumes().

Mounts are done using recursive bind (rbind). And these are unmounted using
lazy mount option on top level mount. (detachMounted()). That means if there
are submounts under top level mounts, these mount events will propagate and
they were "shared" mounts with host, it will unmount the submount on host
as well.

For example, try following.

- Prepare a parent and child mount point.
  $ mkdir /root/foo
  $ mount --bind /root/foo /root/foo 
  $ mount --make-rshared /root/foo
  
- Prepare a child mount 

  $ mkdir /root/foo/foo1
  $ mount --bind /root/foo/foo1 /root/foo/foo1
 
- Bind mount foo at bar

  $ mkdir /root/bar
  $ mount --rbind /root/foo /root/bar
  
- Now lazy unmount /root/bar and it will unmount /root/foo/foo1 as well.

  $ umount -l /root/bar

This is not unintended. We just wanted to unmount /root/bar and anything
underneath but did not have intentions of unmounting anything on source.

So far this was not a problem as docker daemon was running in a seprate
mount namespace where all propagation was "slave". That means any unmounts
in docker daemon namespace did not propagate to host namespace. 

But now we are running docker daemon in host namespace so that it is possible
to mount some volumes "shared" with container. So that if container mounts
something it propagates to host namespace as well. 

Given mountVolumes() seems to be doing only temporary mounts to read some
data, there does not seem to be a need to mount these shared/slave. Just
mount these private so that on unmount, nothing propagates and does not
have unintended consequences. 

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-04-15 14:03:11 +00:00
Alexander Morozov
5ee8652a21 all: remove some unused funcs and variables
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-04-06 10:40:01 -07:00
Tonis Tiigi
ee61235880 Fix setting cgroup permission to user/privileged devices
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-03-24 14:16:33 -07:00
Tonis Tiigi
9c4570a958 Replace execdrivers with containerd implementation
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
2016-03-18 13:38:32 -07:00
Liron Levin
6993e891d1 Run privileged containers when userns are specified
Following #19995 and #17409 this PR enables skipping userns re-mapping
when creating a container (or when executing a command). Thus, enabling
privileged containers running side by side with userns remapped
containers.

The feature is enabled by specifying ```--userns:host```, which will not
remapped the user if userns are applied. If this flag is not specified,
the existing behavior (which blocks specific privileged operation)
remains.

Signed-off-by: Liron Levin <liron@twistlock.com>
2016-03-14 17:09:25 +02:00
msabansal
e8026d8a98 Windows libnetwork integration
Signed-off-by: msabansal <sabansal@microsoft.com>
2016-03-09 20:33:21 -08:00
Alessandro Boch
b8a5fb76ea Add Exposed ports and port-mapping configs to Sandbox
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-03-09 14:07:23 -08:00
David Calavera
dd32445ecc Merge pull request #18697 from jfrazelle/pids-cgroup
Add PIDs cgroup support to Docker
2016-03-08 14:03:36 -08:00
Brian Goff
dc702b6c6b Merge pull request #20727 from mrunalp/no_new_priv
Add support for NoNewPrivileges in docker
2016-03-08 14:26:15 -05:00
Jessica Frazelle
69cf03700f
pids limit support
update bash commpletion for pids limit

update check config for kernel

add docs for pids limit

add pids stats

add stats to docker client

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-08 07:55:01 -08:00
Mrunal Patel
74bb1ce9e9 Add support for NoNewPrivileges in docker
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>

Add tests for no-new-privileges

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>

Update documentation for no-new-privileges

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2016-03-07 09:47:02 -08:00
David Calavera
1a729c3dd8 Do not wait for container on stop if the process doesn't exist.
This fixes an issue that caused the client to hang forever if the
process died before the code arrived to exit the `Kill` function.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-03-04 16:00:58 -05:00
Brian Goff
d883002fac Merge pull request #20684 from yongtang/13840-follow-symlink
Follow symlink for --device argument.
2016-03-01 12:44:10 -05:00
Yong Tang
7ed569efdc Follow symlink for --device argument.
Fixes: #13840

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-03-01 07:16:19 +00:00
Qiang Huang
53b0d62683 Vendor engine-api to 70d266e96080e3c3d63c55a4d8659e00ac1f7e6c
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-02-29 19:28:37 +08:00
David Calavera
a793564b25 Remove static errors from errors package.
Moving all strings to the errors package wasn't a good idea after all.

Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:

```go
func GetErrorMessage(err error) string {
	switch err.(type) {
	case errcode.Error:
		e, _ := err.(errcode.Error)
		return e.Message

	case errcode.ErrorCode:
		ec, _ := err.(errcode.ErrorCode)
		return ec.Message()

	default:
		return err.Error()
	}
}
```

This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.

Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.

Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:

```go
	switch err.(type) {
	case errcode.ErrorCode:
		daError, _ := err.(errcode.ErrorCode)
		statusCode = daError.Descriptor().HTTPStatusCode
		errMsg = daError.Message()

	case errcode.Error:
		// For reference, if you're looking for a particular error
		// then you can do something like :
		//   import ( derr "github.com/docker/docker/errors" )
		//   if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }

		daError, _ := err.(errcode.Error)
		statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
		errMsg = daError.Message

	default:
		// This part of will be removed once we've
		// converted everything over to use the errcode package

		// FIXME: this is brittle and should not be necessary.
		// If we need to differentiate between different possible error types,
		// we should create appropriate error types with clearly defined meaning
		errStr := strings.ToLower(err.Error())
		for keyword, status := range map[string]int{
			"not found":             http.StatusNotFound,
			"no such":               http.StatusNotFound,
			"bad parameter":         http.StatusBadRequest,
			"conflict":              http.StatusConflict,
			"impossible":            http.StatusNotAcceptable,
			"wrong login/password":  http.StatusUnauthorized,
			"hasn't been activated": http.StatusForbidden,
		} {
			if strings.Contains(errStr, keyword) {
				statusCode = status
				break
			}
		}
	}
```

You can notice two things in that code:

1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.

This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:

```go
type errorWithStatus interface {
	HTTPErrorStatusCode() int
}
```

This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.

I included helper functions to generate errors that use custom status code in `errors/errors.go`.

By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.

Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors

Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-26 15:49:09 -05:00
Kenfe-Mickael Laventure
f7d4abdc00 Prevent mqueue from implicitely becoming a bind mount with --ipc=host
Currently, when running a container with --ipc=host, if /dev/mqueue is
a standard directory on the hos the daemon will bind mount it allowing
the container to create/modify files on the host.

This commit forces /dev/mqueue to always be of type mqueue except when
the user explicitely requested something to be bind mounted to
/dev/mqueue.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-02-09 14:16:08 -08:00
Dan Walsh
ba38d58659 Make mqueue container specific
mqueue can not be mounted on the host os and then shared into the container.
There is only one mqueue per mount namespace, so current code ends up leaking
the /dev/mqueue from the host into ALL containers.  Since SELinux changes the
label of the mqueue, only the last container is able to use the mqueue, all
other containers will get a permission denied.  If you don't have SELinux protections
sharing of the /dev/mqueue allows one container to interact in potentially hostile
ways with other containers.

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2016-02-05 16:50:35 +01:00
Zhang Wei
3c0a91d227 Fix error for restarting container
Fix error message for `--net container:b` and `--ipc container:b`,
container `b` is a restarting container.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2016-02-04 20:14:50 +08:00
Alexander Morozov
3b80b1947c Merge pull request #19943 from aboch/ec
Store endpoint config on network connect to a stopped container
2016-02-03 11:06:35 -08:00
Lei Jitang
09a33b5f60 Check nil before set resource.OomKillDisable
Signed-off-by: Lei Jitang <leijitang@huawei.com>
2016-02-03 04:31:00 -05:00
Alessandro Boch
9b63e4e7f5 Store endpoint config on network connect to a stopped container
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-02-02 17:25:44 -08:00
Arnaud Porterie
269a6d7d36 Merge pull request #19705 from mavenugo/18222
Vendor libnetwork v0.6.0-rc4 & corresponding changes in engine for port-map sandobx handling.
2016-01-26 09:16:57 -08:00
Aleksa Sarai
4357ed4a73 *: purge dockerinit from source code
dockerinit has been around for a very long time. It was originally used
as a way for us to do configuration for LXC containers once the
container had started. LXC is no longer supported, and /.dockerinit has
been dead code for quite a while. This removes all code and references
in code to dockerinit.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-01-26 23:47:02 +11:00
Madhu Venugopal
e38463b277 Move port-mapping ownership closer to Sandbox (from Endpoint)
https://github.com/docker/libnetwork/pull/810 provides the more complete
solution for moving the Port-mapping ownership away from endpoint and
into Sandbox. But, this PR makes the best use of existing libnetwork
design and get a step closer to the gaol.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-01-26 03:59:03 -08:00
Vincent Demeester
141a301dca Merge pull request #19154 from hqhq/hq_verify_cgroupparent
Verify cgroup-parent name for systemd cgroup
2016-01-26 11:44:31 +01:00
Sebastiaan van Stijn
5b0183e91c Merge pull request #19683 from calavera/network_config_file
Allow network configuration via daemon config file.
2016-01-25 18:59:20 -08:00
David Calavera
c539be8833 Allow network configuration via daemon config file.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-01-25 18:54:56 -05:00
Alessandro Boch
733245b2e7 Save endpoint config only if endpoint creation succeeds
- Currently it is being save upfront...

Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-01-25 13:43:32 -08:00
Qiang Huang
5ce5a8e966 Verify cgroup-parent name for systemd cgroup
Fixes: #17126

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-01-22 21:17:23 -05:00
Alessandro Boch
3b0d36dbc1 Move ErrUnsupportedNetwork* checks to updateNetworkConfig() func
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-01-21 10:56:01 -08:00
Lei Jitang
6025517b68 Fix #19477, clean up the ports when release network
Signed-off-by: Lei Jitang <leijitang@huawei.com>
2016-01-20 20:09:11 -05:00
Madhu Venugopal
35dbce109b nil ptr check for endpointsettings when used with older clients
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-01-18 17:15:59 -08:00
David Calavera
73a5393bf3 Merge pull request #19242 from mavenugo/nsalias
Network scoped alias support
2016-01-14 10:58:51 -08:00
Tibor Vass
f292e90b8d Merge pull request #19226 from coolljt0725/remove_dup_check
Remove duplication checking for the existence of endpoint to speed up container starting
2016-01-14 12:24:11 -05:00
Madhu Venugopal
dda513ef65 Network scoped alias support
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-01-14 08:44:41 -08:00
Madhu Venugopal
b464f1d78c Forced endpoint cleanup
docker's network disconnect api now supports `Force` option which can be
used to force cleanup an endpoint from any host in the cluster.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-01-13 21:28:52 -08:00
Tibor Vass
46eb470039 Merge pull request #19267 from mavenugo/vin-ln
Vendor libnetwork v0.5.4
2016-01-13 07:09:58 -05:00
Madhu Venugopal
8edbd10349 Updating to the new ep.Delete API
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-01-12 20:42:37 -08:00
Madhu Venugopal
e221b8a3d6 Support --link for user-defined networks
This brings in the container-local alias functionality for containers
connected to u ser-defined networks.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-01-12 13:38:48 -08:00
Jess Frazelle
c1582f20cc Merge pull request #19243 from calavera/engine_api_0_2
Vendor engine-api 0.2
2016-01-12 13:11:39 -08:00
Jess Frazelle
293b3767c8 Merge pull request #19245 from jfrazelle/seccomp-kernel-check
check seccomp is configured in the kernel
2016-01-12 11:33:27 -08:00
Qiang Huang
f4a687334b Change OomKillDisable to be pointer
It's like `MemorySwappiness`, the default value has specific
meaning (default false means enable oom kill).

We need to change it to pointer so we can update it after
container is created.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
(cherry picked from commit 9c2ea42329)

Conflicts:
	vendor/src/github.com/docker/engine-api/types/container/host_config.go
2016-01-12 13:19:17 -05:00
Jessica Frazelle
40d5ced9d0
check seccomp is configured in the kernel
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-01-12 09:45:21 -08:00
Sebastiaan van Stijn
301627c677 Merge pull request #18906 from coolljt0725/connect_to_created
Support network connect/disconnect to stopped container
2016-01-12 07:06:31 -08:00
Lei Jitang
79d4f0f56e Add docker network connect/disconnect to non-running container
Signed-off-by: Lei Jitang <leijitang@huawei.com>
2016-01-11 20:13:39 -05:00