Commit graph

89 commits

Author SHA1 Message Date
Sebastiaan van Stijn
10ce0d84c2
pkg/sysinfo.New() move v1 code to a newV1() function
This makes it clearer that this code is the cgroups v1 equivalent of newV2().

Also moves the "options" handling to newV2() because it's currently only used
for cgroupsv2.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-14 16:36:56 +02:00
Sebastiaan van Stijn
472f21b923
replace uses of deprecated containerd/sys.RunningInUserNS()
This utility was moved to a separate package, which has no
dependencies.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-18 11:01:24 +02:00
Jakub Guzik
7ef6ece774 Fix setting swaplimit=true without checking
Signed-off-by: Jakub Guzik <jakubmguzik@gmail.com>
2021-06-03 22:54:10 +02:00
Sebastiaan van Stijn
6458f750e1
use containerd/cgroups to detect cgroups v2
libcontainer does not guarantee a stable API, and is not intended
for external consumers.

this patch replaces some uses of libcontainer/cgroups with
containerd/cgroups.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-11-09 15:00:32 +01:00
Brian Goff
df7031b669 Memoize seccomp value for SysInfo
As it turns out, we call this function every time someone calls `docker
info`, every time a contianer is created, and every time a container is
started.
Certainly this should be refactored as a whole, but for now, memoize the
seccomp value.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-09-11 22:48:46 +00:00
Kir Kolyshkin
afbeaf6f29 pkg/sysinfo: rm duplicates
The CPU CFS cgroup-aware scheduler is one single kernel feature, not
two, so it does not make sense to have two separate booleans
(CPUCfsQuota and CPUCfsPeriod). Merge these into CPUCfs.

Same for CPU realtime.

For compatibility reasons, /info stays the same for now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-06-26 16:19:52 -07:00
Sebastiaan van Stijn
66bb1c4644
pkg/sysinfo: use containerd/sys to detect UserNamespaces
The implementation in libcontainer/system is quite complicated,
and we only use it to detect if user-namespaces are enabled.

In addition, the implementation in containerd uses a sync.Once,
so that detection (and reading/parsing `/proc/self/uid_map`) is
only performed once.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-06-15 13:07:48 +02:00
Kir Kolyshkin
d5da7e5330 pkg/sysinfo/sysinfo_linux.go: fix some comments
Some were misleading or vague, some were plain wrong.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-05-22 13:13:27 -07:00
Kir Kolyshkin
f02a53d6b9 pkg/sysinfo.applyPIDSCgroupInfo: optimize
For some reason, commit 69cf03700f chose not to use information
already fetched, and called cgroups.FindCgroupMountpoint() instead.
This is not a cheap call, as it has to parse the whole nine yards
of /proc/self/mountinfo, and the info which it tries to get (whether
the pids controller is present) is already available from cgMounts map.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-05-22 13:13:27 -07:00
Akihiro Suda
f350b53241 cgroup2: implement docker info
ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-17 07:20:01 +09:00
Sebastiaan van Stijn
9f0b3f5609
bump gotest.tools v3.0.1 for compatibility with Go 1.14
full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 00:06:42 +01:00
Akihiro Suda
409bbdc321 cgroup2: enable resource limitation
enable resource limitation by disabling cgroup v1 warnings

resource limitation still doesn't work with rootless mode (even with systemd mode)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-01 02:58:40 +09:00
Tobias Klauser
ba8a15694a Use functions from x/sys/unix to get number of CPUs on Linux
Use Getpid and SchedGetaffinity from golang.org/x/sys/unix to get the
number of CPUs in numCPU on Linux.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2019-06-18 19:26:56 +02:00
Rob Gulewich
256eb04d69 Start containers in their own cgroup namespaces
This is enabled for all containers that are not run with --privileged,
if the kernel supports it.

Fixes #38332

Signed-off-by: Rob Gulewich <rgulewich@netflix.com>
2019-05-07 10:22:16 -07:00
Sebastiaan van Stijn
53460047e4
Refactor pkg/sysinfo
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-02-04 00:38:12 +01:00
Akihiro Suda
ec87479b7e allow running dockerd in an unprivileged user namespace (rootless mode)
Please refer to `docs/rootless.md`.

TLDR:
 * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
 * `dockerd-rootless.sh --experimental`
 * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2019-02-04 00:24:27 +09:00
Andrew Hsu
78045a5419
use empty string as cgroup path to grab first find
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-07 18:44:00 +01:00
Yong Tang
f023816608 Add memory.kernelTCP support for linux
This fix tries to address the issue raised in 37038 where
there were no memory.kernelTCP support for linux.

This fix add MemoryKernelTCP to HostConfig, and pass
the config to runtime-spec.

Additional test case has been added.

This fix fixes 37038.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-11-26 21:03:08 +00:00
Justin Cormack
f8e876d761
Fix denial of service with large numbers in cpuset-cpus and cpuset-mems
Using a value such as `--cpuset-mems=1-9223372036854775807` would cause
`dockerd` to run out of memory allocating a map of the values in the
validation code. Set limits to the normal limit of the number of CPUs,
and improve the error handling.

Reported by Huawei PSIRT.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-05 15:09:02 +02:00
Vincent Demeester
3845728524
Update tests to use gotest.tools 👼
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2018-06-13 09:04:30 +02:00
Daniel Nephin
6be0f70983 Automated migration using
gty-migrate-from-testify --ignore-build-tags

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-03-16 11:03:43 -04:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Sebastiaan van Stijn
6ed1163c98
Remove redundant build-tags
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-12-18 17:41:53 +01:00
Yong Tang
4785f1a7ab Remove solaris build tag and `contrib/mkimage/solaris
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-11-02 00:01:46 +00:00
Michael Crosby
5a9b5f10cf Remove solaris files
For obvious reasons that it is not really supported now.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-10-24 15:39:34 -04:00
Derek McGowan
1009e6a40b
Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-07-31 13:16:46 -07:00
Tobias Klauser
6c9d715a8c sysinfo: use Prctl() from x/sys/unix
Use unix.Prctl() instead of manually reimplementing it using
unix.RawSyscall. Also use unix.SECCOMP_MODE_FILTER instead of locally
defining it.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-07-17 10:37:42 +02:00
Christopher Jones
069fdc8a08
[project] change syscall to /x/sys/unix|windows
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.

Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>

[s390x] switch utsname from unsigned to signed

per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags

Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
2017-07-11 08:00:32 -04:00
Naveed Jamil
6181bd7a7c Increased Unit Test Coverage for PKG/SYSINFO
Signed-off-by: Naveed Jamil <naveed.jamil@tenpearls.com>
2017-06-09 12:06:12 +05:00
Brian Goff
91e0141784 Merge pull request #33485 from tpot/33484-fix-file-perm-octal
Use octal values for file mode in filenotify poller and sysinfo_linux tests
2017-06-02 11:15:42 -04:00
Tim Potter
c35ea6b2cd Use octal values for file mode in filenotify poller and sysinfo_linux tests
Closes #33484.

Signed-off-by: Tim Potter <tpot@hpe.com>
2017-06-02 12:27:10 +10:00
Tim Potter
cd457e7885 Fix incorrect assert message in TestReadProcBool
Closes #33480.

Signed-off-by: Tim Potter <tpot@hpe.com>
2017-06-02 10:15:58 +10:00
Doug Davis
ff42a2eb41 Only show global warnings once
Upon each container create I'm seeing these warning **every** time in the
daemon output:
```
WARN[0002] Your kernel does not support swap memory limit
WARN[0002] Your kernel does not support cgroup rt period
WARN[0002] Your kernel does not support cgroup rt runtime
```
Showing them for each container.create() fills up the logs and encourages
people to ignore the output being generated - which means its less likely
they'll see real issues when they happen.  In short, I don't think we
need to show these warnings more than once, so let's only show these
warnings at daemon start-up time.

Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-11-30 10:11:42 -08:00
Darren Stahl
22c83c567f Swap usage of LazyDLL and LoadDLL to LazySystemDLL.
Signed-off-by: Darren Stahl <darst@microsoft.com>
2016-11-22 14:57:11 -08:00
Erik St. Martin
56f77d5ade Implementing support for --cpu-rt-period and --cpu-rt-runtime so that
containers may specify these cgroup values at runtime. This will allow
processes to change their priority to real-time within the container
when CONFIG_RT_GROUP_SCHED is enabled in the kernel. See #22380.

Also added sanity checks for the new --cpu-rt-runtime and --cpu-rt-period
flags to ensure that that the kernel supports these features and that
runtime is not greater than period.

Daemon will support a --cpu-rt-runtime flag to initialize the parent
cgroup on startup, this prevents the administrator from alotting runtime
to docker after each restart.

There are additional checks that could be added but maybe too far? Check
parent cgroups to ensure values are <= parent, inspecting rtprio ulimit
and issuing a warning.

Signed-off-by: Erik St. Martin <alakriti@gmail.com>
2016-10-26 11:33:06 -04:00
Kenfe-Mickael Laventure
7e12c3bb99 Update containerd and runc
containerd: 837e8c5e1cad013ed57f5c2090c8591c10cbbdae
runc: 02f8fa7863dd3f82909a73e2061897828460d52f

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-10-05 14:47:15 -07:00
allencloud
bd11a269bc make more pkgs support darwin
Signed-off-by: allencloud <allen.sun@daocloud.io>
2016-08-10 22:56:05 +08:00
Yong Tang
f3d8658fff Fix s390x build failure by removing golang.org/x/sys
This fix tries to fix the issue raised in #24168 where
golang.org/x/sys causes s390x build failure.

This fix removed the import of "golang.org/x/sys/unix".

This fix fixes #24168.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-06-30 06:32:32 -07:00
Yong Tang
3707a76921 Fix wrong CPU count after CPU hot-plugging on Windows
This fix tries to fix wrong CPU count after CPU hot-plugging.
On windows, GetProcessAffinityMask has been used to probe the
number of CPUs in real time.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-06-25 22:20:29 -07:00
Yong Tang
8b2383f5c1 Fix wrong CPU count after CPU hot-plugging
This fix tries to address issues raised in #23768 where the CPU count
is not updated after cpu ho-plugging.

This fix follows the suggestion from #23768 and replace go's `runtime.NumCPU()`
with `sysconf(_SC_NPROCESSORS_ONLN)` so that correct CPU count could
be obtained even after CPU hot-plugging.

This fix is tested manually, as is suggested in #23768.

This fix fixes #23768.

The NumCPU() in Linux is based on @wmark 's implementation.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-06-25 20:48:36 -07:00
Antonio Murdaca
44ccbb317c *: fix logrus.Warn[f]
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-06-11 19:42:38 +02:00
Amit Krishnan
86d8758e2b Get the Docker Engine to build clean on Solaris
Signed-off-by: Amit Krishnan <krish.amit@gmail.com>
2016-05-23 16:37:12 -07:00
Jessica Frazelle
69cf03700f
pids limit support
update bash commpletion for pids limit

update check config for kernel

add docs for pids limit

add pids stats

add stats to docker client

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-03-08 07:55:01 -08:00
Christy Perez
5b3fc7aab2 Match case for variables in sysinfo pkg
I noticied an inconsistency when reviewing docker/pull/20692.

Changing Ip to IP and Nf to NF.

More info: The golang folks recommend that you keep the initials consistent:
https://github.com/golang/go/wiki/CodeReviewComments#initialisms.

Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
2016-03-01 10:37:05 -06:00
Alexander Morozov
781a33b6e7 Reuse subsystems mountpoints between checks
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-01-20 19:20:59 -08:00
Jessica Frazelle
40d5ced9d0
check seccomp is configured in the kernel
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-01-12 09:45:21 -08:00
Ma Shimiao
843084b08b Add support for blkio read/write iops device
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2015-12-21 09:14:49 +08:00
Justas Brazauskas
927b334ebf Fix typos found across repository
Signed-off-by: Justas Brazauskas <brazauskasjustas@gmail.com>
2015-12-13 18:04:12 +02:00
Ma Shimiao
3f15a055e5 Add support for blkio read/write bps device
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2015-12-04 09:26:03 +08:00
Ma Shimiao
0fbfa1449d Add support for blkio.weight_device
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2015-11-11 23:06:36 +08:00