Please refer to `docs/rootless.md`.
TLDR:
* Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
* `dockerd-rootless.sh --experimental`
* `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This fix tries to address the issue raised in 37038 where
there were no memory.kernelTCP support for linux.
This fix add MemoryKernelTCP to HostConfig, and pass
the config to runtime-spec.
Additional test case has been added.
This fix fixes 37038.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Using a value such as `--cpuset-mems=1-9223372036854775807` would cause
`dockerd` to run out of memory allocating a map of the values in the
validation code. Set limits to the normal limit of the number of CPUs,
and improve the error handling.
Reported by Huawei PSIRT.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use unix.Prctl() instead of manually reimplementing it using
unix.RawSyscall. Also use unix.SECCOMP_MODE_FILTER instead of locally
defining it.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Upon each container create I'm seeing these warning **every** time in the
daemon output:
```
WARN[0002] Your kernel does not support swap memory limit
WARN[0002] Your kernel does not support cgroup rt period
WARN[0002] Your kernel does not support cgroup rt runtime
```
Showing them for each container.create() fills up the logs and encourages
people to ignore the output being generated - which means its less likely
they'll see real issues when they happen. In short, I don't think we
need to show these warnings more than once, so let's only show these
warnings at daemon start-up time.
Signed-off-by: Doug Davis <dug@us.ibm.com>
containers may specify these cgroup values at runtime. This will allow
processes to change their priority to real-time within the container
when CONFIG_RT_GROUP_SCHED is enabled in the kernel. See #22380.
Also added sanity checks for the new --cpu-rt-runtime and --cpu-rt-period
flags to ensure that that the kernel supports these features and that
runtime is not greater than period.
Daemon will support a --cpu-rt-runtime flag to initialize the parent
cgroup on startup, this prevents the administrator from alotting runtime
to docker after each restart.
There are additional checks that could be added but maybe too far? Check
parent cgroups to ensure values are <= parent, inspecting rtprio ulimit
and issuing a warning.
Signed-off-by: Erik St. Martin <alakriti@gmail.com>
This fix tries to fix the issue raised in #24168 where
golang.org/x/sys causes s390x build failure.
This fix removed the import of "golang.org/x/sys/unix".
This fix fixes#24168.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to fix wrong CPU count after CPU hot-plugging.
On windows, GetProcessAffinityMask has been used to probe the
number of CPUs in real time.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to address issues raised in #23768 where the CPU count
is not updated after cpu ho-plugging.
This fix follows the suggestion from #23768 and replace go's `runtime.NumCPU()`
with `sysconf(_SC_NPROCESSORS_ONLN)` so that correct CPU count could
be obtained even after CPU hot-plugging.
This fix is tested manually, as is suggested in #23768.
This fix fixes#23768.
The NumCPU() in Linux is based on @wmark 's implementation.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
I noticied an inconsistency when reviewing docker/pull/20692.
Changing Ip to IP and Nf to NF.
More info: The golang folks recommend that you keep the initials consistent:
https://github.com/golang/go/wiki/CodeReviewComments#initialisms.
Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
Before this patch libcontainer badly errored out with `invalid
argument` or `numerical result out of range` while trying to write
to cpuset.cpus or cpuset.mems with an invalid value provided.
This patch adds validation to --cpuset-cpus and --cpuset-mems flag along with
validation based on system's available cpus/mems before starting a container.
Signed-off-by: Antonio Murdaca <runcom@linux.com>
sysinfo struct was initialized at daemon startup to make sure
kernel configs such as device cgroup are present and error out if not.
The struct was embedded in daemon struct making impossible to detect
if some system config is changed at daemon runtime (i.e. someone
umount the memory cgroup). This leads to container's starts failure if
some config is changed at daemon runtime.
This patch moves sysinfo out of daemon and initilize and check it when
needed (daemon startup, containers creation, contaienrs startup for
now).
Signed-off-by: Antonio Murdaca <runcom@linux.com>
(cherry picked from commit 472b6f66e0)
Carried: #14015
If kernel is compiled with CONFIG_FAIR_GROUP_SCHED disabled cpu.shares
doesn't exist.
If kernel is compiled with CONFIG_CFQ_GROUP_IOSCHED disabled blkio.weight
doesn't exist.
If kernel is compiled with CONFIG_CPUSETS disabled cpuset won't be
supported.
We need to handle these conditions by checking sysinfo and verifying them.
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Replaced github.com/docker/libcontainer with
github.com/opencontainers/runc/libcontaier.
Also I moved AppArmor profile generation to docker.
Main idea of this update is to fix mounting cgroups inside containers.
After updating docker on CI we can even remove dind.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Memory swappiness option takes 0-100, and helps to tune swappiness
behavior per container.
For example, When a lower value of swappiness is chosen
the container will see minimum major faults. When no value is
specified for memory-swappiness in docker UI, it is inherited from
parent cgroup. (generally 60 unless it is changed).
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
The cleanup to sysinfo package introduced a regression.
If memory cgroup isn't supported and --memory is specified when
starting a container, we should return info instead of nil in
checkCgroupMem(), otherwise we'll access a nil pointer.
Signed-off-by: Zefan Li <lizefan@huawei.com>