During review, it was decided to remove `LimitNOFILE` from `docker.service` to rely on the systemd v240 implicit default of `1024:524288`. On supported platforms with systemd prior to v240, packagers will patch the service with an explicit `LimitNOFILE=1024:524288`.
- `1024` soft limit is an implicit default, avoiding unexpected breakage. Software that needs a higher limit should request to raise the soft limit for its process.
- `524288` hard limit is an implicit default since systemd v240 and is adequate for most processes (_half of the historical limit from `fs.nr_open` of `1048576`_), while 4096 is the implicit default from the kernel (often too low). Individual containers can be started with `--ulimit` when a larger hard limit is required.
- The hard limit may not exceed `fs.nr_open` (_which a value of `infinity` will resolve to_). On most systems with systemd v240 or newer, this will resolve to an excessive size of 2^30 (over 1 billion).
- When set to `infinity` (usually as the soft limit) software may experience significantly increased resource usage, resulting in a performance regression or runtime failures that are difficult to troubleshoot.
- OpenRC current config approach lacks support for different soft/hard limits being set as it adjusts additional limits and `ulimit` does not support mixed usage of `-H` + `-S`. A soft limit of `524288` is not ideal, but 2^19 is much less overhead than 2^30, whilst a hard limit of 4096 would be problematic for Docker.
Signed-off-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
Upstart has been EOL for 8 years and isn't used by any distributions we support any more.
Additionally, this removes the "cgroups v1" setup code because it's more reasonable now for us to expect something _else_ to have set up cgroups appropriately (especially cgroups v2).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Single-Board Computer and embedded systems might have a clock that is extremely out of sync with reality.
Adding this target ensures docker is only started after a somewhat realistic clock was set.
More information about the time-set.target can be found here: https://www.freedesktop.org/software/systemd/man/systemd.special.html#time-sync.target
Signed-off-by: Michael Kuehn <micha@kuehn.io>
Per the systemd.unit documentation:
> If this unit gets activated, the units listed will be activated as well. If one of the other units fails to activate, and an ordering dependency After= on the failing unit is set, this unit will not be started. Besides, with or without specifying After=, this unit will be stopped if one of the other units is explicitly stopped.
>
> Often, it is a better choice to use Wants= instead of Requires= in order to achieve a system that is more robust when dealing with failing services.
This should also be generally "safe" given we added `--containerd=/run/containerd/containerd.sock` to the flags we pass to `dockerd`.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Anca Iordache <anca.iordache@docker.com>
This unit file was created when we packaged rpms without the
socket activation unit, but that's no longer the case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This reverts commit 0ca7456e52,
which caused the docker service to not be starting, or delayed
starting the service in certain conditions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This reverts commit a65c65d801,
which caused the docker service to not be starting, or delayed
starting the service in certain conditions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
relates to https://github.com/docker/for-linux/issues/678
When using the BindTo directive, Docker is permanently stopped by systemd
when containerd is temporarily killed and restarted;
Using `Requires` achieves mostly the same, but defines a weaker dependency;
https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Requires=
> Requires=
>
> .. If this unit gets activated, the units listed will be activated as well.
> If one of the other units fails to activate, and an ordering dependency
> After= on the failing unit is set, this unit will not be started. Besides,
> with or without specifying After=, this unit will be stopped if one of the
> other units is explicitly stopped.
We may want to look into using `Wants=` instead of `Requires=`, because
that allows docker to continue running if containerd is restarted, quoting
the systemd documentation:
> Often, it is a better choice to use Wants= instead of Requires= in order
> to achieve a system that is more robust when dealing with failing services.
Given that docker will likely still fail if the containerd socket is not
present, startup will fail if containerd is not running, but if containerd
is restarted, the docker daemon may be able to try reconnecting.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
dockerd currently sets the oom-score-adjust itself. This functionality
was added when we did not yet run dockerd as a systemd service.
Now that we do, it's better to instead have systemd handle this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add multi-user.target to the After= list in docker.service so that multi-user.target does not wait for docker.service (and consequently wait for network-online.target).
Signed-off-by: Isaiah Grace <irgkenya4@gmail.com>
We were not really using these, and they haven't been
updated in a long time. If needed, we can add people to
the CODEOWNERS file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
PartOf deactivates the socket whenever the service get deactivated.
The socket unit however should be active nevertheless, so that the
docker service can be started again through socket activation.
Based on the original patch in upstream moby/moby by Max Harmathy.
Co-authored-by: Max Harmathy <max.harmathy@web.de>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
* Use rc_ulimit for ulimit constraints
* Synchronize ulimit settings to systemd's
* Add support for reload command
* Add support for retry settings for docker stop/restart
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
containerd is now running as a separate service, and should
no longer be started as a managed child-process of dockerd.
The dockerd service already specifies that it should be started
`After` the containerd.service, but there is still a race
condition, where containerd is started, but its socket is not yet
created.
In that situation, `dockerd` detects that the containerd socket
is missing, and will start a new instance of containerd (as a
managed child-process), which causes live-restore to fail.
This patch explicitly sets the `--containerd` daemon option.
If this option is set, `dockerd` will not start a new instance
of containerd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Without this the docker.socket would not start by default when starting
the docker.service leading to failures to start.
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Removes the systemd drop-in unit file for socket activation and instead
prefers socket activation by default for both RHEL based and DEBIAN
based distributions.
Socket activation for RHEL based distributions was tested on CentOS 7 and Fedora 28.
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Set the PATH to what appears to be the standard on latest Ubuntu (18.04)
and Debian (9), fixing the following two issues:
1. PATH did not contain /bin (leading to ContainerTop/ps not working
on newer distros, among the other things).
2. $PATH can't be specified in Environment directives in .service files.
While at it, also:
3. Remove the comment about RPM as it looks misleading on deb-based
systems.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Removes the need for the offline installer to install the shim process
and instead installs the shim process as part of the packaging.
May be easier in the future to just package the shim process on it's own
but that'll come after this 18.09 release
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229
(6bf0f408e4)
both the old, and new location are accepted by systemd 229 and up, so using the old location
to make them work for either version of systemd.
StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230
(f0367da7d1)
both the old, and new name are accepted by systemd 230 and up, so using the old name to make
this option work for either version of systemd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds support for reloading the docker daemon
(SIGHIUP) so that changes in '/etc/docker/daemon.json'
can be loaded at runtime by reloading the service
through systemd ('systemctl reload docker')
Before this change, systemd would output an error
that "reloading" is not supported for the docker
service;
systemctl reload docker
Failed to reload docker.service: Job type reload is not applicable for unit docker.service.
After this change, the docker daemon can be reloaded
through 'systemctl reload docker', which reloads
the configuration;
journalctl -f -u docker.service
May 02 03:49:20 testing systemd[1]: Reloading Docker Application Container Engine.
May 02 03:49:20 testing docker[28496]: time="2016-05-02T03:49:20.143964103-04:00" level=info msg="Got signal to reload configuration, reloading from: /etc/docker/daemon.json"
May 02 03:49:20 testing systemd[1]: Reloaded Docker Application Container Engine.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Change the kill mode to process so that systemd does not kill container
processes when the daemon is shutdown but only the docker daemon
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We need to add delegate yes to docker's service file so that it can
manage the cgroups of the processes that it launches without systemd
interfering with them and moving the processes after it is reloaded.
Delegate=
Turns on delegation of further resource control partitioning to
processes of the unit. For unprivileged services (i.e. those
using the User= setting), this allows processes to create a
subhierarchy beneath its control group path. For privileged
services and scopes, this ensures the processes will have all
control group controllers enabled.
This is the proper fix for issue moby/moby#20152
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Systemd sets a default of 512 tasks, which is far
too low to run many containers.
Note that TasksMax is only supported on systemd 226
and above.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There is a not-insignificant performance overhead for all containers (if
containerd is a child of Docker, which is the current setup) if systemd
sets rlimits on the main Docker daemon process (because the limits
propogate to all children).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
set LimitCORE=infinity to ensure complete core creation,
allows extraction of as much information as possible.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
(cherry picked from commit 51879873897afe298cbb736acef34b5a0b500424)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Old versions of things on CentOS 7 strike again!
infinity is not a thing for TimeoutSec on systemd < 229
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
PartOf deactivates the socket whenever the service get deactivated. The socket unit however should be active nevertheless.
Signed-off-by: Max Harmathy <max.harmathy@web.de>