Signed-off-by: John Howard <jhoward@microsoft.com>
This ensures that any compute processes in HCS are cleanedup
during daemon restore. Note Windows cannot (currently) reconnect
to containers on restore.
Instead of a timeout the context is cancelled on error to ensure
proper cleanup of the associated fifos' goroutines.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Keeping the current behavior for exec, i.e., inheriting
variables from main process. New variables will be added
to current ones. If there's already a variable with that
name it will be overwritten.
Example of usage: docker exec -it -e TERM=vt100 <container> top
Closes#24355.
Signed-off-by: Jonh Wendell <jonh.wendell@redhat.com>
The code that handles waiting for the servicing container to complete correctly grabs the exit code and logs a failure, but doesn't return that failure to the caller, mistakenly causing servicing operations to look successful when they really failed during processing.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
During the recent OCI changes, I mistakenly thought LayerFolderPath is only needed for Windows Server containers (isolation=process) and not for Hyper-V Containers, but it turns out it is also required for servicing containers used to finish installing updates. Since the servicing containers need to reuse the container's create options, this change makes it so that LayerFolderPath is always filled in for all containers as part of constructing the create options.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Microsoft will be distributing non-base layers that have utility VM image
updates. Update libcontainerd to use the top-most utility VM image that is
available in the image chain when launching Hyper-V-isolated container.
Signed-off-by: John Starks <jostarks@microsoft.com>
New grpc uses failfast by default, but that code was written with other
default in mind, so just preserve it for now.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Fix issue where environment variables with embedded equals signs were
being dropped and not passed to the container.
Fixes#26178.
Signed-off-by: Matt Richardson <matt.richardson@octopus.com>
There exists a race in container servicing on Windows where, during normal operation, the container will begin to shut itself down while docker calls shutdown explicitly. If the former succeeds just as the latter is attempting to communicate with the container to request the shutdown, an error comes back that can cause the servicing to incorrectly register as a failure. Instead, we just wait for the servicing container to shutdown on it's own, using a reasonable timeout to allow for merging in the updates.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
When there is no event for the container it can happen because of a
crash and the container state on the persistent disk will have a
mismatch between what was in `/run` ( machine crash ).
This situation will create an unkillable container in docker because
containerd does not see it and it is not running but docker thinks it is
and you cannot tell it anything different.
This fixes the issue by checking if containerd has the container running
if we do not have an event instead of just returning.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This was preventing the "exit" event to be correctly processed during
the restore process without live-restore enabled.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
This version introduces the following:
- uses nanosecond timestamps for event
- ensure events are sent once their effect is "live"
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
This adds an `--oom-score-adjust` flag to the daemon so that the value
provided can be set for the docker daemon's process. The default value
for the flag is -500. This will allow the docker daemon to have a
less chance of being killed before containers do. The default value for
processes is 0 with a min/max of -1000/1000.
-500 is a good middle ground because it is less than the default for
most processes and still not -1000 which basically means never kill this
process in an OOM condition on the host machine. The only processes on
my machine that have a score less than -500 are dbus at -900 and sshd
and xfce( my window manager ) at -1000. I don't think docker should be
set lower, by default, than dbus or sshd so that is why I chose -500.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The Windows TP5 image is not compatible with the Hyper-V isolated
container clone feature. Detect old images and pass a flag specifying that
clone should not be enabled.
Signed-off-by: John Starks <jostarks@microsoft.com>
Right now, if we hit an error retrieving the exit code in HCS process.ExitCode, we return that 0 and that error. Golang convention says that if an error is returned the other values should not be used, but the caller of ExitCode in libcontainerd has to fall through if an error is received. Rather than return a success exit code in that failure case, we should return -1 to indicate a generic failure.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
This flags enables full support of daemonless containers in docker. It
ensures that docker does not stop containers on shutdown or restore and
properly reconnects to the container when restarted.
This is not the default because of backwards compat but should be the
desired outcome for people running containers in prod.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This change adjusts the calling pattern for servcing containers to use waiting on the process instead of expecting start to block. This is safer, as it avoids timeouts in the start code path for the potentially expensive update operation.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.
The `HEALTHCHECK` instruction has two forms:
* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)
The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.
When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.
The options that can appear before `CMD` are:
* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)
The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.
If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.
It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.
There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.
The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).
The command's exit status indicates the health status of the container.
The possible values are:
- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly
If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.
For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:
HEALTHCHECK --interval=5m --timeout=3s \
CMD curl -f http://localhost/ || exit 1
To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).
When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.
Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A previous change added a TTY fixup for stdin on older Windows versions to
work around a Windows issue with backspace/delete behavior. This change
used the OS version to determine whether to activate the behavior.
However, the Windows bug is actually in the image, not the OS, so it
should have used the image's OS version.
This ensures that a Server TP5 container running on Windows 10 will have
reasonable console behavior.
Signed-off-by: John Starks <jostarks@microsoft.com>
This test is not applicable anymore now that containers are not stopped
when the daemon is restored.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
In Windows containers in TP5, DEL is interpreted as the delete key, but
Linux generally interprets it as backspace. This prevents backspace from
working when using a Linux terminal or the native console terminal
emulation in Windows.
To work around this, translate DEL to BS in Windows containers stdin when
TTY is enabled. Do this only for builds that do not have the fix in
Windows itself.
Signed-off-by: John Starks <jostarks@microsoft.com>
Removing the call to Shutdown from within Signal in order to rely on waitExit handling the exit of the process.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
This change enables the workflow of finishing installing Windows OS updates in the container after it has completed running, via a special servicing container.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Restore the 1.10 logic that will reset the restart manager's timeout or
backoff delay if a container executes longer than 10s reguardless of
exit status or policy.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This avoid an extra bind mount within /var/run/docker/libcontainerd
This should resolve situations where a container having the host
/var/run bound prevents other containers from being cleanly removed
(e.g. #21969).
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
RPC connection closing error will be reported every time we shutdown
daemon, this error is expected, so we should remove this error to avoid
confusion to user.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Don't throw "restartmanager canceled" error for no restart policy container
and add the container id to the warning message if a container has restart policy
and has been canceled.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
Currently if you restart docker daemon, all the containers with restart
policy `on-failure` regardless of its `RestartCount` will be started,
this will make daemon cost more extra time for restart.
This commit will stop these containers to do unnecessary start on
daemon's restart.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
When user try to restart a restarting container, docker client report
error: "container is already active", and container will be stopped
instead be restarted which is seriously wrong.
What's more critical is that when user try to start this container
again, it will always fail.
This error can also be reproduced with a `docker stop`+`docker start`.
And this commit will fix the bug.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
- Refactor generic and path based cleanup functions into a single function.
- Include aufs and zfs mounts in the mounts cleanup.
- Containers that receive exit event on restore don't require manual cleanup.
- Make missing sandbox id message a warning because currently sandboxes are always cleared on startup. libnetwork#975
- Don't unmount volumes for containers that don't have base path. Shouldn't be needed after #21372
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Its useful to have containerd logs as part of docker.
Containerd metrics are too chatty, so set interval to 0.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
runc expects a systemd cgroupsPath to be in slice:scopePrefix:containerName
format and the "--systemd-cgroup" option to be set. Update docker accordingly.
Fixes 21475
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
Signed-off-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Darren Stahl <darst@microsoft.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>