daemon: reinit health monitor on live-restore

The container may have been running without health probes for an
indeterminate amount of time. The container may have become unhealthy in
the interim. We should probe it sooner than in steady-state, while also
giving it some leeway to recover from e.g. timed-out connections. This
is easy to achieve by probing the container like a freshly-started one.
The original author of health-checks came to the same conclusion; the
health monitor was reinitialized on live-restored containers before
v17.11.0, when health monitoring of live-restored containers was
accidentally broken. Revert to the original behavior.

Signed-off-by: Cory Snider <csnider@mirantis.com>
This commit is contained in:
Cory Snider 2024-01-08 19:32:21 -05:00
parent 6b1baf8dd2
commit 0e62dbadcd

View file

@ -442,7 +442,7 @@ func (daemon *Daemon) restore(cfg *configStore) error {
c.Lock()
c.Paused = false
daemon.setStateCounter(c)
daemon.updateHealthMonitor(c)
daemon.initHealthMonitor(c)
if err := c.CheckpointTo(daemon.containersReplica); err != nil {
baseLogger.WithError(err).Error("failed to update paused container state")
}
@ -451,7 +451,7 @@ func (daemon *Daemon) restore(cfg *configStore) error {
case !c.IsPaused() && alive:
logger(c).Debug("restoring healthcheck")
c.Lock()
daemon.updateHealthMonitor(c)
daemon.initHealthMonitor(c)
c.Unlock()
}