Fix racey TestHealthKillContainer

Before this change if you assume that things work the way the test
expects them to (it does not, but lets assume for now) we aren't really
testing anything because we are testing that a container is healthy
before and after we send a signal. This will give false positives even
if there is a bug in the underlying code. Sending a signal can take any
amount of time to cause a container to exit or to trigger healthchecks
to stop or whatever.

Now lets remove the assumption that things are working as expected,
because they are not.
In this case, `top` (which is what is running in the container) is
actually exiting when it receives `USR1`.
This totally invalidates the test.

We need more control and knowledge as to what is happening in the
container to properly test this.
This change introduces a custom script which traps `USR1` and flips the
health status each time the signal is received.
We then send the signal twice so that we know the change has occurred
and check that the value has flipped so that we know the change has
actually occurred.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 27ba755f70)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit is contained in:
Brian Goff 2021-10-21 19:27:07 +00:00 committed by Sebastiaan van Stijn
parent 8ab80e4a20
commit acb4f263b3
No known key found for this signature in database
GPG key ID: 76698F39D527CE8C

View file

@ -43,8 +43,32 @@ func TestHealthKillContainer(t *testing.T) {
client := testEnv.APIClient()
id := container.Run(ctx, t, client, func(c *container.TestContainerConfig) {
cmd := `
# Set the initial HEALTH value so the healthcheck passes
HEALTH="1"
echo $HEALTH > /health
# Any time doHealth is run we flip the value
# This lets us use kill signals to determine when healtchecks have run.
doHealth() {
case "$HEALTH" in
"0")
HEALTH="1"
;;
"1")
HEALTH="0"
;;
esac
echo $HEALTH > /health
}
trap 'doHealth' USR1
while true; do sleep 1; done
`
c.Config.Cmd = []string{"/bin/sh", "-c", cmd}
c.Config.Healthcheck = &containertypes.HealthConfig{
Test: []string{"CMD-SHELL", "sleep 1"},
Test: []string{"CMD-SHELL", `[ "$(cat /health)" = "1" ]`},
Interval: time.Second,
Retries: 5,
}
@ -57,6 +81,13 @@ func TestHealthKillContainer(t *testing.T) {
err := client.ContainerKill(ctx, id, "SIGUSR1")
assert.NilError(t, err)
ctxPoll, cancel = context.WithTimeout(ctx, 30*time.Second)
defer cancel()
poll.WaitOn(t, pollForHealthStatus(ctxPoll, client, id, "unhealthy"), poll.WithDelay(100*time.Millisecond))
err = client.ContainerKill(ctx, id, "SIGUSR1")
assert.NilError(t, err)
ctxPoll, cancel = context.WithTimeout(ctx, 30*time.Second)
defer cancel()
poll.WaitOn(t, pollForHealthStatus(ctxPoll, client, id, "healthy"), poll.WithDelay(100*time.Millisecond))