Jelajahi Sumber

Delete stale containerd object on start failure

containerd has two objects with regard to containers.
There is a "container" object which is metadata and a "task" which is
manging the actual runtime state.

When docker starts a container, it creartes both the container metadata
and the task at the same time. So when a container exits, docker deletes
both of these objects as well.

This ensures that if, on start, when we go to create the container metadata object
in containerd, if there is an error due to a name conflict that we go
ahead and clean that up and try again.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 5ba30cd1dc6000ee53b34f628cbff91d7f6d7231)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Brian Goff 6 tahun lalu
induk
melakukan
1d0353548a
1 mengubah file dengan 15 tambahan dan 2 penghapusan
  1. 15 2
      daemon/start.go

+ 15 - 2
daemon/start.go

@@ -176,9 +176,22 @@ func (daemon *Daemon) containerStart(container *container.Container, checkpoint
 		return err
 	}
 
-	err = daemon.containerd.Create(context.Background(), container.ID, spec, createOptions)
+	ctx := context.TODO()
+
+	err = daemon.containerd.Create(ctx, container.ID, spec, createOptions)
 	if err != nil {
-		return translateContainerdStartErr(container.Path, container.SetExitCode, err)
+		if errdefs.IsConflict(err) {
+			logrus.WithError(err).WithField("container", container.ID).Error("Container not cleaned up from containerd from previous run")
+			// best effort to clean up old container object
+			daemon.containerd.DeleteTask(ctx, container.ID)
+			if err := daemon.containerd.Delete(ctx, container.ID); err != nil && !errdefs.IsNotFound(err) {
+				logrus.WithError(err).WithField("container", container.ID).Error("Error cleaning up stale containerd container object")
+			}
+			err = daemon.containerd.Create(ctx, container.ID, spec, createOptions)
+		}
+		if err != nil {
+			return translateContainerdStartErr(container.Path, container.SetExitCode, err)
+		}
 	}
 
 	// TODO(mlaventure): we need to specify checkpoint options here