libcontainerd: work around exec start bug in c8d

It turns out that the unnecessary serialization removed in
b75246202a happened to work around a bug
in containerd. When many exec processes are started concurrently in the
same containerd task, it takes seconds to minutes for them all to start.
Add the workaround back in, only deliberately this time.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit fb7ec1555c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit is contained in:
Cory Snider 2023-05-25 16:00:29 -04:00 committed by Sebastiaan van Stijn
parent 37bc639704
commit 7a4ea19803
No known key found for this signature in database
GPG key ID: 76698F39D527CE8C

View file

@ -60,6 +60,10 @@ type container struct {
type task struct {
containerd.Task
ctr *container
// Workaround for https://github.com/containerd/containerd/issues/8557.
// See also https://github.com/moby/moby/issues/45595.
serializeExecStartsWorkaround sync.Mutex
}
type process struct {
@ -296,7 +300,12 @@ func (t *task) Exec(ctx context.Context, processID string, spec *specs.Process,
// the stdin of exec process will be created after p.Start in containerd
defer func() { stdinCloseSync <- p }()
if err = p.Start(ctx); err != nil {
err = func() error {
t.serializeExecStartsWorkaround.Lock()
defer t.serializeExecStartsWorkaround.Unlock()
return p.Start(ctx)
}()
if err != nil {
// use new context for cleanup because old one may be cancelled by user, but leave a timeout to make sure
// we are not waiting forever if containerd is unresponsive or to work around fifo cancelling issues in
// older containerd-shim