libcontainerd: work around exec start bug in c8d

It turns out that the unnecessary serialization removed in
b75246202a happened to work around a bug
in containerd. When many exec processes are started concurrently in the
same containerd task, it takes seconds to minutes for them all to start.
Add the workaround back in, only deliberately this time.

Signed-off-by: Cory Snider <csnider@mirantis.com>
This commit is contained in:
Cory Snider 2023-05-25 16:00:29 -04:00
parent d5dc675d37
commit fb7ec1555c

View file

@ -60,6 +60,10 @@ type container struct {
type task struct { type task struct {
containerd.Task containerd.Task
ctr *container ctr *container
// Workaround for https://github.com/containerd/containerd/issues/8557.
// See also https://github.com/moby/moby/issues/45595.
serializeExecStartsWorkaround sync.Mutex
} }
type process struct { type process struct {
@ -296,7 +300,12 @@ func (t *task) Exec(ctx context.Context, processID string, spec *specs.Process,
// the stdin of exec process will be created after p.Start in containerd // the stdin of exec process will be created after p.Start in containerd
defer func() { stdinCloseSync <- p }() defer func() { stdinCloseSync <- p }()
if err = p.Start(ctx); err != nil { err = func() error {
t.serializeExecStartsWorkaround.Lock()
defer t.serializeExecStartsWorkaround.Unlock()
return p.Start(ctx)
}()
if err != nil {
// use new context for cleanup because old one may be cancelled by user, but leave a timeout to make sure // use new context for cleanup because old one may be cancelled by user, but leave a timeout to make sure
// we are not waiting forever if containerd is unresponsive or to work around fifo cancelling issues in // we are not waiting forever if containerd is unresponsive or to work around fifo cancelling issues in
// older containerd-shim // older containerd-shim