The global netlink handle ns.NlHandle() is indirectly cached for the
life of the process by the netutils.CheckRouteOverlaps() function. This
caching behaviour is a problem for the libnetwork unit tests as the
global netlink handle changes every time testutils.SetupTestOSContext()
is called, i.e. at the start of nearly every test case. Route overlaps
can be checked for in the wrong network namespace, causing spurious test
failures e.g. when running the same test twice in a row with -count=2.
Stop the netlink handle from being cached by shadowing the package-scope
variable with a function-scoped one.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestParallel has been written in an unusual style which relies on the
testing package's intra-test parallelism feature and lots of global
state to test one thing using three cooperating parallel tests. It is
complicated to reason about and quite brittle. For example, the command
go test -run TestParallel1 ./libnetwork
would deadlock, waiting until the test timeout for TestParallel2 and
TestParallel3 to run. And the test would be skipped if the
'-test.parallel' flag was less than three, either explicitly or
implicitly (default: GOMAXPROCS).
Overhaul TestParallel to address the aforementioned deficiencies and
get rid of mutable global state.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Reusing the same "OS context" (read: network namespace) and
NetworkController across multiple tests risks tests interfering with
each other, or worse: _depending on_ other tests to set up
preconditions. Construct a new controller for every test which needs
one, and run every test which mutates or inspects the host environment
in a fresh OS context.
The only outlier is runParallelTests. It is the only remaining test
implementation which references the "global" package-scoped controller,
so the global controller instance is effectively private to that one
test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Sharing a single NetworkController instance between multiple tests
makes it possible for tests to interfere with each other. As a first
step towards giving each test its own private controller instance, make
explicit which controller createTestNetwork() creates the test network
on.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There are a handful of libnetwork tests which need to have multiple
concurrent goroutines inside the same test OS context (network
namespace), including some platform-agnostic tests. Provide test
utilities for spawning goroutines inside an OS context and
setting/restoring an existing goroutine's OS context to abstract away
platform differences and so that each test does not need to reinvent the
wheel.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The SetupTestOSContext calls were made conditional in
https://github.com/moby/libnetwork/pull/148 to work around limitations
in runtime.LockOSThread() which existed before Go 1.10. This workaround
is no longer necessary now that runtime.UnlockOSThread() needs to be
called an equal number of times before the goroutine is unlocked from
the OS thread.
Unfortunately some tests break when SetupTestOSContext is not skipped.
(Evidently this code path has not been exercised in a long time.) A
newly-created network namespace is very barebones: it contains a
loopback interface in the down state and little else. Even pinging
localhost does not work inside of a brand new namespace. Set the
loopback interface to up during namespace setup so that tests which
need to use the loopback interface can do so.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the GenericData option is missing from the map, the bridge driver
would skip configuration. At first glance this seems fine: the driver
defaults are to not configure anything. But it also skips over
initializing the persistent storage, which is configured through other
option keys in the map. Fix this oversight by removing the early return.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Each call to datastore.DefaultScopes() would modify and return the same
map. Consequently, some of the config for every NetworkController in the
process would be mutated each time one is constructed or reconfigured.
This behaviour is unexpected, unintended, and undesirable. Stop
accidentally sharing configuration by changing DefaultScopes() to return
a distinct map on each call.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*networkNamespace).InvokeFunc() cleaned up the state of the locked
thread by setting its network namespace to the netns of the goroutine
which called InvokeFunc(). If InvokeFunc() was to be called after the
caller had modified its thread's network namespace, InvokeFunc() would
incorrectly "restore" the state of its goroutine thread to the wrong
namespace, violating the invariant that unlocked threads are fungible.
Change the implementation to restore the thread's netns to the netns
that particular thread had before InvokeFunc() modified it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
When running inside a container, testns == origns. Consequently, closing
testns causes the deferred netns.Set(origns) call to fail. Stop closing
the aliased original namespace handle.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestResolvConfHost and TestParallel both depended on a network named
"testhost" already existing in the libnetwork controller. Only TestHost
created that network, so the aforementioned tests would fail unless run
after TestHost. Fix those tests so they can be run in any order.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While this was convenient for our use, it's somewhat unexpected for a function
that writes a file to also create all parent directories; even more because
this function may be executed as root.
This patch makes the package more "safe" to use as a generic package by removing
this functionality, and leaving it up to the caller to create parent directories,
if needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
unix.Kill() does not produce an error for PID 0, -1. As a result, checking
process.Alive() would return "true" for both 0 and -1 on macOS (and previously
on Linux as well).
Let's shortcut these values to consider them "not alive", to prevent someone
trying to kill them.
A basic test was added to check the behavior.
Given that the intent of these functions is to handle single processes, this patch
also prevents 0 and negative values to be used.
From KILL(2): https://man7.org/linux/man-pages/man2/kill.2.html
If pid is positive, then signal sig is sent to the process with
the ID specified by pid.
If pid equals 0, then sig is sent to every process in the process
group of the calling process.
If pid equals -1, then sig is sent to every process for which the
calling process has permission to send signals, except for
process 1 (init), but see below.
If pid is less than -1, then sig is sent to every process in the
process group whose ID is -pid.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using the implementation from pkg/pidfile for windows, as that implementation
looks to be handling more cases to check if a process is still alive (or to be
considered alive).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the file doesn't exist, the process isn't running, so we should be able
to ignore that.
Also remove an intermediate variable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 48dd89375d...6341884e5f
Pulls in a set of fixes to SwarmKit's nascent Cluster Volumes support
discovered during subsequent development and testing.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Deleting a containerd task whose status is Created fails with a
"precondition failed" error. This is because (aside from Windows)
a process is spawned when the task is created, and deleting the task
while the process is running would leak the process if it was allowed.
libcontainerd and the containerd plugin executor mistakenly try to clean
up from a failed start by deleting the created task, which will always
fail with the aforementined error. Change them to pass the
`WithProcessKill` delete option so the cleanup has a chance to succeed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Currently an attempt to pull a reference which resolves to an OCI
artifact (Helm chart for example), results in a bit unrelated error
message `invalid rootfs in image configuration`.
This provides a more meaningful error in case a user attempts to
download a media type which isn't image related.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>