As pointed out by Tonis, there's a race between ReleaseRWLayer()
and GetRWLayer():
```
----- goroutine 1 ----- ----- goroutine 2 -----
ReleaseRWLayer()
m := ls.mounts[l.Name()]
...
m.deleteReference(l)
m.hasReferences()
... GetRWLayer()
... mount := ls.mounts[id]
ls.driver.Remove(m.mountID)
ls.store.RemoveMount(m.name) return mount.getReference()
delete(ls.mounts, m.Name())
----------------------- -----------------------
```
When something like this happens, GetRWLayer will return
an RWLayer without a storage. Oops.
There might be more races like this, and it seems the best
solution is to lock by layer id/name by using pkg/locker.
With this in place, name collision could not happen, so remove
the part of previous commit that protected against it in
CreateRWLayer (temporary nil assigmment and associated rollback).
So, now we have
* layerStore.mountL sync.Mutex to protect layerStore.mount map[]
(against concurrent access);
* mountedLayer's embedded `sync.Mutex` to protect its references map[];
* layerStore.layerL (which I haven't touched);
* per-id locker, to avoid name conflicts and concurrent operations
on the same rw layer.
The whole rig seems to look more readable now (mutexes use is
straightforward, no nested locks).
Reported-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit af433dd200)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Goroutine stack analisys shown some lock contention
while doing massively (100 instances of `docker rm`)
parallel image removal, with many goroutines waiting
for the mountL mutex. Optimize it.
With this commit, the above operation is about 3x
faster, with no noticeable change to container
creation times (tested on aufs and overlay2).
kolyshkin@:
- squashed commits
- added description
- protected CreateRWLayer against name collisions by
temporary assiging nil to ls.mounts[name], and treating
nil as "non-existent" in all the other functions.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 05250a4f00)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In applyTar, if the driver's ApplyDiff returns an error, the function
returns early without calling io.Copy.
As a consequence, the resources (a goroutine and some buffers holding
the uncompressed image, the digest, etc...) allocated or referenced by
NewInputTarStream above aren't released, as the worker goroutine only
finishes when it finds EOF or a closed pipe.
Signed-off-by: Sergio Lopez <slp@redhat.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
Layer metadata storage has not been implemented outside of the layer
store and will be deprecated by containerd metadata storage. To prepare
for this and freeze the current metadata storage, remove the exported
interface and make it internal to the layer store.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: John Howard <jhoward@microsoft.com>
The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.
This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
This allows graphdrivers to declare that they can reproduce the original
diff stream for a layer. If they do so, the layer store will not use
tar-split processing, but will still verify the digest on layer export.
This makes it easier to experiment with non-default diff formats.
Signed-off-by: Alfred Landrum <alfred.landrum@docker.com>
The `digest` data type, used throughout docker for image verification
and identity, has been broken out into `opencontainers/go-digest`. This
PR updates the dependencies and moves uses over to the new type.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Move some of the optional parameters of CreateRWLayer() in a struct
called CreateRWLayerOpts. This will make it easy to add more options
arguments without having to change signature of CreateRWLayer().
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This allows for easy extension of adding more parameters to existing
parameters list. Otherwise adding a single parameter changes code
at so many places.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
The `archive` package defines aliases for `io.ReadCloser` and
`io.Reader`. These don't seem to provide an benefit other than type
decoration. Per this change, several unnecessary type cases were
removed.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
init layer is read/write layer and not read only layer. Following commit
introduced new graph driver method CreateReadWrite.
ef5bfad Adding readOnly parameter to graphdriver Create method
So far only windows seem to be differentiating between above two methods.
Making this change to make sure -init layer calls right method so that
we don't have surprises in future.
Windows does not need init layer. This patch also gets rid of creation of
init layer on windows.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
As part of making graphdrivers support pluginv2, a PluginGetter
interface was necessary for cleaner separation and avoiding import
cycles.
This commit creates a PluginGetter interface and makes pluginStore
implement it. Then the pluginStore object is created in the daemon
(rather than by the plugin manager) and passed to plugin init as
well as to the different subsystems (eg. graphdrivers, volumedrivers).
A side effect of this change was that some code was moved out of
experimental. This is good, since plugin support will be stable soon.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Since the layer store was introduced, the level above the graphdriver
now differentiates between read/write and read-only layers. This
distinction is useful for graphdrivers that need to take special steps
when creating a layer based on whether it is read-only or not.
Adding this parameter allows the graphdrivers to differentiate, which
in the case of the Windows graphdriver, removes our dependence on parsing
the id of the parent for "-init" in order to infer this information.
This will also set the stage for unblocking some of the layer store
unit tests in the next preview build of Windows.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Get was calling getReference without layerL held. This meant writes to
the references map could race. Such races are dangerous because they can
corrupt the map and crash the process.
Fixes#21616Fixes#21674
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Fix unmount issues in the daemon crash and restart lifecycle, w.r.t
graph drivers. This change sets a live container RWLayer's activity
count to 1, so that the RWLayer is aware of the mount. Note that
containerd has experimental support for restore live containers.
Added/updated corresponding tests.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
This allows a graph driver to provide a custom FileGetter for tar-split
to use. Windows will use this to provide a more efficient implementation
in a follow-up change.
Signed-off-by: John Starks <jostarks@microsoft.com>
Add continue when layer fails on store creation
Trim whitespace from layerstore files to keep trailing space from failing a layer load
Fixes#19449
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Support restoreCustomImage for windows with a new interface to extract
the graph driver from the LayerStore.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
RWLayer will now have more operations and be protected through a referenced type rather than always looked up by string in the layer store.
Separates creation of RWLayer (write capture layer) from mounting of the layer.
This allows mount labels to be applied after creation and allowing RWLayer objects to have the same lifespan as a container without performance regressions from requiring mount.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)