This fix tries to address issues raised in #20936 and #22443
where `docker pull` or `docker push` fails because of the
concurrent connection failing.
Currently, the number of maximum concurrent connections is
controlled by `maxDownloadConcurrency` and `maxUploadConcurrency`
which are hardcoded to 3 and 5 respectively. Therefore, in
situations where network connections don't support multiple
downloads/uploads, failures may encounter for `docker push`
or `docker pull`.
This fix tries changes `maxDownloadConcurrency` and
`maxUploadConcurrency` to adjustable by passing
`--max-concurrent-uploads` and `--max-concurrent-downloads` to
`docker daemon` command.
The documentation related to docker daemon has been updated.
Additional test case have been added to cover the changes in this fix.
This fix fixes#20936. This fix fixes#22443.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Use sockets.DialerFromEnvironment, as is done in other places,
to transparently support SOCKS proxy config from ALL_PROXY
environment variable.
Requires the *engine* have the ALL_PROXY env var set, which
doesn't seem ideal. Maybe it should be a CLI option somehow?
Only tested with push and a v2 registry so far. I'm happy to look
further into testing more broadly, but I wanted to get feedback on
the general idea first.
Signed-off-by: Brett Higgins <brhiggins@arbor.net>
@nwt noticed that the media type specified in the config section of a
schema2 manifest is application/octet-stream, instead of the correct
value application/vnd.docker.container.image.v1+json.
This brings in https://github.com/docker/distribution/pull/1622 to fix
this.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Now that we are checking if the image and host have the same architectures
via #21272, this value should be null so that the test passes on non-x86
machines
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Previously, Windows only supported running with a OS-managed base image.
With this change, Windows supports normal, Linux-like layered images, too.
Signed-off-by: John Starks <jostarks@microsoft.com>
These fields are needed to specify the exact version of Windows that an
image can run on. They may be useful for other platforms in the future.
This also changes image.store.Create to validate that the loaded image is
supported on the current machine. This change affects Linux as well, since
it now validates the architecture and OS fields.
Signed-off-by: John Starks <jostarks@microsoft.com>
The prior error message caused confusion. If a user attempts to push an
image up to a registry, but they misspelled (or forgot to properly tag
their image) they would see the message 'Repository does not exist', which
is not very clear and causes some to think that there might be a problem
with the registry or connectivity to it, when the problem was simply just
that an image with that tag specified does not exist locally.
Signed-off-by: Dave MacDonald <mindlapse@gmail.com>
Close could be called twice on a temporary download file, which could
have bad side effects.
This fixes the problem by setting to ld.tmpFile to nil when the download
completes sucessfully. Then the call to ld.Close will have no effect,
and only the download manager will close the temporary file when it's
done extracting the layer from it. ld.Close will be responsible for
closing the file if we hit the retry limit and there is still a partial
download present.
Fixes#21675
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Fix unmount issues in the daemon crash and restart lifecycle, w.r.t
graph drivers. This change sets a live container RWLayer's activity
count to 1, so that the RWLayer is aware of the mount. Note that
containerd has experimental support for restore live containers.
Added/updated corresponding tests.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
The current error message is "Error: image [name] not found". This makes
sense from the perspective of the v1 pull, since we found the repository
doesn't exist over the v1 protocol. However, in the vast majority of
cases, this error will be produced by fallback situations, where we
first try to pull the tag with the v2 protocol, and then fall back the
v1 protocol, which probably isn't even supported by the server.
Including the tag in the error message makes a lot more sense since the
actual repository may exist on v2, but not the tag.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This adds support for the passthrough on build, push, login, and search.
Revamp the integration test to cover these cases and make it more
robust.
Use backticks instead of quoted strings for backslash-heavy string
contstands.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Changes how the Engine interacts with Registry servers on image pull.
Previously, Engine sent a User-Agent string to the Registry server
that included only the Engine's version information. This commit
appends to that string the fields from the User-Agent sent by the
client (e.g., Compose) of the Engine. This allows Registry server
operators to understand what tools are actually generating pulls on
their registries.
Signed-off-by: Mike Goelzer <mgoelzer@docker.com>
- cherry-pick from 1.10.3 branch: 0186f4d422
- add token service test suite
- add integration test (missing in 1.10.3 branch)
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
This test was checking that it received every progress update that was
produced. But delivery of these intermediate progress updates is not
guaranteed. A new update can overwrite the previous one if the previous
one hasn't been sent to the channel yet.
The call to t.Fatalf exited the current goroutine which was consuming
the channel, which caused a deadlock and eventual test timeout rather
than a proper failure message.
Failure seen here:
https://jenkins.dockerproject.org/job/Docker-PRs-experimental/16400/console
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
The download manager assumed there was at least one layer involved in
all images. This can be false if the image is essentially a copy of
`scratch`.
Fix a nil pointer dereference that happened in this case. Add
integration tests that involve schema1 and schema2 manifests.
Fixes#21213
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Use token handler options for initialization.
Update auth endpoint to set identity token in response.
Update credential store to match distribution interface changes.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Further differentiate the APIEndpoint used with V2 with the endpoint type which is only used for v1 registry interactions
Rename Endpoint to V1Endpoint and remove version ambiguity
Use distribution token handler for login
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Concurrent uploads which share layers worked correctly as of #18353,
but unfortunately #18785 caused a regression. This PR removed the logic
that shares digests between different push sessions. This overlooked the
case where one session was waiting for another session to upload a
layer.
This commit adds back the ability to propagate this digest information,
using the distribution.Descriptor type because this is what is received
from stats and uploads, and also what is ultimately needed for building
the manifest.
Surprisingly, there was no test covering this case. This commit adds
one. It fails without the fix.
See recent comments on #9132.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Attempt layer mounts from up to 3 source repositories, possibly
falling back to a standard blob upload for cross repository pushes.
Addresses compatiblity issues with token servers which do not grant
multiple repository scopes, resulting in an authentication failure for
layer mounts, which would otherwise cause the push to terminate with an
error.
Signed-off-by: Brian Bland <brian.bland@docker.com>
This allows easier URL handling in code that uses APIEndpoint.
If we continued to store the URL unparsed, it would require redundant
parsing whenver we want to extract information from it. Also, parsing
the URL earlier should give improve validation.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
With the --insecure-registry daemon option (or talking to a registry on
a local IP), the daemon will first try TLS, and then try plaintext if
something goes wrong with the push or pull. It doesn't make sense to try
plaintext if a HTTP request went through while using TLS. This commit
changes the logic to keep track of host/port combinations where a TLS
attempt managed to do at least one HTTP request (whether the response
code indicated success or not). If the host/port responded to a HTTP
using TLS, we won't try to make plaintext HTTP requests to it.
This will result in better error messages, which sometimes ended up
showing the result of the plaintext attempt, like this:
Error response from daemon: Get
http://myregistrydomain.com:5000/v2/: malformed HTTP response
"\x15\x03\x01\x00\x02\x02"
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Several improvements to error handling:
- Introduce ImageConfigPullError type, wrapping errors related to
downloading the image configuration blob in schema2. This allows for a
more descriptive error message to be seen by the end user.
- Change some logrus.Debugf calls that display errors to logrus.Errorf.
Add log lines in the push/pull fallback cases to make sure the errors
leading to the fallback are shown.
- Move error-related types and functions which are only used by the
distribution package out of the registry package.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This reverts commit 84b2162c1a.
The intent of this commit was to set an idle timeout on a HTTP
connection. If a read took more than 60 seconds to complete, or a write
took more than 60 seconds to complete, the connection would be
considered dead.
This doesn't work properly, because the HTTP internals apparently read
from the connection concurrently while writing. An upload that doesn't
complete in 60 seconds leads to a timeout.
Fixes#19967
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
`Upload` already closes the reader returned by `compress` and the
progressreader passed into it, before returning. But even so, the
io.Copy inside compress' goroutine needs to attempt a read from the
progressreader to notice that it's closed, and this read has a side
effect of outputting a progress message. If this happens after `Upload`
returns, it can result in a write to a closed channel. Change `compress`
to return a channel that allows the caller to wait for its goroutine to
finish before freeing any resources connected to the reader that was
passed to it.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Otherwise, some operations can get stuck indefinitely when the remote
side is unresponsive.
Fixes#12823
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
A watcher would output the current progress item when it was detached,
in case it missed that item earlier, which would leave the user seeing
some intermediate step of the operation. This commit changes it to only
output it on detach if it didn't already output the same item.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Currently, the temporary file storing downloaded layer data is only
removed after a successful download or a digest verification error. A
transport-level error does not cause it to be removed. This is a
regression from 1.9 that could cause disk usage to grow until the Docker
daemon is restarted.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Things could go wrong if Watch was called after the last existing
watcher was released. The call to Watch would succeed even though it was
not really adding a watcher, and the corresponding call to Release would
close hasWatchers a second time.
The fix for this is twofold:
1. We allow transfers to gain new watchers after the watcher count has
touched zero. This means that the channel returned by Released should
not be closed until all watchers have been released AND the transfer is
no longer tracked by the transfer manager, meaning it won't be possible
for additional calls to Watch to race with closing the channel returned
by Released.
The Transfer interface has a new method called Close so the transfer can
know when the transfer manager no longer references it.
Remove the Cancel method. It's not used and should not be exported.
2. Even though (1) makes it possible to add watchers after all the
previous watchers have been released, we want to avoid doing this in
practice. A transfer that has had all its watchers released is in the
process of being cancelled, and attaching to one of these will never be
the correct behavior. Add a check if a watcher is attaching to a
cancelled transfer. In this case, wait for the transfer to be removed
from the map and try again. This will ensure correct behavior when a
watcher tries to attach during the race window.
Either (1) or (2) should be sufficient to fix the race involved here,
but the combination is the most correct approach. (1) fixes the
low-level plumbing to be resilient to the race condition, and (2) avoids
using it in a racy way.
Fixes#19606
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
There was already a check that prevented protocol-level fallback in this
situation, but retries within a specific protocol will still happen.
This makes it take a long time for the pull to finally error out.
This fixes slowness in TestDaemonNoSpaceleftOnDeviceError, which used to
take a long time due to the backoff between retry attempts:
PASS: docker_cli_daemon_test.go:1868: DockerDaemonSuite.TestDaemonNoSpaceleftOnDeviceError 5.882s
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
When authorization errors are returned by the token process the error will be wrapped in url.Error.
In order to check the underlying error for retry this error message should be unwrapped.
Unwrapping this error allows failure to push due to an unauthorized response to keep from retrying, possibly resulting in later 429 responses.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Revert the portions of #17617 that report all errors when a pull
falls back, and go back to just reporting the last error. This was nice
to have, but causes some UX issues because nonexistent images show
additional "unauthorized" errors.
Keep the part of the PR that handled ENOSPC, as this appears to work
even without tracking multiple errors.
Fixes#19419
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Tracks source repository information for each blob in the blobsum
service, which is then used to attempt to mount blobs from another
repository when pushing instead of having to re-push blobs to the same
registry.
Signed-off-by: Brian Bland <brian.bland@docker.com>
One of the things this test checks is that the progress indicator
completes for each download. Some progress messages may be lost because
only one message is buffered for each download, but the last progress
message is guaranteed not to be lost. The test therefore checks for a
10/10 progress indication.
However, the assumption that this is the last progress message to be
sent is incorrect. The last message is actually "Pull complete". So
check for this instead.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
A manifest list refers to platform-specific manifests. This allows
for images that target more than one architecture to share the same tag.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Currently this always uses the schema1 manifest builder. Later, it will
be changed to attempt schema2 first, and fall back when necessary.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
The trust code used to parse the console output of `docker push` to
extract the digest, tag, and size information and determine what to
sign. This is fragile and might give an attacker control over what gets
signed if the attacker can find a way to influence what gets printed as
part of the push output.
This commit sends the push metadata out-of-band. It introduces an `Aux`
field in JSONMessage that can carry application-specific data alongside
progress updates. Instead of parsing formatted output, the client looks
in this field to get the digest, size, and tag from the push.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
- Stop serializing JSONMessage in favor of events.Message.
- Keep backwards compatibility with JSONMessage for container events.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Support restoreCustomImage for windows with a new interface to extract
the graph driver from the LayerStore.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
This is a followup to #18839. That PR relaxed the fallback logic so that
if a manifest doesn't exist on v2, or the user is unauthorized to access
it, we try again with the v1 protocol. A similar special case is needed
for "pull all tags" (docker pull -a). If the v2 registry doesn't
recognize the repository, or doesn't allow the user to access it, we
should fall back to v1 and try to pull all tags from the v1 registry.
Conversely, if the v2 registry does allow us to list the tags, there
should be no fallback, even if there are errors pulling those tags.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
RWLayer will now have more operations and be protected through a referenced type rather than always looked up by string in the layer store.
Separates creation of RWLayer (write capture layer) from mounting of the layer.
This allows mount labels to be applied after creation and allowing RWLayer objects to have the same lifespan as a container without performance regressions from requiring mount.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)