Perform chmod before rename with the atomic file writer.
Ensure writeErr is set on short write and file is removed on write error.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This fix tries to fix wrong CPU count after CPU hot-plugging.
On windows, GetProcessAffinityMask has been used to probe the
number of CPUs in real time.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix tries to address issues raised in #23768 where the CPU count
is not updated after cpu ho-plugging.
This fix follows the suggestion from #23768 and replace go's `runtime.NumCPU()`
with `sysconf(_SC_NPROCESSORS_ONLN)` so that correct CPU count could
be obtained even after CPU hot-plugging.
This fix is tested manually, as is suggested in #23768.
This fix fixes#23768.
The NumCPU() in Linux is based on @wmark 's implementation.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Currently when overlay creates a whiteout file then the overlay2 layer is archived,
the correct tar header will be created for the whiteout file, but the tar logic will then attempt to open the file causing a failure.
When tar encounters such failures the file is skipped and excluded for the archive, causing the whiteout to be ignored.
By skipping the copy of empty files, no open attempt will be made on whiteout files.
Fixes#23863
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Always enable VT output emulation when starting the process so that
non-attaching commands can still output VT codes.
Also remove the version block for using the native console and just rely
on supported flags being present.
Signed-off-by: John Starks <jostarks@microsoft.com>
This patch introduces a new experimental engine-level plugin management
with a new API and command line. Plugins can be distributed via a Docker
registry, and their lifecycle is managed by the engine.
This makes plugins a first-class construct.
For more background, have a look at issue #20363.
Documentation is in a separate commit. If you want to understand how the
new plugin system works, you can start by reading the documentation.
Note: backwards compatibility with existing plugins is maintained,
albeit they won't benefit from the advantages of the new system.
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
This fix fixes#23459.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
When pivot_root fails we need to unmount the bind mounted path we
previously mounted in preparation for pivot_root.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
There might be other (valid) reasons for setxattr(2) to fail, so only
ignore it when it's a not supported error (ENOTSUP). Otherwise, bail.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This is similar to network scopes where a volume can either be `local`
or `global`. A `global` volume is one that exists across the entire
cluster where as a `local` volume exists on a single engine.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix tries to address the issue raised in #22420. When
`--tmpfs` is specified with `/tmp`, the default value is
`rw,nosuid,nodev,noexec,relatime,size=65536k`. When `--tmpfs`
is specified with `/tmp:rw`, then the value changed to
`rw,nosuid,nodev,noexec,relatime`.
The reason for such an inconsistency is because docker tries
to add `size=65536k` option only when user provides no option.
This fix tries to address this issue by always pre-progating
`size=65536k` along with `rw,nosuid,nodev,noexec,relatime`.
If user provides a different value (e.g., `size=8192k`), it
will override the `size=65536k` anyway since the combined
options will be parsed and merged to remove any duplicates.
Additional test cases have been added to cover the changes
in this fix.
This fix fixes#22420.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
For me when I run the test I see:
```
Downloading from http://nourl/bad
Importing 283 B
Untar re-exec error: exit status 1: output: unexpected EOF
```
and nothing about "dial tcp" so it appears that the output is
system dependent and therefore we can't really check it. I think
checking for non-zero exit code is sufficient so I'm removing this
string check.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Using golang 1.6, is it now possible to ignore SIGPIPE events on
stdout/stderr. Previous versions of the golang library cached 10
events and then killed the process receiving the events.
systemd-journald sends SIGPIPE events when jounald is restarted and
the target of the unit file writes to stdout/stderr. Docker logs to stdout/stderr.
This patch silently ignores all SIGPIPE events.
Signed-off-by: Jhon Honce <jhonce@redhat.com>
The path we're trying to remove doesn't exist after a successful
chroot+chdir because a / is only appended after pivot_root is
successful and so we can't cleanup anymore with the old path.
Also fix leaking .pivot_root dirs under /var/lib/docker/tmp/docker-builder*
on error.
Fix https://github.com/docker/docker/issues/22587
Introduced by https://github.com/docker/docker/pull/22506
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
The jsonlog logger currently allows specifying envs and labels that
should be propagated to the log message, however there has been no way
to read that back.
This adds a new API option to enable inserting these attrs back to the
log reader.
With timestamps, this looks like so:
```
92016-04-08T15:28:09.835913720Z foo=bar,hello=world hello
```
The extra attrs are comma separated before the log message but after
timestamps.
Without timestaps it looks like so:
```
foo=bar,hello=world hello
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fixes one issue with Docker running under a grsec kernel, which
denies chmod and mknod under chroot.
Note, if pivot_root fails it will still fallback to chroot.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This generates an ID string for calls to Mount/Unmount, allowing drivers
to differentiate between two callers of `Mount` and `Unmount`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Leslie B. Lamport is an American computer scientist. Lamport is
best known for his seminal work in distributed systems and as the
initial developer of the document preparation system LaTeX.
Maria Gaetana Agnesi was an Italian mathematician, philosopher,
theologian and humanitarian. She was the first woman to write a
mathematics handbook and the first woman appointed as a Mathematics
Professor at a University.
Signed-off-by: Ali Dehghani <ali.dehghani.g@gmail.com>
pidfile.New() was opening a file in /proc to determine if the owning
process still exists. Use syscall.OpenProcess() on Windows instead.
Other OSes may also need to be updated here.
Signed-off-by: John Starks <jostarks@microsoft.com>
If a build context tar has path names of the form 'x/./y', they will be
stored in this unnormalized form internally by tarsum. When the builder
walks the untarred directory tree and queries hashes for each relative
path, it will query paths of the form 'x/y', and they will not be found.
To correct this, have tarsum normalize path names by calling Clean.
Add a test to detect this caching false positive.
Fixes#21715
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Since there are other users of pkg/listeners, it doesn't make sense to
contain Docker-specific semantics and warnings inside it. To that end,
move the scary warning about -tlsverify and the libnetwork port
allocation code to CmdDaemon (where they belong). This helps massively
reduce the dependency tree for users of pkg/listeners.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This makes separating middlewares from the core api easier.
As an example, the authorization middleware is moved to
it's own package.
Initialize all static middlewares when the server is created, reducing
allocations every time a route is wrapper with the middlewares.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This should not have been in init() as it causes these lookups to happen
in all reexecs of the Docker binary. The only time it needs to be
resolved is when a user is added, which is extremely rare.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Creates a `fixedBuffer` type that is used to encapsulate functionality
for reading/writing from the underlying byte slices.
Uses lazily-loaded set of sync.Pools for storing buffers that are no
longer needed so they can be re-used.
```
benchmark old ns/op new ns/op delta
BenchmarkBytesPipeWrite-8 138469 48985 -64.62%
BenchmarkBytesPipeRead-8 130922 56601 -56.77%
benchmark old allocs new allocs delta
BenchmarkBytesPipeWrite-8 18 8 -55.56%
BenchmarkBytesPipeRead-8 0 0 +0.00%
benchmark old bytes new bytes delta
BenchmarkBytesPipeWrite-8 66903 1649 -97.54%
BenchmarkBytesPipeRead-8 0 1 +Inf%
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
These fields are needed to specify the exact version of Windows that an
image can run on. They may be useful for other platforms in the future.
This also changes image.store.Create to validate that the loaded image is
supported on the current machine. This change affects Linux as well, since
it now validates the architecture and OS fields.
Signed-off-by: John Starks <jostarks@microsoft.com>
Now that listeners is no longer an internal of the client, make it less
Docker-specific (despite there still being some open questions as how to
deal with some of the warnings that listeners has to emit). We should
move as much of the Docker-specific stuff (especially the port
allocation) to docker/ where it belongs (or maybe pass a check function).
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This code will be used in containerd and is quite useful in general to
people who want a nice way of creating listeners from proto://address
arguments (even supporting socket activation). Separate it out from
docker/ so people can use it much more easily.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Avoids allocations and copying by using a buffer pool for intermediate
writes.
```
benchmark old ns/op new ns/op delta
BenchmarkWrite-8 996 175 -82.43%
benchmark old MB/s new MB/s speedup
BenchmarkWrite-8 4414.48 25069.46 5.68x
benchmark old allocs new allocs delta
BenchmarkWrite-8 2 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkWrite-8 4616 0 -100.00%
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When a plugin is first found, it is loaded into the available plugins
even though it's not activated yet.
If activation fails it is taken out of the list.
While it is in the list, other callers may see it and try to check it's
manifest. If it is not fully activated yet, the manifest will be nil and
cause a panic.
This is especially problematic for drivers that are down and have not
been activated yet.
We could just not load the plugin into the available list until it's
fully active, however that will just cause multiple of the same plugin
to attemp to be loaded.
We could check if the manifest is nil and return early (instead of
panicing on a nil manifest), but this will cause a 2nd caller to receive
a response while the first caller is still waiting, which can be
awkward.
This change uses a condition variable to handle activation (instead of
sync.Once). If the plugin is not activated, callers will all wait until
it is activated and receive a broadcast from the condition variable
signaling that it's ok to proceed, in which case we'll check if their
was an error in activation and proceed accordingly.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since certain filesystems don't support extended attributes, ignore
errors produced (emitting a warning) when attempting to apply extended
attributes to file.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Change user/group creation to use flags to adduser/useradd to enforce it
being a system user. Use system user defaults that auto-create a
matching group. These changes allow us to remove all group creation
code, and in doing so we also removed the code that finds available uid,
gid integers and use post-creation query to gather the system-generated
uid and gid.
The only added complexity is that today distros don't auto-create
subordinate ID ranges for a new ID if it is a system ID, so we now need
to handle finding a free range and then calling the `usermod` tool to
add the ranges for that ID. Note that this requires the distro supports
the `-v` and `-w` flags on `usermod` for subordinate ID range additions.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
pkg/chrootarchive/diff_unix.go erroneously calls flush on stdout, which tries to read from stdout returning an error.
This has been fixed by removing the call and by modifying flush to return errors and checking for these errors on calls to flush.
Signed-off-by: Amit Krishnan <krish.amit@gmail.com>
In TestJSONFormatProgress, the progress string was used for comparison.
However, the progress string (progress.String()) uses time.Now().UTC()
to generate the timeLeftBox which is not a fixed value and cannot be
compared reliably.
This PR fixes the issue by stripping the timeLeftBox field before doing
the comparison.
This PR fixes#21124.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
If the destination does not exist, it needs to be created with ownership
mapping to the remapped uid/gid ranges if user namespaces are enabled.
This fixes ADD operations, similar to the prior fixes for COPY and WORKDIR.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
Closes#20470
Before this PR we used to scan the entire build context when there were
exclusions in the .dockerignore file (paths that started with !). Now we
only traverse into subdirs when one of the exclusions starts with that dir
path.
Signed-off-by: Doug Davis <dug@us.ibm.com>
I noticied an inconsistency when reviewing docker/pull/20692.
Changing Ip to IP and Nf to NF.
More info: The golang folks recommend that you keep the initials consistent:
https://github.com/golang/go/wiki/CodeReviewComments#initialisms.
Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
During "COPY" or other tar unpack operations, a target/destination
parent dir might not exist and should be created with ownership of the
root in the right context (including remapped root when user namespaces
are enabled)
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
Stop using global variables as prefixes to inject the writer header.
That can cause issues when two writers set the length of the buffer in
the same header concurrently.
Stop Writing to the internal buffer twice for each write. This could
mess up with the ordering information is written.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Since Docker is already skipping newlines in /etc/sub{uid,gid},
this patch skips commented out lines - otherwise Docker fails to start.
Add unit test also.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Based on the discussion, we have changed the following:
1. Send body only if content-type is application/json (based on the
Docker official daemon REST specification, this is the provided for all
APIs that requires authorization.
2. Correctly verify that the msg body is smaller than max cap (this was
the actual bug). Fix includes UT.
3. Minor: Check content length > 0 (it was -1 for load, altough an
attacker can still modify this)
Signed-off-by: Liron Levin <liron@twistlock.com>
When execute `docker export -o path xxx` and path is a directory docker
has no privilege to write to, daemon will print lots of error logs that
most of them are duplicated and redundant.
This will remove unnecessary error logs and print only once.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
benchmark old ns/op new ns/op delta
BenchmarkPubSub-8 1036494796 1032443513 -0.39%
benchmark old allocs new allocs delta
BenchmarkPubSub-8 2467 1441 -41.59%
benchmark old bytes new bytes delta
BenchmarkPubSub-8 212216 187792 -11.51%
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
pkg/filenotify isn't used anymore and it causes problems with
hack/vendor.sh (nothing uses it, so hack/vendor.sh will remove the
vendored code).
Signed-off-by: Aleksa Sarai <asarai@suse.com>
inotify event is trigged immediately there's data written to disk.
But at the time that the inotify event is received, the json line might
not fully saved to disk. If the json decoder tries to decode in such
case, an io.UnexpectedEOF will be trigged.
We used to retry for several times to mitigate the io.UnexpectedEOF error.
But there are still flaky tests caused by the partial log entries.
The daemon knows exactly when there are new log entries emitted. We can
use the pubsub package to notify all the log readers instead of inotify.
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
try to fix broken test. will squash once tests pass
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
Using {{if major}}{{if minor}} doesn't work as expected when the major
version changes. In addition, this didn't support patch levels (which is
necessary in some cases when distributions ship apparmor weirdly).
Signed-off-by: Aleksa Sarai <asarai@suse.com>
Save was failing file integrity checksums due to bugs in both
Windows and Docker. This commit includes fixes to file time handling
in tarexport and system.chtimes that are necessary along with
the Windows platform fixes to correctly support save. With this
change, sysfile_backups for windowsfilter driver are no longer
needed, so that code is removed.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
To support the requirement of blocking the request after the daemon
responded the authorization plugin use a `response recorder` that replay
the response after the flow ends.
This commit adds support for commands that hijack the connection and
flushes data via the http.Flusher interface. This resolves the error
with the event endpoint.
Signed-off-by: Liron Levin <liron@twistlock.com>
dockerinit has been around for a very long time. It was originally used
as a way for us to do configuration for LXC containers once the
container had started. LXC is no longer supported, and /.dockerinit has
been dead code for quite a while. This removes all code and references
in code to dockerinit.
Signed-off-by: Aleksa Sarai <asarai@suse.com>
Use a back-compat struct to handle listing volumes for volumes we know
about (because, presumably, they are being used by a container) for
volume drivers which don't yet support `List`.
Adds a fall-back for the volume driver `Get` call, which will use
`Create` when the driver returns a `404` for `Get`. The old behavior was
to always use `Create` to get a volume reference.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- Return an error if any of the keys don't match valid flags.
- Fix an issue ignoring merged values as named values.
- Fix tlsverify configuration key.
- Fix bug in mflag to avoid panics when one of the flag set doesn't have any flag.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Read configuration after flags making this the priority:
1- Apply configuration from file.
2- Apply configuration from flags.
Reload configuration when a signal is received, USR2 in Linux:
- Reload router if the debug configuration changes.
- Reload daemon labels.
- Reload cluster discovery.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Born in Germany, she had to flee on the kindertransport to England in
1939. In the 1950s she worked at the Post Office Research Station at
Dollis Hill, building computers from scratch, and took evening classes
to get a degree in Mathematics.
In 1962 she set up a software company, employing almost entirely women,
working at home; the company was floated in 1996. Her team's projects
included programming Concorde's black box flight recorder. She adopted
the name "Steve" to fit in in a male domainated world.
http://www.bbc.co.uk/programmes/b05pmvl8https://en.wikipedia.org/wiki/Steve_Shirley
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
The trust code used to parse the console output of `docker push` to
extract the digest, tag, and size information and determine what to
sign. This is fragile and might give an attacker control over what gets
signed if the attacker can find a way to influence what gets printed as
part of the push output.
This commit sends the push metadata out-of-band. It introduces an `Aux`
field in JSONMessage that can carry application-specific data alongside
progress updates. Instead of parsing formatted output, the client looks
in this field to get the digest, size, and tag from the push.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Don't rely on sqlite db for name registration and linking.
Instead register names and links when the daemon starts to an in-memory
store.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>