beenull/moby

Author	SHA1	Message	Date
Sebastiaan van Stijn	4f057d8bb6	Merge pull request #43887 from thaJeztah/22.06_backport_implicit_runtime_config [22.06 backport] daemon: support other containerd runtimes (MVP)	2022-08-03 23:55:51 +02:00
Akihiro Suda	d9a6b805b3	Merge pull request #43884 from vvoland/fix-exitcode-wait-22.06 [22.06 backport] state/Wait: Fix race when reading exit status	2022-07-30 15:51:39 +09:00
Cory Snider	6de52a29a8	daemon: support other containerd runtimes (MVP) Contrary to popular belief, the OCI Runtime specification does not specify the command-line API for runtimes. Looking at containerd's architecture from the lens of the OCI Runtime spec, the _shim_ is the OCI Runtime and runC is "just" an implementation detail of the io.containerd.runc.v2 runtime. When one configures a non-default runtime in Docker, what they're really doing is instructing Docker to create containers using the io.containerd.runc.v2 runtime with a configuration option telling the runtime that the runC binary is at some non-default path. Consequently, only OCI runtimes which are compatible with the io.containerd.runc.v2 shim, such as crun, can be used in this manner. Other OCI runtimes, including kata-containers v2, come with their own containerd shim and are not compatible with io.containerd.runc.v2. As Docker has not historically provided a way to select a non-default runtime which requires its own shim, runtimes such as kata-containers v2 could not be used with Docker. Allow other containerd shims to be used with Docker; no daemon configuration required. If the daemon is instructed to create a container with a runtime name which does not match any of the configured or stock runtimes, it passes the name along to containerd verbatim. A user can start a container with the kata-containers runtime, for example, simply by calling docker run --runtime io.containerd.kata.v2 Runtime names which containerd would interpret as a path to an arbitrary binary are disallowed. While handy for development and testing it is not strictly necessary and would allow anyone with Engine API access to trivially execute any binary on the host as root, so we have decided it would be safest for our users if it was not allowed. It is not yet possible to set an alternative containerd shim as the default runtime; it can only be configured per-container. Signed-off-by: Cory Snider <csnider@mirantis.com> (cherry picked from commit `547da0d575`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-07-29 20:36:50 +02:00
Paweł Gronowski	e2bd8edb0d	daemon/restart: Don't mutate AutoRemove when restarting This caused a race condition where AutoRemove could be restored before container was considered for restart and made autoremove containers impossible to restart. ``` $ make DOCKER_GRAPHDRIVER=vfs BIND_DIR=. TEST_FILTER='TestContainerWithAutoRemoveCanBeRestarted' TESTFLAGS='-test.count 1' test-integration ... === RUN TestContainerWithAutoRemoveCanBeRestarted === RUN TestContainerWithAutoRemoveCanBeRestarted/kill === RUN TestContainerWithAutoRemoveCanBeRestarted/stop --- PASS: TestContainerWithAutoRemoveCanBeRestarted (1.61s) --- PASS: TestContainerWithAutoRemoveCanBeRestarted/kill (0.70s) --- PASS: TestContainerWithAutoRemoveCanBeRestarted/stop (0.86s) PASS DONE 3 tests in 3.062s ``` Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-07-29 16:49:56 +02:00
Illo Abdulrahim	6d41219bae	Fix file capabilities droping in Dockerfile doCopyXattrs() never reached due to copyXattrs boolean being false, as a result file capabilities not being copied. moved copyXattr() out of doCopyXattrs() Signed-off-by: Illo Abdulrahim <abdulrahim.illo@nokia.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl> (cherry picked from commit `31f654a704`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-07-28 09:39:21 +02:00
Olli Janatuinen	112fb22152	Windows: Re-create custom NAT networks after restart if missing from HNS Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com> (cherry picked from commit `67c36d5`) Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>	2022-07-23 23:16:23 -07:00
Sebastiaan van Stijn	a9081299dd	logger/journald: fix SA4011: ineffective break statement This was introduced in `906b979b88`, which changed a `goto` to a `break`, but afaics, the intent was still to break out of the loop. (linter didn't catch this before because it didn't have the right build-tag set) daemon/logger/journald/read.go:238:4: SA4011: ineffective break statement. Did you mean to break out of the outer loop? (staticcheck) break // won't be able to write anything anymore ^ Signed-off-by: Sebastiaan van Stijn <github@gone.nl> (cherry picked from commit `75577fe7a8`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-07-20 16:57:22 +02:00
Sebastiaan van Stijn	cdbca4061b	gofmt GoDoc comments with go1.19 Older versions of Go don't format comments, so committing this as a separate commit, so that we can already make these changes before we upgrade to Go 1.19. Signed-off-by: Sebastiaan van Stijn <github@gone.nl> (cherry picked from commit `52c1a2fae8`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-07-13 22:42:29 +02:00
Sebastiaan van Stijn	1cab8eda24	replace golint with revive, as it's deprecated WARN [runner] The linter 'golint' is deprecated (since v1.41.0) due to: The repository of the linter has been archived by the owner. Replaced by revive. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-07-04 10:15:54 +02:00
Sebastiaan van Stijn	10c56efa97	linting: error strings should not be capitalized (revive) client/request.go:183:28: error-strings: error strings should not be capitalized or end with punctuation or a newline (revive) err = errors.Wrap(err, "In the default daemon configuration on Windows, the docker client must be run with elevated privileges to connect.") ^ client/request.go:186:28: error-strings: error strings should not be capitalized or end with punctuation or a newline (revive) err = errors.Wrap(err, "This error may indicate that the docker daemon is not running.") ^ Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-07-04 10:15:06 +02:00
Sebastiaan van Stijn	1f187e640c	daemon/config: use more assertions in tests Removes some custom handling, some of which were giving the wrong error on failure ("expected no error" when we were checking for an error). Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:23 +02:00
Sebastiaan van Stijn	10e42f599a	daemon/config: TestUnixValidateConfigurationErrors: use subtests Use sub-tests and make sure we get the expected error Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:21 +02:00
Sebastiaan van Stijn	751222d907	daemon/config: verify that flags were set correctly in tests To prevent (e.g.) introducing a typo in the flag-name and invalidating the tests because of that. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:20 +02:00
Sebastiaan van Stijn	f73aadb230	daemon/config: New(): set more defaults Set the defaults when constructing the config, instead of setting them indirectly through the command-line flags. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:18 +02:00
Sebastiaan van Stijn	a0d0db126c	daemon/config: set default MTU when initializing config Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:16 +02:00
Sebastiaan van Stijn	62f71c4505	daemon/config: fix TestDaemonConfigurationMerge This test was validating that the config file would not overwrite the log-opt, but the test did not set up the flags correctly; as the flags were not marked as "changed", it would not detect a conflict between the config-file and daemon-flags. This patch: - removes the incorrect fields from the JSON file - initializes the Config using config.New(), so that any defaults are also set - sets flag values by actually setting them through the flags Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:14 +02:00
Sebastiaan van Stijn	9b39cab510	daemon/config: improve some tests - TestReloadWithDuplicateLabels() also check value - TestReloadDefaultConfigNotExist, TestReloadBadDefaultConfig, TestReloadWithConflictingLabels: verify that config is not reloaded. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:59:08 +02:00
Sebastiaan van Stijn	f8231c62f4	daemon/config: Validate() also validate default MTU Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-29 19:55:08 +02:00
Paweł Gronowski	56a20dbc19	container/exec: Support ConsoleSize Now client have the possibility to set the console size of the executed process immediately at the creation. This makes a difference for example when executing commands that output some kind of text user interface which is bounded by the console dimensions. Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-06-24 11:54:25 +02:00
Sebastiaan van Stijn	0861539571	Merge pull request #43680 from rumpl/move-image-inspect Move the inspect code away from the image service	2022-06-22 20:12:15 +02:00
Djordje Lukic	b4ffe3a9fb	Move the inspect code away from the image service The LoopkupImage method is only used by the inspect image route and returns an api/type struct. The depenency to api/types of the daemon/images package is wrong, the daemon doesn't need to know about the api types. Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>	2022-06-22 15:08:55 +02:00
Paweł Gronowski	2ec3e14c0f	test: Add tests for logging 1. Add integration tests for the ContainerLogs API call Each test handle a distinct case of ContainerLogs output. - Muxed stream, when container is started without tty - Single stream, when container is started with tty 2. Add unit test for LogReader suite that tests concurrent logging It checks that there are no race conditions when logging concurrently from multiple goroutines. Co-authored-by: Cory Snider <csnider@mirantis.com> Signed-off-by: Cory Snider <csnider@mirantis.com> Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-06-10 09:26:17 +02:00
Sebastiaan van Stijn	20d6b5c1bd	Merge pull request #43702 from thaJeztah/daemon_event_simplify daemon: LogDaemonEventWithAttributes: don't call SystemInfo()	2022-06-08 02:25:23 +02:00
Sebastiaan van Stijn	3b94561db2	Merge pull request #43662 from vvoland/fix-logs-regression2 daemon/logger: Driver-scope buffer pools, bigger buffers	2022-06-07 22:04:14 +02:00
Sebastiaan van Stijn	f90056a79d	daemon: LogDaemonEventWithAttributes: don't call SystemInfo() This function was calling SystemInfo() only to get the daemon's name to add to the event that's generated. SystemInfo() is quite heavy, and no info other than the Name was used. The name returned is just looking up the hostname, so instead, call `hostName()` directly. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-07 22:01:12 +02:00
Akihiro Suda	2c7a6d7bb1	daemon: remove support for deprecated io.containerd.runtime.v1.linux This has been deprecated in Docker 20.10.0 (`f63f73a4a8`) Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2022-06-05 18:41:30 +09:00
Sebastiaan van Stijn	b241e2008e	daemon.NewDaemon(): fix network feature detection on first start Commit `483aa6294b` introduced a regression, causing spurious warnings to be shown when starting a daemon for the first time after a fresh install: docker info ... WARNING: IPv4 forwarding is disabled WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled The information shown is incorrect, as checking the corresponding options on the system, shows that these options are available: cat /proc/sys/net/ipv4/ip_forward 1 cat /proc/sys/net/bridge/bridge-nf-call-iptables 1 cat /proc/sys/net/bridge/bridge-nf-call-ip6tables 1 The reason this is failing is because the daemon itself reconfigures those options during networking initialization in `configureIPForwarding()`; `cf4595265e/libnetwork/drivers/bridge/setup_ip_forwarding.go (L14-L25)` Network initialization happens in the `daemon.restore()` function within `daemon.NewDaemon()`: `cf4595265e/daemon/daemon.go (L475-L478)` However, `483aa6294b` moved detection of features earlier in the `daemon.NewDaemon()` function, and collects the system information (`d.RawSysInfo()`) before we enter `daemon.restore()`; `cf4595265e/daemon/daemon.go (L1008-L1011)` For optimization (collecting the system information comes at a cost), those results are cached on the daemon, and will only be performed once (using a `sync.Once`). This patch: - introduces a `getSysInfo()` utility, which collects system information without caching the results - uses `getSysInfo()` to collect the preliminary information needed at that point in the daemon's lifecycle. - moves printing warnings to the end of `daemon.NewDaemon()`, after all information can be read correctly. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-06-03 17:54:43 +02:00
Sebastiaan van Stijn	553b0edb4c	fix unclosed file-handles in tests These seemed to prevent cleaning up directories; On arm64: === RUN TestSysctlOverride testing.go:1090: TempDir RemoveAll cleanup: unlinkat /tmp/TestSysctlOverride2860094781/001/mounts/shm: device or resource busy --- FAIL: TestSysctlOverride (0.00s) On Windows: === Failed === FAIL: github.com/docker/docker/daemon TestLoadOrCreateTrustKeyInvalidKeyFile (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestLoadOrCreateTrustKeyInvalidKeyFile2014634395\001\keyfile4156691647: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/daemon/graphdriver TestIsEmptyDir (0.01s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestIsEmptyDir1962964337\001\dir-with-empty-file\file2523853824: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/pkg/directory TestSizeEmptyFile (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeEmptyFile1562416712\001\file16507846: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/pkg/directory TestSizeNonemptyFile (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeNonemptyFile1240832785\001\file3265662846: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/pkg/directory TestSizeFileAndNestedDirectoryEmpty (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeFileAndNestedDirectoryEmpty2163416550\001\file3715413181: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/pkg/directory TestSizeFileAndNestedDirectoryNonempty (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeFileAndNestedDirectoryNonempty878205470\001\file3280422273: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/volume/service TestSetGetMeta (0.01s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSetGetMeta3332268057\001\db: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/volume/service TestList (0.03s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestList2846947953\001\volumes\metadata.db: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/volume/service TestRestore (0.02s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestRestore3368254142\001\volumes\metadata.db: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/daemon/graphdriver TestIsEmptyDir (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestIsEmptyDir2823795693\001\dir-with-empty-file\file2625561089: The process cannot access the file because it is being used by another process. === FAIL: github.com/docker/docker/pkg/directory TestSizeFileAndNestedDirectoryNonempty (0.00s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeFileAndNestedDirectoryNonempty4246252950\001\nested3442260313\file21164327: The process cannot access the file because it is being used by another process. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-05-31 21:53:38 +02:00
Paweł Gronowski	2463c40144	daemon/logger: Fix TestConcurrentLogging race test The recent fix for log corruption changed the signature of the NewLogFile and WriteLogEntry functions and the test wasn't adjusted to this change. Fix the test by adjusting to the new LogFile API. Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-05-31 14:02:59 +02:00
Paweł Gronowski	d8a731c3aa	daemon/logger: Increase initial buffers size Make the allocated buffers bigger to allow better reusability and avoid frequent reallocations. Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-05-30 20:50:56 +02:00
Paweł Gronowski	98810847c4	daemon/logger: Put Message back as soon as possible The Message is not needed after it is marshalled, so no need to hold it for the entire function scope. Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-05-30 20:50:56 +02:00
Paweł Gronowski	8fe2a68698	daemon/logger: Global buffer pools Moved the buffer pools in json-file and local logging drivers to the whole driver scope. It is more efficient to have a pool for the whole driver rather than for each logger instance. Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-05-30 20:50:56 +02:00
Sebastiaan van Stijn	4e09933aed	Merge pull request #43652 from thaJeztah/update_gotest_tools vendor: gotest.tools v3.2.0	2022-05-30 13:00:29 +02:00
Sebastiaan van Stijn	cb7b329911	daemon: fix daemon.Shutdown, daemon.Cleanup not cleaning up overlay2 mounts While working on deprecation of the `aufs` and `overlay` storage-drivers, the `TestCleanupMounts` had to be updated, as it was currently using `aufs` for testing. When rewriting the test to use `overlay2` instead (using an updated `mountsFixture`), I found out that the test was failing, and it appears that only `overlay`, but not `overlay2` was taken into account. These cleanup functions were added in `05cc737f54`, but at the time the `overlay2` storage driver was not yet implemented; `05cc737f54/daemon/graphdriver` This omission was likely missed in `23e5c94cfb`, because the original implementation re-used the `overlay` storage driver, but later on it was decided to make `overlay2` a separate storage driver. As a result of the above, `daemon.cleanupMountsByID()` would ignore any `overlay2` mounts during `daemon.Shutdown()` and `daemon.Cleanup()`. This patch: - Adds a new `mountsFixtureOverlay2` with example mounts for `overlay2` - Rewrites the tests to use `gotest.tools` for more informative output on failures. - Adds the missing regex patterns to `daemon/getCleanPatterns()`. The patterns are added at the start of the list to allow for the fasted match (`overlay2` is the default for most setups, and the code is iterating over possible options). As a follow-up, we could consider adding additional fixtures for different storage drivers. Before the fix is applied: go test -v -run TestCleanupMounts ./daemon/ === RUN TestCleanupMounts === RUN TestCleanupMounts/aufs === RUN TestCleanupMounts/overlay2 daemon_linux_test.go:135: assertion failed: 0 (unmounted int) != 1 (int): Expected to unmount the shm (and the shm only) --- FAIL: TestCleanupMounts (0.01s) --- PASS: TestCleanupMounts/aufs (0.00s) --- FAIL: TestCleanupMounts/overlay2 (0.01s) === RUN TestCleanupMountsByID === RUN TestCleanupMountsByID/aufs === RUN TestCleanupMountsByID/overlay2 daemon_linux_test.go:171: assertion failed: 0 (unmounted int) != 1 (int): Expected to unmount the root (and that only) --- FAIL: TestCleanupMountsByID (0.00s) --- PASS: TestCleanupMountsByID/aufs (0.00s) --- FAIL: TestCleanupMountsByID/overlay2 (0.00s) FAIL FAIL github.com/docker/docker/daemon 0.054s FAIL With the fix applied: go test -v -run TestCleanupMounts ./daemon/ === RUN TestCleanupMounts === RUN TestCleanupMounts/aufs === RUN TestCleanupMounts/overlay2 --- PASS: TestCleanupMounts (0.00s) --- PASS: TestCleanupMounts/aufs (0.00s) --- PASS: TestCleanupMounts/overlay2 (0.00s) === RUN TestCleanupMountsByID === RUN TestCleanupMountsByID/aufs === RUN TestCleanupMountsByID/overlay2 --- PASS: TestCleanupMountsByID (0.00s) --- PASS: TestCleanupMountsByID/aufs (0.00s) --- PASS: TestCleanupMountsByID/overlay2 (0.00s) PASS ok github.com/docker/docker/daemon 0.042s Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-05-29 16:28:13 +02:00
Sebastiaan van Stijn	467c275b58	Merge pull request #43650 from vvoland/fix-logs-regression daemon/logger: Share buffers by sync.Pool	2022-05-28 14:21:15 +02:00
Sebastiaan van Stijn	a5f6500958	replace deprecated gotest.tools' env.Patch() with t.SetEnv() Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-05-28 12:12:39 +02:00
Paweł Gronowski	7493342926	daemon/logger: Share buffers by sync.Pool Marshalling log messages by json-file and local drivers involved serializing the message into a shared buffer. This caused a regression resulting in log corruption with recent changes where Log may be called from multiple goroutines at the same time. Solution is to use a sync.Pool to manage the buffers used for the serialization. Also removed the MarshalFunc, which the driver had to expose to the LogFile so that it can marshal the message. This is now moved entirely to the driver. Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>	2022-05-27 16:44:06 +02:00
Sebastiaan van Stijn	c6cc03747d	daemon/images: use gotest.tools for tests, and use sub-tests Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-05-27 15:36:14 +02:00
Sebastiaan van Stijn	c3d7a0c603	Fix validation of IpcMode, PidMode, UTSMode, CgroupnsMode These HostConfig properties were not validated until the OCI spec for the container was created, which meant that `container run` and `docker create` would accept invalid values, and the invalid value would not be detected until `start` was called, returning a 500 "internal server error", as well as errors from containerd ("cleanup: failed to delete container from containerd: no such container") in the daemon logs. As a result, a faulty container was created, and the container state remained in the `created` state. This patch: - Updates `oci.WithNamespaces()` to return the correct `errdefs.InvalidParameter` - Updates `verifyPlatformContainerSettings()` to validate these settings, so that an error is returned when _creating_ the container. Before this patch: docker run -dit --ipc=shared --name foo busybox 2a00d74e9fbb7960c4718def8f6c74fa8ee754030eeb93ee26a516e27d4d029f docker: Error response from daemon: Invalid IPC mode: shared. docker ps -a --filter name=foo CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2a00d74e9fbb busybox "sh" About a minute ago Created foo After this patch: docker run -dit --ipc=shared --name foo busybox docker: Error response from daemon: invalid IPC mode: shared. docker ps -a --filter name=foo CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES An integration test was added to verify the new validation, which can be run with: make BIND_DIR=. TEST_FILTER=TestCreateInvalidHostConfig DOCKER_GRAPHDRIVER=vfs test-integration Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-05-25 17:41:51 +02:00
Sebastiaan van Stijn	d633169483	Merge pull request #43484 from ndeloof/create_host_path introduce CreateMountpoint for parity between binds and mounts	2022-05-19 23:06:01 +02:00
Cory Snider	a67e159909	daemon/logger: hold LogFile lock less on ReadLogs Reduce the amount of time ReadLogs holds the LogFile fsop lock by releasing it as soon as all the files are opened, before parsing the compressed file headers. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:23:18 -04:00
Cory Snider	01915a725e	daemon/logger: follow LogFile without file watches File watches have been a source of complexity and unreliability in the LogFile follow implementation, especially when combined with file rotation. File change events can be unreliably delivered, especially on Windows, and the polling fallback adds latency. Following across rotations has never worked reliably on Windows. Without synchronization between the log writer and readers, race conditions abound: readers can read from the file while a log entry is only partially written, leading to decode errors and necessitating retries. In addition to the complexities stemming from file watches, the LogFile follow implementation had complexity from needing to handle file truncations, and (due to a now-fixed bug in the polling file watcher implementation) evictions to unlock the log file so it could be rotated. Log files are now always rotated, never truncated, so these situations no longer need to be handled by the follow code. Rewrite the LogFile follow implementation in terms of waiting until LogFile notifies it that a new message has been written to the log file. The LogFile informs the follower of the file offset of the last complete write so that the follower knows not to read past that, preventing it from attempting to decode partial messages and making retries unnecessary. Synchronization between LogFile and its followers is used at critical points to prevent missed notifications of writes and races between file rotations and the follower opening files for read. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	6d5bc07189	daemon/logger: fix refcounting decompressed files The refCounter used for sharing temporary decompressed log files and tracking when the files can be deleted is keyed off the source file's path. But the path of a log file is not stable: it is renamed on each rotation. Consequently, when logging is configured with both rotation and compression, multiple concurrent readers of a container's logs could read logs out of order, see duplicates or decompress a log file which has already been decompressed. Replace refCounter with a new implementation, sharedTempFileConverter, which is agnostic to the file path, keying off the source file's identity instead. Additionally, sharedTempFileConverter handles the full lifecycle of the temporary file, from creation to deletion. This is all abstracted from the consumer: all the bookkeeping and cleanup is handled behind the scenes when Close() is called on the returned reader value. Only one file descriptor is used per temporary file, which is shared by all readers. A channel is used for concurrency control so that the lock can be acquired inside a select statement. While not currently utilized, this makes it possible to add support for cancellation to sharedTempFileConverter in the future. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	49aa66b597	daemon/logger: rotate log files, never truncate Truncating the current log file while a reader is still reading through it results in log lines getting missed. In contrast, rotating the file allows readers who have the file open can continue to read from it undisturbed. Rotating frees up the file name for the logger to create a new file in its place. This remains true even when max-file=1; the current log file is "rotated" from its name without giving it a new one. On POSIXy filesystem APIs, rotating the last file is straightforward: unlink()ing a file name immediately deletes the name from the filesystem and makes it available for reuse, even if processes have the file open at the time. Windows on the other hand only makes the name available for reuse once the file itself is deleted, which only happens when no processes have it open. To reuse the file name while the file is still in use, the file needs to be renamed. So that's what we have to do: rotate the file to a temporary name before marking it for deletion. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	990b0e28ba	daemon/logger/local: fix appending newlines The json-file driver appends a newline character to log messages with PLogMetaData.Last set, but the local driver did not. Alter the behavior of the local driver to match that of the json-file driver. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	3844d1a3d1	daemon/logger: drain readers when logger is closed The LogFile follower would stop immediately upon the producer closing. The close signal would race the file watcher; if a message were to be logged and the logger immediately closed, the follower could miss that last message if the close signal (formerly ProducerGone) was to win the race. Add logic to perform one more round of reading when the producer is closed to catch up on any final logs. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	906b979b88	daemon/logger: remove ProducerGone from LogWatcher Whether or not the logger has been closed is a property of the logger, and only of concern to its log reading implementation, not log watchers. The loggers and their reader implementations can communicate as they see fit. A single channel per logger which is closed when the logger is closed is plenty sufficient to broadcast the state to log readers, with no extra bookeeping or synchronization required. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	ae5f664f4e	daemon/logger: open log reader synchronously The asynchronous startup of the log-reading goroutine made the follow-tail tests nondeterministic. The Log calls in the tests which were supposed to happen after the reader started reading would sometimes execute before the reader, throwing off the counts. Tweak the ReadLogs implementation so that the order of operations is deterministic. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:22 -04:00
Cory Snider	9aa9d6fafc	daemon/logger: add test suite for LogReaders Add an extensive test suite for validating the behavior of any LogReader. Test the current LogFile-based implementations against it. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:21 -04:00
Cory Snider	961d32868c	daemon/logger: improve jsonfilelog read benchmark The jsonfilelog read benchmark was incorrectly reusing the same message pointer in the producer loop. The message value would be reset after the first call to jsonlogger.Log, resulting in all subsequent calls logging a zero-valued message. This is not a representative workload for benchmarking and throws off the throughput metric. Reduce variation between benchmark runs by using a constant timestamp. Write to the producer goroutine's error channel only on a non-nil error to eliminate spurious synchronization between producer and consumer goroutines external to the logger being benchmarked. Signed-off-by: Cory Snider <csnider@mirantis.com>	2022-05-19 15:22:21 -04:00

1 2 3 4 5 ...

7086 commits