8-bit PCM samples are unsigned, at least in WAV, so after rescaling them
to the correct range we also need to center them around 0. This fix
should make 8-bit WAVs have the correct volume of double of what it was
before, and also future-proof for all other unsigned PCM sample formats
we may encounter.
This completely removes WavLoader's dependency on LegacyBuffer: We
directly create the result sample container and write into it. I took
this opportunity to rewrite most of the sample reading functions as a
single templated function, which combined with the better error handling
makes this "ported" code super concise.
This makes the code much more readable and concise, reduces the size of
the WavLoader class itself, moves almost all fallible initialization out
of the constructor and should provide better error handling in general.
Also, a lot of now-unnecessary imports are removed.
* All clang-tidy warnings fixed except read_header cognitive complexity
* Use size_t in more places
* Replace #define's with constexpr constants
* Some variable renaming for readability
Previously, we were sending Buffers to the server whenever we had new
audio data for it. This meant that for every audio enqueue action, we
needed to create a new shared memory anonymous buffer, send that
buffer's file descriptor over IPC (+recfd on the other side) and then
map the buffer into the audio server's memory to be able to play it.
This was fine for sending large chunks of audio data, like when playing
existing audio files. However, in the future we want to move to
real-time audio in some applications like Piano. This means that the
size of buffers that are sent need to be very small, as just the size of
a buffer itself is part of the audio latency. If we were to try
real-time audio with the existing system, we would run into problems
really quickly. Dealing with a continuous stream of new anonymous files
like the current audio system is rather expensive, as we need Kernel
help in multiple places. Additionally, every enqueue incurs an IPC call,
which are not optimized for >1000 calls/second (which would be needed
for real-time audio with buffer sizes of ~40 samples). So a fundamental
change in how we handle audio sending in userspace is necessary.
This commit moves the audio sending system onto a shared single producer
circular queue (SSPCQ) (introduced with one of the previous commits).
This queue is intended to live in shared memory and be accessed by
multiple processes at the same time. It was specifically written to
support the audio sending case, so e.g. it only supports a single
producer (the audio client). Now, audio sending follows these general
steps:
- The audio client connects to the audio server.
- The audio client creates a SSPCQ in shared memory.
- The audio client sends the SSPCQ's file descriptor to the audio server
with the set_buffer() IPC call.
- The audio server receives the SSPCQ and maps it.
- The audio client signals start of playback with start_playback().
- At the same time:
- The audio client writes its audio data into the shared-memory queue.
- The audio server reads audio data from the shared-memory queue(s).
Both sides have additional before-queue/after-queue buffers, depending
on the exact application.
- Pausing playback is just an IPC call, nothing happens to the buffer
except that the server stops reading from it until playback is
resumed.
- Muting has nothing to do with whether audio data is read or not.
- When the connection closes, the queues are unmapped on both sides.
This should already improve audio playback performance in a bunch of
places.
Implementation & commit notes:
- Audio loaders don't create LegacyBuffers anymore. LegacyBuffer is kept
for WavLoader, see previous commit message.
- Most intra-process audio data passing is done with FixedArray<Sample>
or Vector<Sample>.
- Improvements to most audio-enqueuing applications. (If necessary I can
try to extract some of the aplay improvements.)
- New APIs on LibAudio/ClientConnection which allows non-realtime
applications to enqueue audio in big chunks like before.
- Removal of status APIs from the audio server connection for
information that can be directly obtained from the shared queue.
- Split the pause playback API into two APIs with more intuitive names.
I know this is a large commit, and you can kinda tell from the commit
message. It's basically impossible to break this up without hacks, so
please forgive me. These are some of the best changes to the audio
subsystem and I hope that that makes up for this :yaktangle: commit.
:yakring:
With the following change in how we send audio, the old Buffer type is
not really needed anymore. However, moving WavLoader to the new system
is a bit more involved and out of the scope of this PR. Therefore, we
need to keep Buffer around, but to make it clear that it's the old
buffer type which will be removed soon, we rename it to LegacyBuffer.
Most of the users will be gone after the next commit anyways.
Apologies for the enormous commit, but I don't see a way to split this
up nicely. In the vast majority of cases it's a simple change. A few
extra places can use TRY instead of manual error checking though. :^)
Previously, a libc-like out-of-line error information was used in the
loader and its plugins. Now, all functions that may fail to do their job
return some sort of Result. The universally-used error type ist the new
LoaderError, which can contain information about the general error
category (such as file format, I/O, unimplemented features), an error
description, and location information, such as file index or sample
index.
Additionally, the loader plugins try to do as little work as possible in
their constructors. Right after being constructed, a user should call
initialize() and check the errors returned from there. (This is done
transparently by Loader itself.) If a constructor caused an error, the
call to initialize should check and return it immediately.
This opportunity was used to rework a lot of the internal error
propagation in both loader classes, especially FlacLoader. Therefore, a
couple of other refactorings may have sneaked in as well.
The adoption of LibAudio users is minimal. Piano's adoption is not
important, as the code will receive major refactoring in the near future
anyways. SoundPlayer's adoption is also less important, as changes to
refactor it are in the works as well. aplay's adoption is the best and
may serve as an example for other users. It also includes new buffering
behavior.
Buffer also gets some attention, making it OOM-safe and thereby also
propagating its errors to the user.
All audio applications (aplay, Piano, Sound Player) respect the ability
of the system to have theoretically any sample rate. Therefore, they
resample their own audio into the system sample rate.
LibAudio previously had its loaders resample their own audio, even
though they expose their sample rate. This is now changed. The loaders
output audio data in their file's sample rate, which the user has to
query and resample appropriately. Resampling code from Buffer, WavLoader
and FlacLoader is removed.
Note that these applications only check the sample rate at startup,
which is reasonable (the user has to restart applications when changing
the sample rate). Fully dynamic adaptation could both lead to errors and
will require another IPC interface. This seems to be enough for now.
Previously, ResampleHelper was fixed on handling double's, which makes
it unsuitable for the upcoming FLAC loader that needs to resample
integers. For this reason, ResampleHelper is templated to support
theoretically any type of sample, though only the necessary i32 and
double are templated right now.
The ResampleHelper implementations are moved from WavLoader.cpp to
Buffer.cpp.
This also improves some imports in the WavLoader files.
This commit addresses two issues:
1. If you play a 96 KHz Wave file, the slider position is incorrect,
because it is assumed all files are 44.1 KHz.
2. For high-bitrate files, there are audio dropouts due to not
buffering enough audio data.
Issue 1 is addressed by scaling the number of played samples by the
ratio between the source and destination sample rates.
Issue 2 is addressed by buffering a certain number of milliseconds
worth of audio data (instead of a fixed number of bytes).
This makes the the buffer size independent of the source sample rate.
Some of the code is redesigned to be simpler. The code that did the
book-keeping of which buffers need to be loaded and which have been
already played has been removed. Instead, we enqueue a new buffer based
on a low watermark of samples remaining in the audio server queue.
Other small fixes include:
1. Disable the stop button when playback is finished.
2. Remove hard-coded instances of 44100.
3. Update the GUI every 50 ms (was 100), which improves visualizations.
Prior code in `WavLoader::get_more_samples()` would attempt to
read the requested number of samples without actually checking
whether that many samples were remaining in the stream.
This was the cause of an audible pop at the end of a track, due
to reading non-audio data that is sometimes at the end of a Wave file.
Now we only attempt to read up to the end of sample data, but no
further.
Also, added comments to clarify the meaning of "sample", and how it
should be independent of the number of channels.
This enables support for playing float32 and float64
WAVE_FORMAT_EXTENSIBLE files.
The PCM data format is encoded in the
first two bytes of the SubFormat GUID inside of the
WAVE_FORMAT_EXTENSIBLE `fmt` chunk.
Also, fixed the RIFF header size check to allow up to
maximum_wav_size (currently defined as 1 GiB).
The RIFF header size is the size of the entire
file, so it should be checked against the largest Wave size.
This fixes a bug where if you try to play a Wave file a second
time (or loop with `aplay -l`), the second time will be pure
noise.
The function `Audio::Loader::seek` is meant to seek to a specific
audio sample, e.g. seek(0) should go to the first audio sample.
However, WavLoader was interpreting seek(0) as the beginning
of the file or stream, which contains non-audio header data.
This fixes the bug by capturing the byte offset of the start of the
audio data, and offseting the raw file/stream seek by that amount.
When samples are requested in `Audio::Loader::get_more_samples`,
the request comes in as a max number of bytes to read.
However, the requested number of bytes may not be an even multiple
of the bytes per sample of the loaded file. If this is the case, and
the bytes are read from the file/stream, then
the last sample will be a partial/runt sample, which then offsets
the remainder of the stream, causing white noise in playback.
This bug was discovered when trying to play 24-bit Wave files, which
happened to have a sample size that never aligned with the number
of requested bytes.
This commit fixes the bug by only reading a multiple of
"bytes per sample" for the loaded file.
IODeviceStreamReader isn't pulling its weight.
It's essentially a subset of InputFileStream with only one user
(WavLoader).
This refactors WavLoader to use InputFileStream instead.
LibAudio's WavLoader plugin for loading WAV files now supports loading
audio files with 32-bit float or 64-bit float samples.
By supporting these new non-int sample formats, Audio::Buffer now stores
the sample format (out of a list of supported formats) instead of the
raw bit depth. (The bit depth is easily calculated with
pcm_bits_per_sample)
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
It was using has_any_error, which causes an assertion failure when
destroying the stream. Instead, use handle_any_error, as the
WAV loader does handle errors.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.