Previously, we were sending Buffers to the server whenever we had new
audio data for it. This meant that for every audio enqueue action, we
needed to create a new shared memory anonymous buffer, send that
buffer's file descriptor over IPC (+recfd on the other side) and then
map the buffer into the audio server's memory to be able to play it.
This was fine for sending large chunks of audio data, like when playing
existing audio files. However, in the future we want to move to
real-time audio in some applications like Piano. This means that the
size of buffers that are sent need to be very small, as just the size of
a buffer itself is part of the audio latency. If we were to try
real-time audio with the existing system, we would run into problems
really quickly. Dealing with a continuous stream of new anonymous files
like the current audio system is rather expensive, as we need Kernel
help in multiple places. Additionally, every enqueue incurs an IPC call,
which are not optimized for >1000 calls/second (which would be needed
for real-time audio with buffer sizes of ~40 samples). So a fundamental
change in how we handle audio sending in userspace is necessary.
This commit moves the audio sending system onto a shared single producer
circular queue (SSPCQ) (introduced with one of the previous commits).
This queue is intended to live in shared memory and be accessed by
multiple processes at the same time. It was specifically written to
support the audio sending case, so e.g. it only supports a single
producer (the audio client). Now, audio sending follows these general
steps:
- The audio client connects to the audio server.
- The audio client creates a SSPCQ in shared memory.
- The audio client sends the SSPCQ's file descriptor to the audio server
with the set_buffer() IPC call.
- The audio server receives the SSPCQ and maps it.
- The audio client signals start of playback with start_playback().
- At the same time:
- The audio client writes its audio data into the shared-memory queue.
- The audio server reads audio data from the shared-memory queue(s).
Both sides have additional before-queue/after-queue buffers, depending
on the exact application.
- Pausing playback is just an IPC call, nothing happens to the buffer
except that the server stops reading from it until playback is
resumed.
- Muting has nothing to do with whether audio data is read or not.
- When the connection closes, the queues are unmapped on both sides.
This should already improve audio playback performance in a bunch of
places.
Implementation & commit notes:
- Audio loaders don't create LegacyBuffers anymore. LegacyBuffer is kept
for WavLoader, see previous commit message.
- Most intra-process audio data passing is done with FixedArray<Sample>
or Vector<Sample>.
- Improvements to most audio-enqueuing applications. (If necessary I can
try to extract some of the aplay improvements.)
- New APIs on LibAudio/ClientConnection which allows non-realtime
applications to enqueue audio in big chunks like before.
- Removal of status APIs from the audio server connection for
information that can be directly obtained from the shared queue.
- Split the pause playback API into two APIs with more intuitive names.
I know this is a large commit, and you can kinda tell from the commit
message. It's basically impossible to break this up without hacks, so
please forgive me. These are some of the best changes to the audio
subsystem and I hope that that makes up for this :yaktangle: commit.
:yakring:
With the following change in how we send audio, the old Buffer type is
not really needed anymore. However, moving WavLoader to the new system
is a bit more involved and out of the scope of this PR. Therefore, we
need to keep Buffer around, but to make it clear that it's the old
buffer type which will be removed soon, we rename it to LegacyBuffer.
Most of the users will be gone after the next commit anyways.
Of course, Buffer is going to be removed very soon, but much of the
WavLoader behavior still depends on it. Therefore, this intermediary
API will allow adopting the Loader infrastructure without digging too
deep into the WavLoader legacy code. That's for later :^)
The Buffer files had contained both the ResampleHelper and the
sample format utilities. Because the Buffer class (and its file) is
going to be deleted soon, this commit separates those two things into
their own files.
The old FIXME asserting that Core::AnonymousBuffer cannot be invalid
or zero-sized is no longer accurate. Add a default constructor for
Audio::Buffer that has all invalid state instead of going to the OS to
allocate a 1 sample buffer for the "no more samples" states in the WAV
and FLAC plugins.
FixedArray now doesn't expose any infallible constructors anymore.
Rather, it exposes fallible methods. Therefore, it can be used for
OOM-safe code.
This commit also converts the rest of the system to use the new API.
However, as an example, VMObject can't take advantage of this yet,
as we would have to endow VMObject with a fallible static
construction method, which would require a very fundamental change
to VMObject's whole inheritance hierarchy.
Previously, a libc-like out-of-line error information was used in the
loader and its plugins. Now, all functions that may fail to do their job
return some sort of Result. The universally-used error type ist the new
LoaderError, which can contain information about the general error
category (such as file format, I/O, unimplemented features), an error
description, and location information, such as file index or sample
index.
Additionally, the loader plugins try to do as little work as possible in
their constructors. Right after being constructed, a user should call
initialize() and check the errors returned from there. (This is done
transparently by Loader itself.) If a constructor caused an error, the
call to initialize should check and return it immediately.
This opportunity was used to rework a lot of the internal error
propagation in both loader classes, especially FlacLoader. Therefore, a
couple of other refactorings may have sneaked in as well.
The adoption of LibAudio users is minimal. Piano's adoption is not
important, as the code will receive major refactoring in the near future
anyways. SoundPlayer's adoption is also less important, as changes to
refactor it are in the works as well. aplay's adoption is the best and
may serve as an example for other users. It also includes new buffering
behavior.
Buffer also gets some attention, making it OOM-safe and thereby also
propagating its errors to the user.
This consists of two changes: First, a utility function create_empty
allows the user to quickly create an empty buffer. Second, most creation
functions now return a NonnullRefPtr, as their failure causes a VERIFY
crash anyways.
"Frame" is an MPEG term, which is not only unintuitive but also
overloaded with different meaning by other codecs (e.g. FLAC).
Therefore, use the standard term Sample for the central audio structure.
The class is also extracted to its own file, because it's becoming quite
large. Bundling these two changes means not distributing similar
modifications (changing names and paths) across commits.
Co-authored-by: kleines Filmröllchen <malu.bertsch@gmail.com>
The conversion from a linear scale (how we think about audio) to a
logarithmic scale (how audio actually works) will be useful for other
operations, so let's extract it to its own utility function. Its inverse
will also allow reversible operations to be written more easily.
Across the entire audio system, audio now works in 0-1 terms instead of
0-100 as before. Therefore, volume is now a double instead of an int.
The master volume of the AudioServer changes smoothly through a
FadingProperty, preventing clicks. Finally, volume computations are done
with logarithmic scaling, which is more natural for the human ear.
Note that this could be 4-5 different commits, but as they change each
other's code all the time, it makes no sense to split them up.
All audio applications (aplay, Piano, Sound Player) respect the ability
of the system to have theoretically any sample rate. Therefore, they
resample their own audio into the system sample rate.
LibAudio previously had its loaders resample their own audio, even
though they expose their sample rate. This is now changed. The loaders
output audio data in their file's sample rate, which the user has to
query and resample appropriately. Resampling code from Buffer, WavLoader
and FlacLoader is removed.
Note that these applications only check the sample rate at startup,
which is reasonable (the user has to restart applications when changing
the sample rate). Fully dynamic adaptation could both lead to errors and
will require another IPC interface. This seems to be enough for now.
Floating-point ratios are inherently imprecise, and can lead to
unpredictable or nondeterministic behavior when resampling and expecting
a certain number of resulting samples. Therefore, the resampler now uses
integer ratios, with almost identical but fully predictable behavior.
This also introduces the reset() function that the FLAC loader will use
in the future.
Previously, ResampleHelper was fixed on handling double's, which makes
it unsuitable for the upcoming FLAC loader that needs to resample
integers. For this reason, ResampleHelper is templated to support
theoretically any type of sample, though only the necessary i32 and
double are templated right now.
The ResampleHelper implementations are moved from WavLoader.cpp to
Buffer.cpp.
This also improves some imports in the WavLoader files.
LibAudio's WavLoader plugin for loading WAV files now supports loading
audio files with 32-bit float or 64-bit float samples.
By supporting these new non-int sample formats, Audio::Buffer now stores
the sample format (out of a list of supported formats) instead of the
raw bit depth. (The bit depth is easily calculated with
pcm_bits_per_sample)
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Because it's what it really is. A frame is composed of 1 or more samples, in
the case of SerenityOS 2 (stereo). This will make it less confusing for
future mantainability.