These were intentionally set up to be at the end of the granule size,
but since the stereo intensity loop is intentionally using a <= end
comparison (that’s how the scale factor bands work), we must skip these
dummy bands which would otherwise cause an out-of-bounds index.
We can just create a `BigEndianInputBitStream` where it is needed
instead of storing one in the class. After making `read_header()`
static, the only other user of the field was `read_size_information()`,
so let's do that there and then remove the field.
This makes the checks for a frame header more consistent, so if the
conditions for allowed frame headers change, there are less scattered
lines that will need to be changed.
`synchronize()` will now also properly scan the second byte of the hex
sequence `FF FF F0` as a sync code, where previously it would see
`FF F` and skip on to `F0`, ignoring its preceding `FF` that would
indicate that it is a sync code.
`synchronize()` can be simplified greatly by checking whole bytes with
bitwise operations, and doing so also avoids the overhead of reading
individual bits from a bitstream.
Making `read_frame()` also take a `SeekableStream` will allow it to be
used inside `synchronize()` in the next commit.
Prevously, the header size was used to calculate the `slot_count` field
of `MP3::Header`, but `build_seek_table()` just used the maximum size
of the header instead, causing it not to seek far enough, and in cases
where a possible sync code occurred two bytes before the next frame, it
would read that possible sync code as if it was a real frame. It would
then either reject it due to bad field values, or could possibly skip
over the next real frame due to a larger calculated frame size in the
bogus frame.
By fixing this issue, we now properly calculate the duration of MP3
files where these fake sync codes occur. In the case of the raw file
for this podcast:
https://changelog.com/podcast/554
the duration goes from 1:21:57 to 1:22:47, which is the real duration
according to the player user interface.
The seek table must locate the first MP3 frame in the file, so it makes
sense to locate the samples for the sample table first, then that
information to seek to the first frame.
Previously, we would just start from byte 0 and check individual bytes
of the file until we find two bytes starting with `FF F`, and then
assume that that was the MP3 frame sync code. However, some ID3v2 tags
do not have to be what is referred to as "unsynchronized", meaning that
they can contain that `FF F` sequence and cause our decoder to think it
has found a frame.
To avoid this happening, we can read a minimal amount of the ID3 header
to determine how many bytes to skip before attempting to find the MP3
frames.
This allows the recent podcast with Andreas to play here:
https://changelog.com/podcast/554
Seek points were being created after adding to the sample count in
`build_seek_table()`, meaning that they would be offset forward by
`MP3::frame_size` samples.
This also allows us to remove the hardcoded sample 0 seek point that
was previously added, since a seek point at sample 0 will now be added
by the loop.
A previous commit made it so that SeekTable doesn't provide a seek
point from `seek_point_before()` if there is not a seek point before
the requested sample index. However, MP3Loader was only setting a seek
point after the first 10 frames, meaning that it would do nothing when
seeking back to 0.
To fix this, add a seek point at byte 0 for the first sample, so that
`seek_point_before()` will never fail.
It's no longer needed now that this code uses ErrorOr instead of Result.
Ran:
rg -lw LOADER_TRY Userland/Libraries/LibAudio \
| xargs sed -i '' 's/LOADER_TRY/TRY/g'
...and then manually fixed up Userland/Libraries/LibAudio/LoaderError.h
to not redefine TRY but instead remove the now-unused LOADER_TRY,
and ran clang-format.
There are at most 576 granule samples/frequency lines, but the side data
can specify that the big_values granule type take up to 1024 samples.
The spec says in 2.4.3.4.6 that we should always stop reading Huffman
data once we have 576 samples, so that is what this change does. I also
add some useful spec comments while I'm here.
This removes a lot of duplicated stream creation code from the plugins,
and also simplifies the way that the appropriate plugin is found. This
mirrors the ImageDecoderPlugin design and necessitates new sniffing
methods on the loaders.
Similar to POSIX read, the basic read and write functions of AK::Stream
do not have a lower limit of how much data they read or write (apart
from "none at all").
Rename the functions to "read some [data]" and "write some [data]" (with
"data" being omitted, since everything here is reading and writing data)
to make them sufficiently distinct from the functions that ensure to
use the entire buffer (which should be the go-to function for most
usages).
No functional changes, just a lot of new FIXMEs.
Before, some loader plugins implemented their own buffering (FLAC&MP3),
some didn't require any (WAV), and some didn't buffer at all (QOA). This
meant that in practice, while you could load arbitrary amounts of
samples from some loader plugins, you couldn't do that with some others.
Also, it was ill-defined how many samples you would actually get back
from a get_more_samples call.
This commit fixes that by introducing a layer of abstraction between the
loader and its plugins (because that's the whole point of having the
extra class!). The plugins now only implement a load_chunks() function,
which is much simpler to implement and allows plugins to play fast and
loose with what they actually return. Basically, they can return many
chunks of samples, where one chunk is simply a convenient block of
samples to load. In fact, some loaders such as FLAC and QOA have
separate internal functions for loading exactly one chunk. The loaders
*should* load as many chunks as necessary for the sample count to be
reached or surpassed (the latter simplifies loading loops in the
implementations, since you don't need to know how large your next chunk
is going to be; a problem for e.g. FLAC). If a plugin has no problems
returning data of arbitrary size (currently WAV), it can return a single
chunk that exactly (or roughly) matches the requested sample count. If a
plugin is at the stream end, it can also return less samples than was
requested! The loader can handle all of these cases and may call into
load_chunk multiple times. If the plugin returns an empty chunk list (or
only empty chunks; again, they can play fast and loose), the loader
takes that as a stream end signal. Otherwise, the loader will always
return exactly as many samples as the user requested. Buffering is
handled by the loader, allowing any underlying plugin to deal with any
weird sample count requirement the user throws at it (looking at you,
SoundPlayer!).
This (not accidentally!) makes QOA work in SoundPlayer.
`Stream` will be qualified as `AK::Stream` until we remove the
`Core::Stream` namespace. `IODevice` now reuses the `SeekMode` that is
defined by `SeekableStream`, since defining its own would require us to
qualify it with `AK::SeekMode` everywhere.
Especially if buffered streams are involved, not filling the span
completely can also mean that we just ran out of filled buffer space and
we need to refill it on the beginning of the next read call.
This is to differentiate between the upcoming `AllocatingMemoryStream`,
which automatically allocates memory as needed instead of operating on a
static memory area.
This allows us to either pass a reference, which keeps compatibility
with old code, or to pass a NonnullOwnPtr, which allows us to
comfortably chain streams as usual.
This doesn't have any immediate uses, but this adapts the code a bit
more to `Core::Stream` conventions (as most functions there use
NonnullOwnPtr to handle streams) and it makes it a bit clearer that this
pointer isn't actually supposed to be null. In fact, MP3LoaderPlugin
and FlacLoaderPlugin apparently forgot to check for that completely
before starting to decode data.
This now prepares all the needed (fallible) components before actually
constructing a LoaderPlugin object, so we are no longer filling them in
at an arbitrary later point in time.
This has been overkill from the start, and it has been bugging me for a
long time. With this change, we're probably a bit slower on most
platforms but save huge amounts of space with all in-memory sample
datastructures.
Previously, we were sending Buffers to the server whenever we had new
audio data for it. This meant that for every audio enqueue action, we
needed to create a new shared memory anonymous buffer, send that
buffer's file descriptor over IPC (+recfd on the other side) and then
map the buffer into the audio server's memory to be able to play it.
This was fine for sending large chunks of audio data, like when playing
existing audio files. However, in the future we want to move to
real-time audio in some applications like Piano. This means that the
size of buffers that are sent need to be very small, as just the size of
a buffer itself is part of the audio latency. If we were to try
real-time audio with the existing system, we would run into problems
really quickly. Dealing with a continuous stream of new anonymous files
like the current audio system is rather expensive, as we need Kernel
help in multiple places. Additionally, every enqueue incurs an IPC call,
which are not optimized for >1000 calls/second (which would be needed
for real-time audio with buffer sizes of ~40 samples). So a fundamental
change in how we handle audio sending in userspace is necessary.
This commit moves the audio sending system onto a shared single producer
circular queue (SSPCQ) (introduced with one of the previous commits).
This queue is intended to live in shared memory and be accessed by
multiple processes at the same time. It was specifically written to
support the audio sending case, so e.g. it only supports a single
producer (the audio client). Now, audio sending follows these general
steps:
- The audio client connects to the audio server.
- The audio client creates a SSPCQ in shared memory.
- The audio client sends the SSPCQ's file descriptor to the audio server
with the set_buffer() IPC call.
- The audio server receives the SSPCQ and maps it.
- The audio client signals start of playback with start_playback().
- At the same time:
- The audio client writes its audio data into the shared-memory queue.
- The audio server reads audio data from the shared-memory queue(s).
Both sides have additional before-queue/after-queue buffers, depending
on the exact application.
- Pausing playback is just an IPC call, nothing happens to the buffer
except that the server stops reading from it until playback is
resumed.
- Muting has nothing to do with whether audio data is read or not.
- When the connection closes, the queues are unmapped on both sides.
This should already improve audio playback performance in a bunch of
places.
Implementation & commit notes:
- Audio loaders don't create LegacyBuffers anymore. LegacyBuffer is kept
for WavLoader, see previous commit message.
- Most intra-process audio data passing is done with FixedArray<Sample>
or Vector<Sample>.
- Improvements to most audio-enqueuing applications. (If necessary I can
try to extract some of the aplay improvements.)
- New APIs on LibAudio/ClientConnection which allows non-realtime
applications to enqueue audio in big chunks like before.
- Removal of status APIs from the audio server connection for
information that can be directly obtained from the shared queue.
- Split the pause playback API into two APIs with more intuitive names.
I know this is a large commit, and you can kinda tell from the commit
message. It's basically impossible to break this up without hacks, so
please forgive me. These are some of the best changes to the audio
subsystem and I hope that that makes up for this :yaktangle: commit.
:yakring:
With the following change in how we send audio, the old Buffer type is
not really needed anymore. However, moving WavLoader to the new system
is a bit more involved and out of the scope of this PR. Therefore, we
need to keep Buffer around, but to make it clear that it's the old
buffer type which will be removed soon, we rename it to LegacyBuffer.
Most of the users will be gone after the next commit anyways.