This moves both Gfx::CanonicalCode::write_symbol() and
Compress::CanonicalCode::write_symbol() inline.
It also adds `__attribute__((always_inline))` on the arguments
to visit() in the latter. (ALWAYS_INLINE doesn't work on lambdas.)
Numbers with `ministat`: I ran once:
Build/lagom/bin/image -o test.bmp Base/res/wallpapers/sunset-retro.png
and then ran to bench:
~/src/hack/bench.py -n 20 -o bench_foo1.txt \
Build/lagom/bin/image -o test.webp test.bmp
...and then `ministat bench_foo1.txt bench_foo2.txt` to compare.
The previous commit increased the time for this command by 38% compared
to the before state.
With this, it's an 8.6% regression. So still a regression, but a smaller
one.
Or, in other words, this commit reduces times by 21% compared to the
previous commit.
Numbers with hyperfine are similar -- with this on top of the previous
commit, this is a 7-11% regression, instead of an almost 50% regression.
(A local branch that changes how we compute CanonicalCodes so that we
actually compress a bit is perf-neutral since the image writing code
doesn't change.)
`hyperfine 'image -o test.webp test.bmp'`:
* Before: 23.7 ms ± 0.7 ms (116 runs)
* Previous commit: 33.2 ms ± 0.8 ms (82 runs)
* This commit: 25.5 ms ± 0.7 ms (102 runs)
`hyperfine 'animation -o wow.webp giphy.gif'`:
* Before: 85.5 ms ± 2.0 ms (34 runs)
* Previous commit: 127.7 ms ± 4.4 ms (22 runs)
* This commit: 95.3 ms ± 2.1 ms (31 runs)
`hyperfine 'animation -o wow.webp 7z7c.gif'`:
* Before: 12.6 ms ± 0.6 ms (198 runs)
* Previous commit: 16.5 ms ± 0.9 ms (153 runs)
* This commit: 13.5 ms ± 0.6 ms (186 runs)
We still construct the code length codes manually, and now we also
construct a PrefixCodeGroup manually that assigns 8 bits to all
symbols (except for fully-opaque alpha channels, and for the
unused distance codes, like before). But now we use the CanonicalCodes
from that PrefixCodeGroup for writing.
No behavior change at all, the output is bit-for-bit identical to
before. But this is a step towards actually huffman-coding symbols.
This is however a pretty big perf regression. For
`image -o test.webp test.bmp` (where test.bmp is retro-sunset.png
re-encoded as bmp), time goes from 23.7 ms to 33.2 ms.
`animation -o wow.webp giphy.gif` goes from 85.5 ms to 127.7 ms.
`animation -o wow.webp 7z7c.gif` goes from 12.6 ms to 16.5 ms.
No behavior change. No measurable performance different either.
(I tried `hyperfine 'Build/lagom/bin/image --no-output foo.webp'`
for a few input images before and after this change, and I didn't
see a difference. I also tried if moving both
Gfx::CanonicalCode::read_symbol() and
Compress::CanonicalCode::read_symbol() inline, and that didn't
help either.)
By doing this, we can remove all the special-case checks for `length`
from the generic GetById and GetByIdWithThis code paths, making every
other property lookup a bit faster.
Small progressions on most benchmarks, and some larger progressions:
- 12% on Octane/crypto.js
- 6% on Kraken/ai-astar.js
This returns the secondary target of a mouse event. For `onmouseenter`
and `onmouseover` events, this is the EventTarget the mouse exited
from. For `onmouseleave` and `onmouseout` events, this is the
EventTarget the mouse entered to.
Most of IPC::Connection is thread-safe, but the responsiveness timer is
very much not so, this commit makes sure only one thread can send stuff
through IPC to avoid having threads racing in IPC::Connection.
This is simply less work than making sure everything IPC::Connection
uses (now and later) is thread-safe.
Previously RS handled all the requests in an event loop, leading to
issues with connections being started in the middle of other connections
being started (and potentially blowing up the stack), ultimately causing
requests to be delayed because of other requests.
This commit reworks the way we handle these (specifically starting
connections) by first serialising the requests, and then performing them
in multiple threads concurrently; which yields a significant loading
performance and reliability increase.
Previously sharing a Timer/Notifier between threads (or just handing
its ownership to another thread) lead to a crash as they are
thread-specific.
This commit makes it so we can handle mutation (i.e. just deletion
or unregistering) in a thread-safe and lockfree manner.
The VERIFY assumed that we either have a finally block which we already
visited through a previous boundary or have no finally block what so
ever; But since 301a1fc763 we propagate
finalizers through unwind scopes breaking that assumption.
I saw that this not being implemented was causing a javascript exception
to be thrown when loading https://profiler.firefox.com/.
I was hoping that like the last time that partially implementing this
interface would allow the page to load further, but it seems that this
is unfortunately not the case this time!
However, this may help other pages load slightly further, and puts some
of the scaffolding in place for a proper implementation :^)
For now, this slot is always 0 - (the default value per spec). But
once we start actually processing audio streams this internal slot
should be changed correspondingly.
This will help us in detecting potential web compatability issues from
not having this implemented.
While we're at it, update the spec link, as it was moved from the DOM
parsing spec to the HTML one, and implement this function in a manner
that closr resembles spec text.
To determine the palette of colors we use the median cut algorithm.
While being a correct implementation, enhancements are obviously
existing on both the median cut algorithm and the encoding side.
This is useful to find the best matching color palette from an existing
bitmap. It can be used in PixelPaint but also in encoders of old image
formats that only support indexed colors e.g. GIF.
It wasn't safe to use addition_would_overflow(a, -b) to check if
subtraction (a - b) would overflow, since it doesn't cover this case.
I don't know why we didn't have subtraction_would_overflow(), so this
patch adds it. :^)