The existing ArithmeticEncoder (from Annex E) reads one bit at a
time.
ArithmeticIntegerDecoder (from Annex A) builds on top of that to
read integer values.
This will be used by both the symbol segment and the text segment
readers.
(This does not yet implement the IAID decoding procedure in A.3.
We only need that one in the text segment decoder at the moment,
and it's pretty small, so I'll put it inline there for now.)
Not used yet, so no behavior change yet.
The path we were using is no longer correct, and we've been silently
dropping this error. Use Core::Resource instead, which we use for most
other Ladybird resources. This would have made it much more obvious that
emoji were not installed with the application.
And add a verification step to the emoji data generator to ensure all
emoji are listed in this file. This file will be used as a sources list
in both the CMake and GN build systems.
It is probably possible to generate this list. But in a first attempt,
the CMake code to set the file as a dependency of a pseudo target, which
would then parse the file and install the listed emoji was getting quite
verbose and complicated. So for now, let's just maintain this list.
The spec for each of these state:
-> EOF:
This is an eof-in-comment parse error. Emit the current comment
token. Emit an end-of-file token.
We were neglecting to emit the current comment token before emitting an
EOF token. Note the existing EMIT_CURRENT_TOKEN macro was unused.
If box is sized as replaced it still could be anything, not only SVG.
This fixes crashing on https://www.shopify.com/ that was caused by a
missing paintable for a box that has a layout node. This occurred
because the box was not laid out in dimension_box_on_line().
`Node::shadow_including_root()` was missing a null check, which caused
a crash when manipulating a select element, whose option elements were
initially detached.
The HTMLMediaElement, for example, contains spec text which states any
ongoing fetch process must be "stopped". The spec does not indicate how
to do this, so our implementation is rather ad-hoc.
Our current implementation may cause a crash in places that assume one
of the fetch algorithms that we set to null is *not* null. For example:
if (fetch_params.process_response) {
queue_fetch_task([]() {
fetch_params.process_response();
};
}
If the fetch process is stopped after queuing the fetch task, but not
before the fetch task is run, we will crash when running this fetch
algorithm.
We now track queued fetch tasks on the fetch controller. When the fetch
process is stopped, we cancel any such pending task.
It is a little bit awkward maintaining a fetch task ID. Ideally, we
could use the underlying task ID throughout. But we do not have access
to the underlying task nor its ID when the task is running, at which
point we need some ID to remove from the pending task list.
I created this file using `jbig2` (see below for details), but as
far as I can tell `jbig2` does not produce spec-compliant files:
1. It always writes to 0s for the run lengths that specify how
many symbols to export at the end of a symbol segment
2. It doesn't write any referred-to segments for text segments.
I think it's supposed to write a referred-to segment that
mentions the symbol segment the text segment refers to (?)
I locally tweaked `jbig2` to fix these two defects (*), so the image
added in this commit is correct as best I can tell. It opens fine
using `image` and `jbig2`'s decode mode, and via
`Meta/jbig2_to_pdf.py` in Firefox and Chrome. Without my tweaks,
the image decodes fine with `jbig2`, but not with any of the other
three. The image (in a pdf) does _not_ decode in Preview.app,
either with or without my local `jbig2` tweaks.
*: See the PR adding this image for my local diff.
I created the test image file by running this shell script with
`jbig2` tweaked as described above:
#!/bin/bash
set -eu
I=Build/lagom/bin/image
S=Tests/LibGfx/test-inputs/bmp/bitmap.bmp
$I "$S" --crop 232,70,120,250 -o mouth.bmp
$I "$S" --crop 135,100,100,100 -o nose.bmp
$I "$S" --crop 50,108,30,30 -o top_eye.bmp
$I "$S" --crop 60,265,30,30 -o bottom_eye.bmp
# I then manually converted those to 1bpp using Photoshop
# (Image->Mode->Grayscale, then Image->Mode->Bitmap...,
# File->Save As..., bmp) since `jbig2` gets confused by non-1bpp
# bmp files and `image` can't write 1bpp files :/
#
# (I tried `convert ${in} -monochrome ${in}-1bpp.bmp` via
# https://cancerberosgx.github.io/magic/playground/index.html
# first, but that produced bmp files that neither Preview.app nor
# `jbig2` could handle.)
#
# -HeightClass: Number of height classes
# -WidthClass: Maximum number of symbols in one height class
# -Simple means no refinement; the number after is the symbol's ID
# The 3 numbers afer `-ID` are id, y, x. The `-ID` are sorted by x.
# -RefCorner 1 means "top left".
#
# `jbig2` writes symbol and text segments as specified in the ini
# file, and then only stores the bits of the input image that aren't
# already set through symbol and text segments.
cat << EOF > jbig2-symbol.ini
-sym -Seg 1
-sym -file -numClass -HeightClass 3 -WidthClass 2
-sym -file -numSymbol 4
-sym -file -Height 250
-sym -file -Width 120 -Simple 0 mouth-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -file -Height 100
-sym -file -Width 100 -Simple 1 nose-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -file -Height 30
-sym -file -Width 30 -Simple 2 top_eye-1bpp.bmp
-sym -file -Width 30 -Simple 3 bottom_eye-1bpp.bmp
-sym -file -EndOfHeightClass
-sym -Param -Huff_DH 0
-sym -Param -Huff_DW 0
-txt -Seg 2
-txt -Param -numInst 4
-ID 2 108 50 -ID 3 265 60 -ID 1 100 135 -ID 0 70 232
-txt -Param -RefCorner 1
-txt -Param -Xlocation 0
-txt -Param -Ylocation 0
-txt -Param -W 399
-txt -Param -H 400
EOF
J=$HOME/Downloads/T-REC-T.88-201808-I\!\!SOFT-ZST-E/Software
J=$J/JBIG2_SampleSoftware-A20180829/source/jbig2
$J -i "${S%.bmp}" -f bmp -o symbol -F jb2 -ini jbig2-symbol.ini
- New upstream stable version is available
- Networking is now fully stable and enabled by default
- SDL2 backend is now available alongside SDL1, so switch to it
- Fixed a name collision of PAGE_SIZE with Serenity headers
- Disable threaded IO on Serenity for now
- Many other changes and fixes
- See https://github.com/LekKit/RVVM/releases/tag/v0.6 for more
...because "change" event should be dispatched on control even if it
has "display: none" style.
This change fixes selection in labels dropdown on GitHub's "new issue"
page.
Previously, ChunkID's from_big_endian_number() and
as_big_endian_number() weren't inverses of each other.
ChunkID::from_big_endian_number() used to take an u32 that contained
`('f' << 24) | ('t' << 16) | ('y' << 8) | 'p'`, that is
'f', 't', 'y', 'p' in memory on big-endian and 'p', 'y', 't', 'f'
on little-endian, and return a ChunkID for 'f', 't', 'y', 'p'.
ChunkID::as_big_endian_number() used to return an u32 that for a
ChunkID storing 'f', 't', 'y', 'p' was always 'f', 't', 'y', 'p'
in memory on both little-endian and big-endian, that is it stored
`('f' << 24) | ('t' << 16) | ('y' << 8) | 'p'` on big-endian and
`('p' << 24) | ('y' << 16) | ('t' << 8) | 'f'` on little-endian.
`ChunkID::from_big_endian_number(0x11223344).as_big_endian_number()`
returned 0x44332211.
This change makes the two methods self-consistent: they now take
and return a u32 that always has the first ChunkID part in the
highest bits of the u32 (`'f' << 24`), and so on. That also means
they return a u32 that in-memory looks differently on big-endian
and little-endian. Since that's normal for numbers, this also
renames the two methods to just `from_number()` and `to_number()`.
With the semantics cleared up, change the one use in ISOBMFF to read a
BigEndian for chunk headers and brand codes. This has the effect of
tags now being printed in the right order.
Before:
```sh
% Build/lagom/bin/isobmff ~/Downloads/sample1.jp2
Unknown Box (' Pj')
[ 4 bytes ]
('pytf') (version = 0, flags = 0x0)
- major_brand = ' 2pj'
- minor_version = 0
- compatible_brands = { ' 2pj' }
Unknown Box ('h2pj')
[ 37 bytes ]
Unknown Box ('fniu')
[ 92 bytes ]
Unknown Box (' lmx')
[ 2736 bytes ]
Unknown Box ('c2pj')
[ 667336 bytes ]
```
After:
```sh
% Build/lagom/bin/isobmff ~/Downloads/sample1.jp2
hmm 0x11223344 0x11223344
Unknown Box ('jP ')
[ 4 bytes ]
('ftyp' ) (version = 0, flags = 0x0)
- major_brand = 'jp2 '
- minor_version = 0
- compatible_brands = { 'jp2 ' }
Unknown Box ('jp2h')
[ 37 bytes ]
Unknown Box ('uinf')
[ 92 bytes ]
Unknown Box ('xml ')
[ 2736 bytes ]
Unknown Box ('jp2c')
[ 667336 bytes ]
```
This was causing some racey behaviour in LibHTTP, and just generally
lead to really bad stack traces; avoid that by switching to
Core::Promise and using the existing event loop.
Possibly resolves#23524 and #23642.
Given a selector like `.foo .bar #baz`, we know that elements with
the class names `foo` and `bar` must be present in the ancestor chain of
the candidate element, or the selector cannot match.
By keeping track of the current ancestor chain during style computation,
and which strings are used in tag names and attribute names, we can do
a quick check before evaluating the selector itself, to see if all the
required ancestors are present.
The way this works:
1. CSS::Selector now has a cache of up to 8 strings that must be present
in the ancestor chain of a matching element. Note that we actually
store string *hashes*, not the strings themselves.
2. When Document performs a recursive style update, we now push and pop
elements to the ancestor chain stack as they are entered and exited.
3. When entering/exiting an ancestor, StyleComputer collects all the
relevant string hashes from that ancestor element and updates a
counting bloom filter.
4. Before evaluating a selector, we first check if any of the hashes
required by the selector are definitely missing from the ancestor
filter. If so, it cannot be a match, and we reject it immediately.
5. Otherwise, we carry on and evaluate the selector as usual.
I originally tried doing this with a HashMap, but we ended up losing
a huge chunk of the time saved to HashMap instead. As it turns out,
a simple counting bloom filter is way better at handling this.
The cost is a flat 8KB per StyleComputer, and since it's a bloom filter,
false positives are a thing.
This is extremely efficient, and allows us to quickly reject the
majority of selectors on many huge websites.
Some example rejection rates:
- https://amazon.com: 77%
- https://github.com/SerenityOS/serenity: 61%
- https://nytimes.com: 57%
- https://store.steampowered.com: 55%
- https://en.wikipedia.org: 45%
- https://youtube.com: 32%
- https://shopify.com: 25%
This also yields a chunky 37% speedup on StyleBench. :^)
Previously, the invalid value default wasn't taken into account when
determining the value that should be returned from the getter of an
enumerated attribute. This caused a crash when an enumerated attribute
of type DOMString? was set to an invalid value.
I've seen a crash when trying to verify_cast some block-level box to a
BlockContainer when it's actually something else.
This patch adds a debug log message so we can learn more about it next
time it happens somewhere.
Since we drive painting for SVG-as-image manually anyway, there's no
need for them to say they are "ready to paint", since that just causes
unnecessary extra processing in the HTML event loop.
We do the same thing with the gzip utility for performance.
This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
0.428s to 0.303s.
This reduces the runtime of `./bin/base64 -d enwik8.base64 >/dev/null`
from 0.632s to 0.469s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
There's no need to copy the result. We can also avoid increasing the
size of the output buffer by 1 for each written byte.
This reduces the runtime of `./bin/base64 -d enwik8.base64 >/dev/null`
from 0.917s to 0.632s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
We don't really need the features provided by StringBuilder here, since
we know the exact size of the output. Avoiding StringBuilder avoids the
recurring capacity/size checks both within StringBuilder itself and its
internal ByteBuffer.
This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
0.976s to 0.428s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
We know we are only appending ASCII characters to the StringBuilder, so
do not bother validating the result.
This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
1.192s to 0.976s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
Instead of invalidating animated style properties whenever
`Document::update_style()` is called, now we only do that when
animations might have actually progressed. We still have to ensure
animated properties are up-to-date in `update_style()` to ensure that
JS methods can access updated style properties.