Commit graph

56 commits

Author SHA1 Message Date
Rodrigo Tobar
3eaa27f53a LibPDF: Add infrastructure for accented character glyphs
Type1 accented character glyphs are composed of two other glyphs in the
same font: a base glyph and an accent glyph, given as char codes in the
standard encoding. These two glyphs are then composed together to form
the accented character.

This commit adds the data structures to hold the information for
accented characters, and also the routine that composes the final glyph
path out of the two individual components. All glyphs must have been
loaded by the time this composition takes place, and thus a new
protected consolidate_glyphs() routine has been added to perform this
calculation.
2023-02-08 19:47:15 +01:00
Rodrigo Tobar
11a9bfd4b6 LibPDF: Turn Glyph into a class
Glyph was a simple structure, but even now it's become more complex that
it was initially. Turning it into a class hides some of that complexity,
and make sit easier to understand to external eyes.

While doing this I also decided to remove the float + bool combo for
keeping track of the glyph's width, and replaced it with an Optional
instead.
2023-02-08 19:47:15 +01:00
Rodrigo Tobar
c084943457 LibPDF: Index Type1 glyphs by name, not char code
Storing glyphs indexed by char code in a Type1 Font Program binds a Font
Program instance to the particular Encoding that was used at Font
Program construction time. This makes it difficult to reuse Font Program
instances against different Encodings, which would be otherwise
possible.

This commit changes how we store the glyphs on Type1 Font Programs.
Instead of storing them on a map indexed by char code, the map is now
indexed by glyph name. In turn, when rendering a glyph we use the
Encoding object to turn the char code into a glyph name, which in turn
is used to index into the map of glyphs.

This is the first step towards reusability of Type1 Font Programs. It
also unlocks the ability to render glyphs that are described via the
"seac" command (standard encoding accented character), which requires
accessing the base and accent glyphs by name.
2023-02-08 19:47:15 +01:00
Rodrigo Tobar
286e3e6872 LibPDF: Simplify Encoding to align with simple font requirements
All "Simple Fonts" in PDF (all but Type0 fonts) have the property that
glyphs are selected with single byte character codes. This means that
the Encoding objects should use u8 for representing these character
codes. Moreover, and as mentioned in a previous commit, there is no need
to store the unicode code point associated with a character (which was
in turn wrongly associated to a glyph).

This commit greatly simplifies the Encoding class. Namely it:

 * Removes the unnecessary CharDescriptor class.
 * Changes the internal maps to be u8 -> FlyString and vice-versa,
   effectively providing two-way lookups.
 * Adds a new method to set a two-way u8 -> FlyString mapping and uses
   it in all possible places.
 * Simplified the creation of Encoding objects.
 * Changes how the WinAnsi special treatment for bullet points is
   implemented.
2023-02-02 14:50:38 +01:00
Tim Schumacher
ae64b68717 AK: Deprecate the old AK::Stream
This also removes a few cases where the respective header wasn't
actually required to be included.
2023-01-29 19:16:44 -07:00
Rodrigo Tobar
c4b45a82cd LibPDF: Add initial CFF parsing
The Compat Font Format specification (Adobe's Technical Note #5176) is
used by PDF's Type1C fonts to store their data. While being similar in
spirit to PS1 Type 1 Font Programs, it was designed for a more compact
representation and thus space reduction (but an increment on
complexity). It also shares most of the charstring encoding logic, which
is why the CFF class also inherits from Type1FontProgram.

This initial implementation is still lacking many details, e.g.:

 * It doesn't include all the built-in CFF SIDs
 * It doesn't support CFF-provided SIDs (defaults those glyphs to the
   space character)
 * More checks in general
2023-01-25 15:40:11 +01:00