Commit graph

14 commits

Author SHA1 Message Date
Rodrigo Tobar
c592b889bf LibPDF: Add Reader::try_read for easier error propagation
This will allow us to use TRY(reader.try_read) instead of having to
verify the result of reader.remaining() before calling read.read().
2023-01-25 15:40:11 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Linus Groh
d26aabff04 Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Julian Offenhäuser
4bd79338e8 LibPDF: Fix off-by-one error in Reader::remaining() 2022-11-19 15:42:08 +01:00
Julian Offenhäuser
9f4659cc63 LibPDF: Move consume and match helper functions to the Reader class 2022-09-17 10:07:14 +01:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Ali Mohammad Pur
bf59d9e824 Userland: Include Vector.h in a few places to make HeaderCheck happy
This header was being transitively pulled in, but that no longer happens
after 5f7d008791.
2021-11-11 20:36:36 +01:00
Andreas Kling
80d4e830a0 Everywhere: Pass AK::ReadonlyBytes by value 2021-11-11 01:27:46 +01:00
Ben Wiederhake
6089c4d97d LibPDF: Add missing headers to Reader.h 2021-09-20 17:39:36 +04:30
Matthew Olsson
612b183703 LibPDF: Convert to east-const to comply with the recent style changes 2021-06-12 22:45:01 +04:30
Matthew Olsson
e23bfd7252 LibPDF: Parse linearized PDF files
This is a big step, as most PDFs which are downloaded online will be
linearized. Pretty much the only difference is that the xref structure
is slightly different.
2021-06-12 22:45:01 +04:30
Matthew Olsson
97cc482087 LibPDF: Make Reader::dump_state a bit more readable 2021-05-25 00:24:09 +04:30
Matthew Olsson
8c745ad0d9 LibPDF: Parse page structures
This commit introduces the ability to parse the document catalog dict,
as well as the page tree and individual pages. Pages obviously aren't
fully parsed, as we won't care about most of the fields until we
start actually rendering PDFs.

One of the primary benefits of the PDF format is laziness. PDFs are
not meant to be parsed all at once, and the same is true for pages.
When a Document is constructed, it builds a map of page number to
object index, but it does not fetch and parse any of the pages. A page
is only parsed when a caller requests that particular page (and is
cached going forwards).

Additionally, this commit also adds an object_cast function which
logs bad casts if DEBUG_PDF is set. Additionally, utility functions
were added to ArrayObject and DictObject to get all types of objects
from the collections to avoid having to manually cast.
2021-05-10 10:32:39 +02:00
Matthew Olsson
72f693e9ed LibPDF: Add a basic parser and Document structure
This commit adds a parser as well as the Reader class, which serves
as a utility to aid in reading the PDF both forwards and in reverse.
The parser currently is capable of reading xref tables, as well as
all values. We don't really do anything with any of this information,
however.
2021-05-10 10:32:39 +02:00