Commit graph

4 commits

Author SHA1 Message Date
Julian Offenhäuser
77f5f7a6f4 LibPDF: Support parsing page tree nodes that are in object streams
conditionally_parse_page_tree_node used to assume that the xref table
contained a byte offset, even for compressed objects. It now uses the
common facilities for parsing objects, at the expense of some
performance.
2022-09-17 10:07:14 +01:00
Julian Offenhäuser
563d91b6c4 LibPDF: Implement loading compressed objects from object streams
Now, whenever the xref table points to a compressed object,
parse_object_with_index will look it up in the corresponding object
stream as if it were a regular object.

With this, our parser gains the bare minimum support for xref streams.
2022-09-17 10:07:14 +01:00
Julian Offenhäuser
f9beff7b5e LibPDF: Initial work on parsing xref streams
Since PDF version 1.5, a document may omit the xref table in favor of
a new kind of xref stream object. This is used to reference so-called
"compressed" objects that are part of an object stream.

With this patch we are able to parse this new kind of xref object, but
we'll have to implement object streams to use them correctly.
2022-09-17 10:07:14 +01:00
Julian Offenhäuser
4887aacec7 LibPDF: Move document-specific parsing functionality into its own class
The Parser class is now a generic PDF object parser, of which the new
DocumentParser class derives. DocumentParser now takes over all
functions relating to linearization, pages, xref and trailer handling.

This allows the use of multiple parsers in the same document's
context, which will be needed in order to handle PDF object streams.
2022-09-17 10:07:14 +01:00