Skip to content

Commit

Permalink
Fix parsing error that occurs due to an invalid stream length
Browse files Browse the repository at this point in the history
Currently, an error is raised when parsing a file that contains a stream
whose length value is invalid and where the supposed end is not followed
by the keyword 'endstream' but by an invalid PDF object.

We can infer from the invalid PDF object that the length value is not
valid and thus initiate the stream recovery algorithm.
  • Loading branch information
gettalong committed Aug 10, 2022
1 parent 55387ed commit 0204d7f
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 1 deletion.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
* [HexaPDF::TestUtils] module that contains helper methods useful for testing
various parts of HexaPDF

### Fixed

* Parsing of streams with an invalid length value that led to a parsing error


## 0.24.0 - 2022-08-01

Expand Down
2 changes: 1 addition & 1 deletion lib/hexapdf/parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ def parse_indirect_object(offset = nil)
end
@tokenizer.pos = pos + length rescue pos

tok = @tokenizer.next_token
tok = @tokenizer.next_token rescue nil
unless tok.kind_of?(Tokenizer::Token) && tok == 'endstream'
maybe_raise("Invalid stream length, keyword endstream not found", pos: @tokenizer.pos)
@tokenizer.pos = pos
Expand Down
7 changes: 7 additions & 0 deletions test/hexapdf/test_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,13 @@ def create_parser(str)
assert_equal('12', collector(stream.fiber))
end

it "recovers from an incorrect stream length value which leads to a parsing error" do
create_parser("1 0 obj<</Length 2>> stream\n12(ab\nendstream endobj")
obj, _, _, stream = @parser.parse_indirect_object
assert_equal(5, obj[:Length])
assert_equal('12(ab', collector(stream.fiber))
end

it "recovers from an invalid stream length value" do
create_parser("1 0 obj<</Length 2 0 R>> stream\n12endstream endobj")
@document.add([5], oid: 2)
Expand Down

0 comments on commit 0204d7f

Please sign in to comment.