Issues reading cross-reference stream with incorrect /Size
divergentdave opened this issue · comments
I used origami to create a password-encrypted PDF to attempt to reproduce #159, and I found that pdf-rs is intolerant of files where the /Size
of a cross-reference stream is too small for its contents.
I produced this file with the following script, from the existing pdf-sample.pdf
.
#!/usr/bin/env ruby
require 'origami'
[
['RC4', 40, false, 'pdf-sample_rc4_rev2.pdf'],
['RC4', 64, false, 'pdf-sample_rc4_rev3.pdf'],
['AES', 128, false, 'pdf-sample_aes_128.pdf'],
['AES', 256, false, 'pdf-sample_aes_256.pdf'],
['AES', 256, true, 'pdf-sample_aes_256_hardened.pdf'],
].each do |(cipher, key_size, hardened, file_name)|
pdf = Origami::PDF.read('../pdf-sample.pdf')
pdf.encrypt(
user_passwd: 'userpassword',
owner_passwd: 'ownerpassword',
cipher: cipher,
key_size: key_size,
hardened: hardened,
)
pdf.save(file_name, noindent: true)
end
By inspection, it's clear that the modifications origami
made to the trailer's dictionary are inconsistent. The /Index
ends with "28 2", yet the /Size
is only 27. When this file is loaded, the last subsection of the cross reference stream is successfully parsed, but then XRefTable::add_entries_from()
discards both entries in the subsection (with IDs 28 and 29) because they don't fit in the vector. Thus, future indirect references to the objects fail to resolve. (Soon after, opening the file fails with "Entry 28 in xref table unspecified")
Should Backend::read_xref_table_and_trailer()
scan the /Index
array and update the table's size if necessary?
I think there needs to be a fallback that reads the entire file and rebuilds the xref table.
What you proposed can be implemented as well.