J-F-Liu / lopdf

A Rust library for PDF document manipulation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't parse PDF with comments in content stream

baarkerlounger opened this issue · comments

# parser.rs:304

pub fn content(input: &[u8]) -> Option<Content<Vec<Operation>>> {
    (content_space() * operation().repeat(0..).map(|operations| Content { operations }))
        .parse(input)
        .ok()
}

Returns only Some(Content { operations: [Operation { operator: "q", operands: [] }] }) for the attached sample PDF. It seems like it's not able to parse all the content correctly?

Payslip.pdf

The content stream has comments in it:

q % -- BeginContent
0.75 0 0 -0.75 0 841.92 cm

It seems like that's probably where things are going wrong.

@jrmuizel are comments not supported by lopdf or do I need to do something different to account for that?