pdfcpu / pdfcpu

A PDF processor written in Go.

Home Page:http://pdfcpu.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Corrupt name object

henderjm opened this issue · comments

Thank you for submitting a possible bug!

We have been using this library for some time now but have recently encountered an issue with a particular file. The file itself seems to be pdf 1.4

Please ensure the following:

  • Your issue is based on the latest commit
    yes
  • State your OS and OS version
    Ubuntu 20.04 and OSX Sonoma
  • When reporting a problem with a specific PDF input file please avoid stating the organization responsible for the PDFWriter - just refer to the PDFWriter

The below is the output from Validate after we failed splitting the pdf into chunks

 READ: 2024/03/10 16:11:37 Read: begin
 INFO: 2024/03/10 16:11:37 PDF Version 1.5 conforming reader
 READ: 2024/03/10 16:11:37 readXRefTable: begin
 READ: 2024/03/10 16:11:37 scanning for offsetLastXRefSection starting at 95234
 READ: 2024/03/10 16:11:37 Offset last xrefsection: 94991
 READ: 2024/03/10 16:11:37 buildXRefTableStartingAt: begin
 READ: 2024/03/10 16:11:37 headerVersion begin
 READ: 2024/03/10 16:11:37 headerVersion: end, found header version: 1.4
 READ: 2024/03/10 16:11:37 newPositionedReader: positioned to offset: 94991
 READ: 2024/03/10 16:11:37 xref line 1: <xref>
 READ: 2024/03/10 16:11:37 tryXRefSection: found xref section
 READ: 2024/03/10 16:11:37 parseXRefSection begin
 READ: 2024/03/10 16:11:37 parseXRefSection: <0 34>
 READ: 2024/03/10 16:11:37 parseXRefTableSubSection: begin
 READ: 2024/03/10 16:11:37 detected xref subsection, startObj=0 length=34
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #0 is unused, next free is object#0, generation=65535
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #1 is in use at offset=10, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 1
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #2 is in use at offset=103, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 2
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #3 is in use at offset=156, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 3
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #4 is in use at offset=265, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 4
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #5 is in use at offset=386, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 5
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #6 is in use at offset=7658, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 6
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #7 is in use at offset=7779, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 7
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #8 is in use at offset=14962, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 8
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #9 is in use at offset=15083, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 9
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #10 is in use at offset=23153, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 10
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #11 is in use at offset=23275, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 11
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #12 is in use at offset=31032, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 12
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #13 is in use at offset=31154, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 13
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #14 is in use at offset=39481, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 14
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #15 is in use at offset=39603, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 15
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #16 is in use at offset=47208, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 16
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #17 is in use at offset=47242, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 17
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #18 is in use at offset=47362, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 18
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #19 is in use at offset=47481, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 19
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #20 is in use at offset=47651, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 20
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #21 is in use at offset=54924, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 21
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #22 is in use at offset=55094, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 22
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #23 is in use at offset=62278, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 23
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #24 is in use at offset=62448, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 24
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #25 is in use at offset=70519, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 25
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #26 is in use at offset=70689, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 26
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #27 is in use at offset=78446, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 27
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #28 is in use at offset=78616, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 28
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #29 is in use at offset=86943, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 29
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #30 is in use at offset=87113, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 30
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #31 is in use at offset=94718, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 31
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #32 is in use at offset=94752, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 32
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: begin
 READ: 2024/03/10 16:11:37 createXRefTableEntry: Object #33 is in use at offset=94872, generation=0
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: Insert new xreftable entry for Object 33
 READ: 2024/03/10 16:11:37 parseXRefTableEntry: end
 READ: 2024/03/10 16:11:37 parseXRefTableSubSection: end
 READ: 2024/03/10 16:11:37 parseXRefSection: All subsections read!
 READ: 2024/03/10 16:11:37 parseXRefSection: parsing trailer dict..
 READ: 2024/03/10 16:11:37 line (len 7) <trailer>
 READ: 2024/03/10 16:11:37 line: <>
 READ: 2024/03/10 16:11:37 line: <<< /Size 34>
 READ: 2024/03/10 16:11:37 processTrailer: trailerString: (len:27) <<< /Size 34
/Root 1 0 R >>
>
 READ: 2024/03/10 16:11:37 processTrailer: trailerDict:
<<
	<Root, (1 0 R)>
	<Size, 34>
>>
 READ: 2024/03/10 16:11:37 parseTrailerDict begin
 READ: 2024/03/10 16:11:37 parseTrailer begin
 READ: 2024/03/10 16:11:37 parseTrailerRoot: Root object: (1 0 R)
 READ: 2024/03/10 16:11:37 parseTrailerf end
 READ: 2024/03/10 16:11:37 parseTrailerDict end
 READ: 2024/03/10 16:11:37 buildXRefTableStartingAt: end
TRACE: 2024/03/10 16:11:37 EnsureValidFreeList: begin
TRACE: 2024/03/10 16:11:37 EnsureValidFreeList: empty free list.
 READ: 2024/03/10 16:11:37 readXRefTable: end
 READ: 2024/03/10 16:11:37 dereferenceXRefTable: begin
 READ: 2024/03/10 16:11:37 decodeObjectStreams: begin
 READ: 2024/03/10 16:11:37 decodeObjectStreams: end
 READ: 2024/03/10 16:11:37 dereferenceObjects: begin
 READ: 2024/03/10 16:11:37 dereferenceObject: begin, dereferencing object 0
 READ: 2024/03/10 16:11:37 free object 0
 READ: 2024/03/10 16:11:37 dereferenceObject: begin, dereferencing object 1
 READ: 2024/03/10 16:11:37 in use object 1
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: dereferencing object 1
 READ: 2024/03/10 16:11:37 ParseObject: begin, obj#1, offset:10
 READ: 2024/03/10 16:11:37 newPositionedReader: positioned to offset: 10
 READ: 2024/03/10 16:11:37 object: small obj w/o stream, parse until endobj
 READ: 2024/03/10 16:11:37 dict: end, #1
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: end obj 1 of 34
<<<
	<Outlines, (2 0 R)>
	<PageMode, UseNone>
	<Pages, (3 0 R)>
	<Type, Catalog>
>>>
 READ: 2024/03/10 16:11:37 logStream: no ObjectStreamDict
 READ: 2024/03/10 16:11:37 dereferenceObject: begin, dereferencing object 2
 READ: 2024/03/10 16:11:37 in use object 2
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: dereferencing object 2
 READ: 2024/03/10 16:11:37 ParseObject: begin, obj#2, offset:103
 READ: 2024/03/10 16:11:37 newPositionedReader: positioned to offset: 103
 READ: 2024/03/10 16:11:37 object: small obj w/o stream, parse until endobj
 READ: 2024/03/10 16:11:37 dict: end, #2
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: end obj 2 of 34
<<<
	<Count, 0>
	<Type, Outlines>
>>>
 READ: 2024/03/10 16:11:37 logStream: no ObjectStreamDict
 READ: 2024/03/10 16:11:37 dereferenceObject: begin, dereferencing object 3
 READ: 2024/03/10 16:11:37 in use object 3
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: dereferencing object 3
 READ: 2024/03/10 16:11:37 ParseObject: begin, obj#3, offset:156
 READ: 2024/03/10 16:11:37 newPositionedReader: positioned to offset: 156
 READ: 2024/03/10 16:11:37 object: small obj w/o stream, parse until endobj
 READ: 2024/03/10 16:11:37 dict: end, #3
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: end obj 3 of 34
<<<
	<Count, 6>
	<Kids, [(19 0 R) (21 0 R) (23 0 R) (25 0 R) (27 0 R) (29 0 R)]>
	<Type, Pages>
>>>
 READ: 2024/03/10 16:11:37 logStream: no ObjectStreamDict
 READ: 2024/03/10 16:11:37 dereferenceObject: begin, dereferencing object 4
 READ: 2024/03/10 16:11:37 in use object 4
 READ: 2024/03/10 16:11:37 dereferenceAndLoad: dereferencing object 4
 READ: 2024/03/10 16:11:37 ParseObject: begin, obj#4, offset:265
 READ: 2024/03/10 16:11:37 newPositionedReader: positioned to offset: 265
 READ: 2024/03/10 16:11:37 object: small obj w/o stream, parse until endobj
Fatal: pdfcpu: parse: corrupt name object
github.com/pdfcpu/pdfcpu/pkg/pdfcpu/model.init
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/model/annotation.go:75
runtime.doInit1
	/opt/homebrew/opt/go/libexec/src/runtime/proc.go:6740
runtime.doInit
	/opt/homebrew/opt/go/libexec/src/runtime/proc.go:6707
runtime.main
	/opt/homebrew/opt/go/libexec/src/runtime/proc.go:249
runtime.goexit
	/opt/homebrew/opt/go/libexec/src/runtime/asm_arm64.s:1197
dereferenceAndLoad: problem dereferencing object 4
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceAndLoad
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/read.go:2644
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObject
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/read.go:2727
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObjects
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/read.go:2793
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceXRefTable
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/read.go:2903
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.ReadWithContext
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/read.go:105
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.Read
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/pdfcpu/read.go:74
github.com/pdfcpu/pdfcpu/pkg/api.ReadContext
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/api/api.go:74
github.com/pdfcpu/pdfcpu/pkg/api.Validate
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/api/validate.go:43
github.com/pdfcpu/pdfcpu/pkg/api.ValidateFile
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/api/validate.go:91
github.com/pdfcpu/pdfcpu/pkg/api.ValidateFiles
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/api/validate.go:110
github.com/pdfcpu/pdfcpu/pkg/cli.Validate
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/cli/cli.go:27
github.com/pdfcpu/pdfcpu/pkg/cli.Process
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/pkg/cli/process.go:35
main.process
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/cmd/pdfcpu/process.go:150
main.processValidateCommand
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/cmd/pdfcpu/process.go:207
main.commandMap.process
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/cmd/pdfcpu/cmd.go:143
main.main
	/Users/mark.hender/go/pkg/mod/github.com/pdfcpu/pdfcpu@v0.7.0/cmd/pdfcpu/main.go:56
runtime.main
	/opt/homebrew/opt/go/libexec/src/runtime/proc.go:267
runtime.goexit
	/opt/homebrew/opt/go/libexec/src/runtime/asm_arm64.s:1197

Sorry, this is not enough to be able to analyze your problem.
Looks like a parsing problem - I would need the file in front of me in order to provide a fix.