ashtuchkin / iconv-lite

Convert character encodings in pure javascript.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

No matter how it is used, the parsing is garbled. And the document can still be viewed

snoopy83101 opened this issue · comments

help!
This PDF document (China's general invoice) can be viewed, but it is always garbled when parsed.
Can you help me?

image
image

zzz.pdf

not sure how I can help. You seem to be using iconv-lite correctly, maybe something with the pdf parser library?

Perhaps the problem is with the PDF parser, but unfortunately, the PDF parser does not parse all PDFs. About 50% of Chinese invoices cannot be parsed by a PDF parser.
So I can only use PDF2IMG and then OCR, so I don’t need to consider the text encoding.