yomurb / yomu

Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)

Home Page:http://github.com/yomurb/yomu

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

incompatible character encodings: ASCII-8BIT and UTF-8

venkatasubbaiah opened this issue · comments

When text extraction from PDF files it showing the "Incompatible character encodings: ASCII-8BIT and UTF-8 " error. Please help me.

It sounds like something is trying to put utf8 encoded text into a string marked as ascii encoded.

Can you provide an example of code and a file that causes the error?

These blog posts are a good intro to how modern ruby versions handle string encoding.
http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/
http://yehudakatz.com/2010/05/17/encodings-unabridged/