martijn / xsv

High performance, lightweight .xlsx parser for Ruby that provides nothing a CSV parser wouldn't

Home Page:https://storck.io/posts/announcing-xsv-1-0-0/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Encoding::UndefinedConversionError when parsing non-ASCII character

vmsp opened this issue · comments

commented

To reproduce, just create a new XLSX with a single cell with content :

Encoding::UndefinedConversionError ("\xC3" from ASCII-8BIT to UTF-8)

Also, after having input the date 05/05/1995, xsv gives me 1995-05-05. Is this expected? And is there a way to override this behavior?

I'm using version 1.0.0.pre

Thanks!

Seems I forgot all about string encoding in the new parser. I will try to fix that today.

As for the dates, Xsv should always return a Date object for cells with a date. The Excel formatting is lost in the translation, because of the 'Excel separated values' philosophy of Xsv.

I made a slightly brute-force update on the master branch. Can you test if this resolves the encoding issue for you?

5fe78a8

commented

I haven't yet found a gem that allows me to avoid casting to appropriate type and just return everything as written. Maybe I could tempt you to consider this functionality?

Anyway, the issue is indeed fixed. Thank you!

Thanks for your feedback!

Excel stores dates as an integer (days since epoch), so the raw information wouldn't be useful to most users. Returning the formatted date, time or number as it appears in Excel would involve parsing and applying the Excel number formats. It is possible, but I currently don't have the time for it.