paulyoder / LinqToExcel

Use LINQ to retrieve data from spreadsheets and csv files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Numbers returned as formatted value

JohnHopstaken opened this issue · comments

I have a Excel sheet which serves as data source. The sheet contains a number of calculated values.
The cells containing these calculated values have been formatted.
These calculated values which have to be imported by another application, but these values should be imported without the format applied. The raw calculated value so to speak.
Currently, however the data is read in the formatted way and it is caused by the fact that the connection string is forced to use IMEX=1.
Not specifying the IMEX or using IMEX=2 returns the unformatted values from the file.

What is the reason the the IMEX value is hardcoded to 1? Couldn't it just be replaced by using value 2 instead?
I can't think of a reason why you don't want the actual value from a cell.

The most flexible solution would be to allow the user to specify the IMEX option when setting up the ExcelQueryFactory. I could send a pull request which allows this option to be set.
I would only allow the options 1 and 2 as using value 0 would currently skip the first row when retrieving the data which I consider unexpected behavior.
Well at least I was surprised by this 😉

A PR sounds like a good idea. Let's refrain from using actual integer values though. Can you set it up with an enum with sensible names?

Is there any progress on this?

Hi Tom,

I will send the PR this week. I kinda forgot about this issue, but I will take it up this week.
Sorry for the delay.

It turned I could only reproduce this problem in the Excel file mentioned above.
When I implemented the option and created integration tests with a newly created test Excel file, both the "formatted value" and "raw value" options returned the unformatted values. 🤔
I don't know why the problem in my original Excel files was solved when I used the IMEX=2 option.
Therefore, as my tests were unfortunately unable to prove the expected difference between IMEX=1 and IMEX=2, there is no point in sending a PR with my changes. 😕

Hi John, i think this issue is also affected by the data types in first 8 rows of the excel. if the first 8 as string, the Engine sets the whole column for the sheet as strings so all values are returned as formatted values. If there is a mix of data types then the column is set as a mixed type and therefore depending on the imex used, either a formatted or unformatted value is returned.

https://stackoverflow.com/questions/50013333/reading-excel-with-oledb-not-displaying-correct-values

Hi Tom,

I think you could be right. But in the mentioned Excel sheet I had either the formatted value or the unformatted value depending on the IMEX setting. It was consistent.
I don't understand why things behave differently in my newly created test Excel sheet. Even if I use mixed input in a given column.
As said, because I cannot get things to work as envisioned I will leave things as they are.