sheet_to_json: inconsistent blank cell parsing
gliluaume opened this issue · comments
Hello,
I think there is an issue with sheet_to_json function depending on the file given. In some cases, blank cells are parsed to and empty string in output Javascript object, in other case these cells are undefined (no key) in output object.
I do not set defVal
attribute as I want undefined values to stay undefined.
You can see an example in following repo https://github.com/gliluaume/bug-xlsx
bug-xlsx.xlsx
is supposed to match XLSX format (generated from Google sheet)bug-xlsx-excel5.xls
is a Excel 5.0 file
Using node (see index.js) output is consistent (but not as expected):
./bug-xlsx.xlsx
[
{
Stuff1: 123,
Stuff3: 456,
Stuff4: 'AED',
Stuff5: 'BAR',
Stuff6: 'x',
Stuff7: '',
Stuff8: '',
Stuff9: 'A'
},
{
Stuff1: 123,
Stuff3: 456,
Stuff4: 'AED',
Stuff5: 'BUS',
Stuff6: 'x',
Stuff7: '',
Stuff8: '',
Stuff9: 'A'
},
{
Stuff1: 123,
Stuff3: 456,
Stuff4: 'AED',
Stuff5: 'APPL',
Stuff6: 'x',
Stuff7: '',
Stuff8: '',
Stuff9: 'A'
}
]
./bug-xlsx-excel5.xls
[
{
Stuff1: 123,
Stuff3: 456,
Stuff4: 'AED',
Stuff5: 'BAR',
Stuff6: 'x',
Stuff9: 'A'
},
{
Stuff1: 123,
Stuff3: 456,
Stuff4: 'AED',
Stuff5: 'BUS',
Stuff6: 'x',
Stuff9: 'A'
},
{
Stuff1: 123,
Stuff3: 456,
Stuff4: 'AED',
Stuff5: 'APPL',
Stuff6: 'x'
}
]
Using the library in a browser (firefox 104.0.2 (64 bits) on Windows 10), I can see the same behaviour.
The files themselves have different contents. You can inspect with the parser:
% for F in ./bug-xlsx*; do echo $F; node -pe 'var wb = require("xlsx").readFile("'$F'"); wb.Sheets[wb.SheetNames[0]].G2'; done
./bug-xlsx-excel5.xls
undefined
./bug-xlsx.xlsx
{ t: 's', v: '', r: '<t></t>', h: '', w: '' }
The XLSX file literally has blank strings:
<!-- xl/worksheets/sheet1.xml -->
<c r="G2" s="4" t="s"><v>11</v></c>
<!-- xl/sharedStrings.xml -- you have to count starting from index 0 -->
<si><t></t></si>
Google Sheets must be translating the unspecified cells to blank string cells. You can always programmatically delete those fields from the output of sheet_to_json
:
const newobj = obj.map(r => Object.fromEntries(Object.entries(r).filter(([k,v]) => v !== "")));
Hello,
Thank you for your quick response!
Yes, I handle it by removing properties with empty string value, for now.