exceljs / exceljs

Excel Workbook Manager

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Defined Names Break/Corrupt Excel File into Repair Mode

jacoblockett opened this issue · comments

Problem

After a bit of time trying to figure out why my files were always being corrupted when using workbook.xlsx.writeFile(), I came to find out it was the named ranges (under the formulas tab) in the template I was reading that was causing the issue - more specifically, it was the way the ranges were being extracted and re-entered into a "new" xlsx file. To give you an idea of what I was going through, here's the xml for causing the issue:

First, the Excel Error

Excel found unreadable content in 'test.xlsx'. Do you want to recover the contents of this 
workbook? If you trust the source of this workbook, click Yes

Working XLSX Example

<definedName name="_xlnm.Print_Area" localSheetId="0">'12pm'!$C$1:$AX$12</definedName>
<definedName name="_xlnm.Print_Titles" localSheetId="0">'12pm'!$C:$E</definedName>

How ExcelJS is Rewriting it

<definedName name="_xlnm.Print_Area" localSheetId="0">12pm!C1:AX12</definedName>
<definedName name="_xlnm.Print_Titles" localSheetId="0">&apos;12pm&apos;!$$C:$$E</definedName>

As you can see, the first definedName named _xlnm.Print_Area had all of its apostrophes and dollar signs removed. _xlnm.Print_Titles however is replacing every apostrophe with &apos; entities, and for some reason, it is adding a new dollar sign on every run of the program (i.e. if I were to run it again, it would rewrite it as &apos;12pm&apos;!$$$C:$$$E)

Fortunately, I found out the apostrophes and dollar signs aren't necessary for these kind of simple names, so the _xlnm.Print_Area name works perfectly fine. The unfortunate side of things is the _xlnm.Print_Titles corrupts the file and sends excel into repair mode, at which point it manually deletes the names causing the issue.

Because I got very impatient trying to navigate through the code to find out why the latter of the two was doing all sorts of weird stuff (hint hint cough cough document the code a little more thoroughly please ^^), I decided to hard-patch the issue with a little filter.

Solution

If you're having the same issue as I am, (v. 1.6.1) navigate to the exceljs/dist/es5/xlsx/xform/book/defined-name-xform.js file, go to line 25:25, and replace the

xmlStream.writeText(model.ranges.join(','));

with

xmlStream.writeText(model.ranges.join(',').split('').filter(s => s !== '$' && s !== "\'").join(''));

This will remove all apostrophes and dollar signs before it writes to a file, thus making it readable by Excel.

Of course this is just a temporary fix, and I hope @guyonroche will be able to fix it permanently soon.

Brilliant @huntinghawk1415 this works. I'll try and give this a look too.
If I don't find the cause, I'll put up a PR based off this 😁. Thanks

I had this same error when attempting to put disallowed characters into a sheet name.

TestSet001/TestCase001

was replaced by:

TestSet001-TestCase001

and VOILA!

Thanks! Will break into multiple files then

@atljoseph Worksheetnames: #705

@huntinghawk1415 @LesterLyu Could You open PR with fixes for this bug?

@huntinghawk1415 Fix deployed in 1.12.1 - could you recheck please and close if satisfied

@guyonroche Apologies for the very delayed reply - I can confirm that the quotations and duplicate $ signs are now fixed, however as of 3.8.0 there's another issue. My original file's named ranges are as such:

<definedNames>
    <definedName hidden="1" name="_xlnm._FilterDatabase" localSheetId="1">'730 Democrat'!$A$6:$CO$17</definedName>
    <definedName name="_xlnm.Print_Area" localSheetId="0">'530-Republican'!$D$1:$CN$18</definedName>
    <definedName name="_xlnm.Print_Area" localSheetId="1">'730 Democrat'!$D$1:$CN$18</definedName>
    <definedName name="_xlnm.Print_Area" localSheetId="2">HOLDS!$C$1:$CO$18</definedName>
    <definedName name="_xlnm.Print_Titles" localSheetId="0">'530-Republican'!$D:$E,'530-Republican'!$4:$5</definedName>
    <definedName name="_xlnm.Print_Titles" localSheetId="1">'730 Democrat'!$D:$E,'730 Democrat'!$4:$5</definedName>
    <definedName name="_xlnm.Print_Titles" localSheetId="2">HOLDS!$C:$F,HOLDS!$4:$5</definedName>
</definedNames>

And the generated file gives:

<definedNames>
    <definedName name="_xlnm._FilterDatabase">'730 Democrat'!$A$6:$CO$17</definedName>
    <definedName name="_xlnm.Print_Area" localSheetId="0">'530-Republican'!$D1:$CN18</definedName>
    <definedName name="_xlnm.Print_Titles" localSheetId="0">'530-Republican'!$D:$E,'530-Republican'!$4:$5</definedName>
    <definedName name="_xlnm.Print_Area" localSheetId="1">'730 Democrat'!$D1:$CN18</definedName>
    <definedName name="_xlnm.Print_Titles" localSheetId="1">'730 Democrat'!$D:$E,'730 Democrat'!$4:$5</definedName>
    <definedName name="_xlnm.Print_Area" localSheetId="2">'HOLDS'!$C1:$CO18</definedName>
    <definedName name="_xlnm.Print_Titles" localSheetId="2">'HOLDS'!$C:$F,'HOLDS'!$4:$5</definedName>
</definedNames>

As you can see, the absolute cell declarations are being omitted in the generated file (!$D$1:$CN$18 VS !$D1:$CN18) as well as the localSheetId and hidden attributes from the very first definedName.
I'll open a separate issue for this considering the current issue has been fixed. Thanks!