exceljs / exceljs

Excel Workbook Manager

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

read csv file issue

hehhuang opened this issue · comments

var Excel = require("exceljs");
// read from a file
var workbook = new Excel.Workbook();
workbook.csv.readFile("3.csv")
.then(worksheet => {
// use workbook or worksheet
console.log(worksheet.getColumn(1).values);
});
/output :
[ <1 empty item>,
'name',
2001-04-30T16:00:00.000Z,
2001-04-30T16:00:00.000Z,
'george's Grocery aa55' ]
/
but the csv file is like this:
name
George's Grocery 005
Quinn's Supermarket #5
george's Grocery aa55
the issue is that if the value contains space+number(or other Special symbol), it will output a date type value.For example "a 123" will output "0122-12-31T15:54:17.000Z". How to resolve this.(xlsx file can work well).

I am encountering a similar issue, except it does not appear to be tied to a space or a special symbol.

String 0020Z6TNY parses as a string: 0020Z6TNY. (expected).
String 0020Z6TRL parses as a string: 0020Z6TRL. (expected).
String 00210PRG1 parses as a date: 1920-12-01T06:00:00.000Z. (unexpected).
String 00210PRGS parses as a date: 1920-12-01T05:00:00.000Z. (unexpected).
String 00211MDN2 parses as a date: 1921-01-01T07:00:00.000Z. (unexpected).

I wonder if this could be related to day.js date validation?

As a workaround I passed in an options object that specifies how each column should be parsed. This allowed me to override the behavior on a column by column basis. Changing the dateFormats part of the options object does not appear to work, which again may point back at day.js.

I looked into it a little bit more, and this is what I found. I would submit a pull request but there are a few different things going on between exceljs and dayjs, and I'm not sure what would be the right approach to rectifying it.

CSV.createInputStream's default map attempts to convert a cell's value into a date with dayjs. If the result passes a test as a valid date, it is returned as a date object for the value for the cell. However, the implementation of dayjs in exceljs is not extended with the customParseFormat plugin. Another issue is that it looks like dayjs does not accept an array of formats as an argument, only a single string as a primitive or a property in an object.

So instead of the intended behavior, dayjs defaults to parsing anything that passes a permissive regex test into a valid date, which exceljs then returns as the value for the cell. I tested the regex and had it pass and fail as reflected in my test cases above, but it did not do the same for hehhuang's examples.

The regex is
/^(\d{4})-?(\d{1,2})-?(\d{0,2})[^0-9]*(\d{1,2})?:?(\d{1,2})?:?(\d{1,2})?.?(\d{1,3})?$/
which is satisfied by anything that passes
/^\d{4}-?\d/

@hehhuang, is the example you presented exactly the same as you used when you experienced the issue? If so, these may be two separate issues.

Thanks for reply.My solution is to convert csv file to xlsx file that will not execute the date conversion operation.That will solve this issue correctly.

I'm seeing a similar issue with data like this being converted to dates:
RN-56785
GG-74510
KS-49593

Problem occurs with ExcelJS v2 & v3.

Works ok by downgrading to v1.15.0.

There is a nasty issue waiting in the wings. Even when using dayjs with customParseFormat, dayjs does not strictly enforce the format (and seems to have no option to do so). This results in non-dates still getting interpreted as dates even when using a carefully specified date format.

I hacked up exceljs to fix the issue noted by @LibertyNJ above and discovered this issue.

Consider the following program:

const dayjs = require('dayjs');
const customParseFormat = require('dayjs/plugin/customParseFormat');
dayjs.extend(customParseFormat);

const date = dayjs("NZ1019316", "D/M/YY");
console.log("Valid?", date.isValid());
console.log("Date?", date.format());

Output:

Valid? true
Date? 2032-07-10T00:00:00+10:00

See iamkun/dayjs#605

So the reason exceljs v1 worked is because it uses momentjs rather than dayjs. Unfortunately dayjs is not a drop-in replacement in this instance. I humbly suggest reverting to momentjs.

This code in exceljs v1 (exceljs/lib/csv/csv.js) works as desired: handles array of date formats, true means date formats are strictly enforced.
v1
const dt = moment(datum, dateFormats, true);

New code: dateFormats array is invalid (can be solved with loop), true is ignored since dayjs does not have a strict mode (insurmountable with current dayjs).

v3
const dt = dayjs(datum, dateFormats, true);