dbro / csvquote

Enables common unix utlities like cut, awk, wc, head to work correctly with csv data containing delimiters and newlines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Changing the number of lines

christiansaiki opened this issue · comments

First of all thank you for this tool.
My issue is that when I'm using this tool it is changing the total number of lines.

wc -l old.csv
Displays: 344548 old.csv

csvquote old.csv > new.csv
wc -l new.csv
Displays: 344370 temp.csv

Have you got any idea why is this happening?
Thanks

Hi Christian-
This happens when there are newline characters enclosed within double quotes, like this:

r1c1,"r1c2
has a newline it it",r1c3
r2c1,r2c2,r2c3

Such a file has 3 lines according to a simple "wc -l" command, but only 2 after being sanitized by csvquote. Typically, the output of the regular csvquote command needs to have its embedded newlines and commas restored using the "csvquote -r" command as part of a pipeline of shell commands separated by the pipe ("|") character. There are examples in the instructions.