Add an option to specify minimum record length
yruslan opened this issue · comments
Ruslan Yushchenko commented
Background
This come from an issue with some ASCII files, but is relevant to EBCDIC as well.
Cobrix ignores all empty lines of ASCII files. But some files contain EOF character at the end:
aaaa bbbb 1234
cccc dddd 5678
EOF
Since there is a character in a row, it is treated as a record resulting one additional record:
+-----+-----+-----+
|A |B |C |
+-----+-----+-----+
|aaaa |bbbb |1234 |
|cccc |dddd |5678 |
|null |null |null |
+-----+-----+-----+
Should be
+-----+-----+-----+
|A |B |C |
+-----+-----+-----+
|aaaa |bbbb |1234 |
|cccc |dddd |5678 |
+-----+-----+-----+
Feature
Add an option to specify minimum record length.
Proposed Solution
.option("minimum_record_length", 2)