BOM and enclosured multiline data
cbepxpa3ym opened this issue · comments
Bug Report
Information | Description |
---|---|
Version | 9.8.0 |
PHP version | 8.1 |
OS Platform | Windows 11 |
Summary
There is an issue with BOM and enclosured multiline data.
Standalone code, or other way to reproduce the problem
Option 1
Create new file in Excel and enter this text into the 1st column:
Save it as CSV with UTF8 encoding and run the following code:
var_export(Reader::createFromPath($path)->fetchOne()[0]);
Option 2
Run the following code:
$data = Reader::BOM_UTF8 . <<<CSV
"start
end"
CSV;
var_export(Reader::createFromString($data)->fetchOne()[0]);
Expected result
'start
end'
Actual result
'"start'
Note
It works as expected if there is no BOM.
@cbepxpa3ym thanks for reporting the bug. This on is due to the underlying PHP CSV parser. Not sure If I can easily fix it. I will have to take a closer look at how to remove the BOM sequence prior to ask PHP to parse the CSV
Once the next minor version is released you will be able to do the following:
<?php
use League\Csv\Reader;
use League\Csv\SkipBOMSequence;
SkipBOMSequence::register();
$reader = Reader::createFromString($sequence.$data);
$reader->includeInputBOM();
$reader->addStreamFilter(SkipBOMSequence::getFiltername());
Which will fix your issue. If you can't wait until then you can already copy/paste the following class https://github.com/thephpleague/csv/blob/master/src/SkipBOMSequence.php in your codebase. Make sure to read the documentation to understand its usage and shortcomings https://csv.thephpleague.com/9.0/connections/bom/#removing-the-bom-sequence .