Encoding Issue: Arabic Characters Conflict with Another Language in the Same Row
SheikhThingsUp opened this issue · comments
I've noticed an issue with my script that queries a database and generates a CSV file. Specifically, when a row contains characters from languages other than Arabic and English, the Arabic characters in that row aren't encoded correctly. It seems that either the combine or string methods of CSV are causing this problem.
For example, in a line where the name "Pelé" is present, the Arabic word "نسيج" is transformed into random characters like "Ù�سÙ�ج,Ù�سÙ�ج". Interestingly, when I open the file in VI, I observe that the same word appears differently encoded in two different locations.
I've experimented with both the binary => 1 option and without it, but the issue persists.
my $csv = Text::CSV->new( { binary => 1 } );
open my $fh, ">:encoding(UTF-8)", "new.csv" or die "new.csv: $!";
print $fh "\x{feff}";
my $status = $csv->combine(@row); # combine columns into a string
my $line = $csv->string();
print $fh $line
When i take Text::CSV out of hte equation, and just write directly to the fine with minimum transformation (commas and quotes), it works fine.
The issue is also present in CSV_XS