explorerhq / django-sql-explorer

Easily share data across your company via SQL queries.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exports need to include a BOM

marksweb opened this issue · comments

Currently exporting a CSV and opening it in Excel ruins any unicode characters.

For example, Japanese characters display as 裕太

Exporter needs to do something like this to write the BOM to the beginning of the file;

    contents = StringIO()
    contents.write(codecs.BOM_UTF8.decode('utf-8'))
    writer = csv.writer(contents)

I'd like to work on that if that's ok?

Hi @bdista

It would be great if you're able to. I've not had the time yet so would love a PR to fix this!

Ok great, I'll prepare a PR, thanks!

Hi @marksweb,

While testing this issue locally I noticed that the paths to the requirement files in test_project/start.sh are not in line with the actual requirements files.

Should I change that in the PR I'll make for the BOM or do you prefer a dedicated issue and PR for this ? (sorry newbie here)

Thanks!

Hi @bdista, happy for you to include that fix.

I think I've broken those paths as I migrated from travis CI to github actions recently!

Hi @marksweb,

I've created a draft PR for this issue.
Not sure how users use this, so I'm wondering if this can be a breaking change for some if these exported csv get then processed by other tools that don't play nice with an UTF8 BOM.
If that's a problem, perhaps a setting in app_settings.py could define whether a BOM is to be included or not.
Happy to try out other approaches if needed!

Thanks

Hi @bdista

At the moment unicode characters don't export properly and nobody else has reported it, so I'm not concerned about this feature having heavy use and prefer to just get bugs fixed.

If this change lead to people raising issues because of the BOM, then we can consider an enhancement with settings I think.

Actually, we also stumbled into this problem in our place.
What we did is the following

We improved the built in exporter a little bit

from explorer.exporters import CSVExporter
from io import BytesIO


class CSVExporterBOM(CSVExporter):
    def _get_output(self, res, **kwargs):
        csv_data = super(CSVExporterBOM, self)._get_output(res, **kwargs)
        csv_data_io = BytesIO()
        csv_data_io.write(b'\xef\xbb\xbf')
        csv_data_io.write(csv_data.getvalue().encode('utf-8'))
        return csv_data_io

than we set this as the CSV exporter in the settings

EXPLORER_DATA_EXPORTERS = [
    ('csv', 'core.exporters.CSVExporterBOM'),
    ('excel', 'explorer.exporters.ExcelExporter'),
    ('json', 'explorer.exporters.JSONExporter')
]

So this is an existing problem I think, it was just easy to fix it like this.

Ok thanks for the feedback! I'll leave the PR as is then.