datopian / datahub-qa

:package: Bugs, issues and suggestions for datahub.io

Home Page:https://datahub.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Encoding issues with `airport-codes_csv.csv`

yelizariev opened this issue · comments

Describe the issue

How to reproduce

  1. Download airport codes file from the page https://datahub.io/core/airport-codes

wget https://datahub.io/core/airport-codes/r/airport-codes.csv

  1. Check EPAR airport
grep EPAR, airport-codes_csv.csv
EPAR,small_airport,ArÅamów Airport,1455,EU,PL,PL-PK,Bircza,EPAR,,,"22.514298, 49.657501"

Expected behavior

The name must be Arłamów Airfield.

I also tried to convert from different encodings, but without success

grep EPAR, airport-codes_csv.csv  | iconv -f Windows-1252 -t UTF-8
EPAR,small_airport,Arłamów Airport,1455,EU,PL,PL-PK,Bircza,EPAR,,,"22.514298, 49.657501"

@yelizariev thank-you for reporting 🙏

this should be fixed upstream in the source dataset here https://github.com/datasets/airport-codes - would you like to report there and/or submit a fix - PRs are welcome. 🙂

INVALID / DUPLICATE. Think this is a duplicate of datasets/airport-codes#37

Plus should report this in https://github.com/datasets/airport-codes