guilatrova / tryceratops

A linter to prevent exception handling antipatterns in Python (limited only for those who like dinosaurs).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`UnicodeDecodeError` on Windows

tobiasraabe opened this issue · comments

Hi,

thanks for the great tool! I might be one of the few Windows users out here having special characters in their files because I get this error.

Traceback (most recent call last):
  File "c:\users\tobia\git\tryceratops\src\tryceratops\files\discovery.py", line 54, in _parse_python_files_from_dir
    parsed, filefilter = parse_file(filename)
  File "c:\users\tobia\git\tryceratops\src\tryceratops\files\parser.py", line 42, in parse_file
    tree = parse_tree(content)
  File "c:\users\tobia\git\tryceratops\src\tryceratops\files\parser.py", line 37, in parse_tree
    return ast.parse(content.read())
  File "C:\tools\miniconda3\envs\pytask\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 10491: character maps to <undefined>

The special character is "┐" which causes the error since files on my machine are opened with this default encoding:

>>> locale.getpreferredencoding()
'cp1252'

Adding encoding="utf-8 to this open call fixes the issue.

def parse_file(filename: str) -> Tuple[ast.AST, FileFilter]:
with open(filename, "r") as content:
tree = parse_tree(content)

Thank you very much for reporting and for the detailed explanation! @tobiasraabe

A fix is on the way and should be available soon :)

Thanks for the quick response, @guilatrova! Everything works fine with the new version.