ejwa / gitinspector

:bar_chart: The statistical analysis tool for git repositories

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UnicodeEncodeError: 'charmap' codec can't encode character '\u0107' in position 865: character maps to <undefined>

rameshrr opened this issue · comments

gitinspector -F html > test.html

Traceback (most recent call last):
File "C:\Users\EMD BACKUP\AppData\Roaming\npm\node_modules\gitinspector\gitinspector.py", line 24, in
gitinspector.main()
File "C:\Users\EMD BACKUP\AppData\Roaming\npm\node_modules\gitinspector\gitinspector\gitinspector.py", line 206, in main
run.process(repos)
File "C:\Users\EMD BACKUP\AppData\Roaming\npm\node_modules\gitinspector\gitinspector\gitinspector.py", line 86, in process
outputable.output(BlameOutput(summed_changes, summed_blames))
File "C:\Users\EMD BACKUP\AppData\Roaming\npm\node_modules\gitinspector\gitinspector\output\outputable.py", line 39, in output
outputable.output_html()
File "C:\Users\EMD BACKUP\AppData\Roaming\npm\node_modules\gitinspector\gitinspector\output\blameoutput.py", line 95, in output_html
print(blame_xml)
File "C:\Program Files (x86)\Python36-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0107' in position 865: character maps to

Hi @rameshrr . This is not related to gitinspector, but to your terminal or destination pipe not being able to handle the characters. What is interesting though is that it defaults to cp1252 for some reason despite the redirection. It should default to UTF-8. That means set_stdout_encoding() is failing to set the encoding to UTF-8 for some reason.

This could actually happen if you run gitinspector under Python3 with PYTHIONIOENCODING forced/set to cp1252. It could also happen if isatty() returns true depsite the redirection. There are countless issues covering this already here on the tracker.

In what terminal are you running gitinspector? You could try setting PYTHIONIOENCODING to UTF-8 before running gitinspector - this would probably solve your issues.

Just to more clearly explain what is going on - your terminal/pipe is requesting CP1252 characters from Python, so python is trying to comply and output via that encoding. Unfortunately, certain characters from UTF-8 won't map to that encoding.

Understood..

fyi, I'm using windows command prompt(windows 10).

It will be better if application handles the exception and prints some useful message to the user - Just my thoughts :)

This has also been discussed in the past. The exception from Python already provides more information than any error message. There are many exceptions that can occur from invalid decoding and terminal issues in Python - swallowing these messages is generally not a good idea and would just make it more difficult to handle. There's also a lot of them. We could maybe extend them and add some additional help... But I'm uncertain how useful it would really be. Gitinspector already prints certain hints concerning PYTHONIOENCODING to stderr under certain conditions.

Try using PowerShell (provided with windows since v9) and see if that makes a difference. The normal command prompt isn't really usable for any kind of decent work.