vaab / gitchangelog

Creates a changelog from git log history.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UnicodeEncodeError character maps to undefined

LexiconCode opened this issue · comments

Trying to run a repository produces the following error message.

Error: Exception while running 'gitchangelog':
| Traceback (most recent call last):
| File "c:\python27\lib\site-packages\gitchangelog\gitchangelog.py", line 1972, in main
| config.get("publish", stdout)(content)
| File "c:\python27\lib\site-packages\gitchangelog\gitchangelog.py", line 1445, in stdout
| safe_print(chunk)
| File "c:\python27\lib\site-packages\gitchangelog\gitchangelog.py", line 1820, in safe_print
| content = content.encode(_preferred_encoding)
| File "c:\python27\lib\encodings\cp1252.py", line 12, in encode
| return codecs.charmap_encode(input,errors,encoding_table)
| UnicodeEncodeError: 'charmap' codec can't encode character u'\u011b' in position 64: character maps to <undefined>

Try running it with Python 3.

I'm running Python 3, but it stops writing after detecting a unicode lambda character (\u03bb) in a git commit. I get a UnicodeError too:

λ gitchangelog > CHANGELOG.md
UnicodeEncodeError:  There was a problem outputing the resulting
changelog to your console. This probably means that the changelog 
contains characters that cant be translated to characters ...

Strangely, the command alone prints in the terminal

λ gitchangelog

I just need to capture the terminal output. I wouldn't mind ignoring the unicode characters given the option.

OS: Windows 7
Python: 3.6
gitchangelog: 3.0.3+

Have the same issue under openSUSE 15.1:

Error: Exception while running 'gitchangelog':
| Traceback (most recent call last):
| File "/home/ralph/bin/gitchangelog", line 1972, in main
| config.get("publish", stdout)(content)
| File "/home/ralph/bin/gitchangelog", line 1445, in stdout
| safe_print(chunk)
| File "/home/ralph/bin/gitchangelog", line 1820, in safe_print
| content = content.encode(_preferred_encoding)
| UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 38694: ordinal not in range(128)

And this is my fix:

#!/usr/bin/env python3

@roseswe In which file did you put that shebang?

I found a workaround on Windows with Python < 3.7:

Code

Per terminal session, run:

λ set PYTHONIOENCODING="UTF-8"
λ gitchangelog

This sets a Python environ. var. to UTF-8.


Details

It seems Python 3.6 defaults to the system's encoding. On Windows 7, that encoding is CP-1252, resulting in the error described above. However, this behavior was changed in Python 3.7, which introduced PYTHONUTF8=1. 1 sets the default to UTF-8; 0 assumes the system encoding.

Thus on any system, I extend @luis-valdes-21b 's suggestion to:

try using Python 3.7+ or assign PYTHONIOENCODING="UTF-8"

Since this is a resolved Python problem, I believe this issue can be closed now.

Thanks everyone this solves the issue for me.

Thanks everyone this solves the issue for me.

Can we expect updated sources? ;-)

@roseswe In which file did you put that shebang?

In the main python script "gitchangelog.py"