redacted / XKCD-password-generator

Generate secure multiword passwords/passphrases, inspired by XKCD

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Encoding when printing utf-8 to windows console

anlx-sw opened this issue · comments

I have encoding problems with xkcdpass to create passphrases from a wordlist with utf-8 chars and printing them the the windows console.

This works without problems on Linux the problem is only on windows.

Maybe the output has to be prepared somehow to work on the windows console:
https://neurocline.github.io/dev/2016/10/13/python-utf8-windows.html

Environment:

Python 3.6.4
xkcdpass installed via pip (xkcdpass-1.14.3)

I tested it with the compiled C:\Python36\Scripts\xkcdpass.exe which pip is installing.
The problem is the same in the normal cmd.exe - console as well as in the powershell.exe console.

Sample Output:
Herrscher Silber fördern Plädoyer verstehe Ablösung

I think that should read:
Herscher Silber fördern Plädoyer verstehe Ablösung

Update:
if i echo the "umlauts" with the python.exe directly started in the windows console i get no errors:

> python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Umlaute ÄÖÜäöÜ")
Umlaute ÄÖÜäöÜ
>>> 

If i try to use xkcdpass as a module the same error occures:

> python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import xkcdpass
>>> from xkcdpass import xkcd_password as xp
>>> wordfile = xp.locate_wordfile("ger-anlx-sorted.txt")
>>> mywords = xp.generate_wordlist(wordfile=wordfile)
>>> print(xp.generate_xkcdpassword(mywords))
Georgios verlangte Töne holte teilten unbekannt
>>>

This should read
Georgios verlangte Töne holte teilten unbekannt

As your test with printing unicode directly succeeded, I suspect this is because of this open call. I assume that the word file is stored on Windows as utf-8 as well, but the open() call uses the platform-dependent default encoding. On Linux, this is utf-8, on Windows, this is ISO-8859-1 (I think), which would explain your findings.

Can you try what happens when you change that line to this?

    with open(wordfile, encoding='utf-8') as wlf:

Quick test in Windows 10 suggests that @florianjacob fix above works. I've pushed the change, can you check if it works for you?

yes - i can confirm that this fix works for me. thanks.

This fix is in the 1.16.1 release - thanks again