Parsing fails when encountering text in cyrillic

Question

Parsing fails when encountering text in cyrillic

Traut89now opened this issue 3 years ago · comments

Hi,
when I try to search for deleted tweets from an account that uses cyrillic, the process fails with the following exception:

Parsing text...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00, 2.00s/it]
Traceback (most recent call last):
File "twayback.py", line 184, in
File "encodings\cp1250.py", line 19, in encode
UnicodeEncodeError: 'charmap' codec can't encode character '\u0430' in position 0: character maps to
[20476] Failed to execute script 'twayback' due to unhandled exception!

Anything that could be done on my end? Thanks,

Mennaruuk · Answer 1 · Tue Feb 15 2022 00:46:19 GMT+0800 (China Standard Time)

Yes, I’m currently working on it! Should be an easy fix. I will upload a working release today and ping you!!!

Mennaruuk · Answer 2 · Tue Feb 15 2022 03:16:33 GMT+0800 (China Standard Time)

@Traut89now Fixed! Check out the 02/14/2022 release and let me know how it goes please 😊

Russian:

Traut89now · Answer 3 · Tue Feb 15 2022 04:18:37 GMT+0800 (China Standard Time)

Unfortunately, the fix seems to have broken the tool altogether for me. The process will run without throwing an error, but the "both" option generates only an empty .txt file and .html files have no data as well. I tested this on the @vostsbiry twitter account you used as well as other accounts that don't use any irregular characters in their tweets.

Mennaruuk · Answer 4 · Tue Feb 15 2022 05:20:57 GMT+0800 (China Standard Time)

My bad 😅 It's my fault, basically the downloader and text parser were dealing with an empty list of URLs, so I made sure the list isn't empty anymore! Try again, and don't hesitate to report any more issues, I want to know if there exist problems. Thank you!

Traut89now · Answer 5 · Wed Feb 16 2022 01:00:11 GMT+0800 (China Standard Time)

Works perfectly now, thank you 👍 :) !