humandecoded / twayback

Automate downloading archived deleted Tweets.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parsing fails when encountering text in cyrillic

Traut89now opened this issue · comments

Hi,
when I try to search for deleted tweets from an account that uses cyrillic, the process fails with the following exception:

Parsing text...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00, 2.00s/it]
Traceback (most recent call last):
File "twayback.py", line 184, in
File "encodings\cp1250.py", line 19, in encode
UnicodeEncodeError: 'charmap' codec can't encode character '\u0430' in position 0: character maps to
[20476] Failed to execute script 'twayback' due to unhandled exception!

Anything that could be done on my end? Thanks,

Yes, I’m currently working on it! Should be an easy fix. I will upload a working release today and ping you!!!

@Traut89now Fixed! Check out the 02/14/2022 release and let me know how it goes please 😊

Russian:
image

Unfortunately, the fix seems to have broken the tool altogether for me. The process will run without throwing an error, but the "both" option generates only an empty .txt file and .html files have no data as well. I tested this on the @vostsbiry twitter account you used as well as other accounts that don't use any irregular characters in their tweets.

image
image

My bad 😅 It's my fault, basically the downloader and text parser were dealing with an empty list of URLs, so I made sure the list isn't empty anymore! Try again, and don't hesitate to report any more issues, I want to know if there exist problems. Thank you!

Works perfectly now, thank you 👍 :) !