Error on Email Harvest from Google Search
ondrejtoral opened this issue · comments
Please provide the following details.
Host System
- OS : Ubuntu 16.04.03 LTS
- Python version (
python --version
) : 2.7.12 - Pip version (
pip --version
) : 9.0.1 - Output of
pip freeze
: [https://gist.github.com/ondrejtoral/bf4a0b3f5120989eab811f084dd96512]
Error Description
I have run python 2.7 Belati.py -c mega.cz
(and a couple different addresses) as root. The Google search is blocked and when trying to harvest emails from google I get this error:
[*] Perfoming Email Harvest from Google Search... Error code: 503 [-] Not found or Unavailable. None Traceback (most recent call last): File "Belati.py", line 432, in <module> BelatiApp = Belati() File "Belati.py", line 155, in __init__ self.harvest_email_search(domain, proxy) File "Belati.py", line 323, in harvest_email_search self.db.insert_email_result(self.project_id, util.clean_list_string(harvest_result)) File "/home/trl/Belati/plugins/util.py", line 74, in clean_list_string return str(", ".join(text)) TypeError: can only join an iterable
Hi, thank you for the quick fix! It went well until searching for PDFs:
[*] Searching PDF Document... Error code: 503 Traceback (most recent call last): File "Belati.py", line 430, in <module> BelatiApp = Belati() File "Belati.py", line 165, in __init__ self.harvest_document(domain, proxy) File "Belati.py", line 338, in harvest_document public_doc.init_crawl(domain_name, proxy_address, self.project_id) File "/home/trl/Belati/plugins/harvest_public_document.py", line 52, in init_crawl self.harvest_public_doc(domain, "pdf", proxy_address) File "/home/trl/Belati/plugins/harvest_public_document.py", line 70, in harvest_public_doc data = re.findall(regex, data) File "/usr/lib/python2.7/re.py", line 181, in findall return _compile(pattern, flags).findall(string) TypeError: expected string or buffer
Ah i see, will update soon. Thanks for remindering this issue. Any other problems?
So far so good, if I find something else, I will post another issue.
Must find some proxies, without google search, the report is very basic.
Thank you for the great work!
Okay. Thanks for your report. I will update and close this issue :)