Pastebin.com error
TheFiZi opened this issue · comments
It appears the rest of the scraping is working just fine but I noticed Pastebin was having some problems today.
ERROR: URL Error ############################# http://pastebin.com/archive
Thread for pastebin.com crashed unexpectectly, recovering...: 'NoneType' object is not iterable
Traceback (most recent call last):
File "pystemon.py", line 92, in run
last_pasties = self.getLastPasties()
File "pystemon.py", line 106, in getLastPasties
htmlPage, headers = downloadUrl(self.archive_url)
TypeError: 'NoneType' object is not iterable
and
Downloading pasties from pastebin.com. Next download scheduled in 34 seconds
Downloading url: http://pastebin.com/archive with proxy: None and user-agent: None
ERROR: HTTP Error ############################# http://pastebin.com/archive
No HTML content for page http://pastebin.com/archive
Same issue with snipt.net
Downloading pasties from snipt.net. Next download scheduled in 29 seconds
Downloading url: https://snipt.net/?rss with proxy: None and user-agent: None
ERROR: URL Error ############################# https://snipt.net/?rss
Thread for snipt.net crashed unexpectectly, recovering...: 'NoneType' object is not iterable
Traceback (most recent call last):
File "pystemon.py", line 92, in run
last_pasties = self.getLastPasties()
File "pystemon.py", line 106, in getLastPasties
htmlPage, headers = downloadUrl(self.archive_url)
TypeError: 'NoneType' object is not iterable
It might be that pastebin is throttling your requests.
Could you do a network capture (wireshark) while running pystemon (and giving that issue)?
You can email me this at christophe at vandeplas dot com
I think you're right. The problem seems intermittent. I'll get a packet capture later tonight.
This was a throttling problem.