scalingexcellence / scrapybook

Scrapy Book Code

Home Page:http://scrapybook.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error when running scrapy shell commands examples as well as running spider code

gtinjr opened this issue · comments

I am getting the an error when running the following shell command in the docker scrapybook_dev_1 shell:
scrapy shell http://web:9312/properties/property_000000.html

The same when running the following spider from A scrapy project section of the book:
scrapy crawl basic

2017-11-03 02:47:59 [boto] DEBUG: Retrieving credentials from metadata server.
2017-11-03 02:48:00 [boto] ERROR: Caught exception reading instance data
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/boto/utils.py", line 210, in retry_url
r = opener.open(req, timeout=timeout)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
URLError:
2017-11-03 02:48:00 [boto] ERROR: Unable to read instance data, giving up

This error does not prevent the commands from the shell to finish and it neither prevents the spider to run. I also noticed that a similar, if not the same, issue was fixed on scrapy 1.1. I just want to make sure that this was a known issue with these docker images. Maybe putting this on the README.md page or updating the images with the latest scrapy may help.

Yes - it's known... it's a pity and it confused people. All my settings.py e.g. this set those keys to empty which mitigates the problem. I can't wait till the 2nd version of the book is out, early next year and all those problems will go away.

Disabling S3 in settings.py solved the problem

DOWNLOAD_HANDLERS = {
's3': None,
}