fhamborg / news-please

news-please - an integrated web crawler and information extractor for news that just works

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error: slice indices must be integers or None or have an __index__ method

aljbri opened this issue · comments

I get the following Error:
[scrapy.core.scraper:168|ERROR] Spider error processing
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/scrapy/utils/defer.py", line 120, in iter_errback
yield next(it)
File "/usr/local/lib/python3.7/dist-packages/scrapy/utils/python.py", line 353, in next
return next(self.data)
File "/usr/local/lib/python3.7/dist-packages/scrapy/utils/python.py", line 353, in next
return next(self.data)
File "/usr/local/lib/python3.7/dist-packages/scrapy/core/spidermw.py", line 56, in _evaluate_iterable
for r in iterable:
File "/usr/local/lib/python3.7/dist-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
for x in result:
File "/usr/local/lib/python3.7/dist-packages/scrapy/core/spidermw.py", line 56, in _evaluate_iterable
for r in iterable:
File "/usr/local/lib/python3.7/dist-packages/scrapy/spidermiddlewares/referer.py", line 342, in
return (_set_referer(r) for r in result or ())
File "/usr/local/lib/python3.7/dist-packages/scrapy/core/spidermw.py", line 56, in _evaluate_iterable
for r in iterable:
File "/usr/local/lib/python3.7/dist-packages/scrapy/spidermiddlewares/urllength.py", line 40, in
return (r for r in result or () if _filter(r))
File "/usr/local/lib/python3.7/dist-packages/scrapy/core/spidermw.py", line 56, in _evaluate_iterable
for r in iterable:
File "/usr/local/lib/python3.7/dist-packages/scrapy/spidermiddlewares/depth.py", line 58, in
return (r for r in result or () if _filter(r))
File "/usr/local/lib/python3.7/dist-packages/scrapy/core/spidermw.py", line 56, in _evaluate_iterable
for r in iterable:
File "/usr/local/lib/python3.7/dist-packages/newsplease/crawler/spiders/sitemap_crawler.py", line 47, in parse
response, self.allowed_domains[0], self.original_url)
File "/usr/local/lib/python3.7/dist-packages/newsplease/helper_classes/parse_crawler.py", line 44, in pass_to_pipeline_if_article
response, source_domain, rss_title=None)
File "/usr/local/lib/python3.7/dist-packages/newsplease/helper_classes/parse_crawler.py", line 56, in pass_to_pipeline
.get_savepath(response.url)
File "/usr/local/lib/python3.7/dist-packages/newsplease/helper_classes/savepath_parser.py", line 212, in get_savepath
), savepath
File "/usr/lib/python3.7/re.py", line 194, in sub
return compile(pattern, flags).sub(repl, string, count)
File "/usr/local/lib/python3.7/dist-packages/newsplease/helper_classes/savepath_parser.py", line 211, in
SavepathParser.get_max_url_file_name_length(abs_savepath)
File "/usr/local/lib/python3.7/dist-packages/newsplease/helper_classes/savepath_parser.py", line 103, in append_md5_if_too_long
return "%s
%s" % (component[:component_size],
TypeError: slice indices must be integers or None or have an index method

pls use the bug report template