rmax / scrapy-redis

Redis-based components for Scrapy.

Home Page:http://scrapy-redis.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`scrapy_redis.scheduler.Scheduler` not compatible with `scrapy.dupefilters.BaseDupeFilter`

HairlessVillager opened this issue · comments

self.df = load_object(self.dupefilter_cls).from_spider(spider)

calls from_spider in a dupefilter class.

However, the from_spider ONLY implements in scrapy_redis.dupefilter.RFPDupeFilter, while scrapy.dupefilters.BaseDupeFilter not declares. Which will raise

  File "D:\Anaconda\anaconda3\envs\scrapy\Lib\site-packages\scrapy\crawler.py", line 160, in crawl
    yield self.engine.open_spider(self.spider, start_requests)
AttributeError: type object 'RFPDupeFilter' has no attribute 'from_spider'

and

  File "D:\Anaconda\anaconda3\envs\scrapy\Lib\site-packages\scrapy_redis\scheduler.py", line 149, in flush
    self.df.clear()
AttributeError: 'Scheduler' object has no attribute 'df'

Another user also met the same question: #242 (comment)