Scrapy-redis usage related issues
Hao1617 opened this issue · comments
Description
Let me introduce my python program: First, I get the data I need to crawl from MySQL and then process the data into a URL through
Submit to redis and wait for crawling. If crawling fails, it will be resubmitted
Successful data is stored in MySQL through the pipeline. Get data from MySQL to redis once a day.I can confirm that it only runs once a day.
After running, the amount of data stored in redis is abnormal.The data is larger than expected At present, it is found that it is caused by resubmitting the URL when crawling errors. How can I solve it?
Pass in the request meta the number of retry so that it doesn't retry indefinitely.
Pass in the request meta the number of retry so that it doesn't retry indefinitely.
ok
Pass in the request meta the number of retry so that it doesn't retry indefinitely.
Does it mean that if the request fails, I don't need to add it again manually, the framework will automatically retry?
Pass in the request meta the number of retry so that it doesn't retry indefinitely.
Under what circumstances will it retry
This is a problem with program logic. Please help me close this discussion.