jxlil / scrapy-aiohttp-downloader

Scrapy download handler that integrates aiohttp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scrapy-aiohttp-downloader

version

scrapy-aiohttp-downloader is a Scrapy download handler.

Installation

pip install scrapy-aiohttp-downloader

Activation

Replace the default http and/or https Download Handlers through DOWNLOAD_HANDLERS

DOWNLOAD_HANDLERS = {
    "http": "scrapy_aiohttp_downloader.AioHTTPDownloadHandler",
    "https": "scrapy_aiohttp_downloader.AioHTTPDownloadHandler",
}

Also, be sure to install the asyncio-based Twisted reactor:

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

Basic usage

Set the aiohttp Request.meta key to download a request using aiohttp:

import scrapy


class AioHTTPSpider(scrapy.Spider):
    name = "spider"
    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "http": "scrapy_aiohttp_downloader.AioHTTPDownloadHandler",
            "https": "scrapy_aiohttp_downloader.AioHTTPDownloadHandler",
        },
        "TWISTED_REACTOR": "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
    }

    def start_requests(self):
        yield scrapy.Request(
            "https://example.com/",
            meta={"aiohttp": True},
        )

About

Scrapy download handler that integrates aiohttp

License:MIT License


Languages

Language:Python 100.0%