Scrapy project (scrapy)

Scrapy project

scrapy

Organization data from Github https://github.com/scrapy

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Home Page:https://scrapy.org

GitHub:@scrapy

Scrapy project's repositories

scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Language:PythonLicense:BSD-3-ClauseStargazers:58234Issues:1773Issues:3189

scrapyd

A service daemon to run Scrapy spiders

Language:PythonLicense:BSD-3-ClauseStargazers:3058Issues:91Issues:316

scrapely

A pure-python HTML screen-scraping library

dirbot

Scrapy project to scrape public web directories (educational) [DEPRECATED]

Language:PythonStargazers:1635Issues:167Issues:0

quotesbot

This is a sample Scrapy project for educational purposes

Language:PythonLicense:MITStargazers:1316Issues:70Issues:3

parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Language:PythonLicense:BSD-3-ClauseStargazers:1268Issues:35Issues:125

scrapyd-client

Command line client for Scrapyd server

Language:PythonLicense:BSD-3-ClauseStargazers:772Issues:37Issues:84

w3lib

Python library of web-related functions

Language:PythonLicense:BSD-3-ClauseStargazers:401Issues:25Issues:61

cssselect

CSS Selectors for Python

Language:PythonLicense:NOASSERTIONStargazers:302Issues:23Issues:59

queuelib

Collection of persistent (disk-based) and non-persistent (memory-based) queues for Python

Language:PythonLicense:BSD-3-ClauseStargazers:276Issues:20Issues:19

loginform

Fill HTML login forms automatically

itemadapter

Common interface for data container classes

Language:PythonLicense:BSD-3-ClauseStargazers:66Issues:8Issues:30

protego

A pure-Python robots.txt parser with support for modern conventions.

Language:DIGITAL Command LanguageLicense:BSD-3-ClauseStargazers:65Issues:9Issues:15

scrapy.org

The scrapy.org website

itemloaders

Library to populate items using XPath and CSS with a convenient API

Language:PythonLicense:BSD-3-ClauseStargazers:48Issues:9Issues:33

booksbot

A crawler for http://books.toscrape.com

Language:PythonStargazers:40Issues:8Issues:0

scrapy-bench

A CLI for benchmarking Scrapy.

Language:PythonLicense:MITStargazers:31Issues:7Issues:15

scurl

Performance-focused replacement for Python urllib

Language:PythonLicense:Apache-2.0Stargazers:21Issues:5Issues:25

pypydispatcher

A fork of http://pydispatcher.sourceforge.net/ with PyPy support

Language:PythonLicense:NOASSERTIONStargazers:16Issues:8Issues:1

xtractmime

https://mimesniff.spec.whatwg.org/ implementation for Python

Language:PythonLicense:BSD-3-ClauseStargazers:13Issues:9Issues:2

base-chromium

base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/

Language:C++License:BSD-3-ClauseStargazers:7Issues:4Issues:0

scrapy-itemloader

[Archived] Library to populate Scrapy items using XPath and CSS with a convenient API

Language:PythonLicense:BSD-3-ClauseStargazers:6Issues:6Issues:2

form2request

Python 3.8+ library to build HTTP requests out of HTML forms

Language:PythonLicense:BSD-3-ClauseStargazers:4Issues:8Issues:2

gsoc2014-integration-tests

GSoC2014 - Scrapy Integration tests project

Language:ShellStargazers:3Issues:3Issues:0

scrapy-bench-speedcenter

Codespeed for scrapy-bench

Language:PythonStargazers:2Issues:4Issues:0

url-chromium

url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url

Language:C++License:BSD-3-ClauseStargazers:2Issues:5Issues:0