Scrapy project (scrapy)

Scrapy project

scrapy

Geek Repo

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Home Page:https://scrapy.org

Github PK Tool:Github PK Tool

Scrapy project's repositories

scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Language:PythonLicense:BSD-3-ClauseStargazers:50837Issues:1775Issues:2948

scrapyd

A service daemon to run Scrapy spiders

Language:PythonLicense:BSD-3-ClauseStargazers:2842Issues:93Issues:288

scrapely

A pure-python HTML screen-scraping library

dirbot

Scrapy project to scrape public web directories (educational) [DEPRECATED]

Language:PythonStargazers:1631Issues:168Issues:0

quotesbot

This is a sample Scrapy project for educational purposes

Language:PythonLicense:MITStargazers:1266Issues:70Issues:3

parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Language:PythonLicense:BSD-3-ClauseStargazers:1074Issues:35Issues:106

scrapyd-client

Command line client for Scrapyd server

Language:PythonLicense:BSD-3-ClauseStargazers:753Issues:38Issues:82

w3lib

Python library of web-related functions

Language:PythonLicense:BSD-3-ClauseStargazers:381Issues:25Issues:58

cssselect

CSS Selectors for Python

Language:PythonLicense:NOASSERTIONStargazers:281Issues:23Issues:58

loginform

Fill HTML login forms automatically

queuelib

Collection of persistent (disk-based) and non-persistent (memory-based) queues for Python

Language:PythonLicense:BSD-3-ClauseStargazers:260Issues:19Issues:18

itemadapter

Common interface for data container classes

Language:PythonLicense:BSD-3-ClauseStargazers:60Issues:9Issues:27

scrapy.org

The scrapy.org website

protego

A pure-Python robots.txt parser with support for modern conventions.

Language:DIGITAL Command LanguageLicense:BSD-3-ClauseStargazers:50Issues:9Issues:14

itemloaders

Library to populate items using XPath and CSS with a convenient API

Language:PythonLicense:BSD-3-ClauseStargazers:41Issues:9Issues:32

booksbot

A crawler for http://books.toscrape.com

Language:PythonStargazers:39Issues:9Issues:0

scrapy-bench

A CLI for benchmarking Scrapy.

Language:PythonLicense:MITStargazers:30Issues:8Issues:15

scurl

Performance-focused replacement for Python urllib

Language:PythonLicense:Apache-2.0Stargazers:21Issues:8Issues:25

pypydispatcher

A fork of http://pydispatcher.sourceforge.net/ with PyPy support

Language:PythonLicense:NOASSERTIONStargazers:15Issues:8Issues:1

xtractmime

https://mimesniff.spec.whatwg.org/ implementation for Python

Language:PythonLicense:BSD-3-ClauseStargazers:13Issues:9Issues:2

base-chromium

base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/

Language:C++License:BSD-3-ClauseStargazers:7Issues:5Issues:0

scrapy-itemloader

[Archived] Library to populate Scrapy items using XPath and CSS with a convenient API

Language:PythonLicense:BSD-3-ClauseStargazers:6Issues:6Issues:2

gsoc2014-integration-tests

GSoC2014 - Scrapy Integration tests project

Language:ShellStargazers:3Issues:4Issues:0

scrapy-bench-speedcenter

Codespeed for scrapy-bench

Language:PythonStargazers:2Issues:5Issues:0

url-chromium

url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url

Language:C++License:BSD-3-ClauseStargazers:2Issues:5Issues:0