Peefy / WebMagicSharp

This is a C# .net stadard crawler framework forked from webmagic written in java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WebMagicSharp.

A scalable crawler framework. It covers the whole lifecycle of crawler: downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler.And This is a good c# crawler framework easy to use.Surely support .Net Framework2.0-.Net Framework4.7 ; .Net Standard2.0; .NetCore2.0; Xamarin.Forms ; Xamarin.Android; Xamarin.iOS; Xamarin.Mac; Xamarin.Gtk ; WPF; SliverLight; WindowsForm; Thank you to use.

DotnetSpider

This project is no longer maintained.Please go to see:

https://github.com/dotnetcore/DotnetSpider

So

Features:

  • Simple core with high flexibility.
  • Simple API for html extracting.
  • Annotation with POJO to customize a crawler, no configuration.
  • Multi-thread and Distribution support.
  • Easy to be integrated.

Support Platform:

  • .Net Framework2.0 ~ .Net Framework4.7
  • .Net Standard2.0
  • .NetCore2.0
  • Xamarin : Xamarin.Forms ; Xamarin.Android; Xamarin.iOS; Xamarin.Mac; Xamarin.Gtk
  • WPF; SliverLight; WindowsForm etc...

Install:

  • WebMagicSharp:

Install-Package WebMagicSharp -Version 0.0.1.0 dotnet add package WebMagicSharp --version 0.0.1

  • WebMagicSharp.Extensions:

Install-Package WebMagicSharp.Extensions -Version 0.0.1. dotnet add package WebMagicSharp.Extensions --version 0.0.1

Get Started:

First crawler:

Thanks:

To write webmagic, I refered to the projects below :

About

This is a C# .net stadard crawler framework forked from webmagic written in java

License:Apache License 2.0


Languages

Language:C# 100.0%