renanbm / WebCrawler

Open source, multi-threaded website crawler written in C#, persisting in IBM's Cloudant NoSQL DB and configured for a Linux Docker image.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ASP.NET Core Web Crawler

This is an open source, multi-threaded and stateless website crawler written in C# / ASP.NET Core, persisting in IBM's Cloudant NoSQL DB and configured for a Linux Docker image.

Deploy to Bluemix

Run the app locally

  1. Install ASP.NET Core and the Dotnet CLI by following the Getting Started instructions
  • Clone this app
  • cd into the app directory and then src/WebCrawler.Spider.Web
  • Copy the value for the VCAP_SERVICES envirionment variable from the application running in Bluemix and paste it in the vcap-local.json file
  • Run dotnet restore
  • Run dotnet run
  • Access the running app in a browser at http://localhost:63939

About

Open source, multi-threaded website crawler written in C#, persisting in IBM's Cloudant NoSQL DB and configured for a Linux Docker image.

License:MIT License


Languages

Language:C# 96.0%Language:HTML 3.2%Language:Dockerfile 0.5%Language:CSS 0.3%Language:JavaScript 0.0%