gabrielmiller / Py-Webscraper

An awful search engine and crawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Toastie

About

Toastie is a basic web spider written in Python that scans webpages' content and enters it into a database. The front-end of the website allows the user to search through scanned pages.

Technologies

Python, Flask, Werkzeug, Jinja, Requests, urlparse, re, robotparser, pymongo, MongoDB, Twitter Bootstrap

Installation

Install python2.7, pip, mongodb, and the python-mongodb connector
sudo pip install Virtualenv
git clone http://www.github.com/gmiller2007/Py-Webscraper
source bin/activate
bin/pip install flask
bin/pip install pymongo
bin/pip install BeautifulSoup
deactivate

Run the Application

source bin/activate
bin/python2.7 application.py

Notes

To safely close the virtual environment run the command 'deactivate'

About

An awful search engine and crawler


Languages

Language:Python 98.1%Language:JavaScript 1.8%Language:Shell 0.1%Language:C 0.0%