s2t2 / gwu-courses-py

Home Page:https://gwu-courses-ad16e9fbadaf.herokuapp.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GWU Course Catalogue Scraper (Python)

Gives you a CSV file of courses in a given subject, based on your filter criteria.

Maintainability

Tests

Prerequisites

Chromedriver

Installing Chromedriver (and Chrome Binary) on Mac:

brew install chromedriver
brew upgrade chromedriver
brew install google-chrome

NOTE: on Mac you need to also mark chromedriver as a trusted app from the Security and Privacy settings

Setup

Virtual Environment

Setup a virtual environment:

conda create -n courses-env python=3.10
conda activate courses-env

Install packages:

pip install -r requirements.txt

Configure environment variables in ".env" file:

# this is the ".env" file...

FLASK_APP="web_app"
HEADLESS_MODE=true

# Mac:
CHROME_BINARY_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"

Usage

Command Line App

Course Search Version 1 (Deprecated)

Browse the course catalogue for a give subject, and download a CSV file of the course listings:

python -m app.browser

# TERM_ID="202203" SUBJECT_ID="CSCI" python -m app.browser

NOTE: this creates a new subdirectory in the "exports" dir corresponding with the subject name, and downloads the files there (e.g. "exports/CSCI/courses.csv")

After doing this for all interested subjects, compile a single file of all courses:

python -m app.compiler

Course Search Version 2

This newer version stores the data in memory, and also leverages threading to speed up the process:

python -m app.multisubject

# HEADLESS_MODE=true TERM_ID="202303" SUBJECT_IDS="CSCI, EMSE" python -m app.multisubject

Web App

Put the app in headless mode via HEADLESS_MODE=true in the ".env" file.

Run local webserver (then visit localhost:5000):

#FLASK_APP=web_app flask run

# flask --app web_app run --debug

flask --app web_app run --debugger

Testing

Run tests:

pytest
# APP_ENV="CI" pytest

Deploying

About

https://gwu-courses-ad16e9fbadaf.herokuapp.com/


Languages

Language:HTML 73.7%Language:Python 26.3%Language:Procfile 0.0%