Ilirski / COMP1204-CW2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

COMP1204-CW2 — steamdb web scraper

For Southampton UoSM COMP1204 CW2. This is a web scraper that scrapes website. It is a collection of scripts that scrape the website and store the data in a MySQL database.


Download the CLI programs below. Ensure they are added to PATH (/usr/local/bin) to run the scripts.

Note: curl-impersonate and pup cannot be downloaded through apt. Therefore, you'll need to install them manually by extracting the binaries from the zip files and adding them to path. Here's a general guide:

  1. Go to their GitHub repository and click on Releases in the right side menu.
  2. Download the correct binary file for your OS. For example on a Raspberry Pi which runs on the Linux OS and ARM architecture, aarch64 and arm usually work. If you're running these scripts on a VM, keep this in mind.
  3. Unzip the zip / tar file.
  4. Now, the last and most important part — sudo mv the binary files required for the script to /usr/local/bin or the scripts will not work!!! To verify that the binaries are added to PATH, try to run the binaries as a command in the CLI.


Note: The script assumes that following path exists: /var/lib/mysql/steam_games_db. This is the default installation directory if you install mysql with apt.

  1. Run to create the MySQL database to store the data.
  1. Run with one or more steam app id(s) that you want to scrape.
./ 730
./ 550 990 4000
  1. Run to scrape live steam and twitch data from all the app id(s) you've added.
  1. To plot the data, run with the app id(s) you want to plot.
./ -a 730 -v
./ -pf -a 550,730



Language:Shell 100.0%