msramalho / feup-erasmus

A @FEUP maniac's free time put to contentious good use - otherwise known as Erasmus OCD

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FEUP Erasmus

A tool that makes use of sigpy to scrape the ERASMUS page at FEUP into JSON files, for time analysis and other fun ideas people usually have.

Features

  • python extract.py [USERNAME|201403027] [YEAR|2018] scrapes all the current hardcoded allocations (ids should be updated annually as they are unpredictable) into archive/COURSE/YEAR/yyyy_mm_dd_hh_mm.json, this should be executed on a daily basis (or at the rate of the system updates). This folder (archive) is gitignored, so it will only persist on your local clone.
  • python anonymize.py since the identity of students and their GPA is not public information it needs to be anonymized, this script takes care of that and creates a duplicate database in anonymous/COURSE/YEAR/yyyy_mm_dd_hh_mm.json using funny, yet consistent, anonymous animals for students.
  • Jupyter notebook discover previous years can be used to bruteforce url IDS and find valid ones, so that past allocations can be found (I already did this for MIEIC up to 2019)
  • Additionally, @antonioalmeida has created a google sheets that is reusable for further years that allows for real-time updates if all students specify their preferences. The sheet can be copied from here.

Previous MIEIC years

Legacy

This is probably a stationary repo, as far as my dedication goes, but...

Here are some ideas for people that might want to improve it:

  • Extend to other faculties (maybe even works by changing the URL)
  • Perform the scrapping using a cron job on future years and PR

About

A @FEUP maniac's free time put to contentious good use - otherwise known as Erasmus OCD


Languages

Language:Python 54.2%Language:Jupyter Notebook 45.8%