mdamien / eu-cases

:elephant: crawling and parsing of the EU Competition Commission cases

Home Page:http://ec.europa.eu/competition/elojade/isef/index.cfm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EU Competition cases

Extract all the cases from http://ec.europa.eu/competition/elojade/isef/index.cfm

To use, first install the dependencies: pip3 install -r req.txt

Then, there are two dataset you can get, cases.json with detailed infos or the CSVs with not many infos (but easier to process):

cases.json

What it does: going though all the results pages to get all the cases links and download those cases and then parse them

  • that's the 0get_list.py, 1download_cases.py, 2parse_cases.py scripts
  • you get output/cases.json at the end

the CSVs

What it does: do an export of each case category:

  • that's wip0_get_exports.py and wip1_exports_to_csv.py
  • you get mergers.csv, aids.csv and cartels.csv at the end in output/export/

Examples of exploitation of the CSVs:

TODO

  • manage to download the cases based on the export list
  • automate and make the data and stats available online

About

:elephant: crawling and parsing of the EU Competition Commission cases

http://ec.europa.eu/competition/elojade/isef/index.cfm


Languages

Language:Python 100.0%