dlcotter / SEC-EDGAR-networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CS626p1

CS626 - Large Scale Data Science project

This repository contains code and files in partial fulfilment of the requirements for CS626 by Daniel Cotter, Steve Roggenkamp, and Nima Seyedtalebi.

The code directory contains the code for this project.

Within code we have the scripts directory containing many of the shell, Python, and scala scripts we used for this project.

We also have the code/sec-wc/src/main/python subdirectory containing the Python code used to process the data.

The rdbms directory contains artifacts from our early experimentation with lodaing data inte a traditional database.

About


Languages

Language:Java 47.2%Language:Python 37.7%Language:Scala 8.7%Language:Shell 6.4%