gridu / INTRO_SPARK-SCALA_FOR_STUDENTS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Steps to generate input data for your project:

  1. Copy the bigdata-input-generator folder in the root directory of your exam repo
  2. Install the required python packages and run the main script from the root directory of the exam repo to generate the datasets (you can use virtualenv for this step):
pip install -r bigdata-input-generator/requirements.txt
python bigdata-input-generator/main.py
  1. Add capstone-dataset (generated folder) to .gitignore in your exam repo

About


Languages

Language:Python 97.3%Language:Shell 2.7%