Find me on Twitter: @newfront Find me on Medium @newfrontcreative About Twilio: Twilio
- Docker (at least 2 CPU cores and 8gb RAM)
- System Terminal (iTerm, Terminal, etc)
- Working Web Browser (Chrome or Firefox)
Install Docker Desktop (https://www.docker.com/products/docker-desktop)
Additional Docker Resources:
- 2 or more cpu cores.
- 8gb/ram or higher.
- The Apache Spark configuration is stored in
/install/spark-defaults.conf
. You can update those settings to match the configuration of your Docker setup.
The spark defaults are below.
spark.cores.max 4
spark.executor.memory 8g
- Install Docker (See Docker above)
- Once Docker is installed. Open up your terminal application and
cd /spark-intro-to-ml/docker
./run.sh install
./run.sh start
- The Main Application should now be running at http://localhost:8080/
docker exec -it redis5 redis-cli
should show127.0.0.1:6379>
this should be a new install. Try inputtinginfo
to see the redid-server configuration.
The following command will let you view all commands hitting redis during the workshop
docker exec -it redis5 redis-cli monitor
- Open up your Browser on http://localhost:8080 and you should see the Zeppelin Home Screen
- Click on the Notebook named 1-LoadAndQuery. When this loads select the
spark
andmd
interpreters to attach to the notebook and then press the button at the top that says Run All Paragraphs - This first note in the notebook will take you through to 2-LoadTransformAndCluster and finally to 3-ReloadAndPredictLogistically
https://hub.docker.com/_/redis/
https://github.com/RedisLabs/spark-redis
- Netflix Movies and Shows: https://www.kaggle.com/shivamb/netflix-shows
- House Prices: https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data
- GoodReads Books: https://www.kaggle.com/jealousleopard/goodreadsbooks