rickfarmer / data-science-vm

A Big Data Analytics VM for doing Data Science. It provides a huge kickstart to those working with the Big Data Analytics side of Data Science. Essentially, this project automates the creation of the Big Data Scientist's toolbox on a virtual machine (VM). In a few minutes one can begin working with a fully configured data science lab instead of performing the complex installations and configuration required for a functioning development environment. The Data Scientist's VM includes R, Git, Python, Cloudera, Hadoop, YARN, MRv2, Mahout, MongoDB, Spark, Neo4j, etc. pre-installed. The Data Scientist's Toolbox VM is automatically built for you on a single CentOS VM using the Vagrant DevOps tool with Chef and shell-scripts for VMware Fusion.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Science VM

##Need to install the following Gems vagrant plugin install vagrant-omnibus vagrant plugin install vagrant-env

Users

root/vagrant joe/joe chuck/chuck cloudera/cloudera

Hive Embedded DB

PostgresSQL Host, e63:7432 DB name, hive Username, hive Password, 8xlpmpA6NE

Hue

http://e63:8888/ hdfs/hdfs

Test the Cluster

sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100

About

A Big Data Analytics VM for doing Data Science. It provides a huge kickstart to those working with the Big Data Analytics side of Data Science. Essentially, this project automates the creation of the Big Data Scientist's toolbox on a virtual machine (VM). In a few minutes one can begin working with a fully configured data science lab instead of performing the complex installations and configuration required for a functioning development environment. The Data Scientist's VM includes R, Git, Python, Cloudera, Hadoop, YARN, MRv2, Mahout, MongoDB, Spark, Neo4j, etc. pre-installed. The Data Scientist's Toolbox VM is automatically built for you on a single CentOS VM using the Vagrant DevOps tool with Chef and shell-scripts for VMware Fusion.

License:GNU General Public License v2.0


Languages

Language:Ruby 85.2%Language:HTML 8.0%Language:Shell 6.8%