zhangyichi12 / stock_big-data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

#Big-Data-Proj

###A high performance data processing distributed system using Apache Kafka, Apache Cassandra, and Apache Spark to analyze stock data (200k msg/s on one MacBook Pro).

Using Docker on docker-machine on Mac OS.

  1. Create docker-machine
docker-machine create --driver virtualbox --virtualbox-cpu-count 2 --virtualbox-memory 2048 bigdata
  1. Map terminal to docker-machine
eval $(docker-machine env bigdata)
  1. Zookeeper
docker run -d -p 2181:2181 -p 2888:2888 -p 3888:3888 --name zookeeper confluent/zookeeper
  1. Kafka
docker run -d -p 9092:9092 -e KAFKA_ADVERTISED_HOST_NAME=`docker-machine ip bigdata` -e KAFKA_ADVERTISED_PORT=9092 --name kafka --link zookeeper:zookeeper confluent/kafka
  1. Cassandra
docker run -d -p 7199:7199 -p 9042:9042 -p 9160:9160 -p 7001:7001 --name cassandra cassandra:3.7
  1. Redis
docker run -d -p 6379:6379 --name redis redis:alpine
  1. pyenv and virtualenv Using pyenv as python version control and use virtualenv to isolate dependencies.
source python_env/bin/activate

to get into virtual environment.

About


Languages

Language:Python 98.0%Language:C 0.9%Language:JavaScript 0.9%Language:CSS 0.1%Language:Shell 0.1%Language:HTML 0.0%