mrsrinivas / spark-bench

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spark-bench Build Status

Build

    $ git clone https://github.com/mrsrinivas/spark-bench.git
    $ cd spark-bench
    $ mvn install

Run

Run DataGen Spark application on YARN cluster

    $ nohup spark2-submit \
        --master yarn \
        --executor-cores 2 \
        --num-executors 30 \
        --driver-memory 2g \
        --executor-memory 4g \
        --class com.mrsrinivas.app.DataGen \
        ./target/spark-bench-1.0-fat.jar  \ 
        100G \
        30 \
        file:///scratch/username/datagen_in > spark-submit.log &
    
    [1] 11069
    $ nohup: ignoring input and redirecting stderr to stdout
    
    tail -f spark-submit.log
        

Once the job is successful, the output directory should have following sub directories

    $ cd /scratch/username/datagen_in
    $ ls
    employees	stage-metrics
    

About

License:Apache License 2.0


Languages

Language:Scala 98.9%Language:Makefile 1.1%