Condla / spark-example

Spark Scala example.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spark Example

This is a simple example of a Spark 1.6.x application that reads data from a Hive table using Spark SQL does some grouping and aggregation and saves the output to a file on HDFS

Hive --> Spark --> HDFS

Prerequistes

  • JDK 1.8
  • Spark 1.6.x
  • Scala 2.10

Usage

Import

If you are using IntelliJ, import the project as Maven project using JDK 1.8

Build

Use maven on the command line to create a jar.

mvn package

Find the jars, one with the other without dependencies in the target directory.

Submit to HDP

In order to submit your application to an existing YARN cluster with Spark 1.6.x libraries installed you need to perform the following commands. The kinit command is only required on a secure cluster.

kinit username  [-kt /path/to/keytab]
spark-submit --class com.hortonworks.SparkHiveExample --master yarn spark-example-1.0-SNAPSHOT.jar MyApplicationName /path/to/hdfs-dir/with/write/access/

About

Spark Scala example.


Languages

Language:Scala 100.0%