kakamessi99 / sparkle-1

Haskell on Apache Spark.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sparkle: program Apache Spark applications in Haskell

Circle CI

Sparkle [spär′kəl]: a library for writing resilient analytics applications in Haskell that scale to thousands of nodes, using Spark and the rest of the Apache ecosystem under the hood.

This is an early tech preview, not production ready

Getting started

The tl;dr using the hello app as an example on your local machine:

$ stack build hello
$ mvn -f sparkle -Dsparkle.app=sparkle-example-hello package
$ spark-submit --master 'local[1]' sparkle/target/sparkle-0.1.jar

Requirements:

  • the Stack build tool;
  • either, the Nix package manager,
  • or, OpenJDK, Maven and Spark >= 1.6 installed from your distro.

To run a Spark application the process is as follows:

  1. create an application in the apps/ folder, in-repo or as a submodule;
  2. add your app to stack.yaml;
  3. build the app;
  4. package your app into a deployable JAR container;
  5. submit it to a local or cluster deployment of Spark.

To build:

$ stack [--nix] build

You can optionally pass --nix to all Stack commands to ask Nix to provision a local Spark and Maven in a local sandbox for good build results reproducibility. Otherwise you'll need these installed through your OS distribution's package manager for the next steps.

To package your app:

$ mvn -f sparkle -Dsparkle.app=<app-executable-name> package

or with

$ stack --nix exec -- mvn -f sparkle -Dsparkle.app=<app-executable-name> package

And finally, to run your application, say locally:

$ spark-submit --master 'local[1]' target/sparkle-0.1.jar

See here for other options, including lauching a whole cluster from scratch on EC2.

License

Copyright (c) 2015-2016 Tweag I/O Limited.

All rights reserved.

Sparkle is free software, and may be redistributed under the terms specified in the LICENSE file.

About

Tweag I/O

Sparkle is maintained by Tweag I/O.

Have questions? Need help? Tweet at @tweagio.

About

Haskell on Apache Spark.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Haskell 77.3%Language:Java 12.3%Language:Nix 5.5%Language:C 3.8%Language:Shell 1.1%