gstamatakis / top_k_queries

Top-k queries algorithm in Spark - Scala.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Top-k queries

Project repository for top-k queries project.

Contents

Files contains useful PDFs.

Datasets contains tested datasets.

Building via cmd line

Run the following:

sbt assembly

Running Locally

You can use the Intellij IDEA to run the jar LOCALLY as a program (via the Play/Debug) button. See bellow.

Using in Intellij

To use this project in IntelliJ, import the project from existing sources using "sbt". Ensure that the Scala plugin is installed first. IntelliJ will download sbt for you.

To use Intellij as a LOCAL runner follow these 3 easy steps:

1. Run the src/main/scala/topk.TopKDirver in Debug mode (it will crash).

2. On the top right edit the configuration settings

3. Add the -k 3 as programm arguments and check the "Include dependencies with Provided scope" on the bottom.

You can now run/debug the TopKDriver program from the top right corner of the IDE.

If it's greyed out simply click on it and Save it again.

The Spark app can now be run locally.

Troubleshooting

If the program isn't running or the output is empty try to run:

sbt clean assembly

About

Top-k queries algorithm in Spark - Scala.


Languages

Language:Scala 80.8%Language:Python 17.8%Language:Batchfile 1.4%