mhausenblas / dromedar

Apache Drill On Apache Mesos

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dromedar: Drill on Mesos

This is a simple wrapper/enabler for running Apache Drill on Apache Mesos.

Dromedar (DRill On MEsos aDAptoR) gets launched via Marathon and whenever a query request comes in, it launches a number of Drillbits, depending on the dataset size under query. The query-scale-factor (QSF) determines how many Drillbits are launched in relation to the dataset size and defaults to 1 Drillbit per 100MB (1:100) or qsf=100, for short.

Dromedar's architecture is as follows:

+----------------+ +----------------------------------------+                     
| Marathon       | |  Mesos worker node                     |                     
|                | |                                        |                     
|                | |       +-------------------------+      |                     
|                | |       |                         |      |                     
|                | |       |        Drillbit         <---------[3]-------> SQL client
|                | |       +------------+------------+      |                     
|                | |                   [2]                  |                     
|                | |       +------------+------------+      |                     
|                | |       |                         |      |                     
|                +----[2]-->    drillbit.sh start    |      |                     
|                | |       +-------------------------+      |                     
|                | |                                        |                     
|                | |                                        |                     
|                | |       +-------------------------+      |                     
|                | |       |                         |      |                     
|                | |HTTP API                         |      |                     
|                <----[2]--+         qsf.py          <---------[1]-------- [QSF]  
|                | |       |                         |      |                     
|                | |       +-------------------------+      |                     
+----------------+ +----------------------------------------+                     

Dromedar's underlying long-runing service is qsf.py which itself is initially deployed through dromedar.py, using Marathon. Once qsf.py is running as a Web service it performs the following steps:

  1. As an input it takes a QSF via its HTTP interface on port 9876.
  2. It uses the Marathon HTTP API to trigger on-demand Drillbits creation using the drillbit.sh start command.
  3. The SQL client connects to (one of) the Drillbit(s) and executes the SQL query.

Dependencies

Note that Apache Drill and the Marathon Python package are installed via Dromedar, directly. The only two things that are assumed to be available are Mesos and Marathon itself.

Usage

$ ./launch.sh

Then, go to the Marathon UI where you should see something like the following:

Dromedar launched in Marathon

To Do

  • Bootstrap (install Drill, launch Dromedar via Marathon)
  • Implement QSF HTTP API
  • Implement Drillbit launch/teardown based on requests
  • Clarify relation/communication between QSF and SQL client (out of band??)
  • Strata implementation cross-check
  • Cluster deployment and testing
  • HAProxy deployment?
  • Examples and video walkthrough

About

Apache Drill On Apache Mesos

License:Apache License 2.0


Languages

Language:Python 65.8%Language:Shell 34.2%