hbutani / spark-druid-olap

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Home Page:http://sparklinedata.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Avoid Druid Broker Bottleneck

jpullokkaran opened this issue · comments

Eliminate broker as bottleneck in cases where large amount of data needs to be pulled out from Druid for subsequent processing in spark. One possible solution is to talk directly to Historical nodes.

For example:
SELECT c_name,
bal,
sales_prospects_amount
FROM (SELECT c_name,
Sum(c_acctbal) bal
FROM orderlineitempartsupplier
GROUP BY c_name
HAVING Sum(c_acctbal) > 1000)r1
JOIN (SELECT cname,
Sum(sales_prospects_amount) AS sales_prospects_amount
FROM sales_leads
GROUP BY c_name) r2
ON r1.c_name = r2.cname

fixed with aae401f
set queryHistoricalServer = true,
see examples in HistoricalServerTest