paypal / dione

Dione - a Spark and HDFS indexing library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Switch filesDF to HadoopRDD

shay1bz opened this issue · comments

current filesDF is both ugly and inefficient in terms of data locality.
we should try to switch to something like HadoopRDD/NewHadoopRDD or something more natural to leverage the preferred locations functionality.