weld-project / weld

High-performance runtime for data analytics applications

Home Page:https://www.weld.rs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Queries on weld groupby and sort

shashwatwork opened this issue · comments

Hi,

In Groupby,
I have used following code snippet

#normal pandas
df = pd.DataFrame({"a":[3,2,3], "b":[4,5,6]})
start = time.time()
res = df.groupby('a').agg('sum')
end = time.time()
pandas_time_groupby = end-start
print "({:.3} seconds)".format(pandas_time_groupby)

#In weld
start = time.time()
input = gr.DataFrameWeld(df)
groupby_sd = input.groupby("a").sum()
end = time.time()
weld_time_groupby = end-start
print "({:.3} seconds)".format(weld_time_groupby)

Queries:
1.How to view or display the result of weld operation. like I need to print the output of groupby_sd
*2.weld groupby is running better than normal pandas incase of small dataframe, If dataframe size is increased again pandas giving much better performance than weld (Please send me snippet if weld works fine on top of huge volume of data)

In Sort function
I have searched lot on how to perform sort operation with weld like how we perform sort in normal pandas. Please let me know how to perform sort function with weld.