Evaluating query runtime without output
AKheli opened this issue · comments
Hello,
I am using PyDruid to evaluate a query runtime in Druid without taking in account the results output that are obtained on the API.
from pydruid.db import connect
import time
conn = connect(host='localhost', port=8082, path='/druid/v2/sql/', scheme='http')
curs = conn.cursor()
start = time.time()
curs.execute("""
SELECT id_station, count(*) FROM bafu_comma where id_station IN (32, 54, 8, 25, 95, 13, 80, 16, 83, 27) group by id_station
""")
end1 = time.time()
print('exeution runtime:', (end1 - start) * 1000, 'ms')
print('number of rows:', sum(1 for _ in curs))
end2 = time.time()
# for row in curs:
# print(row)
print('total time: ',(end2 - start) * 1000, 'ms')
Is this a correct way of measuring the runtime. My execution time is always around 200ms or 50ms which is a bit suspecious. Also, the total runtime that I obtain is much higher than the results that I obtain in the API.
Any ideas on how to properly evaluate a query execution time in Druid?
Thanks!
I'm not sure if that's correct. The DB API connector will stream the results from Druid, so unless you have iterated over all the result set I don't think you can assume that the query execution has finished.
Lines 365 to 380 in bd7b741
The correct time is probably closer to end2 - start
in this case, I think.