qubole / sparklens

Qubole Sparklens tool for performance tuning Apache Spark

Home Page:http://sparklens.qubole.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Getting negetive wallclock time in the sparklens UI

shahidki31 opened this issue · comments

Hi,
Thank you for the amazing tool for the performance monitoring of spark jobs. I was trying out some long running spark query using sparklens. However I was getting some strange output in the sparklens UI regarding the job time, like wallclock time etc.
image
image
Could you please help me to resolve this issue?

@shahidki31 Thanks for raising this. We will take a look at get back to you.
I know of couple of reasons why this happens. The first one is missing job end events in the event log file. When the job end time is not known, we try to estimate it. If these are bad estimates, we can run into negative job time issues. The second reason is multiple jobs running in parallel. Sparklens computes driver time by subtracting "time spent in jobs" from the total wall clock time. With parallel jobs, it becomes a bit tricky to find out "time spent in jobs". We had made some changes to deal with this problem, but perhaps running is something new here. We will check and get back to you.

@shahidki31 Parallel jobs were not getting considered at one place in the code, which is why you were getting a negative driver wallclock time. I have fixed it and updated the jar. Please check now, it should work correctly. Thanks for reporting 👍

Thanks @iamrohit @mayurdb for the replies. Yes, I am running jobs in parallel (TPCDS queries basically). Seems, console output is giving the correct results, only UI has the problem. Will check again. Thank you.