pinterest / DoctorK

DoctorK is a service for Kafka cluster auto healing and workload balancing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kafkastats fails to report stats due to ArithmeticError

paulkiernan opened this issue · comments

I'm unable to get the kafkastats service to report metrics. On each report interval I see the following error:

21:25:49.144 [StatsReporter] ERROR com.pinterest.doctorkafka.stats.BrokerStatsReporter - Failed to report stats
java.lang.ArithmeticException: / by zero
        at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.computeNetworkStats(BrokerStatsRetriever.java:676) ~[kafkastats-0.2.4.2.jar:?]
        at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.retrieveBrokerStats(BrokerStatsRetriever.java:656) ~[kafkastats-0.2.4.2.jar:?]
        at com.pinterest.doctorkafka.stats.BrokerStatsReporter.run(BrokerStatsReporter.java:57) [kafkastats-0.2.4.2.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_171]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_171]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_171]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]

This appears due to the value of deltaT always being set to 0 in this method: https://github.com/pinterest/doctorkafka/blob/master/kafkastats/src/main/java/com/pinterest/doctorkafka/stats/BrokerStatsRetriever.java#L668

I've tried running kafkastats with both the 0.2.4.3 release and the 0.2.4.2 release and have found this problem presents in both.

@paulkiernan Thanks for reporting the issue! we have put a fix #76 for this and reset 0.2.4.3 to the latest commit. Could you try again?

Great, #76 appears to have fixed the broken calculation. Thanks!