pinterest / DoctorK

DoctorK is a service for Kafka cluster auto healing and workload balancing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ERROR com.pinterest.doctorkafka.DoctorKafka - No brokerstats info for cluster

sidharthk opened this issue · comments

Hi,

I am getting below error while starting doctorkafka and doctorkafka ui is not populated with cluster details
I have pasted the doctorkafka UI screenshot and startup log
INFO [2019-04-18 10:26:36,215] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/kafka/doctorkafka-0.2.4.5-build
INFO [2019-04-18 10:26:36,217] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=kafka-poc-2.com:2181,ka
fka-poc-3.com:2181,kafka-poc-4.com:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@65383667
INFO [2019-04-18 10:26:36,246] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-2.com/192.168.100.6:
2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-04-18 10:26:36,247] org.apache.curator.framework.imps.CuratorFrameworkImpl: Default schema
WARN [2019-04-18 10:26:36,262] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and att
empting reconnect
! java.net.ConnectException: Connection refused
! at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
! at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
! at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
! at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
INFO [2019-04-18 10:26:36,364] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-2.com/10.144.179.90:2181. Will not attempt to authenticate using SASL (unknown error)
WARN [2019-04-18 10:26:36,365] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
! java.net.ConnectException: Connection refused
! at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
! at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
! at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
! at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
INFO [2019-04-18 10:26:36,466] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-4.com/192.168.100.8:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-04-18 10:26:36,468] org.apache.zookeeper.ClientCnxn: Socket connection established to kafka-poc-4.com/192.168.100.8:2181, initiating session
WARN [2019-04-18 10:26:36,554] com.pinterest.doctorkafka.util.OstrichAdminService: Failed to load properties from build.properties
! java.lang.NullPointerException: null
! at com.pinterest.doctorkafka.util.OstrichAdminService.startAdminHttpService(OstrichAdminService.java:39)
! at com.pinterest.doctorkafka.util.OperatorUtil.startOstrichService(OperatorUtil.java:282)
! at com.pinterest.doctorkafka.DoctorKafkaMain.startMetricsService(DoctorKafkaMain.java:130)
! at com.pinterest.doctorkafka.DoctorKafkaMain.run(DoctorKafkaMain.java:80)
! at com.pinterest.doctorkafka.DoctorKafkaMain.run(DoctorKafkaMain.java:38)
! at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:43)
! at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:87)
! at io.dropwizard.cli.Cli.run(Cli.java:78)
! at io.dropwizard.Application.run(Application.java:93)
! at com.pinterest.doctorkafka.DoctorKafkaMain.main(DoctorKafkaMain.java:149)
INFO [2019-04-18 10:26:36,903] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@3586412d
INFO [2019-04-18 10:26:36,904] org.I0Itec.zkclient.ZkEventThread: Starting ZkClient event thread.
INFO [2019-04-18 10:26:36,904] org.I0Itec.zkclient.ZkClient: Waiting for keeper state SyncConnected
INFO [2019-04-18 10:26:36,908] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-3.com/192.168.100.7:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-04-18 10:26:36,909] org.apache.zookeeper.ClientCnxn: Socket connection established to kafka-poc-3.com/192.168.100.7:2181, initiating session
INFO [2019-04-18 10:26:37,249] com.twitter.ostrich.stats.LatchedStatsListener$$anon$1: Starting LatchedStatsListener
INFO [2019-04-18 10:26:37,520] com.twitter.ostrich.admin.AdminHttpService: Admin HTTP interface started on port 2052.
INFO [2019-04-18 10:26:37,533] io.dropwizard.server.ServerFactory: Starting DoctorKafkaMain
INFO [2019-04-18 10:26:37,620] org.eclipse.jetty.setuid.SetUIDListener: Opened application@ea00de{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
.............
INFO [2019-04-18 10:26:39,226] org.apache.kafka.common.utils.AppInfoParser: Kafka version : 1.1.0
INFO [2019-04-18 10:26:39,226] org.apache.kafka.common.utils.AppInfoParser: Kafka commitId : fdcf75ea326b8e07
15:56:39.233 [pool-8-thread-1] ERROR com.pinterest.doctorkafka.DoctorKafka - No brokerstats info for cluster kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181
INFO [2019-04-18 10:26:39,242] org.apache.kafka.clients.consumer.internals.AbstractCoordinator: [Consumer clientId=consumer-5, groupId=operator_brokerstats_group_sidharth-kafka-poc-1] Successfully joined group with generation 7
INFO [2019-04-18 10:26:39,243] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: [Consumer clientId=consumer-5, groupId=operator_brokerstats_group_sidharth-kafka-poc-1] Setting newly assigned partitions [brokerstats-0, brokerstats-1, brokerstats-2]

dropwizard_yaml_file.yaml
config: doctorkafka.properties
server:
type: default
maxThreads: 1024

doctorkafka.properties
###################################################################################

DoctorKafka global settings, including the zookeeper url for kafkastats topic

###################################################################################

[required] zookeeper quorum for storing doctorkafka metadata

#doctorkafka.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181
doctorkafka.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] zookeeper connection string for kafkastats topic

#doctorkafka.brokerstats.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster1
doctorkafka.brokerstats.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] kafka topic name for kafkastats

doctorkafka.brokerstats.topic=brokerstats

[required] the time window that doctorkafka uses to compute

doctorkafka.brokerstats.backtrack.seconds=86400

[optional] ssl related setting for the kafka cluster that hosts brokerstats

can PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL

doctorkafka.brokerstats.consumer.security.protocol=SSL
doctorkafka.brokerstats.consumer.ssl.client.auth=required
doctorkafka.brokerstats.consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
doctorkafka.brokerstats.consumer.ssl.endpoint.identification.algorithm=HTTPS
doctorkafka.brokerstats.consumer.ssl.key.password=key_password
doctorkafka.brokerstats.consumer.ssl.keystore.location=keystore_path
doctorkafka.brokerstats.consumer.ssl.keystore.password=keystore_password
doctorkafka.brokerstats.consumer.ssl.keystore.type=JKS
doctorkafka.brokerstats.consumer.ssl.secure.random.implementation=SHA1PRNG
doctorkafka.brokerstats.consumer.ssl.truststore.location=truststore_path
doctorkafka.brokerstats.consumer.ssl.truststore.password=truststore_password
doctorkafka.brokerstats.consumer.ssl.truststore.type=JKS

[required] zookeeper connection string for action_report topic

#doctorkafka.action.report.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster1
doctorkafka.action.report.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] kafka topics for storing the actions that doctorkafka takes.

doctorkafka.action.report.topic=operator_report

[optional] broker replacement interval in seconds

doctorkafka.action.broker_replacement.interval.seconds=43200

[optional] broker replacement script

doctorkafka.action.broker_replacement.command="/usr/local/bin/ec2-replace-node.py -r "

[optional] ssl related settings for action report producer

doctorkafka.action.producer.security.protocol=SSL
doctorkafka.action.producer.ssl.client.auth=required
doctorkafka.action.producer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
doctorkafka.action.producer.ssl.endpoint.identification.algorithm=HTTPS
doctorkafka.action.producer.ssl.key.password=key_password
doctorkafka.action.producer.ssl.keystore.location=keystore_path
doctorkafka.action.producer.ssl.keystore.password=keystore_password
doctorkafka.action.producer.ssl.keystore.type=JKS
doctorkafka.action.producer.ssl.secure.random.implementation=SHA1PRNG
doctorkafka.action.producer.ssl.truststore.location=truststore_path
doctorkafka.action.producer.ssl.truststore.password=truststore_password
doctorkafka.action.producer.ssl.truststore.type=JKS

[required] doctorkafka web port

doctorkafka.web.port=8080

[optional] disable doctorkafka service restart

doctorkafka.restart.disabled=false

[required] doctorkafka service restart interval

doctorkafka.restart.interval.seconds=86400

[optional] ostrich port

doctorkafka.ostrich.port=2052

[optional] tsd host and port.

#doctorkafka.tsd.hostport=localhost:18621

[required] email addresses for sending general notification on cluster under-replication etc.

doctorkafka.emails.notification=email_address_1,email_address_2

[required] email addresses for sending alerts to

doctorkafka.emails.alert=email_address_3,email_address_4

[optional] brokerstats.version

doctorkafka.brokerstats.version=0.2.4.5

################################################################################

cluster1 settings

[required] whether DoctorKakfa runs in the dry run mode.

kafkacluster.cluster1.dryrun=true

[required] zookeeper url for the kafka cluster

#kafkacluster.cluster1.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster1
kafkacluster.cluster1.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] the network-inbound limit in megabytes

kafkacluster.cluster1.network.inbound.limit.mb=35

[required] the network-outbound limit in megabytes

kafkacluster.cluster1.network.outbound.limit.mb=80

[required] the broker's maximum network bandwidth

kafkacluster.cluster1.network.bandwith.max.mb=150

[optional] ssl related kafka consumer setting for accessing topic metadata info of cluster1

kafkacluster.cluster1.consumer.security.protocol=SSL
kafkacluster.cluster1.consumer.ssl.client.auth=required
kafkacluster.cluster1.consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
kafkacluster.cluster1.consumer.ssl.endpoint.identification.algorithm=HTTPS
kafkacluster.cluster1.consumer.ssl.key.password=key_password
kafkacluster.cluster1.consumer.ssl.keystore.location=keystore_path
kafkacluster.cluster1.consumer.ssl.keystore.password=keystore_password
kafkacluster.cluster1.consumer.ssl.keystore.type=JKS
kafkacluster.cluster1.consumer.ssl.secure.random.implementation=SHA1PRNG
kafkacluster.cluster1.consumer.ssl.truststore.location=truststore_path
kafkacluster.cluster1.consumer.ssl.truststore.password=truststore_password
kafkacluster.cluster1.consumer.ssl.truststore.type=JKS

image

@sidharthk , how are you deploying kafkastats?

@sidharthk It seems that doctorkafka cannot find your cluster. Can you check if the "kafkacluster.<yourclustername>.zkurl" for your cluster in the doctorkafka config file is the same as the "zookeeper.connect" config on the broker's kafka configuration file (i.e. the file you provided as an argument to kafkastats)? Currently these must be the same for doctorkafka to recognize the cluster.

We could not reproduce this issue. Closing this issue as non-repro.