confluentinc / kafka-connect-hdfs

Kafka Connect HDFS connector

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

what does a leader node usually do?

kimnami opened this issue · comments

Hi there!

I'd like to set a kafka-connect cluster in distributed mode.
Let me assume that there are 4 nodes here like this :

node1
node2
node3
node4

And I want to set the node1 as ADVERTISED_HOST which is to set like a leader i guess. Also set the node1as REST_HOST, too.
So, when I run docker containers, I set environment variables to every node like below

	-e CONNECT_REST_HOST_NAME="node1" \
	-e CONNECT_REST_PORT="8083" \
	-e CONNECT_REST_ADVERTISED_HOST_NAME="node1" \
	-e CONNECT_REST_ADVERTISED_PORT="8083" \

Then, I can see the phrase like leaderUrl='http://node1:8083/'from the log like this :

[2021-04-28 18:24:09,320] INFO Joined group and got assignment: Assignment{error=0, leader='connect-1-303380fb-90ff-4c00-a302-a57ee8b4a0e4', leaderUrl='http://node1:8083/', offset=93, connectorIds=[], taskIds=[test5-0, test5-2]} (org.apache.kafka.connect.runtime.distributed.DistributedHerder)

In this case, my point is,

  • I read the docs like there is no leader in a kafka-connect cluster, but I think this CONNECT_REST_ADVERTISED_HOST_NAME set a node as a leader;as you can see in the log. Am I wrong? If I misunderstood, what is the exact role of the CONNECT_REST_ADVERTISED_HOST_NAME ?
  • I thought that CONNECT_REST_HOST_NAME makes only one node get all the requests. But as I requested to node2 or node3 or node4, they WORKED WELL..! So I wonder what the exact role of this variable is.
  • Finally, if there is the leader in the cluster, what does the leader node usually do?

I thought this question was not suitable for this repo, so I asked it on stackoverflow. OR lmk if anybody knows more suitable one for this.