Azure / azure-event-hubs-for-kafka

Azure Event Hubs for Apache Kafka Ecosystems

Home Page:https://docs.microsoft.com/azure/event-hubs/event-hubs-for-kafka-ecosystem-overview

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kafka API - offset commits are throttled to 2 commits/sec

ravel3 opened this issue · comments

Description

During our tests we observe that Event Hubs Kafka API throttles offset commits to ~2 commits/sec.
Doesn’t matter whether event hub (topic) has 1 or 10 partitions nor amount of consumer groups connected to the event hub/topic. Pricing tier doesn’t matter, tested on Standard and Premium.


EventHubs’ docs says:


“Offset commits are throttled to 4 calls/second per partition with a maximum internal log size of 1 MB”
Source https://docs.microsoft.com/en-us/azure/event-hubs/apache-kafka-troubleshooting-guide#limits

Could you please verify whether problem is with client configuration or EventHubs limits configuration?

The following logs are an extract of an attached log file that shows the described throttling take place.
All DEBUG logs in the attached file


2022-06-02 14:59:47.005 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 81
2022-06-02 14:59:47.096 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 81 in **91ms**
2022-06-02 14:59:47.097 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 82
2022-06-02 14:59:48.200 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 82 in **1103ms**
2022-06-02 14:59:48.201 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 83
2022-06-02 14:59:48.278 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 83 in **77ms**
2022-06-02 14:59:48.279 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 84
2022-06-02 14:59:49.337 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 84 in **1058ms**
2022-06-02 14:59:49.337 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 85
2022-06-02 14:59:49.421 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 85 in **84ms**
2022-06-02 14:59:49.422 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 86
2022-06-02 14:59:50.396 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 86 in **974ms**
2022-06-02 14:59:50.397 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 87
2022-06-02 14:59:50.493 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 87 in **96ms**
2022-06-02 14:59:50.493 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 88
2022-06-02 14:59:51.491 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 88 in **997ms**
2022-06-02 14:59:51.491 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 89
2022-06-02 14:59:51.604 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 89 in **113ms**
2022-06-02 14:59:51.605 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 90
2022-06-02 14:59:52.538 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 90 in **933ms**
2022-06-02 14:59:52.538 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 91
2022-06-02 14:59:52.629 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 91 in **91ms**
2022-06-02 14:59:52.630 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 3 offset: 92
2022-06-02 14:59:53.739 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 3 offset: 92 in **1109ms**
2022-06-02 14:59:53.742 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 84
2022-06-02 14:59:54.457 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 84 in **715ms**
2022-06-02 14:59:54.457 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 85
2022-06-02 14:59:54.846 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 85 in **389ms**
2022-06-02 14:59:54.846 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 86
2022-06-02 14:59:55.485 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 86 in **639ms**
2022-06-02 14:59:55.485 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 87
2022-06-02 14:59:55.979 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 87 in **494ms**
2022-06-02 14:59:55.980 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 88
2022-06-02 14:59:56.752 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 88 in **772ms**
2022-06-02 14:59:56.752 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 89
2022-06-02 14:59:57.118 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 89 in **366ms**
2022-06-02 14:59:57.118 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 90
2022-06-02 14:59:57.939 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 90 in **821ms**
2022-06-02 14:59:57.939 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 91
2022-06-02 14:59:58.259 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 91 in **320ms**
2022-06-02 14:59:58.259 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 92
2022-06-02 14:59:59.064 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 92 in **805ms**
2022-06-02 14:59:59.065 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 93
2022-06-02 14:59:59.322 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 93 in **257ms**
2022-06-02 14:59:59.322 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 94
2022-06-02 15:00:00.191 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 94 in **869ms**
2022-06-02 15:00:00.192 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 95
2022-06-02 15:00:00.408 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 95 in **216ms**
2022-06-02 15:00:00.409 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 96
2022-06-02 15:00:01.269 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 96 in **860ms**
2022-06-02 15:00:01.270 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 97
2022-06-02 15:00:01.470 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 97 in **200ms**
2022-06-02 15:00:01.471 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 98
2022-06-02 15:00:02.443 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 98 in **972ms**
2022-06-02 15:00:02.444 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 99
2022-06-02 15:00:02.527 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 99 in **83ms**
2022-06-02 15:00:02.528 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 100
2022-06-02 15:00:03.672 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 100 in **1144ms**
2022-06-02 15:00:03.673 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 101
2022-06-02 15:00:03.753 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 101 in **80ms**
2022-06-02 15:00:03.754 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 102
2022-06-02 15:00:04.798 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 102 in **1044ms**
2022-06-02 15:00:04.798 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 103
2022-06-02 15:00:04.871 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 103 in **73ms**
2022-06-02 15:00:04.871 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 5 offset: 104
2022-06-02 15:00:05.883 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 5 offset: 104 in **1012ms**
2022-06-02 15:00:05.884 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 4 offset: 79
2022-06-02 15:00:05.959 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 4 offset: 79 in **75ms**
2022-06-02 15:00:05.959 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 4 offset: 80
2022-06-02 15:00:06.965 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 4 offset: 80 in **1006ms**
2022-06-02 15:00:06.966 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 4 offset: 81
2022-06-02 15:00:07.043 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 4 offset: 81 in **77ms**
2022-06-02 15:00:07.044 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 4 offset: 82
2022-06-02 15:00:08.075 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 4 offset: 82 in **1031ms**
2022-06-02 15:00:08.075 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Will commit msg partition: 4 offset: 83
2022-06-02 15:00:08.155 [     Gr_0-Cln_0] INFO  TestConsumerThread:76 - Gr_0/Cln_0: Committed msg partition: 4 offset: 83 in **79ms**

How to reproduce

  1. Create an eventHub with 10 partitions
  2. Connect to the eventHub with Kafka Client (Java)
  3. Poll and commit synchronously every consumed message.

    The following code snippet is slightly modified Quickstart Java Consumer example:


while (true) {
    logMsg("Will poll msg");
    final ConsumerRecords<String, String> consumerRecords = consumer.poll(10000);
    logMsg("Pulled msgs count:" + consumerRecords.count());
    for(ConsumerRecord<String, String> cr : consumerRecords) {
        logMsg("Consumer Record:(key: %s, value: %s, partition: %d, offset: %d, headers: %s)\n", cr.key(), cr.value(), cr.partition(), cr.offset(), Arrays.toString(cr.headers().toArray()));
        logMsg("Will commit partition: "+cr.partition()+" offset: " + cr.offset());
        consumer.commitSync();
        logMsg("Committed partition: "+cr.partition()+" offset: " + cr.offset());
        logMsg("----------------------------------------------------------------");
    }
}

Has it worked previously?

From the beginning observing the same behaviour

Checklist

Please provide the relevant information for the following items:

  • SDK (include version info): OpenJdk Java 11
  • Sample you're having trouble with: Kafka offset throttling is lower than documentation claims
  • If using Apache Kafka Java clients or a framework that uses Apache Kafka Java clients, version: kafka-clients 1.0.0 or kafka-clients 3.2.0
  • Kafka client configuration:
bootstrap.servers=/CUT/
group.id=Gr_0
request.timeout.ms=60000
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="/CUT/;
max.poll.records=1
max.poll.interval.ms=300000
isolation.level=read_committed
enable.auto.commit=false
# Event Hubs recommended config
heartbeat.interval.ms=3000
session.timeout.ms=30000
  • Namespace and EventHub/topic name: test-eventhub-kafka-commit-throttling.servicebus.windows.net:, subscriptionId: ae224581-db60-438f-90c2-3aebcf5d404b
  • Consumer or producer failure NA
  • Timestamps in UTC 2022-06-02 14:59:47.005
  • group.id or client.id group.id=Gr_0 client.id=Cln_0
  • Logs: logs_event-hubs.log
  • Standalone repro code snippet in How to reproduce section
  • Operating system: MacOS 12.4 / Ubuntu Linux 20.04
  • Minor issue
  • Namespace Standard tier (tested on Premium also)

Hi guys,

Has anyone been able to confirm (or not) the problem described above?

Where this offset get stored by default? During rebalance or restarts so that it can refer and start from the persisted offset.

For Event Hub standard tier we do slow down the offset commit rate if you exceed the approx. 2 calls per second per topic-partition. However since Nov 2022 we have made changes to the Premium and Dedicated tier so that you can offset commit as fast as resource allocation is allowed (PU core allocation limit for Premium, and hardware performance for Dedicated)

commented

@hmlam thank you for the clarification.
Are you able to update the Event Hubs documentation?
the following doc states that there is a limit of 4 commits/secs without distinguishing pricing tiers, which is different to what you wrote

Offset commits are throttled to 4 calls/second per partition with a maximum internal log size of 1 MB
Source: https://learn.microsoft.com/en-us/azure/event-hubs/apache-kafka-troubleshooting-guide#limits