aio-libs / aiokafka

asyncio client for kafka

Home Page:http://aiokafka.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] Perfomance producer with large messages

YraganTron opened this issue · comments

I see performance issues when publishing large messages (30 kilobytes)

I am publishing a message using a benchmark from the repository with these parameters
python -m simple_produce_bench -b kafka -s 30000 --topic test --batch-size 200000 --linger-ms 20 --uvloop
Produced 168 messages in 1 second(s). Produced 156 messages in 1 second(s)

If you set messages to 1000 bytes, the performance looks much better
python -m simple_produce_bench -b kafka -s 1000 --topic test --batch-size 200000 --linger-ms 20 --uvloop
Produced 5148 messages in 1 second(s). Produced 4158 messages in 1 second(s). Produced 4752 messages in 1 second(s). Produced 4356 messages in 1 second(s)

Is this the expected throughput for this message size?
Or should I look for problems in the Kafka configuration itself?

Hey @YraganTron

It seems you have some kind of linear performance impact here, so you see the same ratio between message size and message per second (you can produce a message 30x bigger as a 1/30 rate).
It doesn't seem that bad in this situation, the limiting factors could be :

  • bandwidth between the client and the broker
  • CPU used by the python code to process each message