Confluent Parallel Consumer

Parallel Apache Kafka client wrapper with client side queueing, a simpler consumer/producer API with key concurrency and extendable non-blocking IO processing.

Confluent’s product page for the project is here.

💡	If you like this project, please ⭐ Star it in GitHub to show your appreciation, help us gauge popularity of the project and allocate resources.

ℹ️	This is not a part of the Confluent commercial support offering, except through consulting engagements. See the Support and Issues section for more information.

❗	This project has been stable and reached its initial target feature set in Q1 2021. It is actively maintained by the CSID team at Confluent.

This library lets you process messages in parallel via a single Kafka Consumer meaning you can increase consumer parallelism without increasing the number of partitions in the topic you intend to process. For many use cases this improves both throughput and latency by reducing load on your brokers. It also opens up new use cases like extreme parallelism, external data enrichment, and queuing.

Consume many messages concurrently with a single consumer instance:

parallelConsumer.poll(record ->
        log.info("Concurrently processing a record: {}", record)
);

An overview article to the library can also be found on Confluent’s blog: Introducing the Confluent Parallel Consumer.

1. Demo

Relative speed demonstration

Figure 1. Click on the animated SVG image to open the Asciinema.org player.

2. Video Overview

Kafka Summit Europe 2021 Presentation

Figure 2. A video presentation overview can be found from the Kafka Summit Europe 2021 page for the presentatoin, along with slides.

Table of Contents

1. Demo
2. Video Overview
3. Motivation
- 3.1. Why would I need this?
  - 3.1.1. Before
  - 3.1.2. After
- 3.2. Background
- 3.3. FAQ
- 3.4. Scenarios
4. Features List
5. Performance
- 5.1. Illustrative Performance Example
6. Support and Issues
7. License
8. Usage
- 8.1. Maven
- 8.2. Common Preparation
- 8.3. Core
- 8.4. Batching
  - 8.4.1. Usage
  - 8.4.2. Restrictions
- 8.5. HTTP with the Vert.x Module
- 8.6. Project Reactor
- 8.7. Kafka Streams Concurrent Processing
- 8.8. Confluent Cloud
9. Upgrading
- 9.1. From 0.4 to 0.5
10. Ordering Guarantees
- 10.1. Vanilla Kafka Consumer Operation
- 10.2. Unordered
- 10.3. Ordered by Partition
- 10.4. Ordered by Key
- 10.5. Retries and Ordering
11. Retries
- 11.1. Retry Delay Function
- 11.2. Skipping Records
- 11.3. Circuit Breaker Pattern
- 11.4. Head of Line Blocking
- 11.5. Future Work
12. Result Models
13. Commit Mode
- 13.1. Apache Kafka EoS Transaction Model
14. Using with Kafka Streams
15. Roadmap
- 15.1. Medium Term - What’s up next ⏲
- 15.2. Long Term - The future ☁️
16. Usage Requirements
17. Development Information
- 17.1. Requirements
- 17.2. Notes
- 17.3. Recommended IDEA Plugins
- 17.4. Readme
- 17.5. Maven targets
- 17.6. Testing
  - 17.6.1. Integration Testing with TestContainers
18. Implementation Details
- 18.1. Core Architecture
- 18.2. Vert.x Architecture
- 18.3. Transactional System Architecture
- 18.4. Offset Map
  - 18.4.1. Storage Notes
19. Attribution
20. Change Log
- 20.1. Next Version
- 20.2. v0.5.2.3
  - 20.2.1. Improvements
- 20.3. v0.5.2.2
  - 20.3.1. Fixes
- 20.4. v0.5.2.1
  - 20.4.1. Fixes
- 20.5. v0.5.2.0
  - 20.5.1. Fixes and Improvements
  - 20.5.2. Build
  - 20.5.3. Dependencies
  - 20.5.4. Linked issues
- 20.6. v0.5.1.0
  - 20.6.1. Features
  - 20.6.2. Fixes and Improvements
- 20.7. v0.5.0.0
  - 20.7.1. Features
  - 20.7.2. Fixes and Improvements
  - 20.7.3. Build
- 20.8. v0.4.0.1
  - 20.8.1. Improvements
  - 20.8.2. Docs
- 20.9. v0.4.0.0
  - 20.9.1. Features
  - 20.9.2. Fixes and Improvements
- 20.10. v0.3.2.0
  - 20.10.1. Fixes and Improvements
- 20.11. v0.3.1.0
  - 20.11.1. Fixes and Improvements
- 20.12. v0.3.0.3
  - 20.12.1. Fixes and Improvements
- 20.13. v0.3.0.2
  - 20.13.1. Fixes and Improvements
- 20.14. v0.3.0.1
- 20.15. v0.3.0.0
  - 20.15.1. Features
  - 20.15.2. Improvements
  - 20.15.3. Fixes
- 20.16. v0.2.0.3
  - 20.16.1. Fixes
- 20.17. v0.2.0.2
  - 20.17.1. Fixes
- 20.18. v0.2.0.1 DO NOT USE - has critical bug
  - 20.18.1. Fixes
- 20.19. v0.2.0.0
  - 20.19.1. Features
  - 20.19.2. Improvements
  - 20.19.3. Fixes
- 20.20. v0.1
  - 20.20.1. Features:

3. Motivation

3.1. Why would I need this?

The unit of parallelism in Kafka’s consumers is the partition but sometimes you want to break away from this approach and manage parallelism yourself using threads rather than new instances of a Consumer. Notable use cases include:

Where partition counts are difficult to change and you need more parallelism than the current configuration allows.
You wish to avoid over provisioning partitions in topics due to unknown future requirements.
You wish to reduce the broker-side resource utilization associated with highly-parallel consumer groups.
You need queue-like semantics that use message level acknowledgment, for example to process a work queue with short- and long-running tasks.

When reading the below, keep in mind that the unit of concurrency and thus performance, is restricted by the number of partitions (degree of sharding / concurrency). Currently, you can’t adjust the number of partitions in your Kafka topics without jumping through a lot of hoops, or breaking your key ordering.

3.1.1. Before

Figure 3. The slow consumer situation with the raw Apache Kafka Consumer client

3.1.2. After

Figure 4. Example usage of the Parallel Consumer

3.2. Background

The core Kafka consumer client gives you a batch of messages to process one at a time. Processing these in parallel on thread pools is difficult, particularly when considering offset management and strong ordering guarantees. You also need to manage your consume loop, and commit transactions properly if using Exactly Once semantics.

This wrapper library for the Apache Kafka Java client handles all this for you, you just supply your processing function.

Another common situation where concurrent processing of messages is advantageous, is what is referred to as "competing consumers". A pattern that is often addressed in traditional messaging systems using a shared queue. Kafka doesn’t provide native queue support and this can result in a slow processing message blocking the messages behind it in the same partition. If log ordering isn’t a concern this can be an unwelcome bottleneck for users. The Parallel Consumer provides a solution to this problem.

In addition, the Vert.x extension to this library supplies non-blocking interfaces, allowing higher still levels of concurrency with a further simplified interface. Also included now is a module for Project Reactor.io.

3.3. FAQ

Why not just run more consumers?

The typical way to address performance issues in a Kafka system, is to increase the number of consumers reading from a topic. This is effective in many situations, but falls short in a lot too.
- Primarily: You cannot use more consumers than you have partitions available to read from. For example, if you have a topic with five partitions, you cannot use a group with more than five consumers to read from it.
- Running more extra consumers has resource implications - each consumer takes up resources on both the client and broker side. Each consumer adds a lot of overhead in terms of memory, CPU, and network bandwidth.
- Large consumer groups (especially many large groups) can cause a lot of strain on the consumer group coordination system, such as rebalance storms.
- Even with several partitions, you cannot achieve the performance levels obtainable by per-key ordered or unordered concurrent processing.
- A single slow or failing message will also still block all messages behind the problematic message, ie. the entire partition. The process may recover, but the latency of all the messages behind the problematic one will be negatively impacted severely.
Why not run more consumers within your application instance?
- This is in some respects a slightly easier way of running more consumer instances, and in others a more complicated way. However, you are still restricted by all the per consumer restrictions as described above.
Why not use the Vert.x library yourself in your processing loop?
- Vert.x us used in this library to provide a non-blocking IO system in the message processing step. Using Vert.x without using this library with ordered processing requires dealing with the quite complicated, and not straight forward, aspect of handling offset commits with Vert.x asynchronous processing system.
  
  Unordered processing with Vert.x is somewhat easier, however offset management is still quite complicated, and the Parallel Consumer also provides optimizations for message-level acknowledgment in this case. This library handles offset commits for both ordered and unordered processing cases.

3.4. Scenarios

Below are some real world use cases which illustrate concrete situations where the described advantages massively improve performance.

Slow consumer systems in transactional systems (online vs offline or reporting systems)
- Notification system:
  - Notification processing system which sends push notifications to a user to acknowledge a two-factor authentication request on their mobile and authorising a login to a website, requires optimal end-to-end latency for a good user experience.
  - A specific message in this queue uncharacteristically takes a long time to process because the third party system is sometimes unpredictably slow to respond and so holds up the processing for ALL other notifications for other users that are in the same partition behind this message.
  - Using key order concurrent processing will allow notifications to proceed while this message either slowly succeeds or times out and retires.
- Slow GPS tracking system (slow HTTP service interfaces that can scale horizontally)
  - GPS tracking messages from 100,000 different field devices pour through at a high rate into an input topic.
  - For each message, the GPS location coordinates is checked to be within allowed ranges using a legacy HTTP services, dictated by business rules behind the service.
  - The service takes 50ms to process each message, however can be scaled out horizontally without restriction.
  - The input topic only has 10 partitions and for various reasons (see above) cannot be changed.
  - With the vanilla consumer, messages on each partition must be consumed one after the other in serial order.
  - The maximum rate of message processing is then:
    
    1 second / 50 ms * 10 partitions = 200 messages per second.
  - By using this library, the 10 partitions can all be processed in key order.
    
    1 second / 50ms × 100,000 keys = 2,000,000 messages per second
    
    While the HTTP system probably cannot handle 2,000,000 messages per second, more importantly, your system is no longer the bottleneck.
- Slow CPU bound model processing for fraud prediction
  - Consider a system where message data is passed through a fraud prediction model which takes CPU cycles, instead of an external system being slow.
  - We can scale easily the number of CPUs on our virtual machine where the processing is being run, but we choose not to scale the partitions or consumers (see above).
  - By deploying onto machines with far more CPUs available, we can run our prediction model massively parallel, increasing our throughput and reducing our end-to-end response times.
Spikey load with latency sensitive non-functional requirements
- An upstream system regularly floods our input topic daily at close of business with settlement totals data from retail outlets.
  - Situations like this are common where systems are designed to comfortably handle average day time load, but are not provisioned to handle sudden increases in traffic as they don’t happen often enough to justify the increased spending on processing capacity that would otherwise remain idle.
  - Without adjusting the available partitions or running consumers, we can reduce our maximum end-to-end latency and increase throughout to get our global days outlet reports to division managers so action can be taken, before close of business.
- Natural consumer behaviour
  - Consider scenarios where bursts of data flooding input topics are generated by sudden user behaviour such as sales or television events ("Oprah" moments).
  - For example, an evening, prime-time game show on TV where users send in quiz answers on their devices. The end-to-end latency of the responses to these answers needs to be as low as technically possible, even if the processing step is quick.
  - Instead of a vanilla client where each user response waits in a virtual queue with others to be processed, this library allows every single response to be processed in parallel.
Legacy partition structure
- Any existing setups where we need higher performance either in throughput or latency where there are not enough partitions for needed concurrency level, the tool can be applied.
Partition overloaded brokers
- Clusters with under-provisioned hardware and with too many partitions already - where we cannot expand partitions even if we were able to.
- Similar to the above, but from the operations perspective, our system is already over partitioned, perhaps in order to support existing parallel workloads which aren’t using the tool (and so need large numbers of partitions).
- We encourage our development teams to migrate to the tool, and then being a process of actually lowering the number of partitions in our partitions in order to reduce operational complexity, improve reliability and perhaps save on infrastructure costs.
Server side resources are controlled by a different team we can’t influence
- The cluster our team is working with is not in our control, we cannot change the partition setup, or perhaps even the consumer layout.
- We can use the tool ourselves to improve our system performance without touching the cluster / topic setup.
Kafka Streams app that had a slow stage
- We use Kafka Streams for our message processing, but one of it’s steps have characteristics of the above and we need better performance. We can break out as described below into the tool for processing that step, then return to the Kafka Streams context.
Provisioning extra machines (either virtual machines or real machines) to run multiple clients has a cost, using this library instead avoids the need for extra instances to be deployed in any respect.

4. Features List

Have massively parallel consumption processing without running hundreds or thousands of:
- Kafka consumer clients,
- topic partitions,
  
  without operational burden or harming the cluster’s performance
Client side queueing system on top of Apache Kafka consumer
- Efficient individual message acknowledgement system (without local or third party external system state storage) to massively reduce (and usually completely eliminate) message replay upon failure - see Offset Map section for more details
Solution for the "head of line" blocking problem where continued failure of a single message, prevents progress for messages behind it in the queue
Per key concurrent processing, per partition and unordered message processing
Offsets committed correctly, in order, of only processed messages, regardless of concurrency level or retries
Batch support in all versions of the API to process batches of messages in parallel instead of single messages.
- Particularly useful for when your processing function can work with more than a single record at a time - e.g. sending records to an API which has a batch version like Elasticsearch
Vert.x and Reactor.io non-blocking library integration
- Non-blocking I/O work management
- Vert.x’s WebClient and general Vert.x Future support
- Reactor.io Publisher (Mono/Flux) and Java’s CompletableFuture (through Mono#fromFuture)
Fair partition traversal
Zero~ dependencies (Slf4j and Lombok) for the core module
Java 8 compatibility
Throttle control and broker liveliness management
Clean draining shutdown cycle
Manual global pause / resume of all partitions, without unsubscribing from topics (useful for implementing a simplistic circuit breaker)
- Circuit breaker patterns for individual paritions or keys can be done through throwing failure exceptions in the processing function (see PR #291 Explicit terminal and retriable exceptions for further refinement)
- Note: Pausing of a partition is also automatic, whenever back pressure has built up on a given partition

And more to come!

5. Performance

In the best case, you don’t care about ordering at all.In which case, the degree of concurrency achievable is simply set by max thread and concurrency settings, or with the Vert.x extension, the Vert.x Vertical being used - e.g. non-blocking HTTP calls.

For example, instead of having to run 1,000 consumers to process 1,000 messages at the same time, we can process all 1,000 concurrently on a single consumer instance.

More typically though you probably still want the per key ordering grantees that Kafka provides. For this there is the per key ordering setting. This will limit the library from processing any message at the same time or out of order, if they have the same key.

Massively reduce message processing latency regardless of partition count for spikey workloads where there is good key distribution. Eg 100,000 “users” all trigger an action at once. As long as the processing layer can handle the load horizontally (e.g auto scaling web service), per message latency will be massively decreased, potentially down to the time for processing a single message, if the integration point can handle the concurrency.

For example, if you have a key set of 10,000 unique keys, and you need to call an http endpoint to process each one, you can use the per key order setting, and in the best case the system will process 10,000 at the same time using the non-blocking Vert.x HTTP client library. The user just has to provide a function to extract from the message the HTTP call parameters and construct the HTTP request object.

5.1. Illustrative Performance Example

(see VolumeTests.java)

These performance comparison results below, even though are based on real performance measurement results, are for illustrative purposes. To see how the performance of the tool is related to instance counts, partition counts, key distribution and how it would relate to the vanilla client. Actual results will vary wildly depending upon the setup being deployed into.

For example, if you have hundreds of thousands of keys in your topic, randomly distributed, even with hundreds of partitions, with only a handful of this wrapper deployed, you will probably see many orders of magnitude performance improvements - massively out performing dozens of vanilla Kafka consumer clients.

Figure 5. Time taken to process a large number of messages with a Single Parallel Consumer vs a single Kafka Consumer, for different key space sizes. As the number of unique keys in the data set increases, the key ordered Parallel Consumer performance starts to approach that of the unordered Parallel Consumer. The raw Kafka consumer performance remains unaffected by the key distribution.

Figure 6. Consumer group size effect on total processing time vs a single Parallel Consumer. As instances are added to the consumer group, it’s performance starts to approach that of the single instance Parallel Consumer. Key ordering is faster than partition ordering, with unordered being the fastest.

Figure 7. Consumer group size effect on message latency vs a single Parallel Consumer. As instances are added to the consumer group, it’s performance starts to approach that of the single instance Parallel Consumer.

As an illustrative example of relative performance, given:

A random processing time between 0 and 5ms
10,000 messages to process
A single partition (simplifies comparison - a topic with 5 partitions is the same as 1 partition with a keyspace of 5)
Default ParallelConsumerOptions
- maxUncommittedMessagesToHandle = 1000
- maxConcurrency = 100
- numberOfThreads = 16

Table 1. Comparative performance of order modes and key spaces

Ordering	Number of keys	Duration	Note
Partition	20 (not relevant)	22.221s	This is the same as a single partition with a single normal serial consumer, as we can see: 2.5ms avg processing time * 10,000 msg / 1000ms = ~25s.
Key	1	26.743s	Same as above
Key	2	13.576s
Key	5	5.916s
Key	10	3.310s
Key	20	2.242s
Key	50	2.204s
Key	100	2.178s
Key	1,000	2.056s
Key	10,000	2.128s	As key space is t he same as the number of messages, this is similar (but restricted by max concurrency settings) as having a single consumer instance and partition per key. 10,000 msgs * avg processing time 2.5ms = ~2.5s.
Unordered	20 (not relevant)	2.829s	As there is no order restriction, this is similar (but restricted by max concurrency settings) as having a single consumer instance and partition per key. 10,000 msgs * avg processing time 2.5ms = ~2.5s.

6. Support and Issues

If you encounter any issues, or have any suggestions or future requests, please create issues in the github issue tracker. Issues will be dealt with on a good faith, best efforts basis, by the small team maintaining this library.

We also encourage participation, so if you have any feature ideas etc, please get in touch, and we will help you work on submitting a PR!

ℹ️	We are very interested to hear about your experiences! And please vote on your favourite issues!

If you have questions, head over to the Confluent Slack community, or raise an issue on GitHub.

7. License

This library is copyright Confluent Inc, and licensed under the Apache License Version 2.0.

8. Usage

8.1. Maven

This project is available in maven central, repo1, along with SNAPSHOT builds (starting with 0.5-SNAPSHOT) in repo1’s SNAPSHOTS repo.

Latest version can be seen here.

Where ${project.version} is the version to be used:

group ID: io.confluent.parallelconsumer
artifact ID: parallel-consumer-core
version:

Core Module Dependency

<dependency>
    <groupId>io.confluent.parallelconsumer</groupId>
    <artifactId>parallel-consumer-core</artifactId>
    <version>${project.version}</version>
</dependency>

Reactor Module Dependency

<dependency>
    <groupId>io.confluent.parallelconsumer</groupId>
    <artifactId>parallel-consumer-reactor</artifactId>
    <version>${project.version}</version>
</dependency>

Vert.x Module Dependency

<dependency>
    <groupId>io.confluent.parallelconsumer</groupId>
    <artifactId>parallel-consumer-vertx</artifactId>
    <version>${project.version}</version>
</dependency>

8.2. Common Preparation

Setup the client

Consumer<String, String> kafkaConsumer = getKafkaConsumer(); // (1)
Producer<String, String> kafkaProducer = getKafkaProducer();

var options = ParallelConsumerOptions.<String, String>builder()
        .ordering(KEY) // (2)
        .maxConcurrency(1000) // (3)
        .consumer(kafkaConsumer)
        .producer(kafkaProducer)
        .build();

ParallelStreamProcessor<String, String> eosStreamProcessor =
        ParallelStreamProcessor.createEosStreamProcessor(options);

eosStreamProcessor.subscribe(of(inputTopic)); // (4)

return eosStreamProcessor;

Setup your clients as per normal. A Producer is only required if using the produce flows.
Choose your ordering type, KEY in this case. This ensures maximum concurrency, while ensuring messages are processed and committed in KEY order, making sure no offset is committed unless all offsets before it in it’s partition, are completed also.
The maximum number of concurrent processing operations to be performing at any given time. Also, because the library coordinates offsets, enable.auto.commit must be disabled in your consumer.
Subscribe to your topics

ℹ️	Because the library coordinates offsets, `enable.auto.commit` must be disabled.

After this setup, one then has the choice of interfaces:

ParallelStreamProcessor
VertxParallelStreamProcessor
JStreamParallelStreamProcessor
JStreamVertxParallelStreamProcessor

There is another interface: ParallelConsumer which is integrated, however there is currently no immediate implementation. See issue #12, and the ParallelConsumer JavaDoc:

/**
 * Asynchronous / concurrent message consumer for Kafka.
 * <p>
 * Currently there is no direct implementation, only the {@link ParallelStreamProcessor} version (see {@link
 * AbstractParallelEoSStreamProcessor}), but there may be in the future.
 *
 * @param <K> key consume / produce key type
 * @param <V> value consume / produce value type
 * @see AbstractParallelEoSStreamProcessor
 */

8.3. Core

8.3.1. Simple Message Process

This is the only thing you need to do, in order to get massively concurrent processing in your code.

Usage - print message content out to the console in parallel

parallelConsumer.poll(record ->
        log.info("Concurrently processing a record: {}", record)
);

See the core example project, and it’s test.

8.3.2. Process and Produce a Response Message

This interface allows you to process your message, then publish back to the broker zero, one or more result messages. You can also optionally provide a callback function to be run after the message(s) is(are) successfully published to the broker.

Usage - print message content out to the console in parallel

parallelConsumer.pollAndProduce(context -> {
            var consumerRecord = context.getSingleRecord().getConsumerRecord();
            var result = processBrokerRecord(consumerRecord);
            return new ProducerRecord<>(outputTopic, consumerRecord.key(), result.payload);
        }, consumeProduceResult -> {
            log.debug("Message {} saved to broker at offset {}",
                    consumeProduceResult.getOut(),
                    consumeProduceResult.getMeta().offset());
        }
);

8.3.3. Callbacks vs Streams

You have the option to either use callbacks to be notified of events, or use the Streaming versions of the API, which use the java.util.stream.Stream system:

JStreamParallelStreamProcessor
JStreamVertxParallelStreamProcessor

In future versions, we plan to look at supporting other streaming systems like RxJava via modules.

8.4. Batching

The library also supports sending a batch or records as input to the users processing function in parallel. Using this, you can process several records in your function at once.

To use it, set a batch size in the options class.

There are then various access methods for the batch of records - see the PollContext object for more information.

❗	If an exception is thrown while processing the batch, all messages in the batch will be returned to the queue, to be retried with the standard retry system. There is no guarantee that the messages will be retried again in the same batch.

8.4.1. Usage

ParallelStreamProcessor.createEosStreamProcessor(ParallelConsumerOptions.<String, String>builder()
        .consumer(getKafkaConsumer())
        .producer(getKafkaProducer())
        .maxConcurrency(100)
        .batchSize(5) // (1)
        .build());
parallelConsumer.poll(context -> {
    // convert the batch into the payload for our processing
    List<String> payload = context.stream()
            .map(this::preparePayload)
            .collect(Collectors.toList());
    // process the entire batch payload at once
    processBatchPayload(payload);
});

Choose your batch size.

8.4.2. Restrictions

If using a batch version of the API, you must choose a batch size in the options class.
If a batch size is chosen, the "normal" APIs cannot be used, and an error will be thrown.

8.5. HTTP with the Vert.x Module

Call an HTTP endpoint for each message usage

var resultStream = parallelConsumer.vertxHttpReqInfoStream(context -> {
    var consumerRecord = context.getSingleConsumerRecord();
    log.info("Concurrently constructing and returning RequestInfo from record: {}", consumerRecord);
    Map<String, String> params = UniMaps.of("recordKey", consumerRecord.key(), "payload", consumerRecord.value());
    return new RequestInfo("localhost", port, "/api", params); // (1)
});

Simply return an object representing the request, the Vert.x HTTP engine will handle the rest, using it’s non-blocking engine

See the Vert.x example project, and it’s test.

8.6. Project Reactor

TODO example

8.7. Kafka Streams Concurrent Processing

Use your Streams app to process your data first, then send anything needed to be processed concurrently to an output topic, to be consumed by the parallel consumer.

Figure 8. Example usage with Kafka Streams

Preprocess in Kafka Streams, then process concurrently

void run() {
    preprocess(); // (1)
    concurrentProcess(); // (2)
}

void preprocess() {
    StreamsBuilder builder = new StreamsBuilder();
    builder.<String, String>stream(inputTopic)
            .mapValues((key, value) -> {
                log.info("Streams preprocessing key: {} value: {}", key, value);
                return String.valueOf(value.length());
            })
            .to(outputTopicName);

    startStreams(builder.build());
}

void startStreams(Topology topology) {
    streams = new KafkaStreams(topology, getStreamsProperties());
    streams.start();
}

void concurrentProcess() {
    setupParallelConsumer();

    parallelConsumer.poll(record -> {
        log.info("Concurrently processing a record: {}", record);
        messageCount.getAndIncrement();
    });
}

Setup your Kafka Streams stage as per normal, performing any type of preprocessing in Kafka Streams
For the slow consumer part of your Topology, drop down into the parallel consumer, and use massive concurrency

See the Kafka Streams example project, and it’s test.

8.8. Confluent Cloud

Provision your fully managed Kafka cluster in Confluent Cloud
1. Sign up for Confluent Cloud, a fully-managed Apache Kafka service.
2. After you log in to Confluent Cloud, click on Add cloud environment and name the environment learn-kafka. Using a new environment keeps your learning resources separate from your other Confluent Cloud resources.
3. Click on LEARN and follow the instructions to launch a Kafka cluster and to enable Schema Registry.
Access the client configuration settings
1. From the Confluent Cloud Console, navigate to your Kafka cluster. From the Clients view, get the connection information customized to your cluster (select Java).
2. Create new credentials for your Kafka cluster, and then Confluent Cloud will show a configuration block with your new credentials automatically populated (make sure show API keys is checked).
3. Use these settings presented to configure your clients.
Use these clients for steps outlined in the Common Preparation section.

9. Upgrading

9.1. From 0.4 to 0.5

This version has a breaking change in the API - instead of passing in ConsumerRecord instances, it passes in a PollContext object which has extra information and utility methods. See the PollContext class for more information.

10. Ordering Guarantees

The user has the option to either choose ordered, or unordered message processing.

Either in ordered or unordered processing, the system will only commit offsets for messages which have been successfully processed.

🔥	`Unordered` processing could cause problems for third party integration where ordering by key is required.

🔥	Beware of third party systems which are not idempotent, or are key order sensitive.

❗	The below diagrams represent a single iteration of the system and a very small number of input partitions and messages.

10.1. Vanilla Kafka Consumer Operation

Given this input topic with three partitions and a series of messages:

Figure 9. Input topic

The normal Kafka client operations in the following manner. Note that typically offset commits are not performed after processing a single message, but is illustrated in this manner for comparison to the single pass concurrent methods below. Usually many messages are committed in a single go, which is much more efficient, but for our illustrative purposes is not really relevant, as we are demonstration sequential vs concurrent processing messages.

Figure 10. Normal execution of the raw Kafka client

10.2. Unordered

Unordered processing is where there is no restriction on the order of multiple messages processed per partition, allowing for highest level of concurrency.

This is the fastest option.

Figure 11. Unordered concurrent processing of message

10.3. Ordered by Partition

At most only one message from any given input partition will be in flight at any given time. This means that concurrent processing is restricted to the number of input partitions.

The advantage of ordered processing mode, is that for an assignment of 1000 partitions to a single consumer, you do not need to run 1000 consumer instances or threads, to process the partitions in parallel.

Note that for a given partition, a slow processing message will prevent messages behind it from being processed. However, messages in other partitions assigned to the consumer will continue processing.

This option is most like normal operation, except if the consumer is assigned more than one partition, it is free to process all partitions in parallel.

Figure 12. Partition ordered concurrent processing of messages

10.4. Ordered by Key

Most similar to ordered by partition, this mode ensures process ordering by key (per partition).

The advantage of this mode, is that a given input topic may not have many partitions, it may have a ~large number of unique keys. Each of these key → message sets can actually be processed concurrently, bringing concurrent processing to a per key level, without having to increase the number of input partitions, whilst keeping strong ordering by key.

As usual, the offset tracking will be correct, regardless of the ordering of unique keys on the partition or adjacency to the committed offset, such that after failure or rebalance, the system will not replay messages already marked as successful.

This option provides the performance of maximum concurrency, while maintaining message processing order per key, which is sufficient for many applications.

Figure 13. Key ordering concurrent processing of messages

10.5. Retries and Ordering

Even during retries, offsets will always be committed only after successful processing, and in order.

11. Retries

If processing of a record fails, the record will be placed back into it’s queue and retried with a configurable delay (see the ParallelConsumerOptions class). Ordering guarantees will always be adhered to, regardless of failure.

A failure is denoted by any exception being thrown from the user’s processing function. The system catches these exceptions, logs them and replaces the record in the queue for processing later. All types of Exceptions thrown are considered retriable. To not retry a record, do not throw an exception from your processing fuction.

If for some reason you want to proactively fail a record, without relying on some other system throwing an exception which you don’t catch - simply throw an exception of your own design, which the system will treat the same way.

To configure the retry delay, see ParallelConsumerOptions#defaultRetryDelay.

At the moment there is no terminal error support, so messages will continue to be retried forever as long as an exception continues to be thrown from the user function (see Skipping Records). But still this will not hold up the queues in KEY or UNORDERED modes, however in PARTITION mode it will block progress. Offsets will also continue to be committed (see Commit Mode and Offset Map).

11.1. Retry Delay Function

As part of the enhanced retry epic, the ability to dynamically determine the retry delay was added. This can be used to customise retry delay for a record, such as exponential back off or have different delays for different types of records, or have the delay determined by the status of a system etc.

You can access the retry count of a record through it’s wrapped WorkContainer class, which is the input variable to the retry delay function.

Example retry delay function implementing exponential backoff

final double multiplier = 0.5;
final int baseDelaySecond = 1;

ParallelConsumerOptions.<String, String>builder()
        .retryDelayProvider(recordContext -> {
            int numberOfFailedAttempts = recordContext.getNumberOfFailedAttempts();
            long delayMillis = (long) (baseDelaySecond * Math.pow(multiplier, numberOfFailedAttempts) * 1000);
            return Duration.ofMillis(delayMillis);
        });

11.2. Skipping Records

If for whatever reason you want to skip a record, simply do not throw an exception, or catch any exception being thrown, log and swallow it and return from the user function normally. The system will treat this as a record processing success, mark the record as completed and move on as though it was a normal operation.

A user may choose to skip a record for example, if it has been retried too many times or if the record is invalid or doesn’t need processing.

Implementing a max retries feature as a part of the system is planned.

Example of skipping a record after a maximum number of retries is reached

final int maxRetries = 10;
final Map<ConsumerRecord<String, String>, Long> retriesCount = new ConcurrentHashMap<>();

pc.poll(context -> {
    var consumerRecord = context.getSingleRecord().getConsumerRecord();
    Long retryCount = retriesCount.computeIfAbsent(consumerRecord, ignore -> 0L);
    if (retryCount < maxRetries) {
        processRecord(consumerRecord);
        // no exception, so completed - remove from map
        retriesCount.remove(consumerRecord);
    } else {
        log.warn("Retry count {} exceeded max of {} for record {}", retryCount, maxRetries, consumerRecord);
        // giving up, remove from map
        retriesCount.remove(consumerRecord);
    }
});

11.3. Circuit Breaker Pattern

Although the system doesn’t have an explicit circuit breaker pattern feature, one can be created by combining the custom retry delay function and proactive failure. For example, the retry delay can be calculated based upon the status of an external system - i.e. if the external system is currently out of action, use a higher retry. Then in the processing function, again check the status of the external system first, and if it’s still offline, throw an exception proactively without attempting to process the message. This will put the message back in the queue.

Example of circuit break implementation

final Map<String, Boolean> upMap = new ConcurrentHashMap<>();

pc.poll(context -> {
    var consumerRecord = context.getSingleRecord().getConsumerRecord();
    String serverId = extractServerId(consumerRecord);
    boolean up = upMap.computeIfAbsent(serverId, ignore -> true);

    if (!up) {
        up = updateStatusOfSever(serverId);
    }

    if (up) {
        try {
            processRecord(consumerRecord);
        } catch (CircuitBreakingException e) {
            log.warn("Server {} is circuitBroken, will retry message when server is up. Record: {}", serverId, consumerRecord);
            upMap.put(serverId, false);
        }
        // no exception, so set server status UP
        upMap.put(serverId, true);
    } else {
        throw new RuntimeException(msg("Server {} currently down, will retry record latter {}", up, consumerRecord));
    }
});

11.4. Head of Line Blocking

In order to have a failing record not block progress of a partition, one of the ordering modes other than PARTITION must be used, so that the system is allowed to process other messages that are perhaps in KEY order or in the case of UNORDERED processing - any message. This is because in PARTITION ordering mode, records are always processed in order of partition, and so the Head of Line blocking feature is effectively disabled.

11.5. Future Work

Improvements to this system are planned, see the following issues:

12. Result Models

Void

Processing is complete simply when your provided function finishes, and the offsets are committed.

Streaming User Results

When your function is actually run, a result object will be streamed back to your client code, with information about the operation completion.

Streaming Message Publishing Results

After your operation completes, you can also choose to publish a result message back to Kafka. The message publishing metadata can be streamed back to your client code.

13. Commit Mode

The system gives you three choices for how to do offset commits. The simplest of the three are the two Consumer commits modes. They are of course, synchronous and asynchronous mode. The transactional mode is explained in the next section.

Asynchronous mode is faster, as it doesn’t block the control loop.

Synchronous will block the processing loop until a successful commit response is received, however, Asynchronous will still be capped by the max processing settings in the ParallelConsumerOptions class.

If you’re used to using the auto commit mode in the normal Kafka consumer, you can think of the Asynchronous mode being similar to this. We suggest starting with this mode, and it is the default.

13.1. Apache Kafka EoS Transaction Model

There is also the option to use Kafka’s Exactly Once Semantics (EoS) system. This causes all messages produced as a result of processing a message to be committed within a transaction, along with their source offset. This means that even under failure, the results will exist exactly once in the Kafka output topic. If as a part of your processing, you create side effects in other systems, this pertains to the usual idempotency requirements when breaking of EoS Kafka boundaries.

NOTE: As with the synchronous processing mode, this will also block the processing loop until a successful transaction completes

🔥	This cannot be true for any externally integrated third party system, unless that system is idempotent.

For implementations details, see the Transactional System Architecture section.

14. Using with Kafka Streams

Kafka Streams (KS) doesn’t yet (KIP-311, KIP-408) have parallel processing of messages. However, any given preprocessing can be done in KS, preparing the messages. One can then use this library to consume from an input topic, produced by KS to process the messages in parallel.

For a code example, see the Kafka Streams Concurrent Processing section.

Figure 14. Example usage with Kafka Streams

15. Roadmap

For released changes, see the CHANGELOG. For features in development, have a look at the GitHub issues.

15.1. Medium Term - What’s up next ⏲

Distributed tracing integration
Distributed rate limiting
Metrics
More customisable handling[confluentinc#65] of HTTP interactions

15.2. Long Term - The future ☁️

Automatic fanout (automatic selection of concurrency level based on downstream back pressure) (draft PR)
Dead Letter Queue (DLQ) handling
Call backs only once offset has been committed

16. Usage Requirements

Client side
- JDK 8
- SLF4J
- Apache Kafka (AK) Client libraries 2.5
- Supports all features of the AK client (e.g. security setups, schema registry etc)
- For use with Streams, see Using with Kafka Streams section
- For use with Connect:
  - Source: simply consume from the topic that your Connect plugin is publishing to
  - Sink: use the poll and producer style API and publish the records to the topic that the connector is sinking from
Server side
- Should work with any cluster that the linked AK client library works with
  - If using EoS/Transactions, needs a cluster setup that supports EoS/transactions

17. Development Information

17.1. Requirements

Uses Lombok, if you’re using IntelliJ Idea, get the plugin.
Integration tests require a running locally accessible Docker host.
Has a Maven profile setup for IntelliJ Idea, but not Eclipse for example.

17.2. Notes

The unit test code is set to run at a very high frequency, which can make it difficult to read debug logs (or impossible). If you want to debug the code or view the main logs, consider changing the below:

ParallelEoSStreamProcessorTestBase

ParallelEoSStreamProcessorTestBase#DEFAULT_BROKER_POLL_FREQUENCY_MS
ParallelEoSStreamProcessorTestBase#DEFAULT_COMMIT_INTERVAL_MAX_MS

17.3. Recommended IDEA Plugins

AsciiDoc
CheckStyle
CodeGlance
EditorConfig
Rainbow Brackets
SonarLint
Lombok

17.4. Readme

The README uses a special custom maven processor plugin to import live code blocks into the root readme, so that GitHub can show the real code as includes in the README. This is because GitHub doesn’t properly support the include directive.

The source of truth readme is in .//src/docs/README_TEMPLATE.adoc.

17.5. Maven targets

Compile and run all tests

mvn verify
Run tests excluding the integration tests

mvn test
Run all tests

mvn verify
Run any goal skipping tests (replace <goalName> e.g. install)

mvn <goalName> -DskipTests
See what profiles are active

mvn help:active-profiles
See what plugins or dependencies are available to be updated

mvn versions:display-plugin-updates versions:display-property-updates versions:display-dependency-updates
Run a single unit test

mvn -Dtest=TestCircle test
Run a specific integration test method in a submodule project, skipping unit tests

mvn -Dit.test=TransactionAndCommitModeTest#testLowMaxPoll -DskipUTs=true verify -DfailIfNoTests=false --projects parallel-consumer-core
Run git bisect to find a bad commit, edit the Maven command in bisect.sh and run

git bisect start good bad
git bisect run ./bisect.sh

Note: mvn compile - Due to a bug in Maven’s handling of test-jar dependencies - running mvn compile fails, use mvn test-compile instead. See issue #162 and this Stack Overflow question.

17.6. Testing

The project has good automated test coverage, of all features. Including integration tests running against real Kafka broker and database. If you want to run the tests yourself, clone the repository and run the command: mvn test. The tests require an active docker server on localhost.

17.6.1. Integration Testing with TestContainers

We use the excellent Testcontainers library for integration testing with JUnit.

To speed up test execution, you can enable container reuse across test runs by setting the following in your ~/.testcontainers.properties file:

testcontainers.reuse.enable=true

This will leave the container running after the JUnit test is complete for reuse by subsequent runs.

ℹ️
The container will only be left running if it is not explicitly stopped by the JUnit rule. For this reason, we use a variant of the singleton container pattern instead of the JUnit rule.

Testcontainers detects if a container is reusable by hashing the container creation parameters from the JUnit test. If an existing container is not reusable, a new container will be created, but the old container will not be removed.

Target | Description --- | --- testcontainers-list | List all containers labeled as testcontainers testcontainers-clean | Remove all containers labeled as testcontainers

Stop and remove all containers labeled with org.testcontainers=true

docker container ls --filter 'label=org.testcontainers=true' --format '{{.ID}}' \
| $(XARGS) docker container rm --force

List all containers labeled with org.testcontainers=true

docker container ls --filter 'label=org.testcontainers=true'

ℹ️
testcontainers-clean removes all docker containers on your system with the io.testcontainers=true label > (including the most recent container which may be reusable).

See this testcontainers PR for details on the reusable containers feature.

18. Implementation Details

18.1. Core Architecture

Concurrency is controlled by the size of the thread pool (worker pool in the diagram). Work is performed in a blocking manner, by the users submitted lambda functions.

These are the main sub systems:

controller thread
broker poller thread
work pool thread
work management
offset map manipulation

Each thread collaborates with the others through thread safe Java collections.

Figure 15. Core Architecture. Threads are represented by letters and colours, with their steps in sequential numbers.

18.2. Vert.x Architecture

The Vert.x module is an optional extension to the core module. As depicted in the diagram, the architecture extends the core architecture.

Instead of the work thread pool count being the degree of concurrency, it is controlled by a max parallel requests setting, and work is performed asynchronously on the Vert.x engine by a core count aligned Vert.x managed thread pool using Vert.x asynchronous IO plugins (verticles).

Figure 16. Vert.x Architecture

18.3. Transactional System Architecture

18.4. Offset Map

Unlike a traditional queue, messages are not deleted on an acknowledgement. However, offsets are tracked per message, per consumer group - there is no message replay for successful messages, even over clean restarts.

Across a system failure, only completed messages not stored as such in the last offset payload commit will be replayed. This is not an exactly once guarantee, as message replay cannot be prevented across failure.

🔥

Note that Kafka’s Exactly Once Semantics (EoS) (transactional processing) also does not prevent duplicate message replay - it presents an effectively once result messages in Kafka topics. Messages may still be replayed when using EoS. This is an important consideration when using it, especially when integrating with thrid party systems, which is a very common pattern for utilising this project.

As mentioned previously, offsets are always committed in the correct order and only once all previous messages have been successfully processed; regardless of ordering mode selected. We call this the "highest committable offset".

However, because messages can be processed out of order, messages beyond the highest committable offset must also be tracked for success and not replayed upon restart of failure. To achieve this the system goes a step further than normal Kafka offset commits.

When messages beyond the highest committable offset are successfully processed;

they are stored as such in an internal memory map.
when the system then next commits offsets
if there are any messages beyond the highest offset which have been marked as succeeded
1. the offset map is serialised and encoded into a base 64 string, and added to the commit message metadata.
upon restore, if needed, the system then deserializes this offset map and loads it back into memory
when each messages is polled into the system
1. it checks if it’s already been previously completed
2. at which point it is then skipped.

This ensures that no message is reprocessed if it’s been previously completed.

❗	Successful messages beyond the highest committable offset are still recorded as such in a specially constructed metadata payload stored alongside the Kafka committed offset. These messages are not replayed upon restore/restart.

The offset map is compressed in parallel using two different compression techniques - run length encoding and bitmap encoding. The sizes of the compressed maps are then compared, and the smallest chosen for serialization. If both serialised formats are significantly large, they are then both compressed using zstd compression, and if that results in a smaller serialization then the compressed form is used instead.

18.4.1. Storage Notes

Runtime data model creates list of incomplete offsets
Continuously builds a full complete / not complete bit map from the base offset to be committed
Dynamically switching storage
- encodes into a BitSet, and a RunLength, then compresses both using zstd, then uses the smallest and tags as such in the encoded String
- Which is smallest can depend on the size and information density of the offset map
  - Smaller maps fit better into uncompressed BitSets ~(30 entry map bitset: compressed: 13 Bytes, uncompressed: 4 Bytes)
  - Larger maps with continuous sections usually better in compressed RunLength
  - Completely random offset maps, compressed and uncompressed BitSet is roughly the same (2000 entries, uncompressed bitset: 250, compressed: 259, compressed bytes array: 477)
  - Very large maps (20,000 entries), a compressed BitSet seems to be significantly smaller again if random.
Gets stored along with base offset for each partition, in the offset commitsync metadata string
The offset commit metadata has a hardcoded limit of 4096 bytes per partition (@see kafka.coordinator.group.OffsetConfig#DefaultMaxMetadataSize = 4096)
- Because of this, if our map doesn’t fit into this, we have to drop it and not use it, losing the shorter replay benefits. However with runlength encoding and typical offset patterns this should be quite rare.
  - Work is being done on continuous and predictive space requirements, which will optionally prevent the system from continuing past a point by introducing local backpressure which it can’t proceed without dropping the encoded map information - see Exact continuous offset encoding for precise offset payload size back pressure.
- Not being able to fit the map into the metadata, depends on message acknowledgement patterns in the use case and the numbers of messages involved. Also the information density in the map (i.e. a single not yet completed message in 4000 completed ones will be a tiny map and will fit very large amounts of messages)

19. Attribution

Apache®, Apache Kafka, and Kafka® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

20. Change Log

A high level summary of noteworthy changes in each version.

NOTE: Dependency version bumps are not listed here.

Adds a very simple Dependency Injection system modeled on Dagger (#398)
Various refactorings e.g. new ProducerWrap

20.3. v0.5.2.2

20.3.1. Fixes

Fixes dependency scope for Mockito from compile to test (#376)

20.4. v0.5.2.1

20.4.1. Fixes

Fixes regression issue with order of state truncation vs commit (#362)

20.5. v0.5.2.0

20.5.1. Fixes and Improvements

fixes #184: Fix multi topic subscription with KEY order by adding topic to shard key (#315)
fixes #329: Committing around transaction markers causes encoder to crash (#328)
build: Upgrade Truth-Generator to 0.1.1 for user Subject discovery (#332)

20.5.2. Build

build: Allow snapshots locally, fail in CI (#331)
build: OSS Index scan change to warn only and exclude Guava CVE-2020-8908 as it’s WONT_FIX (#330)

20.5.3. Dependencies

build(deps): bump reactor-core from 3.4.19 to 3.4.21 (#344)
build(deps): dependabot bump Mockito, Surefire, Reactor, AssertJ, Release (#342) (#342)
build(deps): dependabot bump TestContainers, Vert.x, Enforcer, Versions, JUnit, Postgress (#336)

20.5.4. Linked issues

Message with null key lead to continuous failure when using KEY ordering #318
Subscribing to two or more topics with KEY ordering, results in messages of the same Key never being processed #184
Cannot have negative length BitSet error - committing transaction adjacent offsets #329

20.6. v0.5.1.0

20.6.1. Features

#193: Pause / Resume PC (circuit breaker) without unsubscribing from topics

20.6.2. Fixes and Improvements

#225: Build and runtime support for Java 16+ (#289)
#306: Change Truth-Generator dependency from compile to test
#298: Improve PollAndProduce performance by first producing all records, and then waiting for the produce results.Previously, this was done for each ProduceRecord individually.

20.7. v0.5.0.0

20.7.1. Features

feature: Poll Context object for API (#223)
- PollContext API - provides central access to result set with various convenience methods as well as metadata about records, such as failure count
major: Batching feature and Event system improvements
- Batching - all API methods now support batching. See the Options class set batch size for more information.

20.7.2. Fixes and Improvements

Event system - better CPU usage in control thread
Concurrency stability improvements
Update dependencies
#247: Adopt Truth-Generator (#249)
- Adopt Truth Generator for automatic generation of Google Truth Subjects
Large rewrite of internal architecture for improved maintence and simplicity which fixed some corner case issues
- refactor: Rename PartitionMonitor to PartitionStateManager (#269)
- refactor: Queue unification (#219)
- refactor: Partition state tracking instead of search (#218)
- refactor: Processing Shard object
fix: Concurrency and State improvements (#190)

20.7.3. Build

build: Lock TruthGenerator to 0.1 (#272)
build: Deploy SNAPSHOTS to maven central snaphots repo (#265)
build: Update Kafka to 3.1.0 (#229)
build: Crank up Enforcer rules and turn on ossindex audit
build: Fix logback dependency back to stable
build: Upgrade TestContainer and CP

20.8. v0.4.0.1

20.8.1. Improvements

Add option to specify timeout for how long to wait offset commits in periodic-consumer-sync commit-mode
Add option to specify timeout for how long to wait for blocking Producer#send

20.8.2. Docs

docs: Confluent Cloud configuration links
docs: Add Confluent’s product page for PC to README
docs: Add head of line blocking to README

20.9. v0.4.0.0

20.9.1. Features

Project Reactor non-blocking threading adapter module
Generic Vert.x Future support - i.e. FileSystem, db etc…

20.9.2. Fixes and Improvements

Vert.x concurrency control via WebClient host limits fixed - see #maxCurrency
Vert.x API cleanup of invalid usage
Out of bounds for empty collections
Use ConcurrentSkipListMap instead of TreeMap to prevent concurrency issues under high pressure
log: Show record topic in slow-work warning message

20.10. v0.3.2.0

20.10.1. Fixes and Improvements

Major: Upgrade to Apache Kafka 2.8 (still compatible with 2.6 and 2.7 though)
Adds support for managed executor service (Java EE Compatibility feature)
#65 support for custom retry delay providers

20.11. v0.3.1.0

20.11.1. Fixes and Improvements

Major refactor to code base - primarily the two large God classes
- Partition state now tracked separately
- Code moved into packages
Busy spin in some cases fixed (lower CPU usage)
Reduce use of static data for test assertions - remaining identified for later removal
Various fixes for parallel testing stability

Tests now run in parallel
License fixing / updating and code formatting
License format runs properly now when local, check on CI
Fix running on Windows and Linux
Fix JAVA_HOME issues

Details:

tests: Enable the fail fast feature now that it’s merged upstream
tests: Turn on parallel test runs
format: Format license, fix placement
format: Apply Idea formatting (fix license layout)
format: Update mycila license-plugin
test: Disable redundant vert.x test - too complicated to fix for little gain
test: Fix thread counting test by closing PC @After
test: Test bug due to static state overrides when run as a suite
format: Apply license format and run every All Idea build
format: Organise imports
fix: Apply license format when in dev laptops - CI only checks
fix: javadoc command for various OS and envs when JAVA_HOME missing
fix: By default, correctly run time JVM as jvm.location

20.13. v0.3.0.2

20.13.1. Fixes and Improvements

ci: Add CODEOWNER
fix: #101 Validate GroupId is configured on managed consumer
Use 8B1DA6120C2BF624 GPG Key For Signing
ci: Bump jdk8 version path
fix: #97 Vert.x thread and connection pools setup incorrect
Disable Travis and Codecov
ci: Apache Kafka and JDK build matrix
fix: Set Serdes for MockProducer for AK 2.7 partition fix KAFKA-10503 to fix new NPE
Only log slow message warnings periodically, once per sweep
Upgrade Kafka container version to 6.0.2
Clean up stalled message warning logs
Reduce log-level if no results are returned from user-function (warn → debug)
Enable java 8 Github
Fixes #87 - Upgrade UniJ version for UnsupportedClassVersion error
Bump TestContainers to stable release to specifically fix #3574
Clarify offset management capabilities

20.14. v0.3.0.1

fixes #62: Off by one error when restoring offsets when no offsets are encoded in metadata
fix: Actually skip work that is found as stale

20.15. v0.3.0.0

20.15.1. Features

Queueing and pressure system now self tuning, performance over default old tuning values (softMaxNumberMessagesBeyondBaseCommitOffset and maxMessagesToQueue) has doubled.
- These options have been removed from the system.
Offset payload encoding back pressure system
- If the payload begins to take more than a certain threshold amount of the maximum available, no more messages will be brought in for processing, until the space need beings to reduce back below the threshold. This is to try to prevent the situation where the payload is too large to fit at all, and must be dropped entirely.
- See Proper offset encoding back pressure system so that offset payloads can’t ever be too large #47
- Messages that have failed to process, will always be allowed to retry, in order to reduce this pressure.

20.15.2. Improvements

Default ordering mode is now KEY ordering (was UNORDERED).
- This is a better default as it’s the safest mode yet high performing mode. It maintains the partition ordering characteristic that all keys are processed in log order, yet for most use cases will be close to as fast as UNORDERED when the key space is large enough.
Support BitSet encoding lengths longer than Short.MAX_VALUE #37 - adds new serialisation formats that supports wider range of offsets - (32,767 vs 2,147,483,647) for both BitSet and run-length encoding.
Commit modes have been renamed to make it clearer that they are periodic, not per message.
Minor performance improvement, switching away from concurrent collections.

20.15.3. Fixes

Maximum offset payload space increased to correctly not be inversely proportional to assigned partition quantity.
Run-length encoding now supports compacted topics, plus other bug fixes as well as fixes to Bitset encoding.

20.16. v0.2.0.3

20.16.1. Fixes

Bitset overflow check (#35) - gracefully drop BitSet or Runlength encoding as an option if offset difference too large (short overflow)
- A new serialisation format will be added in next version - see Support BitSet encoding lengths longer than Short.MAX_VALUE #37
Gracefully drops encoding attempts if they can’t be run
Fixes a bug in the offset drop if it can’t fit in the offset metadata payload

20.17. v0.2.0.2

20.17.1. Fixes

Turns back on the Bitset overflow check (#35)

20.18. v0.2.0.1 DO NOT USE - has critical bug

20.18.1. Fixes

Incorrectly turns off an over-flow check in offset serialisation system (#35)

20.19. v0.2.0.0

20.19.1. Features

Choice of commit modes: Consumer Asynchronous, Synchronous and Producer Transactions
Producer instance is now optional
Using a transactional Producer is now optional
Use the Kafka Consumer to commit offsets Synchronously or Asynchronously

20.19.2. Improvements

Memory performance - garbage collect empty shards when in KEY ordering mode
Select tests adapted to non transactional (multiple commit modes) as well
Adds supervision to broker poller
Fixes a performance issue with the async committer not being woken up
Make committer thread revoke partitions and commit
Have onPartitionsRevoked be responsible for committing on close, instead of an explicit call to commit by controller
Make sure Broker Poller now drains properly, committing any waiting work

20.19.3. Fixes

Fixes bug in commit linger, remove genesis offset (0) from testing (avoid races), add ability to request commit
Fixes #25 confluentinc#25:
- Sometimes a transaction error occurs - Cannot call send in state COMMITTING_TRANSACTION #25
ReentrantReadWrite lock protects non-thread safe transactional producer from incorrect multithreaded use
Wider lock to prevent transaction’s containing produced messages that they shouldn’t
Must start tx in MockProducer as well
Fixes example app tests - incorrectly testing wrong thing and MockProducer not configured to auto complete
Add missing revoke flow to MockConsumer wrapper
Add missing latch timeout check

20.20. v0.1

20.20.1. Features:

Have massively parallel consumption processing without running hundreds or thousands of
- Kafka consumer clients
- topic partitions
  
  without operational burden or harming the clusters performance
Efficient individual message acknowledgement system (without local or third system state) to massively reduce message replay upon failure
Per key concurrent processing, per partition and unordered message processing
Offsets committed correctly, in order, of only processed messages, regardless of concurrency level or retries
Vert.x non-blocking library integration (HTTP currently)
Fair partition traversal
Zero~ dependencies (Slf4j and Lombok) for the core module
Java 8 compatibility
Throttle control and broker liveliness management
Clean draining shutdown cycle

Confluent Parallel Consumer

About

Languages