Kinesis checkpointing does not work correctly when leasing multiple shards
Limess opened this issue · comments
Description
The Kinesis checkpointing mechanism only updates a single shard per minute, rather than each shard that is subscribed.
In an application where multiple shard leases exist for a single worker, this leads to checkpoints drifting far from the actual last-behind fo the stream, we've observed up to 15 minutes of delay with 4 shard leases per worker.
This results in large amounts of duplicate work when a worker process is terminated as the checkpoint is not correctly maintained.
Expectation
The checkpoint of each shard leased by a worker is updated approximately minutely.
What actually happens
The checkpoint of one shard is updated every minute
Details
Amazonica version: 0.3.22
This seems to be due to a shared atom for each Kinesis worker, rather than creating an atom per record processor.