ArroyoSystems / arroyo

Distributed stream processing engine in Rust

Home Page:https://arroyo.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Maintenance for long running clusters

harshit2283 opened this issue · comments

Currently Arroyo stores all metadata for checkpoints in checkpoints table, overtime it grows upto a significant size which may impact performance.

Screenshot 2024-03-11 at 15 29 51

Immediate solve (by @mwylde ) is to run this query manually ->

DELETE FROM checkpoints
WHERE checkpoints.id != (
  SELECT id FROM checkpoints
  WHERE job_id = '{{ JOB_ID }}'
  ORDER BY finish_time DESC
  LIMIT 1
) AND job_id='{{ JOB_ID }}';

Just merged the fix