Helm scheduler HA : postgresql error on slot_pool lock
parisni opened this issue · comments
Nicolas Paris commented
Official Helm Chart version
1.3.0 (latest released)
Apache Airflow version
2.2.1
Kubernetes Version
1.21
Helm Chart configuration
airflow:
env:
- name: AIRFLOW__SCHEDULER__USE_ROW_LEVEL_LOCKING
value: "True"
webserver:
livenessProbe:
initialDelaySeconds: 100
timeoutSeconds: 100
failureThreshold: 20
periodSeconds: 25
readinessProbe:
initialDelaySeconds: 100
timeoutSeconds: 100
failureThreshold: 20
periodSeconds: 25
scheduler:
replicas: 2
executor: LocalExecutor
Docker Image customisations
No response
What happened
THe postgresql logs this errors:
ERROR: could not obtain lock on row in relation "slot_pool"
STATEMENT: SELECT slot_pool.pool AS slot_pool_pool, slot_pool.slots AS slot_pool_slots
FROM slot_pool FOR UPDATE NOWAIT │
What you expected to happen
No response
How to reproduce
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Tanel Kiis commented
This should be the intended behaviour. There is a so called "critical section" in the scheduler job, where only one scheduler instance can be at a time. The pools database table is used as a mutex.
Don't think that there is anything that can be done on the airflow side to reduce the misleading error loging.
Ephraim Anierobi commented
This is expected, not an issue
Ash Berlin-Taylor commented
I've worked out how I can stop that error from happening, to just remove a source of confusion #19842