apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Home Page:https://airflow.apache.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Helm scheduler HA : postgresql error on slot_pool lock

parisni opened this issue · comments

Official Helm Chart version

1.3.0 (latest released)

Apache Airflow version

2.2.1

Kubernetes Version

1.21

Helm Chart configuration

airflow:

  env:
    - name: AIRFLOW__SCHEDULER__USE_ROW_LEVEL_LOCKING
      value: "True"

webserver:
  livenessProbe:
    initialDelaySeconds: 100
    timeoutSeconds: 100
    failureThreshold: 20
    periodSeconds: 25

  readinessProbe:
    initialDelaySeconds: 100
    timeoutSeconds: 100
    failureThreshold: 20
    periodSeconds: 25


scheduler:
  replicas: 2

executor: LocalExecutor

Docker Image customisations

No response

What happened

THe postgresql logs this errors:

ERROR:  could not obtain lock on row in relation "slot_pool"                                                              
STATEMENT:  SELECT slot_pool.pool AS slot_pool_pool, slot_pool.slots AS slot_pool_slots         
FROM slot_pool FOR UPDATE NOWAIT                          │

What you expected to happen

No response

How to reproduce

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

This should be the intended behaviour. There is a so called "critical section" in the scheduler job, where only one scheduler instance can be at a time. The pools database table is used as a mutex.

Don't think that there is anything that can be done on the airflow side to reduce the misleading error loging.

This is expected, not an issue

I've worked out how I can stop that error from happening, to just remove a source of confusion #19842