jayhack / prefect

The easiest way to coordinate your dataflow

Home Page:https://prefect.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`propose_state` can reach recursion depth on `Wait`

jayhack opened this issue · comments

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the Prefect documentation for this issue.
  • I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

When receiving SetStateStatus.WAIT from the server it is possible to reach max recursion depth in propose_state. https://github.com/PrefectHQ/prefect/blob/main/src/prefect/engine.py#L1884-L1895

A possible case is reproduced below. Where a lot of tasks are being sent but a concurrency limit is keeping the tasks from being executed in a timely manner. The server will send a Wait response if a concurrency slot is not available and tell the task to wait 30 seconds before trying again. If the concurrency limit is low but the number of tasks is high, a task can eventually hit the recursion depth.

  • We should look to change this to a loop as opposed to a recursive function
  • It may be helpful to add an additional exponential backoff on top of whatever amount of time the server is instructing to wait

Reproduction

$ prefect concurrency-limit create "my-tag" 1
from prefect import flow, task


@task(tags=["my-tag"])
def print_num_and_sleep(num):
    print(num)

@flow
def map_flow(nums):
    print_num_and_sleep.map(nums)

map_flow([i for i in range(100000)]) # or some other high number

Error

Crash details:

Traceback (most recent call last):

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1722, in report_task_run_crashes

    yield

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1351, in begin_task_run

    state = await orchestrate_task_run(

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1471, in orchestrate_task_run

    state = await propose_state(

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1890, in propose_state

    return await propose_state(

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1890, in propose_state

    return await propose_state(

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1890, in propose_state

    return await propose_state(

  [Previous line repeated 2947 more times]

  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1857, in propose_state

    response = await client.set_task_run_state(

  File "/usr/local/lib/python3.10/site-packages/prefect/client/orchestration.py", line 1897, in set_task_run_state

    return OrchestrationResult.parse_obj(response.json())

  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj

  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__

  File "pydantic/main.py", line 1076, in pydantic.main.validate_model

  File "pydantic/fields.py", line 884, in pydantic.fields.ModelField.validate

  File "pydantic/fields.py", line 1101, in pydantic.fields.ModelField._validate_singleton

  File "pydantic/fields.py", line 1151, in pydantic.fields.ModelField._apply_validators

  File "pydantic/class_validators.py", line 337, in pydantic.class_validators._generic_validator_basic.lambda13

  File "pydantic/main.py", line 711, in pydantic.main.BaseModel.validate

  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__

  File "pydantic/main.py", line 1076, in pydantic.main.validate_model

  File "pydantic/fields.py", line 884, in pydantic.fields.ModelField.validate

  File "pydantic/fields.py", line 1101, in pydantic.fields.ModelField._validate_singleton

  File "pydantic/fields.py", line 1151, in pydantic.fields.ModelField._apply_validators

  File "pydantic/class_validators.py", line 337, in pydantic.class_validators._generic_validator_basic.lambda13

  File "pydantic/main.py", line 711, in pydantic.main.BaseModel.validate

  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__

  File "pydantic/main.py", line 1076, in pydantic.main.validate_model

  File "pydantic/fields.py", line 884, in pydantic.fields.ModelField.validate

  File "pydantic/fields.py", line 1101, in pydantic.fields.ModelField._validate_singleton

  File "pydantic/fields.py", line 1151, in pydantic.fields.ModelField._apply_validators

  File "pydantic/class_validators.py", line 341, in pydantic.class_validators._generic_validator_basic.lambda15

  File "pydantic/validators.py", line 318, in pydantic.validators.uuid_validator

  File "/usr/local/lib/python3.10/uuid.py", line 170, in __init__

    if [hex, bytes, bytes_le, fields, int].count(None) != 4:

RecursionError: maximum recursion depth exceeded in comparison

04:20:18 PM

execute-185

prefect.task_runs

ERROR

Crash detected! Execution was interrupted by an unexpected exception: RecursionError: maximum recursion depth exceeded in comparison


### Versions

```Text
Version:             2.8.6+2.g00b08ccf9
API version:         0.8.4
Python version:      3.11.0
Git commit:          00b08ccf
Built:               Thu, Mar 16, 2023 1:35 PM
OS/Arch:             darwin/x86_64
Profile:             postgres
Server type:         ephemeral
Server:
  Database:          postgresql

Additional context