Redshift Ingestion broken in 0.13.2
joshua-pgatour opened this issue · comments
Describe the bug
Execution finished with errors.
{'exec_id': '2236cd44-90eb-4781-9563-05ea51f4bbd4',
'infos': ['2024-05-03 19:27:34.726045 INFO: Starting execution for task with name=RUN_INGEST',
'2024-05-03 19:55:40.923833 INFO: Caught exception EXECUTING task_id=2236cd44-90eb-4781-9563-05ea51f4bbd4, name=RUN_INGEST, '
'stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 140, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 282, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
'errors': ['2024-05-03 19:55:40.923635 ERROR: The ingestion process was killed by signal SIGKILL likely because it ran out of memory. You can '
'resolve this issue by allocating more memory to the datahub-actions container.']}
I have tried to increase my memory up to 32gb and I still get this error. I've turned off Lineage and Profiling and even tried just tables, no views.
Occassionally I will get this error instead of the memory one:
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 391, in _batch_workunits_by_urn
for wu in stream:
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 184, in auto_materialize_referenced_tags
for wu in stream:
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 91, in auto_status_aspect
for wu in stream:
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/source/redshift/redshift.py", line 468, in get_workunits_internal
yield from self.extract_lineage(
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/source/redshift/redshift.py", line 987, in extract_lineage
lineage_extractor.populate_lineage(
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/source/redshift/lineage.py", line 659, in populate_lineage
table_renames, all_tables_set = self._process_table_renames(
File "/tmp/datahub/ingest/venv-redshift-2b9c1ab97dc6cd7f/lib/python3.10/site-packages/datahub/ingestion/source/redshift/lineage.py", line 872, in _process_table_renames
all_tables[database][schema].add(prev_name)
KeyError: 'pgat_competitions_x'
In this case it seems to be trying to reference a schema name that I have filtered out in the ingest recipe.
@joshua-pgatour what CLI version is this with?
I merged a fix related to this in #9967
Thank you for the reply. I have tried going back one version at a time on datahub-actions dockerhub releases and the KeyError seems to stop happening around v10. However, I still have a memory issue. I have pretty much maxed out the size my pod can be and it still fails with memory SIGKILL. Any suggestions on getting around this? Here is my current recipe:
`source:
type: redshift
config:
host_port: ''
database: pgat
username:
table_lineage_mode: mixed
include_table_lineage: false
include_tables: true
include_views: false
profiling:
enabled: false
profile_table_level_only: false
stateful_ingestion:
enabled: true
password: '${redshift_secret2}'
schema_pattern:
allow:
- 'curated_.*'
`
So I figured out how to change the CLI version in the ingest recipe. I'm sorry I thought it was controlled by the datahub-actions container version. (Didn't know it was controlled in the recipe). 0.10.5.1 CLI works fine on 16gb memory and there is no KeyError. I will experiment at what point this breaks. But I gotta believe there's a memory leak in newer versions.
I can confirm that v0.13.2 has the memory problem. v0.13.1 works, but in my testing the ingest process has slowed significantly since 0.12