open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

Home Page:https://open-metadata.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug: intermittent issues with connection to database service cause unwanted table deletion

mgorsk1 opened this issue · comments

Affected module
Ingestion

Describe the bug
We have defined Trino connection with ingestion that has Mark Deleted Tables to True. Our Trino uses Hive Metastore under the hood, which might from time to time experience connectivity issues. If a connectivity issue happens and Trino cannot get a list of tables, it will remove all tables from Open Metadata.

To Reproduce

  • Register Trino service and corresponding metadata ingestion oricess with Mark Deleted Tables set to True
  • Run ingestion on healthy Trino connector instance (for example HMS)
  • Cause Trino connector system to fail (for example bring HMS down)
  • Run ingestion on not healthy Trino connector

because of issues with Trino connector, the table list is not fetched and they are marked for deletion.

Expected behavior
If listing tables fails they are not marked for deletion, the schema is filtered out from processing in mark_deleted_tables method.

Version:

  • OS: [e.g. iOS]
  • Python version:
  • OpenMetadata version: [e.g. 0.8] 1.5.4
  • OpenMetadata Ingestion package version: [e.g. openmetadata-ingestion[docker]==XYZ] 1.5.4.0

Additional context
Add any other context about the problem here.