The `add_license_url` DAG keeps timing out
krysal opened this issue · comments
Description
This DAG keeps timing out for unknown reasons when the number of items to modify is relatively high (>500k). Instead, it was verified that the batched_update
DAG can handle this kind of updates for loads of millions of row. It was tested to back fill the license (by-nc-sa, 2.0) and it updated 11,090,909 records successfully.
However, continuous executions have resulted in the reappearance of licenses in the group of rows missing the field, so there could be ingestion flows that are not filling in this data or some other problem (#4318). I'd like to update the add_license_url
DAG to use the batched_update
and automate this process until we make sure all rows are complete.