astronomer / astronomer-cosmos

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code

Home Page:https://astronomer.github.io/astronomer-cosmos/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Airflow using cosmos can't locate the dbt_project.yml file

SampreetKishan opened this issue · comments

Hello,

Not. sure this is the place to ask but I am finding it extremely hard finding assistance on an issue I am facing.

I have a simple dag.py file to deploy dbt projects on Airflow using Cosmos. I have copied my dbt project into the dags/dbt folder.

Here is my dag.py file

from datetime import datetime
import os
from cosmos import DbtDag, ProjectConfig, ProfileConfig, ExecutionConfig
from cosmos.profiles import PostgresUserPasswordProfileMapping
from cosmos.profiles import SnowflakeUserPasswordProfileMapping

profile_config = ProfileConfig(profile_name="default",
                               target_name="dev",
                               profile_mapping=SnowflakeUserPasswordProfileMapping(conn_id="snowflake_conn", 
                                                    profile_args={
                                                        "database": "dbt_db",
                                                        "schema": "dbt_schema"
                                                        },
                                                    ))
print("Present working directory: ", os.getcwd)
dbt_snowflake_dag = DbtDag(project_config=ProjectConfig("/Users/samkishan/dbt_snowflake_project/dbt_dag/dags/dbt/data_pipeline1/"),
                    operator_args={"install_deps": True},
                    profile_config=profile_config,
                    execution_config=ExecutionConfig(dbt_executable_path=f"{os.environ['AIRFLOW_HOME']}/dbt_venv/bin/dbt",),
                    schedule_interval="@daily",
                    start_date=datetime(2024, 5, 4),
                    catchup=False,
                    dag_id="dbt_snowflake_dag",)

print(f"Project Config: {dbt_snowflake_dag.project_config}")

I have made sure the dbt_project.yml file lies in this path: /Users/samkishan/dbt_snowflake_project/dbt_dag/dags/dbt/data_pipeline1/
Screen Shot 2024-06-07 at 4 39 12 PM

However, on the Airflow UI, it doesn't seem to be able to locate this file..
Screen Shot 2024-06-07 at 4 43 03 PM

Hey there, @SampreetKishan! I'm here to assist you with any bugs, questions, or contributions you have. Let's squash those bugs together! 🤖

To resolve the issue where Airflow cannot locate the dbt_project.yml file at the specified path, ensure that the project_dir parameter is correctly set and that the file exists at the specified location. The code checks for the existence of the dbt_project.yml file and raises an error if it is not found.

Here is the relevant code snippet that performs this check:

if self.project_dir:
    project_yml_path = Path(self.project_dir) / "dbt_project.yml"
    mandatory_paths = {
        "dbt_project.yml": project_yml_path,
        "models directory": self.models_path,
    }
if self.manifest_path:
    mandatory_paths["manifest"] = self.manifest_path

for name, path in mandatory_paths.items():
    if path is None or not Path(path).exists():
        raise CosmosValueError(f"Could not find {name} at {path}")

Ensure that:

  1. self.project_dir is correctly set to the path where your dbt project is located.
  2. The dbt_project.yml file exists at the specified project_dir.

Additionally, ensure that any dependencies or packages required by the dbt project are present. You can use the has_non_empty_dependencies_file function to check for the presence of dependencies.yml or packages.yml files:

def has_non_empty_dependencies_file(project_path: Path) -> bool:
    project_dir = Path(project_path)
    has_deps = False
    for filename in DBT_DEPENDENCIES_FILE_NAMES:
        filepath = project_dir / filename
        if filepath.exists() and filepath.stat().st_size > 0:
            has_deps = True
            break

    if not has_deps:
        logger.info(f"Project {project_path} does not have {DBT_DEPENDENCIES_FILE_NAMES}")
    return has_deps

If the path or file is incorrect, update the project_dir to the correct location and verify that the dbt_project.yml file is present. Also, ensure that any required dependencies are properly set up [1][2][3].

To continue talking to Dosu, mention @dosu.

Never mind, I had to change the ProjectConfig object to point the dbt project to the path: "/usr/local/airflow/dags/dbt/data_pipeline1" and now it works fine.