Airflow using cosmos can't locate the dbt_project.yml file
SampreetKishan opened this issue · comments
Hello,
Not. sure this is the place to ask but I am finding it extremely hard finding assistance on an issue I am facing.
I have a simple dag.py file to deploy dbt projects on Airflow using Cosmos. I have copied my dbt project into the dags/dbt folder.
Here is my dag.py file
from datetime import datetime
import os
from cosmos import DbtDag, ProjectConfig, ProfileConfig, ExecutionConfig
from cosmos.profiles import PostgresUserPasswordProfileMapping
from cosmos.profiles import SnowflakeUserPasswordProfileMapping
profile_config = ProfileConfig(profile_name="default",
target_name="dev",
profile_mapping=SnowflakeUserPasswordProfileMapping(conn_id="snowflake_conn",
profile_args={
"database": "dbt_db",
"schema": "dbt_schema"
},
))
print("Present working directory: ", os.getcwd)
dbt_snowflake_dag = DbtDag(project_config=ProjectConfig("/Users/samkishan/dbt_snowflake_project/dbt_dag/dags/dbt/data_pipeline1/"),
operator_args={"install_deps": True},
profile_config=profile_config,
execution_config=ExecutionConfig(dbt_executable_path=f"{os.environ['AIRFLOW_HOME']}/dbt_venv/bin/dbt",),
schedule_interval="@daily",
start_date=datetime(2024, 5, 4),
catchup=False,
dag_id="dbt_snowflake_dag",)
print(f"Project Config: {dbt_snowflake_dag.project_config}")
I have made sure the dbt_project.yml file lies in this path: /Users/samkishan/dbt_snowflake_project/dbt_dag/dags/dbt/data_pipeline1/
However, on the Airflow UI, it doesn't seem to be able to locate this file..
Hey there, @SampreetKishan! I'm here to assist you with any bugs, questions, or contributions you have. Let's squash those bugs together! 🤖
To resolve the issue where Airflow cannot locate the dbt_project.yml
file at the specified path, ensure that the project_dir
parameter is correctly set and that the file exists at the specified location. The code checks for the existence of the dbt_project.yml
file and raises an error if it is not found.
Here is the relevant code snippet that performs this check:
if self.project_dir:
project_yml_path = Path(self.project_dir) / "dbt_project.yml"
mandatory_paths = {
"dbt_project.yml": project_yml_path,
"models directory": self.models_path,
}
if self.manifest_path:
mandatory_paths["manifest"] = self.manifest_path
for name, path in mandatory_paths.items():
if path is None or not Path(path).exists():
raise CosmosValueError(f"Could not find {name} at {path}")
Ensure that:
self.project_dir
is correctly set to the path where your dbt project is located.- The
dbt_project.yml
file exists at the specifiedproject_dir
.
Additionally, ensure that any dependencies or packages required by the dbt project are present. You can use the has_non_empty_dependencies_file
function to check for the presence of dependencies.yml
or packages.yml
files:
def has_non_empty_dependencies_file(project_path: Path) -> bool:
project_dir = Path(project_path)
has_deps = False
for filename in DBT_DEPENDENCIES_FILE_NAMES:
filepath = project_dir / filename
if filepath.exists() and filepath.stat().st_size > 0:
has_deps = True
break
if not has_deps:
logger.info(f"Project {project_path} does not have {DBT_DEPENDENCIES_FILE_NAMES}")
return has_deps
If the path or file is incorrect, update the project_dir
to the correct location and verify that the dbt_project.yml
file is present. Also, ensure that any required dependencies are properly set up [1][2][3].
Never mind, I had to change the ProjectConfig object to point the dbt project to the path: "/usr/local/airflow/dags/dbt/data_pipeline1" and now it works fine.