PrefectHQ / prefect-dbt

Collection of Prefect integrations for working with dbt with your Prefect flows.

Home Page:https://prefecthq.github.io/prefect-dbt/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add easy way to define DBT-profile in the code

Antupis opened this issue · comments

Currently defining the DBT profile inside code is kinda hassle especially if you have multiple steps at flow. When creating flow you need to specify profiles_dir and call overwrite_profiles so that code is truly deterministic.

    credentials = DatabaseCredentials(
        driver=SyncDriver.POSTGRESQL_PSYCOPG2,
        username="postgres",
        password="password",
        database="postgres",
        host="localhost",
        port=5432,
    )

    target_configs = PostgresTargetConfigs(credentials=credentials, schema="postgres")

    dbt_cli_profile = DbtCliProfile(
        type="postgres", name="jaffa", target="jaffa_test", target_configs=target_configs
    )

    deps = trigger_dbt_cli_command(
        "dbt deps",
        profiles_dir=".",
        overwrite_profiles=True,
        project_dir=f"{project_root}/dbt/",
        dbt_cli_profile=dbt_cli_profile,
    )

    hesiod_etl = trigger_dbt_cli_command(
        f"dbt run -s +exposure:hesiod_etl --vars '{{client_key: {client_key}}}'",
        project_dir=f"{project_root}/dbt/",
        profiles_dir=".",
    )

When you could do something simple like pass dbt_cli_profile at every call. Which is not DRY but much cleaner than the current way.

    credentials = DatabaseCredentials(
        driver=SyncDriver.POSTGRESQL_PSYCOPG2,
        username="postgres",
        password="password",
        database="postgres",
        host="localhost",
        port=5432,
    )

    target_configs = PostgresTargetConfigs(credentials=credentials, schema="postgres")

    dbt_cli_profile = DbtCliProfile(
        type="postgres", name="jaffa", target="jaffa_test", target_configs=target_configs
    )

    deps = trigger_dbt_cli_command(
        "dbt deps",
        overwrite_profiles=True,
        dbt_cli_profile=dbt_cli_profile,
    )

    hesiod_etl = trigger_dbt_cli_command(
        f"dbt run -s +exposure:hesiod_etl --vars '{{client_key: {client_key}}}'",
        project_dir=f"{project_root}/dbt/",
        dbt_cli_profile=dbt_cli_profile,
    )

Hi, thanks for reporting this to streamline the prefect-dbt experience!

To clarify, are you saying that specifying profiles_dir and overwrite_profile per trigger_dbt_cli_command is a hassle?

If so, would this work?

   trigger_kwargs = dict(
        profiles_dir=".",
        overwrite_profiles=True,
        dbt_cli_profile=dbt_cli_profile,
    )
    deps = trigger_dbt_cli_command(
        "dbt deps",
        project_dir=f"{project_root}/dbt/",
        **trigger_kwargs
    )

    hesiod_etl = trigger_dbt_cli_command(
        f"dbt run -s +exposure:hesiod_etl --vars '{{client_key: {client_key}}}'",
        project_dir=f"{project_root}/dbt_projects/hesiod/",
        **trigger_kwargs
    )

Yeah, that would work. Should something like that be documented as example? Now documented examples are cases where you are calling like just one DBT task.

Sure, that'd be great! Would you be willing to contribute a small PR for that?

Sure where I should document this?

In README.md should be fine. Thanks!