opensource-observer / oso

Measuring the impact of open source software

Home Page:https://opensource.observer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Docs: Add some dbt guidelines

ryscheng opened this issue · comments

What is it?

We should link to the dbt best practices docs for how to structure staging/intermediate/mart models
https://docs.getdbt.com/best-practices/how-we-structure/1-guide-overview
https://docs.getdbt.com/best-practices/how-we-style/1-how-we-style-our-dbt-models
https://docs.getdbt.com/blog/stakeholder-friendly-model-names

And we should add our own guidelines:

  • Ensure consistent entity and source naming in all int tables. For instance, to_id in the events table should be to_artifact_id. We should aim to fix the consistency between id, source, network, namespace etc.

  • Avoid complex marts. Push all complexity to int tables. Marts should simply be a direct copy or less granular version of an int table. This is a point @ravenac95 made previously.

  • Enumerate all columns in the mart explicitly, rather than using * statements. This makes it easier to trace any changes through version control and any upstream changes to int models will fail to compile in dbt.

Adding some fixes to code_metrics here
#1332

Fixes to onchain_metrics here
#1333

Added style guide here
#1334

Filing separate issue for cleaning up the models themselves
#1335