This is a template for creating a fully functional dbt project for teaching, learning, writing, demoing, or any other scenarios where you need a basic project with a synthesized jaffle shop business. We recommend beginners use the following steps to open this project right here on GitHub in a Codespace. If you're a little more experienced with devcontainers and want to go faster 🏎️, you can use the Gitpod link above for a quicker startup and deeper feature set.
This will create a new repository exactly like this one, and navigate you there. Make sure to execute the next instructions in that repo.
This will create a new codespace
, a sandboxed devcontainer with everything you need for a dbt project. Once the codespace is finished setting up, you'll be ready to run a dbt build
.
After the container is built and connected to, VSCode will run a few clean up commands and then a postCreateCommand
, a set of commands run after the container is set up. This is where we install our dependencies, such as dbt, the duckdb adapter, and other necessities, as well as run dbt deps
to install the dbt packages we want to use. That screen will look something like the above, when its completed it will close and leave you in a fresh terminal prompt. From there you're ready to do some analytics engineering!
This template includes two additional tools for the other parts of the stack to create a more realistic experience:
- BI reporting built with Evidence - an open source, code-based BI tool to write reports with markdown and SQL.
- EL with Meltano - an open source tool that provides a CLI & version control for ELT pipelines.
With Evidence you can:
- Version control your BI layer
- Build reports in the same repo as your dbt project
- Deploy your reports to a static site
To run Evidence, use:
cd reports
npm run dev
See the Evidence CLI docs for more details.
You can make changes to the markdown pages in the reports/pages
folder and see the reports update in the browser preview.
This project is preconfigured with Meltano, which can be used to extract and load raw data into DuckDB.
meltano run tap-jaffle-shop target-duckdb
Optionally, you can modify extract parameters using environment variables. For instance, this modified version will extract five years of data instead of the default 1 year.
TAP_JAFFLE_SHOP_YEARS=5
meltano run tap-jaffle-shop target-duckdb
You can also modify any tap or target config with the interactive config
command:
meltano config tap-jaffle-shop set --interactive
meltano config target-duckdb set --interactive
This project is optimized for running in a container. If you'd like to use it locally outside of container you'll need to follow the instructions below.
- Create a python virtual environment and install the dependencies.
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
- Install meltano with pipx. And install meltano's dependencies.
pipx install meltano
meltano install
- Run the EL pipeline.
meltano run el
- Install dbt dependencies and build the dbt project.
dbt deps
dbt build
- Install Evidence dependencies and run the Evidence server.
cd reports
npm install
npm run dev
We welcome issues and PRs requesting or adding new features. The package that generates the synthetic data, jafgen
, is also under active development, and will add more types of source data to model as we go along. If you have tests, descriptions, new models, metrics, materializations types, or techniques you use this repo to demonstrate, which you feel would make for a more expansive baseline experience, we encourage you to consider contributing them back in so that this project becomes an even better collective tool for exploring and learning dbt over time.