Hello fellow Data Developers!
In this blogpost, I am going to provide an overview of dbt’s Jinja capabilities, and of dbt’s macros/packages.
To start, if you have not heard of dbt (data build tool), it is relatively new tool that is marketed as “the new standard for data transformation”. In short, dbt lets you quickly develop and deploy analytics code to run all of your data transformations, while connecting to your data warehouse (such as Snowflake or BigQuery).
Let’s dive in!
Jinja
What is Jinja? This is a templating language written in Python. In dbt, Jinja is used to write functional SQL. For example, Jinja can be used with DRY principles in mind and eliminate unnecessary extra code by utilizing IF/ELSE statements and FOR loops.
There are three Jinja delimiters that you must know:
- Statement Delimiter:
{% ... %}
These are used for things such as setting a variable, initiating a FOR loop, or running a IF/ELSE statement. For example, below I am setting a variable and executing a FOR loop:
{% set dog_names = ['skylord', 'sky', 'jaws'] %}{% for dog_name in dog_names %}
The current dog's name is {{ dog_name }}
{% endfor %}---OUTPUT---
The current dog's name is skylord.
The current dog's name is sky.
The current dog's name is jaws.
2. Expression Delimiter: {{ ... }}
These are used to print text to the file being rendered; specifically in dbt this is pure SQL that is compiled Jinja code. Above the {{ dog_name }}
is an example of the expression delimiter being used.
3. Comments Delimiter: {{# ... #}}
These are used to document code inline and will not be rendered to the pure SQL when compiled.
This is a high overview of Jinja, there are many other capabilities that can be utilized to really maximize the use of Jinja, a great resource is located here.
Macros
What is a Macro? In dbt, a macro is a function that is written in Jinja. This allows developers to reuse code easily and reference consistently used logic through their projects.
Here is an example of a simple macro that converts a cents value to a US dollars value. Specifically you can see there are two parameters: column_name
and decimal_places
. column_name
is the name of the column that is being transformed to dollars and decimal_places
, which defaults to 2 if not set, is the number of decimal (cents) places that are in the calculated FLOAT value.
{% macro cents_to_dollars(column_name, decimal_places=2) %}
round(1.0 * {{ column_name }} / 100, {{ decimal_places }})
{% endmacro %}
An example of it being used can be seen below in the dbt model SQL file:
select
id as order_id,
{{ cents_to_dollars('amount', 1) }} as amount
from some_order_payments_table
Packages
What are Packages? Packages are used to import models and macros into your dbt Project, these models and macros can be written internally or externally. If they are an external package, they will most likely be found at hub.getdbt.com website, which shares open-source packages.
These are imported through the packages.yml
file, which is located at the project’s root. An example of a packages.yml
file can be seen below, which imports the popular dbt_utils
package:
packages:
- package: dbt-labs/dbt_utils
version: 0.9.0
When you want to utilize a macro from an installed package you must use the following syntax:
{{ dbt_utils.date_spine(
datepart="day"
start_date="to_date('01/01/2021', 'mm/dd/yyyy')",
end_date="dateadd(week, 1, current_date)"
)
}}
Conclusion
This was a high-level overview of dbt’s powerful Jinja capabilities, Macros, and Packages. In summary: Jinja is a templating language that can enable a developer to write functional SQL, Macros are a way to utilize DRY principles and cut down on repeated code, and Packages are way to utilize open-source developed macros and models.
I hope that this quick run through is a good introduction to a few of the many powerful capabilities of dbt.
Happy Coding!