How do you write incremental models in DBT?

If you have a date field you can use it to load only the last data. I.e. you have transaction_date column.

{{ config(
  materialized='incremental',
    as_columnstore=false, 
  pre_hook="""
    {% if is_incremental() %}
      delete from {{this}} 
      where transaction_date >= '{{ (modules.datetime.datetime.now() - modules.datetime.timedelta(2)).isoformat() }}'
    {% endif %}  
  """
)}}

SELECT
   u.user_id, t.transaction_id
FROM users u
JOIN transactions t ON t.user_id = u.user_id
WHERE u.active = 1

{% if is_incremental() %}
 and transaction_date >= '{{ (modules.datetime.datetime.now() - modules.datetime.timedelta(2)).isoformat() }}'
{% endif %}

The first time you run this model it will ignore all the code under "if is_incremental()". All the upcoming runs it will clean up transactions for the last two days and reload them.


As @anders-swanson wrote in his comment, if transaction_id is definitely unique, you could set it as the unique_key and materialize your model as an incremental table.

dbt's docs explain how to do this. Using your example, it might be:

{{
    config(
        materialized='incremental',
        unique_key='transaction_id'
    )
}}

select
   u.user_id, t.transaction_id
from users u
join transactions t ON t.user_id = u.user_id
where u.active = 1

If transaction_id is not unique but transaction_id||user_id is, you could try creating a new column which concatenates these columns in an upstream dbt model, and then assigning that as the unique_key:

{{
    config(
        materialized='incremental',
        unique_key='pkey'
    )
}}

select
   u.user_id, 
   t.transaction_id,
   u.user_id||t.transaction_id as pkey
from users u
join transactions t ON t.user_id = u.user_id
where u.active = 1

Otherwise, you'll have to pull in a column that is either a) unique, or b) has an ordered quality that could be used to apply an is_incremental() filter (like @viacheslav-nefedov wrote).