How we execute dbt™ runs faster and cheaper

Introducing smart run for dbt Core™
April 6, 2023

Do you find yourself wasting time looking through your data lineage DAG trying to figure out which models to run? What about saving on the costs of running your dbt models? Do your dbt runs take forever?

To solve this problem for ourselves, the Fivetran analytics team developed "smart run for dbt Core™," our way of running only the models that need to run without wasting your brainpower to figure out how to craft your dbt run command. 

Imagine the following scenario… 

You’re working on a code change which involves making changes to many models. You’ve made changes to the models in red below, and you want to know how your changes impact the model in green. Purple models are those that haven’t been touched.

The cheapest way in terms of time and compute cost to run the new sequence is to:

  1. Copy the corresponding table from the models labeled with “C” from the production schema to your development schema. This ensures the models that will be run are using the data that is as fresh as the production environment. Note that the Copy command is free!
  2. Subsequently run the models labeled with “R”.
  3. All models labeled with “I” are ignored

That’s why the Fivetran Analytics team developed a python script “smart run for dbt Core™”

Now, analysts at Fivetran don’t need to worry about this problem. They run `$python3 dbt_smart_run.py` and it does the heavy lifting. 

Consider the following example: Assume we have made a change in the xactly_quotas model, and everything else has not changed. What would be the least expensive (both in terms of time and cost) way to understand what is the impact of our change to the quota_attainment model at the end?

We actually don’t need to run the entire tree. We can ignore some models, copy others, and run only the necessary models (see image above). Well, that is what smart run does automatically. You just specify the target model, and it figures out and does the rest.

Now, without wasting any time thinking, the analyst can simply run:

$ python3 dbt_smart_run.py -targets quota_attainment

We (the internal analytics team at Fivetran) chose to develop this for the following reasons:

  • The same command is used regardless of how complicated or simple your dbt run is.  No need to remember different commands.  We encourage our analysts to always use smart run for dbt Core™
  • This does not rely on the manifest.json

Great, how can I use this on my team?

Check out the code here.

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Data insights
Data insights

How we execute dbt™ runs faster and cheaper

How we execute dbt™ runs faster and cheaper

April 6, 2023
April 6, 2023
How we execute dbt™ runs faster and cheaper
Introducing smart run for dbt Core™

Do you find yourself wasting time looking through your data lineage DAG trying to figure out which models to run? What about saving on the costs of running your dbt models? Do your dbt runs take forever?

To solve this problem for ourselves, the Fivetran analytics team developed "smart run for dbt Core™," our way of running only the models that need to run without wasting your brainpower to figure out how to craft your dbt run command. 

Imagine the following scenario… 

You’re working on a code change which involves making changes to many models. You’ve made changes to the models in red below, and you want to know how your changes impact the model in green. Purple models are those that haven’t been touched.

The cheapest way in terms of time and compute cost to run the new sequence is to:

  1. Copy the corresponding table from the models labeled with “C” from the production schema to your development schema. This ensures the models that will be run are using the data that is as fresh as the production environment. Note that the Copy command is free!
  2. Subsequently run the models labeled with “R”.
  3. All models labeled with “I” are ignored

That’s why the Fivetran Analytics team developed a python script “smart run for dbt Core™”

Now, analysts at Fivetran don’t need to worry about this problem. They run `$python3 dbt_smart_run.py` and it does the heavy lifting. 

Consider the following example: Assume we have made a change in the xactly_quotas model, and everything else has not changed. What would be the least expensive (both in terms of time and cost) way to understand what is the impact of our change to the quota_attainment model at the end?

We actually don’t need to run the entire tree. We can ignore some models, copy others, and run only the necessary models (see image above). Well, that is what smart run does automatically. You just specify the target model, and it figures out and does the rest.

Now, without wasting any time thinking, the analyst can simply run:

$ python3 dbt_smart_run.py -targets quota_attainment

We (the internal analytics team at Fivetran) chose to develop this for the following reasons:

  • The same command is used regardless of how complicated or simple your dbt run is.  No need to remember different commands.  We encourage our analysts to always use smart run for dbt Core™
  • This does not rely on the manifest.json

Great, how can I use this on my team?

Check out the code here.

Topics
No items found.
Share

Articles associés

dbt en détail
Data insights

dbt en détail

Lire l’article
Fivetran is now a dbt Metrics Ready Partner
Product

Fivetran is now a dbt Metrics Ready Partner

Lire l’article
Fivetran & dbt: The essential duo for modern analytics
Data insights

Fivetran & dbt: The essential duo for modern analytics

Lire l’article
Laggy insights? Level up with Fivetran at Coalesce 2023
Blog

Laggy insights? Level up with Fivetran at Coalesce 2023

Lire l’article
Announcing the Fivetran dbt™ package for SAP
Blog

Announcing the Fivetran dbt™ package for SAP

Lire l’article
Best practices for optimizing a dbt™ deployment in a cloud destination
Blog

Best practices for optimizing a dbt™ deployment in a cloud destination

Lire l’article
No items found.

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.