New dbt recipe allowing you to rerun your dbt DAG from a failed node - by Alex Streed

Hi Prefectionists! :wave:

If you are using Prefect with dbt, we have a great new recipe you should try out! :cook:
@desertaxle has just released a recipe allowing you to easily rerun failed dbt models from a failed dbt DAG node.

Get the recipe code

https://github.com/PrefectHQ/customer-success-recipes/tree/main/prefect/flows/dbt/rerun-models-from-failure

The problem this recipe helps to solve

When running a suite of dbt models, sometimes a few models may fail due to exogenous circumstances (e.g., network issues, improper authorization, etc.). Retrying the entire DbtShellTask with all your dbt models may be expensive and time-consuming, especially if only one of those models failed.

Solution

You can extend your dbt Prefect flow by adding a task that can rerun only failed models. The recipe uses the --select flag together with --defer and --state to only rerun your dbt DAG from a failed node:

dbt build --select result:error+ --defer --state ./target
  • The value for the --state flag can be modified to point to another location where dbt state artifacts are stored.

  • The flow uses the all_failed trigger to ensure that the rerun task only runs if the initial dbt task fails.

  • For more details, check out the README.


The flow code is located here:
https://github.com/PrefectHQ/customer-success-recipes/blob/main/prefect/flows/dbt/rerun-models-from-failure/flow.py

Wow this is a really nice use case of all the capabilities the dbt cli offers.

I have a bit of a hard time seeing when an immediate retry is useful as most failures rely on upstream issues and not flakes, network issues and such. But it could be very different with different warehouses or organisations of course. Of course, it probably doesn’t hurt either. If it can avoid some days waking up to alerts of failures that’s probably worth it.

Another idea here could be to add a manual trigger for the rerun task, so that you just have to click on the rerun manual approval once you’ve fixed any upstream issues if any. I’ve found myself kicking of a new flow and manually giving it a rerun command like dbt run --select failed_model+

1 Like

@noahholm I’ve recently got a response from dbt about WHEN to use this feature - sharing in case this might be useful:

  • Timeout issues with the database( you can simply rerun the prefect flow as is)
  • Lack of permissions to read/write with tables/schemas etc (need to manually invoke grant statements in the database before running this prefect flow again)
  • Another dbt job is touching the same table as the current job and causing concurrency issues at the database level (you can simply rerun the prefect flow as is)
1 Like

Flow code and README location have changed to the following location: