Modular Data Stack — Build a Data Platform with Prefect, dbt and Snowflake

Orchestration platforms historically only allowed managing dependencies within individual data pipelines. The typical result was a series of DAGs and brittle engineering processes. To avoid breaking anything, you would think twice before touching the underlying logic.

Today, data practitioners, especially data platform engineers, are crossing the boundaries of teams, repositories, and data pipelines. Running things on a regular schedule alone doesn’t cut it anymore. Some dataflows are event-driven or triggered via ad-hoc API calls. To meet the demands of the rapidly changing world, data practitioners need to react quickly, deploy frequently, and have an automated development lifecycle with CI/CD.

This post dives into the “buckets of pain” data platform engineers still struggle with despite the Modern Data Stack. It discusses the desired solution and implementation design on a conceptual level. Hands-on demos follow later.

Part 2:

Part 3:

Part 4:

Part 5:

Part 6:

Part 7:

Summary of all parts:

The code for the entire blog post series is included in the prefect-dataplatform repository: