What's the relationship between the agent, executor, storage and FlowRunner and how do they affect local vs. backend-based flow run execution?

Agent vs. Executor

This StackOverflow answer provides a good overview of the relationship between agent and executor - here is a copy of that:

Agents represent the local infrastructure that a Flow can and should execute on, as specified by that Flow’s RunConfig . If a Flow should only run on Docker (or Kubernetes, or ECS, or whatever else) then the Flow Run is served by that Agent only. Agents can serve multiple Flows, so long as those Flows are all supported by that particular infrastructure. If a Flow Run is not tied to any particular infrastructure, then a UniversalRun is appropriate, and can be handled by any Agent. Most importantly, the Agent guarantees that the code and data associated with the Flows are never seen by the Prefect Server , by submitting requests to the server for Flows to run, along with updates on Flows in progress.

Executors, on the other hand, are responsible for the actual computation: that is, actually running the individual Tasks that make up a Flow. The Agent manages execution at a high level by calling submit on Tasks in the appropriate order, and by handling the results that the Executor returns. Because of this, an Executor has no knowledge of the Flow as a whole, rather only the Tasks that it received from the Agent. All Tasks in a single Flow are required to use the same Executor, but an Agent may communicate with different Executors between separate flows. Similarly, Executors can serve multiple Flow Runs, but at the Task level only.

In specific terms:

  • For a Docker Agent and a Dask Executor, there would be a Docker container that would manage resolution of the DAG and status reports back to the server. The actual computation of each Task’s results would take place outside of that container though, on a Dask Distributed cluster.
  • For a Docker Agent and a Local Executor, the container would perform the same roles as above. However, the computation of the Tasks’ results would also occur within that container (“local” to that Agent).
  • For a Local Agent and a Dask Executor, the machine that registered as the agent would manage DAG resolution and communication to the Server as a standalone process on that machine, instead of within a container. The computation for each Task though would still take place externally, on a Dask Distributed cluster.

In short, the Agent sits between the backend (Prefect Cloud or Prefect Server) and the Executor, acting as a custodian for the lifetime of a Flow Run and delineating the separation of concerns for each of the other components.

:point_down: Slack discussion about the same topic