What happens if the agent dies in the middle of a flow run? Will the run be able to finish and their state be reflected in the backend?

The agent is intended to run as a lightweight long-running process constantly polling for scheduled runs - your actual execution layer could be different, e.g., this could be a Kubernetes cluster

What happens when the agent process is interrupted?

  • runs deployed by Docker, ECS, or Kubernetes agent can continue because those run as separate processes, but it’s best to treat the agent as a long-running service and not interrupt it
  • runs deployed using a local agent cannot continue

Take action when an agent becomes unhealthy

Also, something worth noting: Prefect Cloud has an Automation that allows you to take action if some agent becomes unhealthy. This could be either sending a message to the Ops team or even triggering an automated process via WebhookAction