Agents and work queues FAQ

What is the relationship between an agent, a work queue, and a deployment?

What happens when an agent picks up a deployed flow run from a work queue?

https://discourse.prefect.io/t/what-happens-when-agent-picks-up-a-deployed-flow-run-from-a-work-queue/1464

What are use cases where assigning multiple work queues to a single agent can be beneficial?

One example is gating work to a single agent:

prefect work-queue create prio1 --limit 20
prefect work-queue create prio2 --limit 5
prefect agent start -q prio1 -q prio2

In this example, the agent can simultaneously poll for work from both of those queues, but given that work queue prio1 is considered more important, it has a higher concurrency setting than the prio2 work queue.

Runs from deployments with prio1 work queue are prioritized over the runs from deployments with prio2 work queue within a single agent process.

What are use cases where starting multiple agent processes for the same work queue can be beneficial?

  • If users don’t use a cluster such as Kubernetes and need to spread the load, they can run, e.g., multiple EC2 instances, each running an agent process polling from the same work queue.

  • While Prefect doesn’t guarantee that the runs would be equally load-balanced (the runs are picked up at random by the agents with matching work queue names), each run will be picked up from the work queue only once — Prefect leverages idempotency keys to avoid duplicate work and prevent situations where s single run could be simultaneously picked up by two agents polling from the same work queue.

Does a work queue provide a mechanism to spread the load across tasks?

https://discourse.prefect.io/t/does-a-work-queue-provide-a-mechanism-to-spread-the-load-across-tasks/1615

For more about various deployment patterns, check out:

Hi! Question with regards to agents, is it possible to limit concurrency of flow runs per agent and queue rather than just per queue? In our case we have a number of workers all running the same flow however the tasks run for hours often at 100% (or near) CPU usage across all cores. As such they cannot handle multiple of these long run flow runs at once. Our solution currently is that we have split the deployment and queues by machine however this means we have to manually manage the scheduling which becomes difficult as flow runs and machines increases over time. Would be fantastic to have them all poll the same queue for work but they can run at most 1 of a given flow per agent. Alternatively allowing a deployment to be associated with multiple work queues would probably also work but I understand from other docs there are reasons this isn’t available.

There’s a lot of flexibility in terms of how you could design the process here.

  1. You could split this long-running process into smaller parts e.g. leveraging subflows
  2. You could leverage the orchestrator pattern to coordinate codependent deployments How to create a flow run from deployment (orchestrator pattern)?
  3. You can set a concurrency limit of 1 per queue and run multiple queues and agent processes
  4. You can switch to some more reliable execution layer e.g. leverage the power of Kubernetes for better resource utilization

It’s already possible to have multiple agents polling from the same work queue and setting concurrency limits set on this work queue. It might be worth cross-checking the flow design to ensure not everything runs as part of one large flow to better separate that load

Unfortunately the flow is a constraint programming problem and so the solver/problem is tightly coupled and cannot be broken down into smaller tasks to be distributed. I’ll have a look into point 2 as this could be something that might work longer term. Point 3 is what we are doing currently however it means that we would need to write some of the scheduling logic in our application to use it moving forward. I think moving to Kubernetes might be the go longer term however I suspect there would be a performance penalty (there appears to be one for cloud versus bare metal) and costs would be prohibitive at the moment.

My understanding here is that this would limit the concurrency of the whole queue not for a given agent. If I had a queue with 5 agents and a concurrency of 5 then there would be nothing stopping 2 of the flow runs going to the same agent and potentially overwhelming it. Even being able to set a flow run concurrency per agent (independent of work queue) as a flag when starting it would allow us to limit runs on our solver specific machines.

I understand but concurrency limits are set on a work queue and work queues are very lightweight - nothing stops you from creating more work queues and more agent processes even on a single machine. Think of it more as lightweight processes.