Horizontal scaling on Kubernetes & handling Workers termination

pudelskip · November 2, 2023, 10:58am

Hello
I’m using a self-hosted Prefect instance (version 2.13.1) utilizing a Kubernetes cluster. I’m encountering some issues while trying to correctly set up horizontal scaling of my pods. Currently, my architecture includes one Kubernetes Deployment with the Prefect Server, three separate workpools, and three Kubernetes Deployments using HPA (Horizontal Pod Autoscaler) with Prefect Workers (of process type) for each of the workpools. Scaling up and adding new pods with Prefect Workers is relatively straightforward, but the problem arises when they need to be scaled down and terminated. I’ve noticed (please correct me if I’m wrong) that there is no built-in mechanism to wait for currently running flows to finish, except for some grace period. Another issue is that such unfinished runs later remain in the RUNNING state indefinitely and are not marked as CRASHED/CANCELLED.

My questions are:

Is there any recommended way or some built-in solution for handling termination of Prefect Workers when they’re still processing flows?
Do you have any general tips for horizontally scaling the Prefect instance or setting up a similar architecture, addressing the issues I’ve mentioned above?

Topic		Replies	Views
Prefect 2.10 is here with Workers, Projects, Variables, versioned docs, and more! 🎆 Announcements prefect-2-0 , release-notes	0	1200	April 6, 2023
Prefect 2.7.1 is released with a new UI page for coordinating task run concurrency limits, bulk-delete functionality from the UI for flows, deployments and work queues, improved timeout for Kubernetes jobs, and many more! Announcements prefect-2-0 , release-notes	0	859	December 8, 2022
Running a worker as a kubernetes pod Show and Tell prefect-2-0 , kubernetes , worker	0	1733	April 24, 2023
Prefect 2.7.10 has been released with a significant upgrade to the cancellation feature, production multi-architecture Docker images and more Announcements prefect-2-0 , release-notes	0	784	January 26, 2023
Graceful shutdown of Dask Scheduler Archive prefect-1-0 , dask-executor	3	1496	June 30, 2022

Horizontal scaling on Kubernetes & handling Workers termination

Related topics