How do I clean up job resources from Kubernetes deployments?

After a Kubernetes flow deployment completes it would be awesome if the created nodes were scaled down to 0 and all residual job resources needed to run the flow were deleted. In Kubernetes 1.23 you can provide a ttl parameter in the job manifest as seen here Automatic Clean-up for Finished Jobs | Kubernetes which might accomplish this. Unfortunately, EKS is only up to 1.22 so that feature does not exist yet.

I see this issue currently exists: Add a KubernetesFlowRunner configuration to automatically delete Kubernetes jobs after a flow run · Issue #5755 · PrefectHQ/prefect (github.com) and that we can provide the customization parameters in KubernetesJob to append to the job manifest. prefect.infrastructure.kubernetes.KubernetesJob - Prefect 2.0

Any advice on how to approach this would be great! When dealing with GPUs if the created nodes persist it would be costly. Right now, I am running my flows on EKS Fargate to test. Thank you!

1 Like

Interesting question - I’d use https://karpenter.sh/ for that purpose, it works exactly as you described

Okay got it, thank you! We definitely plan to use Karpenter for provisioning our GPU instances. For the functional tests of our flows, before deploying on GPUs, we were planning to use EKS Fargate. I am not sure if Karpenter can provision Fargate nodes.

1 Like

It can! Check out this recipe from their docs:

1 Like