How and when does a Kubernetes agent clean up Kubernetes jobs after flow run completion?

Prefect agent queries for Kubernetes jobs with a prefect label:

  jobs = self.batch_client.list_namespaced_job(
      namespace=self.namespace,
      label_selector="prefect.io/identifier",
      limit=20,
      _continue=_continue,
  )

Then, the agent checks if the job fas a flow run id and tries to retrieve this ID:

flow_run_id = job.metadata.labels.get("prefect.io/flow_run_id")

if not flow_run_id:
    # Do not attempt to process a job without a flow run id; we need
    # the id to manage the flow run state
    self.logger.warning(
        f"Cannot manage job {job_name!r}, it is missing a "
        "'prefect.io/flow_run_id' label."
    )
    continue

Now, having the flow run ID, Prefect can check the flow run state:

try:
    # Do not attempt to process a job with an invalid flow run id
    flow_run_state = self.client.get_flow_run_state(flow_run_id)
except ObjectNotFoundError:
    self.logger.warning(
        f"Job {job.metadata.name!r} is for flow run {flow_run_id!r} "
        "which does not exist. It will be ignored."
    )
    continue

Then, if the job succeeded, we delete the Kubernetes job, but if it failed, we keep it so that the failed flow run can be either retried or investigated for troubleshooting:

delete_job = job.status.failed or job.status.succeeded
# deleting the iob:
if delete_job and self.delete_finished_jobs:
    self.logger.debug(f"Deleting job {job_name}")
    try:
        self.job_pod_event_timestamps.pop(job_name, None)
        self.batch_client.delete_namespaced_job(
            name=job_name,
            namespace=self.namespace,
            body=kubernetes.client.V1DeleteOptions(
                propagation_policy="Foreground"
            ),
        )

Source code

Docs

Ensure you set the delete_finished_jobs to True to delete old jobs - this is the default setting

API reference