I’m deploying my flows to GCP Cloud Run from a push work pool, setup from following Serverless Push Work Pools - Prefect Docs
Due to my own mistake, I was saving too much data to disk, and since in Cloud Run disk space is stored into memory, I ran into a out of memory error. (Nothing to do with Prefect, PEBCAK)
However, it seems the default Cloud Run Jobs behaviour is to retry a failed ‘task’ (Not a prefect task, a task in the context of Cloud Run) 3 times before giving up, and Prefect does not seem to reflect this in the UI very well.
To give a specific example of a Prefect task that was restarted multiple times in the job UI:
You can see here that this task ‘Finished in state Completed()’ 4 times, even though it shows a run count of 1, since this repeatedly restarted by Cloud Run itself.
Cloud run permits setting the --max-retries
field via the CLI / YAML, is it possible to define settings in the work pool ‘Base Job Template’ to set this not to retry?
On a sidenote, this makes more sense to be the default functionality when the work pool is created, since I would prefer a crashed Cloud Run Job to be retried by the configuration of the @flow
/ @task
, rather than configuration that lives in GCP.