How to allocate more memory or more worker nodes on a per flow run basis?

anna_geller · January 31, 2022, 10:29pm

It’s hard to give a single general answer because it depends on your infrastructure.

If you want to provide custom keyword arguments to a Dask cluster_class ad-hoc per flow run, you could pass a dynamic function to a DaskExecutor’s cluster_class. This function could retrieve values such as n_workers from a Parameter task, as follows:

import prefect
from prefect import Flow, Parameter
from prefect.executors import DaskExecutor

def dynamic_executor():
    from distributed import LocalCluster

    # could be instead some other class e.g. from dask_cloudprovider.aws import FargateCluster
    return LocalCluster(n_workers=prefect.context.parameters["n_workers"])

with Flow(
    "dynamic_n_workers", executor=DaskExecutor(cluster_class=dynamic_executor)
) as flow:
    flow.add_task(Parameter("n_workers", default=5))

As a result, you could start a new flow run with a different value of n_workers defined ad-hoc.

The second option would be to assign more memory in your run configuration on a per-flow-run basis - e.g. you could overwrite the memory_request set on a KubernetesRun from a UI:

with Flow(
        FLOW_NAME,
        storage=STORAGE,
        run_config=KubernetesRun(
            labels=["k8s"],
            cpu_request=0.5,
            memory_request="2Gi",
        ),
) as flow:

The above run configuration defines 2 GB, but if you notice your flow run ended with an OOM error, you could trigger a new flow run from the UI with a higher memory request.

The last option would be to override the executor values directly in your flow definition:

import coiled
from prefect.executors import DaskExecutor

flow.executor = DaskExecutor(
    cluster_class=coiled.Cluster,
    cluster_kwargs={
        "software": "user/software_env_name",
        "shutdown_on_close": True,
        "name": "prefect-cluster",
        "scheduler_memory": "4 GiB",
        "worker_memory": "8 GiB",
    },
)

As long as you use script storage (e.g. one of Git storage classes such as GitHub, Git, Gitlab, Bitbucket, etc) rather than pickle storage, and you commit your code with a modified value of worker_memory, this should be reflected in your new flow run because metadata about the executor is not stored in the backend - it’s retrieved from your flow storage at runtime.

Topic		Replies	Views
How can I change the number of Dask workers in a DaskExecutor based on a custom parameter value? Archive prefect-1-0 , parameters , dask , executor , dynamicism	1	854	April 12, 2022
How can I configure a specific Dask cluster class? Archive migration-guide , prefect-1-0 , prefect-2-0 , aws , dask , executor , task-runner , parallel-processing , fargate , dask-task-runner , dask-cloud-provider , docker-image	0	762	January 21, 2022
Deploy flow that uses a DaskTaskRunner with a Dask KubeCluster Help prefect-2-0 , deployment , kubernetes , dask , dask-task-runner	4	458	October 13, 2023
Are there any guidelines for using a temporary vs. static Dask cluster? How to set one on Kubernetes? Archive prefect-1-0 , dask , parallel-processing , dask-executor , kube-cluster	1	1216	March 10, 2022
How can I configure my flow to run with Dask? Archive migration-guide , prefect-1-0 , prefect-2-0 , dask , executor , task-runner , parallel-processing , dask-task-runner , getting-started	2	2390	July 5, 2022

How to allocate more memory or more worker nodes on a per flow run basis?

Related topics