Control on memory usage

Tom · April 3, 2023, 8:02am

Hi there,

I’m just starting using prefect on a local ML worflow usecase.
I have some models that I’d like to chain, feeding the output of model A to model B, or looping over a model a few times before moving on to other computation tasks.
I tried isolating each model inside a Task, and using prefect to orchestrate and monitor the whole process.

From my tests, it seems that Prefect keeps the result of each computation task around, even if I don’t specify that i want to retry on fails. Is there a way to discard the result of each task once it’s been processed and fed to the next ones ?

This code shows a minimal reproduction of the issue, where the process will gradually occupy all the memory if create_weights is a task, but everything will be fine if we remove its decorator.

import numpy as np
from prefect import flow, task

@task
def create_weights(n):
    return np.random.randn(n, n)

@flow
def infer():
    for i in range(50):
        print(i)
        create_weights(10000)

if __name__ == '__main__':
    infer()

More generally, is Prefect suitable for my use case, or are Tasks not expected to return large amounts of data or GPU objects ?

Thanks for you help and this amazing tool !

Topic		Replies	Views
Tricks for reclaiming memory in Prefect flows and preventing Out-Of-Memory errors (OOM) Archive prefect-1-0 , prefect-2-0 , infrastructure , memory_request , resource-clean-up , oom-error	4	2561	June 22, 2024
How does prefect manage variables between flows and tasks? Help prefect-2-0 , troubleshooting	0	575	February 13, 2023
Is there a clean way to handle task result? Help prefect-1-0	3	833	April 29, 2022
What is a recommended strategy for setting memory limits on a task? Help prefect-2-0 , parallel-processing , infrastructure	0	513	December 4, 2023
Tracing cached results Help caching , data-science-workflows	0	385	May 24, 2023

Control on memory usage

Related topics