Control on memory usage

Hi there,

I’m just starting using prefect on a local ML worflow usecase.
I have some models that I’d like to chain, feeding the output of model A to model B, or looping over a model a few times before moving on to other computation tasks.
I tried isolating each model inside a Task, and using prefect to orchestrate and monitor the whole process.

From my tests, it seems that Prefect keeps the result of each computation task around, even if I don’t specify that i want to retry on fails. Is there a way to discard the result of each task once it’s been processed and fed to the next ones ?

This code shows a minimal reproduction of the issue, where the process will gradually occupy all the memory if create_weights is a task, but everything will be fine if we remove its decorator.

import numpy as np
from prefect import flow, task

@task
def create_weights(n):
    return np.random.randn(n, n)

@flow
def infer():
    for i in range(50):
        print(i)
        create_weights(10000)

if __name__ == '__main__':
    infer()

More generally, is Prefect suitable for my use case, or are Tasks not expected to return large amounts of data or GPU objects ?

Thanks for you help and this amazing tool !

1 Like