Hi there,
I’m just starting using prefect on a local ML worflow usecase.
I have some models that I’d like to chain, feeding the output of model A to model B, or looping over a model a few times before moving on to other computation tasks.
I tried isolating each model inside a Task, and using prefect to orchestrate and monitor the whole process.
From my tests, it seems that Prefect keeps the result of each computation task around, even if I don’t specify that i want to retry on fails. Is there a way to discard the result of each task once it’s been processed and fed to the next ones ?
This code shows a minimal reproduction of the issue, where the process will gradually occupy all the memory if create_weights
is a task, but everything will be fine if we remove its decorator.
import numpy as np
from prefect import flow, task
@task
def create_weights(n):
return np.random.randn(n, n)
@flow
def infer():
for i in range(50):
print(i)
create_weights(10000)
if __name__ == '__main__':
infer()
More generally, is Prefect suitable for my use case, or are Tasks not expected to return large amounts of data or GPU objects ?
Thanks for you help and this amazing tool !