Persistent objects when running a flow as a service

dima-p · August 29, 2022, 8:33am

Hello Prefecters!

I am new to Prefect, and have a question regarding persistent objects.
I have a python object which takes a long time to load. This is essntially a service for all my flows. Also, my agents execute only one flow at a time (no concurrency).
Today, each time the agent executes my flow, a new “python” process is initiated, and the object loads into memory, which creates a very long overhead.
I would like to load the object during the initialization of the agent, and I would like to keep it alive, and pass this instance to my flow. This flow will use the oject instance and avoid long “load” process.

How is it possible in Prefect 2.0, and what are the limitations for such an approach?

Thanks!
Dima

anna_geller · August 29, 2022, 10:46am

It looks like you want to run your flow as a persistence service running 24/7 and executing only one flow continuously. In that case, this recipe might be helpful:

dima-p · August 29, 2022, 11:49am

Hi Anna, thanks for quick reply!

This is indeed a good option. Let me ask two clarification questions.
Our system processes live data stream. There is a service (non-python) which sends an event for each new chunk of data available.
The python pipeline (based on Prefect) will eventualy run a prediction on pre-trained Deep-Learning network, for the new data chunk.
The prediction itself will take a few seconds, while the loading of the Deep-Learning model into memory and importing all the python libraries might take up to one minute.

I can see two alternative solutions for my question:

Streaming service based on Kafka listener.
In this option I’ll build a 24/7 running prefect flow, which will first consume the events from Kafka and then execute all relevant computation flow, split to tasks of course.
A question: does Prefect 2.0 open source provides out of the box Kafka consumer task? I’ve found a KafkaBatchConsume object in Prefect 1.0, but can’t find it in Prefect 2.0.
A persistent server, run by external call.
In this alternative the Prefect flow is called by external trigger. However, as explained in the first post of this thread, I do not want to load the python each time the flow is run.
Is there a solution in Prefect for such a request?

Thanks
Dima

anna_geller · August 29, 2022, 11:29pm

This is a really interesting use case! You may consider alternative #3 which might be: caching the task loading the deep learning model into memory so that you don’t need to recompute that step every time you need a prediction.

To answer your questions directly:

Not atm, but we have this user contribution:

Topic		Replies	Views
Using Dask client persistence within a Prefect Flow Help prefect-2-0 , dask , dask-dataframes	0	594	August 28, 2023
Flow remains in a "Running" state, and my Task is stuck in a "Pending" state Help prefect-2-0 , deployment , server , concurrency-limits , task	0	108	July 8, 2024
Constant timeouts when running Prefect on EC2 instance with local storage Help prefect-2-0 , aws , agent , airbyte	5	1064	April 17, 2023
A continuously running Discord Bot (never dies) in Prefect Archive prefect-1-0 , kubernetes-agent	2	630	July 28, 2022
How does prefect manage variables between flows and tasks? Help prefect-2-0 , troubleshooting	0	575	February 13, 2023

Persistent objects when running a flow as a service

Related topics