I am new to Prefect, and have a question regarding persistent objects.
I have a python object which takes a long time to load. This is essntially a service for all my flows. Also, my agents execute only one flow at a time (no concurrency).
Today, each time the agent executes my flow, a new “python” process is initiated, and the object loads into memory, which creates a very long overhead.
I would like to load the object during the initialization of the agent, and I would like to keep it alive, and pass this instance to my flow. This flow will use the oject instance and avoid long “load” process.
How is it possible in Prefect 2.0, and what are the limitations for such an approach?
It looks like you want to run your flow as a persistence service running 24/7 and executing only one flow continuously. In that case, this recipe might be helpful:
This is indeed a good option. Let me ask two clarification questions.
Our system processes live data stream. There is a service (non-python) which sends an event for each new chunk of data available.
The python pipeline (based on Prefect) will eventualy run a prediction on pre-trained Deep-Learning network, for the new data chunk.
The prediction itself will take a few seconds, while the loading of the Deep-Learning model into memory and importing all the python libraries might take up to one minute.
I can see two alternative solutions for my question:
Streaming service based on Kafka listener.
In this option I’ll build a 24/7 running prefect flow, which will first consume the events from Kafka and then execute all relevant computation flow, split to tasks of course.
A question: does Prefect 2.0 open source provides out of the box Kafka consumer task? I’ve found a KafkaBatchConsume object in Prefect 1.0, but can’t find it in Prefect 2.0.
A persistent server, run by external call.
In this alternative the Prefect flow is called by external trigger. However, as explained in the first post of this thread, I do not want to load the python each time the flow is run.
Is there a solution in Prefect for such a request?
This is a really interesting use case! You may consider alternative #3 which might be: caching the task loading the deep learning model into memory so that you don’t need to recompute that step every time you need a prediction.