I would like each flow in the flows/ folder to be independent of the central project and created as a separate docker container.
builder.py at startup searches for all flows in flows/ folder, sets a specific configuration and registers them on the server.
But I ran into the problem of importing third-party packages. Let’s say in the test_package1/ in requirements.txt there is SQLAlchemy==1.4.34. And in test_pack1/common/test_module.py there is an import sqlalchemy. And test_pack1/test_pack1_flow.py have a @task with function from test_module.py. When the FlowBuilder class looks for a flow variable in the file test_pack1_flow.py it does this using the function flow = extract_flow_from_file(str(flow_module)). At this step, a ModuleNotFoundError error occurs, since there is no such dependency in the prefect central application(in pyproject.toml). But when the docker container is created, after flow.register(), of course it will already be there. How can I handle this step? Or maybe I’m doing something wrong?
I use Docker Storage, Docker Run and Local Executor.
I mean a project containing only prefect itself in dependencies (in pyproject.toml). It will be stored on the VPS, its area of responsibility will include launching the prefect server, a local (docker) agent, and checking (registering) new flows, which are planned to be added, somehow using ci/cd and the root project will not be associated with individual flows and their dependencies. Root project is only engaged in findind and registering flows.
So you mean that I have to install all the dependencies from each individual flow into the root project? That is, in my case, to transfer all the dependencies from small requirements.txt into one main pyproject.toml?
Then what to do in this case: test_pack1_flow.py uses for example pandas==1.4.2, and test_pack2_flow.py uses pandas==1.2.4?
I think your best option is to use the CLI command. This way you don’t need to build any extra functionality for that and this makes it that much easier to maintain your flows.
prefect register --project xyz -p flows/
Providing this path to flows/ will already loop over all flows for you and register them only when there are some changes made to your flow, if your metadata didn’t change, the flow won’t be re-registered so you can safely use that in your CI pipeline.
This topic dives deeper into it:
Not necessarily - you can install only the package you need. I’m only implying that having your code dependencies per flow/project packaged with setup.py, allows you to easily install it the same way both locally AND within your production Docker image (dev/prod parity). I don’t use poetry so can’t say how this should be done in pyproject.toml, probably the same way.
And btw, if you’re just getting started with Prefect, perhaps it makes sense to start directly with Prefect 2.0? We don’t have a public timeline on when it’s out of beta, but this would be more beneficial for you long-term: