How to create a deployment that uses already available local source files?

Currently our Prefect agents are run from Docker images which contain all of our project source code. (So when we roll out new versions, our CI/CD deploys new versions of our Prefect agents that all have that code locally. I realize this is a bit at odds with how Deployments normally work, but this lets us easily run code in many environments without overwriting authoritative flow files on shared storage, etc.)

I was hoping to use a Process infrastructure block and LocalFilesystem storage block to let Prefect know that the code was already available and could be just executed locally inside the agent container.

My simplified definition looks like this:

Deployment.build_from_flow(
    name=f"sample-data-pipeline",
    description="A manually-triggered pipeline to generate sample data.",
    flow=generate_sample_data,
    storage=LocalFileSystem(
        basepath=f"/opt/app",
    ),
    infrastructure=Process(
        name="generate-sample-data-process",
    ),
    apply=True,
)

What I haven’t been able to figure out is how to have prefect not copy files to the basepath specified here. E.g. in example here it would copy the entire source tree to /opt/app. But everything needed to run the flows is already in the agent’s environment, so I’d just like to point the LocalFilesystem to the directory where the flow files already exist.

Is this possible? Am I thinking about this all wrong?

Thanks in advance!

Edit: I did find the skip_upload flag and added that to the Deployment.build_from_flow() call. That sounded right, but when providing the basepath that should point to the root of the python tree, I’m getting an error like:

Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/opt/app/.venv/lib/python3.10/site-packages/prefect/engine.py", line 276, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/opt/app/.venv/lib/python3.10/site-packages/prefect/client/utilities.py", line 40, in with_injected_client
    return await fn(*args, **kwargs)
  File "/opt/app/.venv/lib/python3.10/site-packages/prefect/deployments.py", line 191, in load_flow_from_flow_run
    basepath = deployment.path or Path(deployment.manifest_path).parent
  File "/usr/local/lib/python3.10/pathlib.py", line 960, in __new__
    self = cls._from_parts(args)
  File "/usr/local/lib/python3.10/pathlib.py", line 594, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/usr/local/lib/python3.10/pathlib.py", line 578, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I’m realizing that the system is wanting some sort of manifest, so maybe my idea isn’t going to work out as I’d hoped.

I’ve managed to get this working, so following up to my question with a sort of recipe if someone else is trying to do this.

  • Importantly, it seems that you need to explicitly set the path on the Deployment when using this approach.

Here is a working minimal Deployment example:

from mypkg.flows.sample_data import generate_sample_data

Deployment.build_from_flow(
    name="sample-data-pipeline",
    description="A manually-triggered pipeline to generate sample data.",
    flow=generate_sample_data,
    storage=LocalFileSystem(
        basepath="/opt/app",  # <-- I don't think this actually matters at all (?)
    ),
    path="/opt/app",  # <-- OTOH, this is critical
    infrastructure=Process(
        name="generate-sample-data-process",
    ),
    apply=True,
)

In this example, Prefect will expect the flow module to be located in the image at /opt/app/mypkg/flows/sample_data.py

1 Like