In Prefect 1.0, there is a distinction between build time and runtime:
- When a flow is registered (i.e. at build time), Prefect stores the location of the flow in Storage (
GitHub
,S3
,Docker
, etc.) when using a flowstored_as_script=True
or the pickled Flow object when using pickle storage. For more about pickle vs script storage, check out: Storage | Prefect Docs - During a flow run execution (i.e. at runtime), Prefect pulls the flow from the storage location and runs it.
1. Registering the flow on one machine and running it on another one
Flows using the default Local
storage can only be run from the same machine
If users don’t specify any storage, it defaults to a Local
storage, which is:
- A serialized version of the flow stored in the
~/.prefect/flows
folder when registering the flow usingflow.register()
, - A location pointing to the local file on your machine when registering a flow using the CLI.
In both cases, the flow is retrieved from a local file at runtime. Therefore, Prefect by default assumes that you run this flow on the same machine from which you registered it.
Using Local
storage during registration on a machine “A” but trying to run it on a machine “B”
The ModuleNotFoundError
error usually happens when you register the flow from your development machine, and then you try to run the flow on a different machine (or a container) that doesn’t have that local file.
Failed to load and execute Flow’s environment: ModuleNotFoundError (“No model named ‘C’”)
This has been answered in this Github issue: Failed to load and execute Flow's environment: ModuleNotFoundError ("No model named 'C'") · PrefectHQ/prefect · Discussion #5143 · GitHub
Possible solutions
If you are running a flow on a different machine than the one from which you registered it, you need to use a remote storage class such as:
- one of Git storage classes (e.g.
GitHub
,GitLab
,Git
,Bitbucket
orCodeCommit
) - or one of cloud storage classes (e.g.
S3
,GCS
orAzure
)
so that the flow can be pulled from that location.
If you need an example on how to use various storage classes, check out this repo:
If you want to still use Local
storage, register the flow from the agent machine. For instance, if your agent runs on a remote VM, then SSH to it, copy your flow file to a correct location on the agent and register your flow from there.
2. Missing code dependencies in the flow’s execution environment
A similar error often occurs when code dependencies are not packaged properly. Instead of ModuleNotFoundError(“No module named ‘/Users/username’”)
you usually see an error like this:
FlowStorageError('An error occurred while unpickling the flow:
ModuleNotFoundError("No module named \'YOUR_MODULE_NAME\'")
Possible solutions to fix missing code dependencies
Start your agent from the same directory as your code dependencies
You could start your agent from a specific project directory to help Prefect find your extra modules:
prefect agent local start -p /Users/your_username/path/to/your_modules
Append your custom module to your PYTHONPATH
import sys
sys.path.append("/path/to/your_module")
Package and install your code dependencies within your agent environment
Or you can install the package containing your utility functions within your agent’s execution environment. This repository shows how you can do that:
Package your code dependencies with a Docker container
This blog post covers this process in-depth:
https://discourse.prefect.io/t/the-simple-guide-to-productionizing-data-workflows-with-docker-by-kevin-kho/453