When I run my flow, I see an error: Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/Users/username'"). What is happening?

In Prefect 1.0, there is a distinction between build time and runtime:

  • When a flow is registered (i.e. at build time), Prefect stores the location of the flow in Storage ( GitHub , S3 , Docker , etc.) when using a flow stored_as_script=True or the pickled Flow object when using pickle storage. For more about pickle vs script storage, check out: Storage | Prefect Docs
  • During a flow run execution (i.e. at runtime), Prefect pulls the flow from the storage location and runs it.

1. Registering the flow on one machine and running it on another one

Flows using the default Local storage can only be run from the same machine

If users don’t specify any storage, it defaults to a Local storage, which is:

  1. A serialized version of the flow stored in the ~/.prefect/flows folder when registering the flow using flow.register(),
  2. A location pointing to the local file on your machine when registering a flow using the CLI.

In both cases, the flow is retrieved from a local file at runtime. Therefore, Prefect by default assumes that you run this flow on the same machine from which you registered it.

Using Local storage during registration on a machine “A” but trying to run it on a machine “B”

The ModuleNotFoundError error usually happens when you register the flow from your development machine, and then you try to run the flow on a different machine (or a container) that doesn’t have that local file.

Failed to load and execute Flow’s environment: ModuleNotFoundError (“No model named ‘C’”)

This has been answered in this Github issue: Failed to load and execute Flow's environment: ModuleNotFoundError ("No model named 'C'") · PrefectHQ/prefect · Discussion #5143 · GitHub

Possible solutions

If you are running a flow on a different machine than the one from which you registered it, you need to use a remote storage class such as:

  • one of Git storage classes (e.g. GitHub, GitLab, Git, Bitbucket or CodeCommit)
  • or one of cloud storage classes (e.g. S3, GCS or Azure)

so that the flow can be pulled from that location.

If you need an example on how to use various storage classes, check out this repo:

If you want to still use Local storage, register the flow from the agent machine. For instance, if your agent runs on a remote VM, then SSH to it, copy your flow file to a correct location on the agent and register your flow from there.

2. Missing code dependencies in the flow’s execution environment

A similar error often occurs when code dependencies are not packaged properly. Instead of ModuleNotFoundError(“No module named ‘/Users/username’”) you usually see an error like this:

FlowStorageError('An error occurred while unpickling the flow:
ModuleNotFoundError("No module named \'YOUR_MODULE_NAME\'")

Possible solutions to fix missing code dependencies

Start your agent from the same directory as your code dependencies

You could start your agent from a specific project directory to help Prefect find your extra modules:

prefect agent local start -p /Users/your_username/path/to/your_modules

Append your custom module to your PYTHONPATH

import sys

sys.path.append("/path/to/your_module")

Package and install your code dependencies within your agent environment

Or you can install the package containing your utility functions within your agent’s execution environment. This repository shows how you can do that:

Package your code dependencies with a Docker container

This blog post covers this process in-depth:
https://discourse.prefect.io/t/the-simple-guide-to-productionizing-data-workflows-with-docker-by-kevin-kho/453

Additional solutions from this StackOverflow question answered by Chris White:

If you are using Local Storage + a Local Agent, you need to make sure the module_that_was_not_found directory is on your local importable Python PATH. There are a few ways of doing this:

  • run your Prefect Agent in the tutorial directory; your Local Agent’s path will then be inherited by the flows it submits
  • manually add the module_that_was_not_found/ directory to your global python path (I don’t recommend this)
  • add the module_that_was_not_found/ directory to your Agent’s path with the -p CLI flag; for example: prefect agent start -p ~/Developer/prefect/examples/module_that_was_not_found (this is the approach I recommend)

A similar question about packaging custom modules with ECSRun and GitHub storage

View in #prefect-community on Slack

Jan_Nitschke @Jan_Nitschke: Hi Prefect, I want to run a flow on ECS and use GitHub as storage. My python code imports modules.
The flow definition looks something like:

from tasks import my_task
from prefect.storage import GitHub
from prefect import Flow
from prefect.run_configs import ECSRun


storage = GitHub(
    repo="repo",  # name of repo
    path="path/to/myflow.py",  # location of flow file in repo
    access_token_secret="GITHUB_ACCESS_KEY",  # name of personal access token secret
)

with Flow(name="foobar",
          run_config=ECSRun(),
          storage=storage) as flow:
    my_task()

The problem seems to be that the GitHub storage only clones the single file and not the entire project which causes my import to fail. (ModuleNotFoundError("No module named 'tasks'") ) I’ve seen that there has been some discussion around this issue but it hasn’t really helped me to solve the issue… Is my only option to clone the repo into the custom image that I use for my ECS task? But that would mean that I would have to rebuild that image every time I change something to my underlying modules, right?

@Anna_Geller: Since ECS tasks run your flows within Docker containers, you would need to package your code into a Docker image. If you need an example of how you can package your custom modules within a Dockerfile and push it to ECR, check out this repo

also this blog post

GitHub: GitHub - anna-geller/packaging-prefect-flows: Examples of various flow deployments for Prefect 1.0 (storage and run configurations)

Medium: The Simple Guide to Productionizing Data Workflows with Docker

Jan_Nitschke @Jan_Nitschke: ok, thanks for the quick response and for the samples :slightly_smiling_face:

If you see the error:

ModuleNotFoundError: No module named ‘prefect’

Try this solution: