How to package dependencies into a Docker image and push those to AWS ECR?

View in #prefect-server on Slack

Saurabh_Sharma @Saurabh_Sharma: Hi,
I am trying to use S3 storage for the flow code as follows

STORAGE = S3(
    bucket="prefect-pipelines",
    key=f"flows/forecaster/flow.py",
    stored_as_script=True,
    # this will ensure to upload the Flow script to S3 during registration
    local_script_path=f"flow.py",
)

The following command is used for registering the flow:

prefect register --project forecaster -p forecaster/

But for some reason, all the modules under the forecaster directory is not getting uploaded under the S3 key specified.

Need help in solving this. Thanks in advance!
Just for reference,

@Anna_Geller: S3 storage will upload only your flow code, it won’t package your dependencies. You could package your code dependencies and custom modules into a Docker image.

This Discourse topic provides an in-depth explanation of the issue and how you can approach it - section #2 “Missing code dependencies in the flow’s execution environment” is relevant in your use case

Prefect Community: When I run my flow, I see an error: Failed to load and execute Flow’s environment: ModuleNotFoundError(“No module named ‘/Users/username’”). What is happening?

Saurabh_Sharma @Saurabh_Sharma: Awesome, thanks! Let me try this out.
Hi, I tried this, but for some reason the code packaging was not happening as expected. So to reproduce the issue, I switched to Docker Storage for my local workflow using the ECR image as base image as follows:

with Flow(
    FLOW_NAME,
    storage=Docker(
        registry_url="<http://xxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com|xxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com>",
        base_image="<http://xxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster|xxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster>",
        image_tag="latest"
    ),
) as flow:
    ....

But when I run my flow.py I get EOF error while it attempts to push the image to registry. Logs for the same are below:

 python flow.py
/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/core/flow.py:1708: UserWarning: No result handler was specified on your Flow. Cloud features such as input caching and resuming task runs from failure may not work properly.
  registered_flow = client.register(

[2022-04-22 13:58:32+0530] INFO - prefect.Docker | Building the flow's Docker storage...
Step 1/9 : FROM <http://xxxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster|xxxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster>
 ---> d2addd0e99fc
Step 2/9 : ENV PREFECT__USER_CONFIG_PATH='/opt/prefect/config.toml'
 ---> Running in be01aa09d8d3
Removing intermediate container be01aa09d8d3
 ---> dc9c3512b05d
Step 3/9 : RUN pip install pip --upgrade
 ---> Running in fd9adcabcc72
Requirement already satisfied: pip in /usr/local/lib/python3.9/site-packages (22.0.4)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: <https://pip.pypa.io/warnings/venv>

Removing intermediate container fd9adcabcc72
 ---> d7f400326e18
Step 4/9 : RUN pip show prefect || pip install git+<https://github.com/PrefectHQ/prefect.git@1.0.0#egg=prefect[all_orchestration_extras]>
 ---> Running in 97b539c2551a
Name: prefect
Version: 1.1.0
Summary: The Prefect Core automation and scheduling engine.
Home-page: <https://www.github.com/PrefectHQ/prefect>
Author: Prefect Technologies, Inc.
Author-email: <mailto:help@prefect.io|help@prefect.io>
License: Apache License 2.0
Location: /usr/local/lib/python3.9/site-packages
Requires: click, cloudpickle, croniter, dask, distributed, docker, importlib-resources, marshmallow, marshmallow-oneofschema, msgpack, mypy-extensions, packaging, pendulum, python-box, python-dateutil, python-slugify, pytz, pyyaml, requests, tabulate, toml, urllib3
Required-by:
Removing intermediate container 97b539c2551a
 ---> ac4b00fad79e
Step 5/9 : RUN pip install wheel
 ---> Running in 00dcc88ede0b
Requirement already satisfied: wheel in /usr/local/lib/python3.9/site-packages (0.37.1)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: <https://pip.pypa.io/warnings/venv>

Removing intermediate container 00dcc88ede0b
 ---> cd84c116e8c9
Step 6/9 : RUN mkdir -p /opt/prefect/
 ---> Running in e5d1fa741f51
Removing intermediate container e5d1fa741f51
 ---> 2ec2b40d4fdb
Step 7/9 : COPY forecaster.flow /opt/prefect/flows/forecaster.prefect
 ---> 13cfddd0b18b
Step 8/9 : COPY healthcheck.py /opt/prefect/healthcheck.py
 ---> 11a2da36e87e
Step 9/9 : RUN python /opt/prefect/healthcheck.py '["/opt/prefect/flows/forecaster.prefect"]' '(3, 9)'
 ---> Running in a51b8b2f74aa
Beginning health checks...
System Version check: OK
Cloudpickle serialization check: OK
Result check: OK
All health checks passed.
Removing intermediate container a51b8b2f74aa
 ---> 976fd83bf28f
Successfully built 976fd83bf28f
Successfully tagged <http://xxxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/forecaster:latest|xxxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/forecaster:latest>
[2022-04-22 13:58:45+0530] INFO - prefect.Docker | Pushing image to the registry...
Traceback (most recent call last):
  File "/Users/saurabh/Dropbox/Projects/professional/Askria/code/financial-template-service/workflows/forecaster/flow.py", line 104, in <module>
    flow.register(project_name="forecaster")
  File "/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/core/flow.py", line 1708, in register
    registered_flow = client.register(
  File "/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/client/client.py", line 848, in register
    serialized_flow = flow.serialize(build=build)  # type: Any
  File "/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/core/flow.py", line 1497, in serialize
    storage = self.storage.build()  # type: Optional[Storage]
  File "/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/storage/docker.py", line 325, in build
    self._build_image(push=push)
  File "/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/storage/docker.py", line 399, in _build_image
    self.push_image(full_name, self.image_tag)
  File "/Users/saurabh/.pyenv/versions/askria-financial-template-service/lib/python3.9/site-packages/prefect/storage/docker.py", line 613, in push_image
    raise InterruptedError(line.get("error"))
InterruptedError: EOF

Post this I even did the following step on shell to ensure the docker login is done for ECR.

aws ecr get-login-password --region us-west-1 --profile askria-deploy | docker login --username AWS --password-stdin http://xxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image|xxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image

Login Succeeded

Not sure what else is missing. Would appreciate any sort of help. Thanks !

Solution

@Anna_Geller: Can you try to push the image manually yourself to see whether it works this way? you can do that using:

docker push http://xxxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/forecaster:latest|xxxxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/forecaster:latest

you could also follow the setup described in section “Deploying your flows to a remote Kubernetes cluster on AWS EKS” in this blog post

Saurabh_Sharma @Saurabh_Sharma: Sure, trying this.
So the image is getting pushed to the ECR repo.

❯ docker push http://xxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster:latest|xxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster:latest
The push refers to repository [<http://xxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster|xxxxxxxxx.dkr.ecr.us-west-1.amazonaws.com/prefect-custom-image/forecaster>]
0ef046cb1c71: Pushed
cbcd7f3aa78a: Pushed
0e71c950bd25: Pushed
130afd05f548: Pushed
9244b3735769: Layer already exists
3599ac45f041: Layer already exists
0caacca1e28a: Layer already exists
cc4b440e810b: Layer already exists
f99c2bd0bdfc: Layer already exists
8bcf066bab18: Layer already exists
7fce09c1d950: Layer already exists
1401df2b50d5: Layer already exists
latest: digest: sha256:9a023585c2c97cc186c074785efa694f73cc2ff74443ab2ba56dbd1ffdb81609 size: 2840

@Anna_Geller: woohoo! :partying_face:

Saurabh_Sharma @Saurabh_Sharma: Let me follow the blog post you shared and revert.