How can I run my flow in a Docker container?

Prefect 2.0

In Prefect 2.0, deployments are extremely flexible and allow you to use any type of infrastructure - from a local process, a Docker container to a job running on a remote Kubernetes cluster.

Here is the syntax to build and create dockerized flow run deployments:

  1. Build step:
prefect deployment build path/to/flow_script.py:flow_name \
--name deployment_name --tag dev -sb storage_block_type/storage_block_name \
-ib infrastructure_block_type/infrastructure_block_name

In practice, this could look as follows:

prefect deployment build flows/hello.py:hello \
--name docker-custom --tag dev \
-sb s3/dev -ib docker-container/docker-custom-image
  1. Apply step:
prefect deployment apply deployment.yaml

To determine the type of infrastructure for build, you can use either:

  1. the --infra flag
  2. the --infrastructure-block or short -ib flag

The first one will generate a deployment manifest providing a placeholder you can fill with extra information. The latter requires that you create the infrastructure block beforehand, either from code or from the UI.

Documentation on DockerContainer infrastructure blocks:

Simple Docker deployment

Looking at this project template, here is how you can create a deployment for a simple flow that doesn’t require any custom pip packages.

Docker deployment using a custom Dockerfile

As long as you have some custom pip packages, Prefect base image won’t be enough - you need to package your dependencies into a custom image and push it to some registry. The code below shows how you can do all that programmatically by building a custom image and pushing it to Dockerhub.

Once the image is built, you can create a custom DockerContainer block and pass it to your deployment build:


Prefect 2.0. beta (2.0b8 and lower)

Using the flow_runner argument in a DeploymentSpec, you can dynamically allocate infrastructure for your flow runs. Since the code must be retrieved on the created infrastructure, configuring flow runners is possible only for deployed flows. Here is an example deploying flow runs as Docker containers:

# docker_deployment.py
from prefect.flow_runners import DockerFlowRunner
from prefect.deployments import DeploymentSpec

DeploymentSpec(
    name="deployment_name",
    flow_location="s3://bucket_name/path/to/my_flow.py",
    flow_runner=DockerFlowRunner(image="prefecthq/prefect:2.0"),
)

You can leverage the CLI to create a deployment from that file:

prefect deployment create docker_deployment.py

To execute that deployment locally, you can use the following syntax:

prefect deployment run 'flow_name/deployment_name'

For more details, check out the DockerFlowRunner tutorial.


Prefect 1.0

The same can be accomplished in Prefect 1.0 using storage and run configuration attached to the Flow object.

from prefect.run_configs import DockerRun
from prefect.storage import S3
from prefect import Flow

with Flow(
    "flow_name",
    run_config=DockerRun(image="prefecthq/prefect:1.0"),
    storage=S3(bucket="bucket_name"),
) as flow:
    ...
1 Like

As this documentation is not valid in 2.0 GA can we get an updated version attached to the migration guide? Flow Runners are no longer a thing in GA it seems, but documentation all points back to this older beta style.

1 Like

Thanks for the pointer! Checkout this while we update Discourse!

1 Like

@scoussens I updated the topic, thanks so much for raising this

1 Like

The code for this part did not work for me. I had to run the following:

prefect deployment build flows/healthcheck.py:healthcheck --name docker-simple --tag dev -sb s3/dev --infra docker
prefect deployment apply deployment.yaml
prefect deployment run healthcheck/docker-simple
prefect agent start --tag dev

With the last command I got an error in:

botocore.exceptions.NoCredentialsError: Unable to locate credentials

Do you know where shall I add which credentials?
Thanks!

got the same error with “Docker deployment using a custom Dockerfile”

you need to create an S3 Block that has credentials

1 Like

That worked perfect thanks!

Do you know why here the credentials are mandatory but when having normal flow codes stored in the same s3 bucket they are optional? I have another block for the same bucket but need no credentials to store there.

A question related to the codes: for " Docker deployment using a custom Dockerfile" is it possible to pass a parameter to the flow ‘hello’ when building the Deployment? or at another point?
Even if the function is defined as def hello(user: str = "Marvin"): I still got an error running it since it was expecting a parameter for [user] and I only solved it modifying deployment.yaml with parameters: {"user": "dummy"} before applying the Deployment

Finally a general question about the post and the concept of “infrastructure”: the agent is running on my machine and the agent triggers a container inside which the flow runs right?
I do not have much experienced with Docker but here the Agent is running in the machine where I call prefect agent start so " Simple Docker deployment " does not really have any advantage since the point of having the flow run in a container is that I can install extra stuff that I do now want in the machine where I call the agent right? like is done in “Docker deployment using a custom Dockerfile”.

Sorry for the many questions --’ and thanks in advance! Top support!

1 Like

your locally running flow has access to your ~/.aws/credentials but your container doesn’t

we are working on adding --param flags to CLI, for now you can set it on the flow decorator or on the Deployment YAML file

Yup exactly!

Cool thanks for the quick response!