How to use DockerFlowRunner with S3 storage? I'm getting an error "botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden"

Problem

This error usually occurs when you use DockerFlowRunner and you didn’t mount AWS credentials to your flow run container. In this case, your Docker container doesn’t have the AWS credentials to pull the flow code from S3.

Solution

To fix that issue, you can mount your AWS credentials to your DockerFlowRunner as follows:

from prefect import flow, get_run_logger
from prefect.deployments import DeploymentSpec
from prefect.flow_runners import DockerFlowRunner


@flow
def docker_flow():
    logger = get_run_logger()
    logger.info("Hello from Docker!")


DeploymentSpec(
    name="example",
    flow=docker_flow,
    flow_runner=DockerFlowRunner(
        image="prefecthq/prefect:2.0b2-python3.9",
        volumes=["/Users/anna/.aws:/root/.aws"],  # ADJUST IT TO MATCH YOUR USER NAME
    ),
)

Note that you need to mount credentials to the Docker root user, otherwise boto3 won’t be able to find AWS credentials.

Do you have the terminal commands / steps to bind mount the credentials? Or am I misunderstanding and by including the volumes line, it automatically mounts the credentials internally in the backend?

Thanks!

Exactly, by adding the volumes argument on your DockerFlowRunner, Prefect will make sure to bind mount those volumes when it deploys a flow run container.

Should this be added to the Docker tutorial? Running flows in Docker - Prefect 2.0

1 Like

For docs you can tag Terry :smile: @terrence or create a GitHub issue directly

I have

from prefect import flow, get_run_logger
from prefect.deployments import DeploymentSpec
from prefect.flow_runners import DockerFlowRunner

@flow
def my_docker_flow():
    logger = get_run_logger()
    logger.info("Hello from Docker!")

DeploymentSpec(
    name="docker-example",
    flow=my_docker_flow,
    flow_runner=DockerFlowRunner(
        image="prefecthq/prefect:2.0b2-python3.9",
        volumes=["/Users/andrew/.gcp:/root/.gcp"],  # ADJUST IT TO MATCH YOUR USER NAME
    ),
)

I’ve ran the following commands

gcloud auth login
gcloud auth application-default login
export GOOGLE_APPLICATION_CREDENTIALS=/Users/andrew/.secrets/gcp.json

But still encounter

google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started

I think it’s related to:
“Note that you need to mount credentials to the Docker root user, otherwise boto3 won’t be able to find AWS credentials.”

But I am not sure how to mount to the Docker root user. Any thoughts? Thanks!

There is no such thing as .gcp folder, GCS uses a different convention for credentials.

If anything you could try .gsutil creds:

Can you try with S3 and .aws for now?

GCS Service Accounts

For GCS you would need to mount the JSON service account file as a volume:

DeploymentSpec(
    name="docker-example",
    flow=my_docker_flow,
    flow_runner=DockerFlowRunner(
        image="prefecthq/prefect:2.0b2-python3.9",
        volumes=["/Users/anna/repos/packaging-prefect-flows/gcs_sa.json:/opt/prefect/gcs_sa.json"],  # ADJUST IT TO MATCH YOUR GCS JSON PATH
    ),
)
1 Like

I was able to get GCP version running here:

1 Like