Prefect deployment build --infra docker-container --override volumes not a valid list

Thank you for adding an easy way to run a flow in a Docker container! I took a look at Prefect 2.3.0 adds support for flows defined in Docker Images and GitHub Repositories | by Anna Geller | The Prefect Blog | Medium and was able to get it working, but it took some a manual edit of the YAML so I can’t (yet) use --apply.

DockerContainer.volumes is a list but prefect deployment build --override volumes=/host/path:/container/path puts a string in the YAML. I can deploy that YAML but when the agent tries to run it fails with:

04:34:37.401 | ERROR   | prefect.agent - Failed to get infrastructure for flow run '783c7e09-bc5e-4957-811e-1d9eb2995b6b'.
Traceback (most recent call last):
  File "/home/thecap/.pyenv/versions/tourist-3.10.7/lib/python3.10/site-packages/prefect/agent.py", line 203, in submit_run
    infrastructure = await self.get_infrastructure(flow_run)
  File "/home/thecap/.pyenv/versions/tourist-3.10.7/lib/python3.10/site-packages/prefect/agent.py", line 189, in get_infrastructure
    infrastructure_block = Block._from_block_document(infra_document)
  File "/home/thecap/.pyenv/versions/tourist-3.10.7/lib/python3.10/site-packages/prefect/blocks/core.py", line 555, in _from_block_document
    block = block_cls.parse_obj(block_document.data)
  File "pydantic/main.py", line 521, in pydantic.main.BaseModel.parse_obj
  File "/home/thecap/.pyenv/versions/tourist-3.10.7/lib/python3.10/site-packages/prefect/blocks/core.py", line 175, in __init__
    super().__init__(*args, **kwargs)
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for DockerContainer
volumes
  value is not a valid list (type=type_error.list)

My workaround is editing the YAML to change

infra_overrides:
  volumes: /host/path:/container/path

to

infra_overrides:
  volumes:
    - /host/path:/container/path

but can someone suggest a way that I can use build --apply ?

1 Like

Have you tried the approach from the linked repo? what’s your Dockerfile?

if you COPY the flow to your Docker image, you don’t have to mount volumes.

My web server (running flask) and prefect flows share ORM classes so I thought it’d be nice to have them all use a single Docker image with a snapshot of my code and all modules and libraries to run it (tourist-with-flask/Dockerfile at master · TomGoBravo/tourist-with-flask · GitHub). I found prefect-docker-deployment/deploy.bash at main · anna-geller/prefect-docker-deployment · GitHub pretty handy and it gave me the idea for configuring the prefect deploy with prefect deployment build flags. The volume is for data that is stored on the host’s filesystem. I change the path of the volume to point at test or production data so I can use the same image for both. Changing to deploying with a script did the trick: tourist-with-flask/dataflow_deployment.py at master · TomGoBravo/tourist-with-flask · GitHub though I’m still curious if there is a way to do it from the command line.

1 Like
  1. Did you consider building this as an installable package?
  2. Do you have it all in a single repo? If so, this should be pretty doable from CI/CD

Generally, I don’t understand why this wouldn’t work from CLI - you can COPY your flow script to the container and specify a path. You can point to the exact location of your flow e.g.:

prefect deployment build -n prod -q prod -a flows/some/nested/module.py:import_test