Can you recommend some repository structure to separate code dependencies between separate Prefect projects?

View in #prefect-community on Slack

Matt_Delacour @Matt_Delacour: hi all :wave:

I am reading the Orion doc and playing the beta cloud product.
I mostly need help regarding the whole deployment process.

What I am looking for is a way to isolate dependencies between our use cases (ML, DBT, Dask, etc.). So in my mind, I imagine that our users will need to create docker images that can be used in when a job runs (see a quick diagram in the thread).

Looking at the Orion documentation, what’s not clear to me
• How can I assign a specific Docker image to a job (so that we can spin up the right server)? The use of DockerFlowRunner() for that is unclear
• I am guessing that I would need to use tags to glue any logic around our different environments (prod / adhoc). But I don’t know if work-queues need to be created “manually” for each new “Repo user” (see diagram)
Also feel free to point me to a tuto explaining all the best practices to deploy Prefect :pray:
And what do you think about this diagram ? What should be changed in your opinion ?

@Nate: hey @Matt_Delacour

https://orion-docs.prefect.io/concepts/flow-runners/

you can pass an image kwarg to DockerFlowRunner specifying which image to use for this flow, where you’d then pass your DockerFlowRunner to a DeploymentSpec

Flow Runners - Prefect 2.0

a common pattern I’ve seen for managing docker images used in flows is a github repo with a structure like

base_img/
   Dockerfile
   requirements.txt
ML_img/
   Dockerfile (FROM base_img)
   requirements.txt
DBT_img/
   Dockerfile (FROM base_img)
   requirements.txt
some_other_img/
   Dockerfile (FROM base_img)
   requirements.txt

where a github action builds these docker images and pushes them to ECR or GHCR (or both) on some PR to main, and then you can use the image path (for whichever registry) in your FlowRunner instances

some content about this from docker

YouTube Video: Building a Docker Image Pipeline Using GitHub Actions

Matt_Delacour @Matt_Delacour: > a common pattern I’ve seen for managing docker images used in flows is a github repo with a structure like
Yes it makes sense but that also mean that there is one centralized repository to manage images

I am thinking more about

Repo Use Case 1
image/
   Dockerfile
   poetry.lock
src/
   ...

Repo Use Case 2
image/
   Dockerfile
   poetry.lock
src/
   ...

And so then I need to make sure that those 2 repos will interact properly with Prefect.

Do you have any example using prefect that way ? :point_up:

@Nate: multiple repos for building images like this also seems reasonable! I don’t have any specific examples of that, but at the end of the day, all you really need is to actually have an image (appropriate for the python that composes your flow) existing somewhere that your runtime can see - for example, ECR for images to be run on EKS - whatever registries/runtimes are most convenient for your stack

the actual process of building and pushing those images is currently entirely separate from prefect 2.0 flow deployments, although you can expect future docker functionalities for building images for deployments as prefect 2.0 matures

Matt_Delacour @Matt_Delacour: Sounds good. I will explore that next week :raised_hands:
Thanks