How to share code between flows

Hi!

I am currently trying to migrate to Prefect 2.0 and having a little trouble in my understanding of how to run flow with shared code. Maybe the answer is just really obvious but I couldn’t find anything through googling.
I’d really appreciate the help!

Problem
I want to create deployments for, and run, flows that import modules from a different folder in the same repository.

This is the repository structure:

repository
|-- flows
|---- flow1.py // accesses tasks/task1.py
|---- flow2.py // accesses tasks/task1.py and utils/utils1.py
|-- tasks
|---- task1.py
|-- utils
|---- utils1.py

There is shared code between the flows, that is also quite a lot, so simply duplicating the code into the flow files or folders is out of the question.

(Unsatisfying) Solutions that I have found:

  • Baking the dependencies into the Docker image
    Here the problem is that we would have to redeploy the container every time we change the code. This I would rather like to avoid, because it
    a) is cumbersome and
    b) could block deploying changes for a while because we would have to wait until currently running flows are done

  • Storage Blocks
    As far as I understood I could create a storage block containing the necessary modules. However, the only way I could find is to include the whole repository into the storage block for each flow. That seems like a wasteful and inefficient approach.
    Additionally, each flow would kinda have their own source of truth to what the current code is.

Question(s)
How can I make this work without neither sacrificing the modularization/deduplication of my code nor depending on redeployments to propagate code changes to the agent?

Thanks in advance!

Prefect should still work when importing code normally? As in your storage block that contains the whole repo should be sufficient.

However, the only way I could find is to include the whole repository into the storage block for each flow . That seems like a wasteful and inefficient approach.

i.e. You should not have to do this.

If there are a lot of shared components amongst a single series of work, I would recommend testing out our newly released projects. It would be a great way to kick of pipelines with shared components as it. Additionally, you could have multiple github storage blocks pertaining to each set of workflows you would want to run.

Thanks for the replies! In the end, I just fundamentally misunderstood code storage. I got it working now using the generic GitHub storage block :slight_smile: