GitHub storage block and dynamically setting branches

I would like to use the GitHub storage block for our flows, but it seems like the branch name (called reference in the block) is only respected at the block level, not the deployment level. I have the following code to create my deployment:

...
storage = await GitHub.load("repo")
storage.reference = Repository(".").head.shorthand
infra = await VertexAICustomTrainingJob.load("vertex")

deploy = await Deployment.build_from_flow(
    flow=hello_flow,
    name="hello-flow",
    storage=storage,
    infrastructure=infra,
)
...

And the yaml file created from this deployment has the correct reference:

...
storage:
  repository: https://github.com/<redacted>.git
  reference: prefect2-git-storage
  access_token: '**********'
  include_git_objects: true
...

But when the flow runs, the code is running from the main branch. The block is saved in Prefect without a reference, and I was hoping to be able to set it at the deployment level for development purposes, but I’m not sure if this is possible. Any help would be greatly appreciated!

1 Like

hi @jeremy_thomas - if you’re just getting started with your deployment strategy, I would recommend taking a look at these docs and this example.

You can define a git_clone pull step and template in the branch that you need for a given deployment, whether via a block value, a env var, or the result of a shell script.

Let me know if you develop any questions


On your actual question

but it seems like the branch name (called reference in the block) is only respected at the block level, not the deployment level

this is correct - if I’m understanding your point, this is as designed.

1 Like

Nate,

We have our deployments working using GCS as our storage backend, but this is a very slow solution when there is lots of code for a flow. It also results in our code being duplicated; GitHub and a GCS bucket.

I understand the design idea of having blocks being somewhat static, but we have infrastructure overrides - why not storage overrides? It would allow us to make blocks be just as reusable, and give us the ability to tweak settings per deployment if needed.

hi @jeremy_thomas - the resources I linked above show how you can define deployments in the prefect.yaml file, in particular you can define a generic pull step (how the worker gets flow code for a given deployment flow run, which can be shell script, git clone, whatever you want) that can be overridden on a deployment basis.

In effect, this should accomplish what you might be interested in as far as “storage overrides”

worth noting that the prefect.yaml + prefect deploy story is our main recommendation for deployment management, in contrast to the infra block / build_from_flow story