Prefect 2.6 is slower than previous versions for deploying

I’m noticing that after upgrading to prefect 2.6 my deployments started getting slower, taking well over 60min to complete (usually finish in 3min).

It might be that I’m deploying things in a wrong way, so I’ll first describe my setup. I have a powerful machine running centos 7 and anaconda. I have a conda environment for prefect, running python 3.7.9, under prefect username. There is also an instance of minio for storage. Prefect Orion is run using systemd, with the following in the ExecStart: conda run -n ams-prefect prefect orion start --host --port 4200 --analytics-off --scheduler.

On my laptop, I have a repository with the code, where I heavily utilize Deployment.build_from_flow(..., apply=True,). I have a deployments/ file, where I essentially have this:


Then, to deploy, I just run python deployments from my command line. And this now started taking extensively long times :frowning: Am I doing something wrong? How can I debug this issue? I tried looking at the logs, but there aren’t any (at least sudo journalctl -u prefect-orion doesn’t show anything useful).

great writeup! I believe the issue with the slow build might be because Prefect attempts to upload files to remote storage by default, can you try passing skip_upload=True?

I’ve checked Monitoring on minio and it seems that prefect is uploading the same version of files over and over again, for each Deployment.build_from_flow call. Since my agents are on a different machine, I need the remote storage (if I understand correctly how the system works?).

Setting skip_upload=True for all calls results in much faster full deployment (22s vs 90min). Setting skip_upload=True for all but one call results also in faster full deployment (29s).

that’s intentional - if you like a different behavior, check out this CI/CD example

Will take a look, thanks!

Now when I understand how code is stored, may I suggest an improvement: instead of storing everything and overwriting, it would be nice if prefect would compute a hash of the content it uploads. Then, instead of storing under bucket/project/... it stores the code under bucket/project/<hash>/.... If the hash already exists, no need to re-upload. The benefit of this approach is that jobs can link to the actual code that was executed.