Calling `Deployment.build_from_flow` from jupyterhub notebook doesn't create deployments

knl · December 21, 2022, 9:19am

Hi all,

I hope you can help me with creating deployments from jupyterhub notebooks.

As my colleagues prefer to run everything from notebooks, they asked me if I could come up with a method for them to schedule their notebooks in prefect (because of awesome scheduling capabilities), and to do that from the comfort of their other notebooks. For this, they only need to run a notebook using papermill in a specific conda environment, get the result as HTML and mail it to designated email addresses.

Toward that goal, I created a small package that contains a flow for running a notebook and the following entry function. I installed this package in the environment where prefect orion and agents run, as well as in all conda environments. The entry function is this one:

@validate_arguments
def schedule(
    name: str,
    notebook_url: AnyHttpUrl,
    tags: List[str],
    queue: str,
    schedule: schemas.schedules.SCHEDULE_TYPES,
    parameters: Optional[Dict[str, Any]] = None,
    retries: NonNegativeInt = 0,
    timeout_seconds: Optional[Union[int, float]] = None,
    email_to: Optional[Union[List[str], str]] = None,
) -> Deployment:
    flow_name = _get_flow_name() # decorate the name with the calling user's name

    with temporary_settings(updates={PREFECT_API_URL: "..."}):
        rflow = run_report.with_options(
            retries=retries,
            name=flow_name,
            retry_delay_seconds=60,
        )
        if timeout_seconds:
            rflow = rflow.with_options(
                timeout_seconds=timeout_seconds,
            )
        d = Deployment.build_from_flow(
            flow=rflow,
            parameters={
                "name": name,
                "notebook_url": notebook_url,
                "parameters": parameters,  # what we pass to the notebook
                "retries": retries,
                "email_to": email_to,
            },
            name=name,
            schedule=schedule,
            version=1,
            work_queue_name=queue,
            tags=tags,
            skip_upload=True,
            # must have path and entrypoint set to avoid https://github.com/PrefectHQ/prefect/issues/6777
            path=".",
            entrypoint="nikolas_prefect_utils.core:run_report",
        )

        return d

When I call the above from the command line it creates a flow and the deployment. Of course, I have to call d.apply().

However, this is failing for the asked use-case, to run it from Jupyterhub notebooks. First, I don’t get a deployment, but a coroutine. Second, if I do this in a cell:

   d = await nikolas_prefect_utils.schedule(....)
   await d.apply()

I get back UUID, yet there is no flow nor deployment on our prefect instance.

Does anyone have an idea what I’m doing wrong here? How to debug where is the issue coming from?

anna_geller · December 21, 2022, 7:29pm

afaik we have an open internal issue for this but there are some complications with this due to Jupyter limitations, I’d recommend not using Jupyter for deployments and leveraging Python scripts instead, e.g. iterate in Jupyter for local development but for production deployments use scripts

one limitation is that you would need to run everything async to make that work

knl · December 21, 2022, 8:49pm

Hi Anna,

Thanks, I pass that suggestion to my colleagues.

Just to make sure I understand, for get deployments from jupyter to work, I need to run everything async. But in my example I do use async (that is, I invoke await) twice. I even get UUID back, but no deployment. Could it be that this is a bug?

anna_geller · December 22, 2022, 1:12am

I generally was trying to persuade you not to do it from Jupyter at all

sorry for not being more helpful here but I don’t know enough here

if you suspect this might be a bug, it’s best to submit a bug report as a GitHub issue - the integrations team engineers will give you much better guidance than me here

This repo would be a great place to open an issue: GitHub - PrefectHQ/prefect-jupyter: Prefect integrations interacting with Jupyter.

Have you thought about recommending a process where those data scientists/Jupyter users commit their flows and CI/CD takes care of deployments? would be cleaner for them (no clutter in notebooks) + less complicated than deploying from Jupyter

you can always add schedule from the UI

Topic		Replies	Views
Cannot create Deployment using build_from_flow after update to prefect Help prefect-2-0 , deployment , scheduling	1	586	March 13, 2024
How to create a flow run from deployment (orchestrator pattern)? Archive prefect-2-0 , deployment , flow-of-flows , create_flow_run , orchestrator-pattern , create_flow_run_from_deployment	11	5551	February 9, 2023
Prefect 2.1.0 has just arrived! It includes Python-based Deployments, improvements to work queues, tons of new integrations and features! Announcements prefect-2-0 , release-notes	0	1279	August 17, 2022
Guidance for Newbie on Deployments and Scheduling Help	3	531	November 20, 2023
New release - Prefect 2.0.4 has just arrived! 🛬 Announcements prefect-2-0 , release-notes , deployment , parameters , ui	0	569	August 10, 2022

Calling `Deployment.build_from_flow` from jupyterhub notebook doesn't create deployments

Related topics