Prefect script inside Docker - not connecting to Prefect Cloud

Hello,
I am a newby to both Prefect and Docker, please let me know if I got something terribly wrong.
I made a small script in Python to collect data from a website, and I arranged the operation flow and tasks with Prefect. It was running for a while on a single board computer, and everything was fine. All the flow runs were going to Prefect Cloud, and I could check if there was some issue. Scheduling was simply made via crontab.

I had to containerize the script to run into another system, and I also tried to automate the login into Prefect Cloud…but this seems not be working at all. Prefect still handles all failures and retries into the flow, but it never communicates with Prefect Cloud to “report” the flow runs and task status.

The way I tried to automate the login is including a call to subprocess into my Python script, just before running the flow.

The basic architecture of the code is this:

Flow definition (task are defined previously)

@flow(
    name="My flow",
    description="main flow for gathering data",
    retries=4,
    retry_delay_seconds=120,
    task_runner=SequentialTaskRunner(),
)
def main_flow():
    # Start logger
    logger = get_run_logger()
    logger.info(
        "Data collection started at {}".format(
            datetime.now().strftime(format="%Y-%m-%d %H:%M")
        )
    )
    # Do more stuff
    ...

Registration on Prefect Cloud and run (PREFECT_TK and PREFECT_WS are two environment variables for my key and worskpace on Prefect Cloud)

if __name__ == "__main__":
    # Register with Prefect
    subprocess.run(
        [
            "prefect",
            "cloud",
            "login",
            "-k",
            os.getenv("PREFECT_TK"),
            "-w",
            os.getenv("PREFECT_WS"),
        ],
        capture_output=True,
    )
    main_flow()

I get his error at the beginning of the execution when I run my container, but I fail to understand what it means or how to fix it:

/usr/local/lib/python3.11/contextlib.py:144: SAWarning: Skipped unsupported reflection of expression-based index ix_flow_run__coalesce_start_time_expected_start_time_desc
next(self.gen)

At this stage I cannot install any Prefect agent on the new host, it can only live inside the Docker container and talk to Prefect Cloud. Any suggestion would be welcome!

1 Like

have you seen some of our getting started tutorials e.g. this one or this one? you need to login to cloud only once from your terminal, no need to have it baked into your script

the flow looks good :+1:

Hi Anna, thanks for your reply. I saw the tutorials, and when the flow was executed directly on a small server everything was working fine.

Now the flow is inside a Docker container that get created for each run (and disposed thereafter), and it does not send anymore any report to Prefect Cloud.

The new server only has Docker and a Cron scheduler (I cannot install anything else there). So the connection with Prefect Cloud needs to be created every time I spin up the container (at least I think), but it always fails (on Prefect Cloud I see no track of the flow runs).

Should I create a shell script that connects to Prefect Cloud before I start the Python script with the flow? Or there is an option to bake in the container the login to Prefect cloud?

Here my current Dockerfile for reference:

FROM python:3

WORKDIR /srv

COPY . /srv
# Update sources and install dependencies
RUN apt-get update -y
RUN apt-get install zip -y
RUN apt-get install unzip -y
RUN apt-get install sqlite3

RUN pip install --upgrade pip
RUN pip install wheel
RUN pip install --no-cache-dir -r requirements.txt

# Install chromium and chromiumdriver
RUN apt-get install chromium -y
RUN apt-get install chromium-driver -y

CMD ["python", "main.py"]

perhaps try this way?

I also strongly recommend using the Prefect base image e.g. FROM prefecthq/prefect:2-python3.10

1 Like

OK, I solved the issue by creating a very simple shell script that first registers with prefect cloud and then start the Python Script. Now it tracks correctly.
Probably it’s important the the login on Prefect Cloud happens BEFORE the script where Prefect is called actually starts.

Updated Dockerfile:

FROM python:3

WORKDIR /srv

COPY . /srv
# Update sources and install dependencies
RUN apt-get update -y
RUN apt-get install zip -y
RUN apt-get install unzip -y
RUN apt-get install sqlite3

RUN pip install --upgrade pip
RUN pip install wheel
RUN pip install --no-cache-dir -r requirements.txt

# Install chromium and chromiumdriver
RUN apt-get install chromium -y
RUN apt-get install chromium-driver -y

CMD [ "/bin/bash", "run.sh" ]

And this is the content of run.sh:

#! /bin/bash
prefect cloud login -k $PREFECT_TK -w $PREFECT_WS
python main.py

I then start the container with:

docker run --rm --env PREFECT_TK=MyPrefectKey --env PREFECT_WS=MyPrefectWorkspace myimage

I will try the prefect image too, is it Debian-based like the Python one?

1 Like

it’s based on python slim image which is then based on FROM debian:bullseye-slim

1 Like