How to deploy a Prefect 2.0 agent to an EC2 instance as your execution layer?

Click on “Launch a new EC2 instance” within your preferred region and then Select Ubuntu 20.04 AMI:

Follow all the default steps until the following step:

Create a key pair if needed:

Use the connect instructions explained under the Connect section to SSH to the instance:

Then, once you SSH’ed to it, run everything as sudo user: sudo su.

Create a file called install_script.bash using e.g. vim or simply with cat command:

cat >> install_script.bash
# paste the lines below and then do control + C to exit
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install software-properties-common -y
sudo apt-get install python3-dateutil -y
sudo apt install python3-pip -y
sudo apt install docker.io -y
PATH="$HOME/.local/bin:$PATH"
export PATH
pip3 install -U "prefect>=2.0b" supervisor

After running this script, you can check if Prefect was properly installed using the prefect version command:
image

Attach IAM role with S3 access to the instance

To allow flow storage with S3, attach an IAM role to the instance:

For convenience in a simple PoC, you may create a role with S3FullAccess permissions.

Once you save it, your instance should have access to S3! This is required so that your execution environment can pull flow from S3 storage.

Authenticate with Prefect Cloud 2.0

Create a new API key: Prefect Cloud 2.0

Copy the key and use it into this command:

prefect cloud login --key YOUR_API_KEY

This will prompt you to select a workspace - choose your workspace and hit Enter.

The result of the above command is that it stored several environment variables in your default profile. You can view those using prefect config view:

PREFECT_PROFILE='default'
PREFECT_API_URL='https://api-beta.prefect.io/api/accounts/acc_id/workspaces/workspace_id'
PREFECT_API_KEY='YOUR_API_KEY'

Create a work-queue for flows deployed to this EC2 instance

In this example, we create a queue named “ubuntu” which will allow for an agent polling for that work queue to deploy flow runs for any deployment that has a tag ubuntu:

prefect work-queue create -t ubuntu ubuntu

The output of that command:

The above command generated a UUID that you’ll need to start an agent, as explained in the next section.

Starting a supervisor process

Create a file called supervisord.conf with the following contents (replace by your WORK_QUEUE_ID):

[unix_http_server]
file=/tmp/supervisor.sock   ; the path to the socket file

[supervisord]
loglevel=debug               ; log level; default info; others: debug,warn,trace

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket

[program:prefect-agent]
command=prefect agent start 360ff4b5-36d3-4bfe-8a08-e24f9aeaa86a

Now to start the agent, run:

supervisord -c ./supervisord.conf

Start a simple flow to test that it worked!

Create a flow file called work_queue_test_flow.py:

import platform
from prefect import task, flow
from prefect import get_run_logger


@task
def say_hi():
    logger = get_run_logger()
    logger.info("Hello world!")


@task
def print_platform_info():
    logger = get_run_logger()
    logger.info(
        "Platform information: IP = %s, Python = %s, EC2 instance type = %s, OS Version = %s",
        platform.node(),
        platform.python_version(),
        platform.platform(),
        platform.version(),
    )


@flow
def work_queue_test_flow():
    hi = say_hi()
    print_platform_info(wait_for=[hi])


from prefect.deployments import DeploymentSpec

DeploymentSpec(
    name="hello_world_local",
    flow=work_queue_test_flow,  # flow_location is inferred from flow
    tags=["local"],
)
DeploymentSpec(
    name="hello_world_ubuntu",
    flow=work_queue_test_flow,  # flow_location is inferred from flow
    tags=["ubuntu"],
)

You can run the following commands from your laptop (not from EC2):

prefect deployment create work_queue_test_flow.py
prefect deployment run work-queue-test-flow/hello_world_ubuntu

Note that it generated a flow run with a name lavender-pigeon :bird:

If you go to your Cloud 2.0 dashboard, you should see a flow run confirming the success. And in the flow run logs, you should see the confirmation that the flow run got executed on a remote Ubuntu EC2 instance even though we triggered the flow run from our local development machine from the terminal.

Ensuring the agent starts on VM reboot

But what if you stop your VM for the night and restart it later? To ensure you don’t have to go through this entire process again, you can use the following command:

echo "@reboot root supervisord -c /home/ubuntu/supervisord.conf -l /home/ubuntu/supervisord.log -u root" >> /etc/crontab

After running this command, stop your instance and try to rerun the same deployment from the UI:

Note that this flow run will be shown as late, as it cannot get deployed since the instance is stopped:

Now let’s start the instance again:

Once the instance boots up, we should see that the late flow run gets automatically picked up and executed!

Delightful, indeed! :smile: