How to deploy a Prefect 2.0 agent to an EC2 instance as your execution layer?

Click on “Launch a new EC2 instance” within your preferred region and then Select Ubuntu 20.04 AMI:

Follow all the default steps until the following step:

Create a key pair if needed:

Use the connect instructions explained under the Connect section to SSH to the instance:

Then, once you SSH’ed to it, run everything as sudo user: sudo su.

Create a file called install_script.bash using e.g. vim or simply with cat command:

cat >> install_script.bash
# paste the lines below and then do control + C to exit
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install software-properties-common -y
sudo apt-get install python3-dateutil -y
sudo apt install python3-pip -y
sudo ln -s /usr/bin/python3 /usr/bin/python
# sudo apt install docker.io -y
PATH="$HOME/.local/bin:$PATH"
export PATH
pip3 install prefect supervisor s3fs

After running this script, you can check if Prefect was properly installed using the prefect version command:
image

Attach IAM role with S3 access to the instance

To allow flow storage with S3, attach an IAM role to the instance:

For convenience in a simple PoC, you may create a role with S3FullAccess permissions.

Once you save it, your instance should have access to S3! This is required so that your execution environment can pull flow from S3 storage.

Authenticate with Prefect Cloud 2.0

Create a new API key: Prefect Cloud

Copy the key and use it with this command:

prefect cloud login --key YOUR_API_KEY

This will prompt you to select a workspace - choose your workspace and hit Enter.

The result of the above command is several environment variables stored in your default profile. You can view those using prefect config view:

PREFECT_PROFILE='default'
PREFECT_API_URL='https://api.prefect.cloud/api/accounts/acc_id/workspaces/workspace_id'
PREFECT_API_KEY='YOUR_API_KEY'

Create a work-queue for flows deployed to this EC2 instance

In this example, we create a queue named “ubuntu” which will allow for an agent polling for that work queue to deploy flow runs for any deployment that has a tag ubuntu:

prefect work-queue create -t ubuntu ubuntu

The output of that command:

The above command generated a UUID that you’ll need to start an agent, as explained in the next section.

Starting a supervisor process

Create a file called supervisord.conf with the following contents (replace with your WORK_QUEUE_ID):

[unix_http_server]
file=/tmp/supervisor.sock   ; the path to the socket file

[supervisord]
loglevel=debug               ; log level; default info; others: debug,warn,trace

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket

[program:prefect-agent]
command=prefect agent start -q dev

Now to start the agent, run:

supervisord -c ./supervisord.conf

Start a simple flow to test that it worked!

Create a flow file called work_queue_test_flow.py:

import platform
import prefect
from prefect import task, flow
from prefect import get_run_logger
from prefect.orion.api.server import ORION_API_VERSION
import sys


@task
def log_platform_info():
    logger = get_run_logger()
    logger.info("Host's network name = %s", platform.node())
    logger.info("Python version = %s", platform.python_version())
    logger.info("Platform information (instance type) = %s ", platform.platform())
    logger.info("OS/Arch = %s/%s", sys.platform, platform.machine())
    logger.info("Prefect Version = %s 🚀", prefect.__version__)
    logger.info("Prefect API Version = %s", ORION_API_VERSION)


@flow
def healthcheck():
    log_platform_info()


if __name__ == "__main__":
    healthcheck()

You can run the following commands locally (not from EC2):

prefect deployment build flows/healthcheck.py:healthcheck --name cicd -q prod -t project -o deploy/s3.yaml -sb s3/prod -v GITHUB_SHA
prefect deployment apply deploy/s3.yaml
prefect deployment run healthcheck/s3

Note that it generated a flow run with a name lavender-pigeon :bird:

If you go to your Cloud 2.0 dashboard, you should see a flow run confirming the success. And in the flow run logs, you should see the confirmation that the flow run got executed on a remote Ubuntu EC2 instance even though we triggered the flow run from our local development machine from the terminal.

Ensuring the agent starts on VM reboot

But what if you stop your VM for the night and restart it later? To ensure you don’t have to go through this entire process again, you can use the following command:

echo "@reboot root supervisord -c /home/ubuntu/supervisord.conf -l /home/ubuntu/supervisord.log -u root" >> /etc/crontab

After running this command, stop your instance and try to rerun the same deployment from the UI:

Note that this flow run will be shown as late, as it cannot get deployed since the instance is stopped:

Now let’s start the instance again:

Once the instance boots up, we should see that the late flow run gets automatically picked up and executed!

Delightful, indeed! :smile:

3 Likes