Click on “Launch a new EC2 instance” within your preferred region and then Select Ubuntu 20.04 AMI:
Follow all the default steps until the following step:
Create a key pair if needed:
Use the connect instructions explained under the Connect section to SSH to the instance:
Then, once you SSH’ed to it, run everything as
Create a file called
install_script.bash using e.g. vim or simply with
cat >> install_script.bash # paste the lines below and then do control + C to exit sudo apt-get update -y sudo apt-get upgrade -y sudo apt-get install software-properties-common -y sudo apt-get install python3-dateutil -y sudo apt install python3-pip -y sudo ln -s /usr/bin/python3 /usr/bin/python # sudo apt install docker.io -y PATH="$HOME/.local/bin:$PATH" export PATH pip3 install prefect supervisor s3fs
After running this script, you can check if Prefect was properly installed using the
prefect version command:
Attach IAM role with S3 access to the instance
To allow flow storage with S3, attach an IAM role to the instance:
For convenience in a simple PoC, you may create a role with
Once you save it, your instance should have access to S3! This is required so that your execution environment can pull flow from S3 storage.
Authenticate with Prefect Cloud 2.0
Create a new API key: Prefect Cloud
Copy the key and use it with this command:
prefect cloud login --key YOUR_API_KEY
This will prompt you to select a workspace - choose your workspace and hit Enter.
The result of the above command is several environment variables stored in your default profile. You can view those using
prefect config view:
PREFECT_PROFILE='default' PREFECT_API_URL='https://api.prefect.cloud/api/accounts/acc_id/workspaces/workspace_id' PREFECT_API_KEY='YOUR_API_KEY'
Create a work-queue for flows deployed to this EC2 instance
In this example, we create a queue named “ubuntu” which will allow for an agent polling for that work queue to deploy flow runs for any deployment that has a tag
prefect work-queue create -t ubuntu ubuntu
The output of that command:
The above command generated a UUID that you’ll need to start an agent, as explained in the next section.
Starting a supervisor process
Create a file called
supervisord.conf with the following contents (replace with your WORK_QUEUE_ID):
[unix_http_server] file=/tmp/supervisor.sock ; the path to the socket file [supervisord] loglevel=debug ; log level; default info; others: debug,warn,trace [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket [program:prefect-agent] command=prefect agent start -q dev
Now to start the agent, run:
supervisord -c ./supervisord.conf
Start a simple flow to test that it worked!
Create a flow file called
import platform import prefect from prefect import task, flow from prefect import get_run_logger from prefect.orion.api.server import ORION_API_VERSION import sys @task def log_platform_info(): logger = get_run_logger() logger.info("Host's network name = %s", platform.node()) logger.info("Python version = %s", platform.python_version()) logger.info("Platform information (instance type) = %s ", platform.platform()) logger.info("OS/Arch = %s/%s", sys.platform, platform.machine()) logger.info("Prefect Version = %s 🚀", prefect.__version__) logger.info("Prefect API Version = %s", ORION_API_VERSION) @flow def healthcheck(): log_platform_info() if __name__ == "__main__": healthcheck()
You can run the following commands locally (not from EC2):
prefect deployment build flows/healthcheck.py:healthcheck --name cicd -q prod -t project -o deploy/s3.yaml -sb s3/prod -v GITHUB_SHA prefect deployment apply deploy/s3.yaml prefect deployment run healthcheck/s3
Note that it generated a flow run with a name
If you go to your Cloud 2.0 dashboard, you should see a flow run confirming the success. And in the flow run logs, you should see the confirmation that the flow run got executed on a remote Ubuntu EC2 instance even though we triggered the flow run from our local development machine from the terminal.
Ensuring the agent starts on VM reboot
But what if you stop your VM for the night and restart it later? To ensure you don’t have to go through this entire process again, you can use the following command:
echo "@reboot root supervisord -c /home/ubuntu/supervisord.conf -l /home/ubuntu/supervisord.log -u root" >> /etc/crontab
After running this command, stop your instance and try to rerun the same deployment from the UI:
Note that this flow run will be shown as late, as it cannot get deployed since the instance is stopped:
Now let’s start the instance again:
Once the instance boots up, we should see that the late flow run gets automatically picked up and executed!