Click on “Launch a new EC2 instance” within your preferred region and then Select Ubuntu 20.04 AMI:
Follow all the default steps until the following step:
Create a key pair if needed:
Use the connect instructions explained under the Connect section to SSH to the instance:
Then, once you SSH’ed to it, run everything as
Create a file called
install_script.bash using e.g. vim or simply with
cat >> install_script.bash # paste the lines below and then do control + C to exit sudo apt-get update -y sudo apt-get upgrade -y sudo apt-get install software-properties-common -y sudo apt-get install python3-dateutil -y sudo apt install python3-pip -y sudo apt install docker.io -y PATH="$HOME/.local/bin:$PATH" export PATH pip3 install -U "prefect>=2.0b" supervisor
After running this script, you can check if Prefect was properly installed using the
prefect version command:
To allow flow storage with S3, attach an IAM role to the instance:
For convenience in a simple PoC, you may create a role with
Once you save it, your instance should have access to S3! This is required so that your execution environment can pull flow from S3 storage.
Create a new API key: Prefect Cloud 2.0
Copy the key and use it into this command:
prefect cloud login --key YOUR_API_KEY
This will prompt you to select a workspace - choose your workspace and hit Enter.
The result of the above command is that it stored several environment variables in your default profile. You can view those using
prefect config view:
PREFECT_PROFILE='default' PREFECT_API_URL='https://api-beta.prefect.io/api/accounts/acc_id/workspaces/workspace_id' PREFECT_API_KEY='YOUR_API_KEY'
In this example, we create a queue named “ubuntu” which will allow for an agent polling for that work queue to deploy flow runs for any deployment that has a tag
prefect work-queue create -t ubuntu ubuntu
The output of that command:
The above command generated a UUID that you’ll need to start an agent, as explained in the next section.
Create a file called
supervisord.conf with the following contents (replace by your WORK_QUEUE_ID):
[unix_http_server] file=/tmp/supervisor.sock ; the path to the socket file [supervisord] loglevel=debug ; log level; default info; others: debug,warn,trace [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket [program:prefect-agent] command=prefect agent start 360ff4b5-36d3-4bfe-8a08-e24f9aeaa86a
Now to start the agent, run:
supervisord -c ./supervisord.conf
Create a flow file called
import platform from prefect import task, flow from prefect import get_run_logger @task def say_hi(): logger = get_run_logger() logger.info("Hello world!") @task def print_platform_info(): logger = get_run_logger() logger.info( "Platform information: IP = %s, Python = %s, EC2 instance type = %s, OS Version = %s", platform.node(), platform.python_version(), platform.platform(), platform.version(), ) @flow def work_queue_test_flow(): hi = say_hi() print_platform_info(wait_for=[hi]) from prefect.deployments import DeploymentSpec DeploymentSpec( name="hello_world_local", flow=work_queue_test_flow, # flow_location is inferred from flow tags=["local"], ) DeploymentSpec( name="hello_world_ubuntu", flow=work_queue_test_flow, # flow_location is inferred from flow tags=["ubuntu"], )
You can run the following commands from your laptop (not from EC2):
prefect deployment create work_queue_test_flow.py prefect deployment run work-queue-test-flow/hello_world_ubuntu
Note that it generated a flow run with a name
If you go to your Cloud 2.0 dashboard, you should see a flow run confirming the success. And in the flow run logs, you should see the confirmation that the flow run got executed on a remote Ubuntu EC2 instance even though we triggered the flow run from our local development machine from the terminal.
But what if you stop your VM for the night and restart it later? To ensure you don’t have to go through this entire process again, you can use the following command:
echo "@reboot root supervisord -c /home/ubuntu/supervisord.conf -l /home/ubuntu/supervisord.log -u root" >> /etc/crontab
After running this command, stop your instance and try to rerun the same deployment from the UI:
Note that this flow run will be shown as late, as it cannot get deployed since the instance is stopped:
Now let’s start the instance again:
Once the instance boots up, we should see that the late flow run gets automatically picked up and executed!