Which Prefect Server components are stateless and can be scaled horizontally?

View in #prefect-community on Slack

M._Siddiqui @M._Siddiqui: Hello everyone, I am currently looking to try out a self hosted Prefect Server and UI instance on AWS.
Based on the services I saw in the docker-compose file over here:
https://github.com/PrefectHQ/prefect/blob/master/src/prefect/cli/docker-compose.yml

Has anyone attempted to deploy this on an ECS Fargate cluster ? Apart from the Postgres instance, I would like to deploy each service into its own Fargate instance. I am just wondering what could be some sane cpu and memory specs to kicks things off assuming I have a few flows with with an overall 1000ish tasks running twice every hour.
I would also like to know if all these services are stateless, meaning I can scale them horizontally via a load balance ? Or are there any services here which can only be scaled vertically in case of performance bottlenecks ?
Any help would be appreciated ! :pray:

GitHub: prefect/docker-compose.yml at master · PrefectHQ/prefect

@Anna_Geller: The docker compose setup is great if your workload can be handled on a single machine. Generally speaking the database is the most important component that needs to scale, so taking e.g. Aurora Postgres would be a good start. Other components should be fine on Fargate. For the scale that you mentioned you would need to experiment with the settings to see what works best.

M._Siddiqui @M._Siddiqui: @Anna_Geller so are the rest of the services like Hashura, graphql, towel and apollo stateless then ?
I can scale them out horizontally behind a load balancer ?

@Anna_Geller: Yes, all components except postgres are stateless in the sense that the state of your flows, flow runs, projects and everything else is stored and maintained in the database so that e.g. upgrading to a new version requires only database migration - all other components can be recreated. However, I’m not sure if simply having 3 Apollo containers instead of one can be immediately used to scale the service. I would suggest to start by scaling vertically, i.e. assigning more vCPU and RAM to your Fargate components when needed before trying to scale horizontally and introducing load balancing.

Based on what I saw in the community, the main challenges in maintaining Server are:

  1. Ensuring that your database storage scales because depending on the amount of flow runs, logs etc. the database can fill up pretty quickly. And again, the DB is the main stateful component
  2. Managing upgrades - because e.g. when you use 0.15.10 on Server, you can’t run flows using a higher version 0.15.11 because then your flow runs may use some API endpoints that don’t exist in your Server version. Upgrading Server requires a DB migration, which can be challenging.
    If you want to avoid that, Prefect Cloud has 20,000 free task runs every month, which is a lot to get started and you don’t need to worry about scale, DB administration and managing upgrades, and there are some features and performance optimizations that are only possible in Cloud.

Stack Overflow: Cleaning ~/.prefect/pg_data/ when using Prefect

M._Siddiqui @M._Siddiqui: You’ve always been an amazing help, thank you very much :raised_hands: :100:

Related topics