Constant timeouts when running Prefect on EC2 instance with local storage

I’m losing my mind here trying to migrate from 1.0 to 2.0. Had everything working in 2.0 locally and moved it to our EC2 instance. However, now I’m getting random timeout errors that cause my flows to crash.

My setup isn’t (I would think) that unusual:

  • I’m triggering a few Airbyte connections as sub-flows.
  • Once those run, I’m triggering another few sub-flows to run associated dbt jobs.
  • Airbyte and Prefect are running on the same EC2 instance.
  • I’m using local storage for the deployment. This worked just fine with 1.0.

I can’t make it through the Airbyte syncs without running into a timeout error. I’ve searched github, slack, and this forum … I’ve found a few similar failure modes, but no clear-cut resolution.

Hey Dra, nothing about your setup is standing out as especially unique. It would be helpful to attach some stack traces so that Discourse users have a better time understanding the specific error you are encountering. Thanks!

Thanks for the reply @George_Coyne. I was able to (mostly) resolve my issues over the weekend. In case it’s helpful to future users:

  • Upgrade the anyio, httpx, and httpcore packages. This reduced the timeout errors I was experiencing, though it didn’t totally eliminate them. There is an open issue on Github related to this issue.
  • I then encountered a new error coming from Prefect itself ("Cannot add child handler, “RuntimeError: Cannot add child handler, the child watcher does not have a loop attached”). Upgrading Python from 3.7.16 to 3.8.16 resolved this issue even though Prefect should be compatible with 3.7.x.

Based on my reading, it appears the timeout issue is related to a bug in an upstream package. I’ve added more error monitoring in the interim, but hopefully that is resolved soon.

I spoke too soon. Time out errors have now returned on every flow run.

> Flow run encountered an exception. Traceback (most recent call last): 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/backends/asyncio.py", line 34, in read return await self._stream.receive(max_bytes=max_bytes) File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/anyio/_backends/_asyncio.py", 
line 1265, in receive await self._protocol.read_event.wait() File "/usr/lib64/python3.8/asyncio/locks.py", 
line 309, in wait await fut asyncio.exceptions.CancelledError During handling of the above exception, another exception occurred: Traceback (most recent call last): 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions yield File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/backends/asyncio.py",
 line 36, in read return b"" File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/anyio/_core/_tasks.py", 
line 118, in __exit__ raise TimeoutError TimeoutError During handling of the above exception, another exception occurred: 
Traceback (most recent call last): 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions yield 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpx/_transports/default.py", line 353, in handle_async_request resp = await self._pool.handle_async_request(req) 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request raise exc 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request response = await connection.handle_async_request(request) File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/connection.py", 
line 90, in handle_async_request return await self._connection.handle_async_request(request) 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/http11.py", line 116, in handle_async_request raise exc File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/http11.py", line 95, in handle_async_request ) = await self._receive_response_headers(**kwargs) 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/http11.py", 
line 159, in _receive_response_headers event = await self._receive_event(timeout=timeout) 
File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_async/http11.py",
line 195, in _receive_event data = await self._network_stream.read( File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/backends/asyncio.py", line 36, in read return b"" File "/usr/lib64/python3.8/contextlib.py", 
line 131, in __exit__ self.gen.throw(type, value, traceback) File "/home/ec2-user/data-prefect/venv/lib64/python3.8/site-packages/httpcore/_exceptions.py", 
line 14, in map_exceptions raise to_exc(exc) httpcore.ReadTimeout The above exception was the direct cause of the following exception: httpx.ReadTimeout

Hey Dra, Prefect works best on Python 3.10 or greater. If you’re going through dependency friction I would start there. Where are these logs from and where are timeouts occurring?

Upgrading to 3.10 will be my next step, it’s just a PITA on our EC2 instance since there isn’t a pre-built package.

The logs I shared were from the Prefect UI for flow state. The timeouts are occurring randomly during one of a few AirbyteSync tasks (it’s not the same one each time). I don’t get the same errors during any of the subsequent dbt jobs that run after the syncs. Airbyte is running in docker on the same EC2 instance as Prefect.