I got an error with a flow on Prefect 2 that I’m not sure how to debug. I haven’t found any related issues in Issues · PrefectHQ/prefect · GitHub or Slack, etc.
The flow failed, and the stacktrace looks like:
Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x7f9cf6c191b0>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/base_subprocess.py", line 126, in __del__
self.close()
File "/usr/local/lib/python3.10/asyncio/base_subprocess.py", line 104, in close
proto.pipe.close()
File "/usr/local/lib/python3.10/asyncio/unix_events.py", line 547, in close
self._close(None)
File "/usr/local/lib/python3.10/asyncio/unix_events.py", line 571, in _close
self._loop.call_soon(self._call_connection_lost, exc)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 753, in call_soon
self._check_closed()
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
20:26:13.266 | INFO | prefect.infrastructure.process - Process 'electric-serval' exited cleanly.
Could this error have too many underlying issues to figure out, or does this point to some known issue?
Our setup:
Running 3 Prefect agents on a GCP VM, and the code that the flows kick off run inside Docker containers. Flows are kicked off for now in the Cloud UI.
Thanks, but this flow wasn’t using any async code. Sorry I forgot to mention that. Have you seen this kind of error before? Perhaps it’s related to the prefect python library?
I understand, but we can’t help solve a bug if we can’t reproduce it. Perhaps try doing the same what you did that led to that error and try to generalize and build a small example you can share in a GitHub issue?
If the task is somewhat “short” (like 10 minutes), it still finishes in completed. However I have also had the problem multiple times, that it does not terminate, however it also does not cancel. It just shows “running”, but no additional progress is made. Problem does not come from the task itself (which is just regular sync python code).
The task runner is also running in a docker container.
This is also exactly how it is shown in the prefect shell docs.
Small update to this (because I dont know if its relevant): Had the exact same issue again. It seems to occur each and every time.
I don’t see the error message in the prefect flow however - it is still showing the flow to be “running” (healthy) after 14 hours (while it should take about 1 - 2 hours).
The error message only appears in the docker logs.
Note: You need to use prefect_shell>=0.1.5 (at least on my system) in order to import ShellOperation. Also using the context manager (with ShellOperation as operation) does not work, since wait for completion throws an error.
So the only way for me to get this kind of setup running is using ShellOperation(…).run()