Prefect Agent on Docker Error

Hi,

I created a custom Docker image (from continuumio/miniconda3:22.11.1-alpine) and installed Prefect 2 and it’s AWS and Shell plugins on it. The agent runs and everything seem to be working fine, but every now and then (at least once everyday!) the agent docker process suddenly fails with an error like this:

Traceback (most recent call last):
  File
"/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/utilities/services.p
y", line 46, in critical_service_loop
    await workload()
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/agent.py",
line 261, in check_for_cancelled_flow_runs
    async for work_queue in self.get_work_queues():
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/agent.py",
line 144, in get_work_queues
    work_queue = await self.client.read_work_queue_by_name(
  File
"/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/client/orchestration
.py", line 850, in read_work_queue_by_name
    response = await self._client.get(f"/work_queues/name/{name}")
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_client.py",
line 1754, in get
    return await self.request(
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_client.py",
line 1530, in request
    return await self.send(request, auth=auth,
follow_redirects=follow_redirects)
  File
"/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/client/base.py",
line 251, in send
    response = await self._send_with_retry(
  File
"/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/client/base.py",
line 194, in _send_with_retry
    response = await request()
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_client.py",
line 1617, in send
    response = await self._send_handling_auth(
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_client.py",
line 1645, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_client.py",
line 1682, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_client.py",
line 1719, in _send_single_request
    response = await transport.handle_async_request(request)
  File
"/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_transports/default.py
", line 352, in handle_async_request
    with map_httpcore_exceptions():
  File "/opt/conda/envs/odssm/lib/python3.10/contextlib.py", line 153, in
__exit__
    self.gen.throw(typ, value, traceback)
  File
"/opt/conda/envs/odssm/lib/python3.10/site-packages/httpx/_transports/default.py
", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.LocalProtocolError: Invalid input ConnectionInputs.SEND_HEADERS in state
ConnectionState.CLOSED


Traceback (most recent call last):
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/cli/_utilities.py", line 41, in wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 260, in coroutine_wrapper
    return call()
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 245, in __call__
    return self.result()
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 173, in result
    return self.future.result(timeout=timeout)
  File "/opt/conda/envs/odssm/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/opt/conda/envs/odssm/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 218, in _run_async
    result = await coro
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/cli/agent.py", line 189, in start
    async with anyio.create_task_group() as tg:
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/opt/conda/envs/odssm/lib/python3.10/site-packages/prefect/utilities/services.py", line 104, in critical_service_loop
    raise RuntimeError("Service exceeded error threshold.")
RuntimeError: Service exceeded error threshold.
An exception occurred.

I don’t know what causes this, especially since when the agent is running the flows work fine. Is this a result of my flows running into error? Or is it an issue potentially with how my docker image is built for this agent?

My error traceback is much longer than this, please let me know if adding all of it might help.

Hi,

I had a similar problem, I think. I asked on Slack, and Prefect folks advised me to do this:

Hi Gosia, try adding PREFECT_API_ENABLE_HTTP2=False to your env variables for the agent

This solved my problem.

Thank you for your suggestion @chosia, I’ll try it out to see if it solves the problem.

On a separate note, I started my agent and just waited one day to see what happens. I noticed that the issue might also be some of the flows that were not cancelled properly and are stuck in cancelling status.

In the end I’m still not sure what is causing it, so I’ll try the env approach you suggested.