Prefect Agent in local failed to finish the workflow with Prefect Orion in Kubernetes

I redeployed multiple times. Unfortunately, it does not work. :cry:

I also tried to add a Traefik Ingress Controller by pointing to a different port 40000. And I got same error. Here are steps:

  1. Install prefect-orion by new PREFECT_API_URL:

    helm install \
      prefect-orion \
      prefect/prefect-orion \
      --namespace=hm-prefect \
      --create-namespace \
      --values=prefect-orion/my-values.yaml
    

    prefect-orion/my-values.yaml

    orion:
      env:
        - name: PREFECT_API_URL
          value: http://localhost:40000/api
    postgresql:
      auth:
        # username: prefect
        password: passw0rd
    
  2. Install Traefik Ingress Controller

    helm repo add traefik https://traefik.github.io/charts
    helm install \
      traefik \
      traefik/traefik \
      --namespace=hm-traefik \
      --create-namespace
    
  3. Install Traefik Ingress for the namespace hm-prefect

    kubectl apply --filename=traefik/hm-prefect-traefik-ingress.yaml
    

    traefik/hm-prefect-traefik-ingress.yaml

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: hm-traefik-ingress
      namespace: hm-prefect
      annotations:
        kubernetes.io/ingress.class: traefik
      labels:
        app.kubernetes.io/name: ingress
    spec:
      rules:
        - http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: prefect-orion
                    port:
                      number: 4200
    
  4. Port forward Traefik

    kubectl port-forward service/traefik --namespace=hm-traefik 40000:80
    
  5. Set PREFECT_API_URL in local

    prefect config set PREFECT_API_URL=http://localhost:40000/api
    
  6. Re-run

    # (You can skip this step to use mine directly, or change to your Docker Hub username)
    # Build hm-prefect-print-platform
    docker build --file=print_platform/Dockerfile --tag=hongbomiao/hm-prefect-print-platform:latest .
    docker push hongbomiao/hm-prefect-print-platform:latest
    
    # Start hm-prefect-print-platform
    python print_platform/add_kubernetes_job_block.py
    prefect deployment build print_platform/main.py:print_platform \
      --name=print-platform \
      --infra-block=kubernetes-job/print-platform-kubernetes-job-block \
      --work-queue=hm-queue \
      --apply
    prefect deployment run print-platform/print-platform
    

    At this moment, all still look good.

  7. Start the agent locally, and now will see similar error:

    prefect agent start --work-queue=hm-queue
    
    Starting v2.7.5 agent connected to http://localhost:40000/api...
    
      ___ ___ ___ ___ ___ ___ _____     _   ___ ___ _  _ _____
     | _ \ _ \ __| __| __/ __|_   _|   /_\ / __| __| \| |_   _|
     |  _/   / _|| _|| _| (__  | |    / _ \ (_ | _|| .` | | |
     |_| |_|_\___|_| |___\___| |_|   /_/ \_\___|___|_|\_| |_|
    
    
    Agent started! Looking for work from queue(s): hm-queue...
    10:05:53.379 | INFO    | prefect.agent - Submitting flow run '106ad98e-ff8a-465c-b219-f587616abb22'
    10:05:55.925 | INFO    | prefect.infrastructure.kubernetes-job - Job 'quirky-bullfinch-d2vkq': Pod has status 'Pending'.
    10:05:55.927 | INFO    | prefect.agent - Completed submission of flow run '106ad98e-ff8a-465c-b219-f587616abb22'
    10:06:08.084 | INFO    | prefect.infrastructure.kubernetes-job - Job 'quirky-bullfinch-d2vkq': Pod has status 'Running'.
    <frozen runpy>:128: RuntimeWarning: 'prefect.engine' found in sys.modules after import of package 'prefect', but prior to execution of 'prefect.engine'; this may result in unpredictable behaviour
    18:06:18.039 | ERROR   | prefect.engine - Engine execution of flow run '106ad98e-ff8a-465c-b219-f587616abb22' exited with unexpected exception
    anyio._backends._asyncio.ExceptionGroup: 2 exceptions were raised in the task group:
    ----------------------------
    Traceback (most recent call last):
      File "/usr/local/lib/python3.11/site-packages/anyio/_core/_sockets.py", line 164, in try_connect
        stream = await asynclib.connect_tcp(remote_host, remote_port, local_address)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1691, in connect_tcp
        await get_running_loop().create_connection(
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1079, in create_connection
        raise exceptions[0]
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1063, in create_connection
        sock = await self._connect_sock(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 967, in _connect_sock
        await self.sock_connect(sock, address)
      File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 634, in sock_connect
        return await fut
               ^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 674, in _sock_connect_cb
        raise OSError(err, f'Connect call failed {address}')
    ConnectionRefusedError: [Errno 111] Connect call failed ('::1', 40000, 0, 0)
    ----------------------------
    Traceback (most recent call last):
      File "/usr/local/lib/python3.11/site-packages/anyio/_core/_sockets.py", line 164, in try_connect
        stream = await asynclib.connect_tcp(remote_host, remote_port, local_address)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1691, in connect_tcp
        await get_running_loop().create_connection(
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1079, in create_connection
        raise exceptions[0]
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1063, in create_connection
        sock = await self._connect_sock(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 967, in _connect_sock
        await self.sock_connect(sock, address)
      File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 634, in sock_connect
        return await fut
               ^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 674, in _sock_connect_cb
        raise OSError(err, f'Connect call failed {address}')
    ConnectionRefusedError: [Errno 111] Connect call failed ('127.0.0.1', 40000)
    
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
        yield
      File "/usr/local/lib/python3.11/site-packages/httpcore/backends/asyncio.py", line 111, in connect_tcp
        stream: anyio.abc.ByteStream = await anyio.connect_tcp(
                                       ^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/anyio/_core/_sockets.py", line 222, in connect_tcp
        raise OSError("All connection attempts failed") from cause
    OSError: All connection attempts failed
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
        yield
      File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
        resp = await self._pool.handle_async_request(req)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
        raise exc
      File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
        response = await connection.handle_async_request(request)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection.py", line 86, in handle_async_request
        raise exc
      File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection.py", line 63, in handle_async_request
        stream = await self._connect(request)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection.py", line 111, in _connect
        stream = await self._network_backend.connect_tcp(**kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpcore/backends/auto.py", line 29, in connect_tcp
        return await self._backend.connect_tcp(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpcore/backends/asyncio.py", line 109, in connect_tcp
        with map_exceptions(exc_map):
      File "/usr/local/lib/python3.11/contextlib.py", line 155, in __exit__
        self.gen.throw(typ, value, traceback)
      File "/usr/local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
        raise to_exc(exc)
    httpcore.ConnectError: All connection attempts failed
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.11/site-packages/prefect/engine.py", line 1898, in <module>
        enter_flow_run_engine_from_subprocess(flow_run_id)
      File "/usr/local/lib/python3.11/site-packages/prefect/engine.py", line 188, in enter_flow_run_engine_from_subprocess
        return anyio.run(retrieve_flow_then_begin_flow_run, flow_run_id)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/anyio/_core/_eventloop.py", line 70, in run
        return asynclib.run(func, *args, **backend_options)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 292, in run
        return native_run(wrapper(), debug=debug)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
        return runner.run(main)
               ^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
        return self._loop.run_until_complete(task)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
        return future.result()
               ^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
        return await func(*args)
               ^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
        return await fn(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/prefect/engine.py", line 260, in retrieve_flow_then_begin_flow_run
        flow_run = await client.read_flow_run(flow_run_id)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/prefect/client/orion.py", line 1469, in read_flow_run
        response = await self._client.get(f"/flow_runs/{flow_run_id}")
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1757, in get
        return await self.request(
               ^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1533, in request
        return await self.send(request, auth=auth, follow_redirects=follow_redirects)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/prefect/client/base.py", line 229, in send
        response = await self._send_with_retry(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/prefect/client/base.py", line 187, in _send_with_retry
        response = await request()
                   ^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1620, in send
        response = await self._send_handling_auth(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
        response = await self._send_handling_redirects(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
        response = await self._send_single_request(request)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1722, in _send_single_request
        response = await transport.handle_async_request(request)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 352, in handle_async_request
        with map_httpcore_exceptions():
      File "/usr/local/lib/python3.11/contextlib.py", line 155, in __exit__
        self.gen.throw(typ, value, traceback)
      File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
        raise mapped_exc(message) from exc
    httpx.ConnectError: All connection attempts failed