How can I resolve this error : PrefectHTTPStatusError on prefect 2.3.1, DaskTaskRunner

Hi everyone,

I am executing a flow on several computers using DaskTaskManager on prefect 2.3.1.
For the moment I test on two computer :

  • one has the dask-scheduler and a dask-worker scheduled to the scheduler
  • the other one only has a dask-worker scheduled to the scheduler too

Currently, I am only trying a simple code I found on this website :

from dask.distributed import Client
from prefect import flow, task
from prefect_dask import DaskTaskRunner

@task
def say_hello(name):
    print(f"hello {name}")

@task
def say_goodbye(name):
    print(f"goodbye {name}")

@flow(task_runner=DaskTaskRunner(address="tcp://X.X.X.X"))
def greetings(names):
    for name in names:
        say_hello.submit(name)
        say_goodbye.submit(name)

However, when I run the code using one of the two computer, only the corresponding computer can do the tasks. The other one is returning this kind of error :

Exception: 'PrefectHTTPStatusError("Client error \'404 Not Found\' for url \'http://ephemeral-orion/api/task_runs/a158be60-8b91-49d8-ab7e-46a5969116d6/set_state\'\\nResponse: {\'exception_message\': \'Task run with id a158be60-8b91-49d8-ab7e-46a5969116d6 not found\'}\\nFor more information check: https://httpstatuses.com/404")'

and

distributed.protocol.pickle - INFO - Failed to deserialize

Thank you in advance !

1 Like

Interesting! It’s a bit hard to replicate this in a distributed environment, but can you perhaps try the same using Prefect Cloud (there is an always-free tier) and see whether this way the issue is resolved?

In theory, if the Dask cluster is set up properly, Prefect should be able to submit tasks to Dask, but tasks running on Dask need to also be able to reach the Orion API to report back the state so it might be that your local Orion instance is not reachable from Dask and this would be solved when using Prefect Cloud.

Thank you very much for your answer ! I will definitely try Prefect Cloud. However, there is no way to connect Dask with Orion without Prefect Cloud ?

1 Like

I tested it with prefect cloud and it is better ! But I have another exception that occurs on every computer that received tasks :

2022-09-07 11:17:55,962 - distributed.worker - WARNING - Compute Failed
Key:       6b8ff000-ad9a-4de0-ac5a-0b9c35c93fe0
Function:  begin_task_run
args:      ()
kwargs:    {'task': <prefect.tasks.Task object at 0x7ff4aaaff760>, 'task_run': TaskRun(id=UUID('44d28b18-a961-43a8-9f5f-e02134767737'), name='say_goodbye-261e56a8-0', flow_run_id=UUID('16b3db98-966b-44b6-b73b-c4ff4682c878'), task_key='__main__.say_goodbye', dynamic_key='0', cache_key=None, cache_expiration=None, task_version=None, empirical_policy=TaskRunPolicy(max_retries=0, retry_delay_seconds=0.0, retries=0, retry_delay=0), tags=[], state_id=UUID('4543c5be-881c-47e5-9464-4ef76eef5130'), task_inputs={'name': []}, state_type=StateType.PENDING, state_name='Pending', run_count=0, expected_start_time=DateTime(2022, 9, 7, 2, 17, 54, 691980, tzinfo=Timezone('+00:00')), next_scheduled_start_time=None, start_time=None, end_time=None, total_run_time=datetime.timedelta(0), estimated_run_time=datetime.timedelta(0), estimated_start_time_delta=datetime.timedelta(microseconds=56332), state=Pending(message=None, type=PENDING, result=None)), 'parameters': {'name': 'arthur'}, 'wait_for': None, 'result_filesyste':
Exception: "PermissionError(13, 'Permission denied')"

I feel that it tries to modify a file or something but I am not sure the log output isn’t really clear… I tried several things but none of them worked

1 Like

ofc there is but it’s much harder since you need to configure distributed setup properly incl. network, etc

what have you tried? it looks like Prefect tries to write the task run results to a directory to which you don’t have write permissions. Again, you can modify that (on all servers you use with Prefect and Dask) this way:

prefect config set PREFECT_LOCAL_STORAGE_PATH='/your/custom/storage/path'

Thanks again for your answer !
I thought it tried to write in the orion.db file so I change the authorization on this file and to other .prefect/ files but it might not be that.
So now my question is (hopefully it’ll solve the problem), to what correspond the PREFECT_LOCAL_STORAGE_PATH ?
Thanks again for your help !

1 Like

you mean by default? by default it’s: PREFECT_LOCAL_STORAGE_PATH='${PREFECT_HOME}/storage'

OK I see what it is. So is the problem that a computer from the cluster is trying to write/do something on a prefect storage path of another computer ?

In the end, I have modified the PREFECT_LOCAL_STORAGE_PATH to a path where I am sure they have permission and it works ! Thanks for the help !
However I still don’t know why it didn’t worked when I modified the default path…

1 Like