Unable to use DockerFlowRunner when using Prefect 2.0 in WSL 2 on Windows

I have successfully run orion server; created work_queue and agent ; and run a flow in WSL2. However I am unable to run it in a docker container. I set the ip address to $(hostname -i) which is 172.17.0.2 which allows the UI to be visible from windows.

If I launch orion on port 8082 then I can see the UI, set the config, set the storage, create a work queue. view the work-queue from command line; it is also in the db; but it does not appear on the UI. I can also launch an agent for the work-queue which claims to be connected to http://172.17.0.2:8082/api but it does not appear in the db.

If I launch orion on the default port then as soon as I try to set the config I get a server error with a huge database stacktrace. Error is “sqlite3.OperationalError: near “,”: syntax error” and stacktrace ends as below:

SELECT counts.interval_start, counts.interval_end, coalesce(json_group_array(json(counts.state_agg)) FILTER (WHERE counts.state_agg IS NOT NULL), ‘’) AS states
FROM (SELECT intervals.interval_start AS interval_start, intervals.interval_end AS interval_end, CASE WHEN (count(runs.id) = ?) THEN NULL ELSE json_object(?, runs.state_type, ?, runs.state_name, ?, count(runs.id), ?, sum(max(?, CAST(STRFTIME(‘%s’, runs.estimated_run_time) AS INTEGER))), ?, sum(max(?, CAST(STRFTIME(‘%s’, runs.estimated_start_time_delta) AS INTEGER)))) END AS state_agg
FROM intervals LEFT OUTER JOIN (SELECT flow_run.id AS id, flow_run.expected_start_time AS expected_start_time, (SELECT CASE WHEN (flow_run.state_type = ?) THEN strftime(?, (julianday(flow_run.total_run_time) + julianday(strftime(?, (? + julianday(strftime(‘%Y-%m-%d %H:%M:%f000’, ‘now’))) - julianday(flow_run_state.timestamp)))) - ?) ELSE flow_run.total_run_time END AS anon_1
WHERE flow_run.state_id = flow_run_state.id) AS estimated_run_time, CASE WHEN (flow_run.start_time > flow_run.expected_start_time) THEN strftime(?, (? + julianday(flow_run.start_time)) - julianday(flow_run.expected_start_time)) WHEN (flow_run.start_time IS NULL AND (flow_run.state_type NOT IN (?, ?, ?)) AND flow_run.expected_start_time < strftime(‘%Y-%m-%d %H:%M:%f000’, ‘now’)) THEN strftime(?, (? + julianday(strftime(‘%Y-%m-%d %H:%M:%f000’, ‘now’))) - julianday(flow_run.expected_start_time)) ELSE ? END AS estimated_start_time_delta, flow_run_state.type AS state_type, flow_run_state.name AS state_name
FROM flow_run JOIN flow_run_state ON flow_run.state_id = flow_run_state.id
WHERE flow_run.expected_start_time <= ? AND flow_run.expected_start_time >= ?) AS runs ON runs.expected_start_time >= intervals.interval_start AND runs.expected_start_time < intervals.interval_end GROUP BY intervals.interval_start, intervals.interval_end, runs.state_type, runs.state_name) AS counts GROUP BY counts.interval_start, counts.interval_end ORDER BY counts.interval_start
LIMIT ? OFFSET ?]
[parameters: (‘2022-03-21T00:00:00+00:00’, ‘2022-03-21T00:00:00+00:00’, ‘+5760.0 seconds’, ‘+5760.0 seconds’, ‘2022-03-23T00:00:00+00:00’, ‘-5760.0 seconds’, 0, ‘state_type’, ‘state_name’, ‘count_runs’, ‘sum_estimated_run_time’, 0, ‘sum_estimated_lateness’, 0, ‘RUNNING’, ‘%Y-%m-%d %H:%M:%f000’, ‘%Y-%m-%d %H:%M:%f000’, 2440587.5, 2440587.5, ‘%Y-%m-%d %H:%M:%f000’, 2440587.5, ‘COMPLETED’, ‘FAILED’, ‘CANCELLED’, ‘%Y-%m-%d %H:%M:%f000’, 2440587.5, ‘1970-01-01 00:00:00.000000’, ‘2022-03-23 00:00:00.000000’, ‘2022-03-21 00:00:00.000000’, 500, 0)]
(Background on this error at: Error Messages — SQLAlchemy 1.4 Documentation)

@simon_mackenzie we are not testing Orion on Windows yet.

Can you run prefect version within your WSL and share the output? I remember back when I was using WSL 1 (not WSL2) that Docker Desktop didn’t work well with WSL so I wouldn’t think that this works, but maybe WSL2 is different in that regard.

Regarding the SQLite error you see, if nothing else works, you can remove the DB file, reset your DB and start from scratch. Here is how you can do this:

rm ~/.prefect/orion.db
prefect orion database reset -y
prefect orion start

Also, make sure you close all browser windows with Orion UI before you reset your DB, it may cause some weird issues.

In general, WSL2 runs everything in a Linux VM, but it could be that something is getting confused with respect to Windows line endings being different from Linux, or something funny with default encoding or similar.

Lastly, based on Docker Desktop WSL 2 backend | Docker Documentation you may need to explicitly enable WSL 2 backend on Docker Desktop to make that work. And having some scars doing similar things on Windows, I would restart the machine entirely after doing that just to be sure :smile:

1 Like

@simon_mackenzie just to clarify - when you say:

Do you mean you are not able to create deployments with DockerFlowRunner or are you trying to run Orion entirely within a Docker container?

It is Orion inside the container. I have not been able to run any flows. Get stuck before that. In the one case SQL error on config. In the other the agent is running but not in the database so flows only work locally.

So far we haven’t anticipated running Orion itself in a Docker container apart from the Kubernetes deployment: Running flows in Kubernetes - Prefect 2.0

So I assumed you had an issue using DockerFlowRunner - if not, can you provide exactly the steps you’ve taken so far and what exactly didn’t work?

I would recommend not running Orion in Docker (at least for now) until we have clear recipes and tests for that. But if you want to do that, be aware that we haven’t optimized for that use case so far and you may therefore experience some rough edges, especially with respect to networking.

If you want to run any server in a docker container and view in windows then the host needs to be set to 0.0.0.0. For example:

prefect orion start --host 0.0.0.0
task_runner=RayTaskRunner(init_kwargs=dict(dashboard_host="0.0.0.0"))
1 Like

I thought this was fixed in some PR so that you don’t need to bind the host to 0.0.0.0, but it looks like it’s still the case. Thanks for sharing!