Hello there,
I have deployed a first flow in production with Prefect cloud and a docker agent
As per the command below, I use a volume when running the agent.
prefect agent docker start --show-flow-logs --env PATH_TO_FILES=/path/to/file --volume /home/Skam/path/to/file:/path/to/file
Everything works fine, however every 2-3 days the runs start to fail with the error docker.errors.DockerException: Credentials store error: StoreError('Unexpected OS error "Too many open files", errno=24')
(full stacktrace at the end)
This is no big problem, restarting the agent and rescheduling the failed runs work, but I’d rather not have the error in the first place
A quick google search does not reveal much (a SO thread which advise to use a now deprecated --ulimit
argument), so I am coming here for advice !
Full stacktrace:
[2022-05-16 09:10:00,005] INFO - agent | Deploying flow run 9f0890cb-e068-4ad2-b2a6-bea91ac70e0f to execution environment...
[2022-05-16 09:10:00,357] INFO - agent | Pulling image <Image registry>
[2022-05-16 09:10:00,359] ERROR - agent | Exception encountered while deploying flow run 9f0890cb-e068-4ad2-b2a6-bea91ac70e0f
Traceback (most recent call last):
File "/home/lixoloic/.local/lib/python3.9/site-packages/docker/credentials/store.py", line 76, in _execute
output = subprocess.check_output(
File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.9/subprocess.py", line 505, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.9/subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.9/subprocess.py", line 1722, in _execute_child
errpipe_read, errpipe_write = os.pipe()
OSError: [Errno 24] Too many open files
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/Skam/.local/lib/python3.9/site-packages/docker/auth.py", line 262, in _resolve_authconfig_credstore
data = store.get(registry)
File "/home/Skam/.local/lib/python3.9/site-packages/docker/credentials/store.py", line 33, in get
data = self._execute('get', server)
File "/home/Skam/.local/lib/python3.9/site-packages/docker/credentials/store.py", line 89, in _execute
raise errors.StoreError(
docker.credentials.errors.StoreError: Unexpected OS error "Too many open files", errno=24
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/Skam/.local/lib/python3.9/site-packages/prefect/agent/agent.py", line 388, in _deploy_flow_run
deployment_info = self.deploy_flow(flow_run)
File "/home/Skam/.local/lib/python3.9/site-packages/prefect/agent/docker/agent.py", line 387, in deploy_flow
pull_output = self.docker_client.pull(image, stream=True, decode=True)
File "/home/Skam/.local/lib/python3.9/site-packages/docker/api/image.py", line 409, in pull
header = auth.get_config_header(self, registry)
File "/home/Skam/.local/lib/python3.9/site-packages/docker/auth.py", line 45, in get_config_header
authcfg = resolve_authconfig(
File "/home/Skam/.local/lib/python3.9/site-packages/docker/auth.py", line 322, in resolve_authconfig
return authconfig.resolve_authconfig(registry)
File "/home/Skam/.local/lib/python3.9/site-packages/docker/auth.py", line 233, in resolve_authconfig
cfg = self._resolve_authconfig_credstore(registry, store_name)
File "/home/Skam/.local/lib/python3.9/site-packages/docker/auth.py", line 278, in _resolve_authconfig_credstore
raise errors.DockerException(
docker.errors.DockerException: Credentials store error: StoreError('Unexpected OS error "Too many open files", errno=24')
[2022-05-16 09:10:00,617] ERROR - agent | Updating flow run 9f0890cb-e068-4ad2-b2a6-bea91ac70e0f state to Failed...