Is there a supported way of checking whether a persisted result with a given cache_key exists outside of a task/flow?

Something like:

is_result_available(cache_key, other_necessary_args) → Bool

my use case is a can of worms, but … it involves dynamically creating a flow from a DAG by calling Flow(some_function_created_from_DAG). Generally it seems the nodes can be created as Prefect tasks with Task(some_node_fxn that takes upstream_node_results) upon a postorder traversal, so that tasks don’t need to call tasks. It would be nice to terminate the flow when / if tasks are already precomputed, eg when is_result_available(cache_key, …). But that’s not entirely necessary.

However, occasionally upstream_node_results are not known at runtime, they are deterministic but determined when the task runs (eg the parameters to those upstream nodes / tasks aren’t available until runtime). In other words, I can’t get around tasks calling other tasks here, so that the node must be a subflow. These cases can get complex, and so if I can avoid manufacturing these run-time subflows it would be helpful.

I checked out Prefect several years ago but didn’t use it to help with my framework designed for deeply layered scientific analyses. Reconsidering now with Prefect 2, but other than searching these topics and glancing at the source totally naive.

1 Like