Tracing cached results

Alenka_Volkova · May 24, 2023, 2:41pm

Hi,

I’m making a data processing pipeline with Prefect, where each task creates an intermediate file. For each run I’d like to be able to trace where all the intermediate files are stored.

Some tasks are expensive, so I use caching. The problem is, if the task was skipped and the cached result was used, I can’t find where that cached result was taken from. Prefect generates the cache key and retrieves the persisted result, but neither the UI nor the context have information where that persisted result was retrieved from, as far as I could tell.

The only workaround I’ve found so far is to have each task return the path to a file with the data, rather than the data itself, and then log/artifact each path:

from prefect import flow, task
from prefect.tasks import task_input_hash
from prefect.artifacts import create_markdown_artifact

@task(cache_key_fn=task_input_hash)
def my_task():
    path = "some-unique-path"
    # ...write something to path...
    return path

@flow
def my_flow():
    path = my_task()
    create_markdown_artifact(key="my-task-output-path", markdown=path)
    # or:
    print(f"my_task output is at {path}")

…and do the same for every task of every flow. Is there a better way to do this?

Topic		Replies	Views
Does Prefect support caching on a flow level rather than only on a task level? How to avoid task run recomputation of several tasks in a flow? Archive prefect-1-0 , caching , results , checkpointing	7	1989	October 21, 2022
Task writes file - can this file be cached? Help prefect-2-0 , caching	0	198	December 4, 2023
How to use targets to cache task run results based on a presence of a specific file (i.e. how to use Makefile or Luigi type file-based caching)? Archive prefect-1-0 , caching , results , targets	8	2932	December 21, 2022
Prefect 2.9.0 is here with Artifacts, configurable result storage keys, and runtime enhancements 🎉 Announcements prefect-2-0 , release-notes	1	1192	May 9, 2024
How can I examine the results of a flow and it's tasks interactively? (locally at first) Help	0	325	January 25, 2024

Tracing cached results

Related topics