Passing files between steps as in kubeflow with InputPath and OutputPath

Hi! Is there a way to pass files between steps, as it’s implemented in kubeflow? In kubeflow, one can use function parameters of type InputPath and OutputPath, something like

def first_step(path: OutputPath(str)):
    # write the data in the local file `path`

def second_step(path: InputPath(str)):
    # read the data from the local file `path`

kubeflow generates the file name and manages the storage, the user code doesn’t depend on whether the files are stored on AWS, Google Cloud or somewhere else.

As I understand, in prefect I can implement my own Result class that implements the required logic. But there are issues: (1) it should be implemented; (2) I will have implementation details in my user code. For example, if the files are stored in S3 I should use the corresponding library to upload/dowload them.

I’ve read the docs about Results, checkpoints and targets. It still seems to be different from kubeflow: I want to work with the files directly in my code, and I don’t want to return just python data that will be then serialized to the files. I also want to abstract the code from the storage, so the code doesn’t change if the storage is changed.

Do I miss something? How can I do it in prefect?

Hi @tardenoisean, welcome to Discourse! :wave:

You can pass arbitrary data between Prefect tasks, regardless of whether those are file paths or an entire dataframe. Anything you can do in Python, you can do in Prefect, and it’s up to you to design it based on your needs.

Why is it important to you? Are you in a multi-cloud environment using simultaneously e.g. GCP and AWS?