Hi! Is there a way to pass files between steps, as it’s implemented in kubeflow? In kubeflow, one can use function parameters of type InputPath
and OutputPath
, something like
def first_step(path: OutputPath(str)):
# write the data in the local file `path`
def second_step(path: InputPath(str)):
# read the data from the local file `path`
kubeflow generates the file name and manages the storage, the user code doesn’t depend on whether the files are stored on AWS, Google Cloud or somewhere else.
As I understand, in prefect I can implement my own Result
class that implements the required logic. But there are issues: (1) it should be implemented; (2) I will have implementation details in my user code. For example, if the files are stored in S3 I should use the corresponding library to upload/dowload them.
I’ve read the docs about Results, checkpoints and targets. It still seems to be different from kubeflow: I want to work with the files directly in my code, and I don’t want to return just python data that will be then serialized to the files. I also want to abstract the code from the storage, so the code doesn’t change if the storage is changed.
Do I miss something? How can I do it in prefect?