For education purposes I am trying to setup a local data pipeline using prefect 2. I am looking for a solution to save a pandas dataframe or json during the execution of a flow. Any help is appreciated.

It works fine when you run this flow locally by calling the script, but not with orion interface (local agent, actually) because local agent defaults to temporary working directory. Set infrastructure: working_dir: to the path where you want to save the ‘results.json’ in the deployment yaml file. …

Save data during a flow run in the local filestructure

Archive

anna_geller October 28, 2022, 1:00pm 6

Check out those resources:

It can be as simple as:

prefect deployment build -n dev -q default -a flows/etl_flow.py:etl_flow
prefect agent start -q default

1 Like

Topic		Replies	Views
How do I configure pandas serialization for a flow involving lots of DataFrames? Archive prefect-2-0 , storage , getting-started	7	1985	June 16, 2023
Using Dask client persistence within a Prefect Flow Help prefect-2-0 , dask , dask-dataframes	0	593	August 28, 2023
I am facing issues with checkpointing Dask dataframe and restarting from failure Archive prefect-1-0 , dask , results , checkpointing , dask-dataframes	5	1654	March 9, 2022
Passing dataframes between flows causes flow to exit Help	0	376	April 10, 2023
Can one add context in prefect2 during the flow? Archive prefect-2-0	1	398	June 27, 2022

Save data during a flow run in the local filestructure

Related topics