View in #prefect-community on Slack
@Fina_Silva-Santisteban: Hi everyone! Does someone have examples of how they’ve used the SnowflakeQuery task?? Specially interested in examples using the imperative api!
I have the following so far:
flow.set_dependencies(
task=SnowflakeQuery,
keyword_tasks=dict(query='''SELECT * FROM dummy_table;'''
)
)
flow.set_dependencies(
task=save_query_result_as_df,
keyword_tasks=dict(result_set=SnowflakeQuery)
)
The task save_query_result_as_df
currently only does a print()
of the result_set
. I’m confused about a few things:
• I didn’t provide the SnowflakeQuery
with any authentication. Why doesn’t it throw an error about that?
• The print statement prints out <class 'prefect.tasks.snowflake.snowflake.SnowflakeQuery'>
, which makes me think the snowflakequery task wasn’t run? If it wasn’t run it would at least explain why it didn’t throw an error but how can I make it run?
ETL | Prefect Docs
@alex: Hey @Fina_Silva-Santisteban! I think that you will need to create an instance of the SnowflakeQuery
task class to use within your flow. Something like this
snowflake_task = SnowflakeQuery()
flow.set_dependencies(
task=snowflake_task,
keyword_tasks=dict(query='''SELECT * FROM dummy_table;'''
)
)
@Kevin_Kho: Yes to what Alex said and then the authentication goes in the first initialization
@Fina_Silva-Santisteban: OHHH yes that makes sense! And now I see the expected authentication error! Thanks so much @alex @Kevin_Kho
@Kevin_Kho @alex The SnowflakeQuery task returns List[List]: output of cursor.fetchall()
, which is not the most useful since I need the result set as a dataframe. I’ll need to set up and use the snowflake.connector
and pd.read_sql()
to get what I need. Can I submit this somewhere as a feature request??
@Kevin_Kho: Alex, do you know if this is possible by changing the cursor in the Snowflake Task?
I guess you can create your own version of the Snowflake Task and use the fetch_pandas_all on the cursor which returns a Pandas DataFrame
@alex: It doesn’t look like changing the cursor will enable getting results as a DataFrame in the current task. We’d either need to create a separate task that returns results as a DataFrame or add an option to the current task to allow the results to be returned as a DataFrame.