When you run your flows locally, you can leverage your favorite debugging tools (e.g. a debugger integrated into your IDE) to troubleshoot various failing task runs.
Once you move those flows to production and deploy (or register) them, you may leverage Automations or attach a state handler to your tasks.
Prefect 2.0
You can call a function that reacts to a state change, as described in this topic:
There will be more ways to configure custom failure notifications in Orion. Watch the Announcements category in the coming weeks and months
Prefect 1.0
You can leverage Automations or attach a state handler to your tasks.
State handler with a full exception traceback
To get a full exception traceback you may use the following state handler logic:
import prefect
from prefect import task,…
Slack discussion
Here is a Slack discussion on the same topic:
View in #prefect-community on Slack
@Vadym_Dytyniak : Hello. Is it possible using prefect.Client or graphql get upstream task_run_ids by task_run_id?
@Anna_Geller : can you describe what you try to do? why do you think you need to use GraphQL for that?
@Vadym_Dytyniak : When some task fails, we would like to have utility that will allow to collect checkpointed inputs that were passed to this task.
@Anna_Geller : may I ask why? what is the end goal do you try to achieve? do you want to apply your own restart logic of some sort?
@Vadym_Dytyniak : no, during debugging users would like to understand why this task failed and see inputs, run the same task locally and debug
@Anna_Geller : For such debugging, perhaps a custom state handler that retrieves and prints (or sends alert via Slack?) the exception of a failed task run may be more helpful than a GraphQL query?
Personally, I think for local debugging you can just use a debugger of your IDE that allows you to step into the next step, and at each step, you should see the variables/inputs.
And also, when running your flows locally and debugging, such a GraphQL query wouldn’t help since it would only be populated when doing a backend run, not when running flow locally.
But if you still want to go that route you described with GraphQL queries, check out this PR that added Restarts in the UI repo: https://github.com/PrefectHQ/ui/pull/285/files
My understanding of the process in this PR:
• it queries for Failed task runs (something also what you try to do),
• it queries for downstream tasks of failed task runs (for you it seems you want to go the opposite direction to fetch upstream task runs, but maybe the syntax already helps),
• then it restarts the task runs in the right order (respecting all the dependencies) and sets new task run states after completion (also something similar to what you try to do with rerunning specific task runs).
@Vadym_Dytyniak : Thanks!