Do Cloud Hooks work on a flow level or on a task level?

View in #prefect-community on Slack

Yeachan_Park @Yeachan_Park: Hello all. Have a small question about cloud hooks. According to the documentation they work only at the flow level and not at the task level. One of the states I can select in the cloud hook is “TimedOut”. However I can’t find how to get my flow to enter that state. When I create my flow, I don’t see any timeout condition for the entire flow run itself.

Have I understood it correctly that cloud hooks only work at the flow run level? And if so how can I get my cloud hook to trigger when a flowrun enters the state “TimedOut”? Thanks in advance

@Anna_Geller: TimedOut means: “Finished state indicating failure due to execution timeout.” - so I would expect to see it being triggered more on a task run level when there is some timeout defined on a task and it took longer than that. I can ask the team to be sure.

But can you explain what problem do you try to solve? Do you want to perform some action if your flow run took longer than e.g. 60 min?
also: are you on Prefect Cloud or Server?

Yeachan_Park @Yeachan_Park: Hi Anna, thanks for your response! I am trying prefect server. I’m was trying to get SLAs to work for both task level/flow run level.

At the task level, I do not think this is supported. I can fetch the TimeOut state if the task times out, and use a statehandler to send a message wherever, but not via a cloud hook as far as I can tell, since cloud hooks seem to be flow-run level. Either way, I guess there is no way for whatever is running to run to completion AFAIK?

I did see this when configuring cloud hooks though, which made me ask this question:

Appreciate your help

@Anna_Geller: on a task level there is a keyword argument timeout on the task decorator you could use:

        - timeout (Union[int, timedelta], optional): The amount of time (in seconds) to wait while
            running this task before a timeout occurs; note that sub-second
            resolution is not supported, even when passing in a timedelta.

on a flow level, so far we don’t have such functionality for Prefect Server, but if you would use Prefect Cloud, there are Automations allowing you to set SLA on a flow level, e.g. if flow doesn’t finish within 60 min, do something (cancel the run, create a new run, send an alert, etc)

Yeachan_Park @Yeachan_Park: That is clear, thanks for your quick help

@Anna_Geller: just for reference https://docs.prefect.io/orchestration/ui/automations.html#new-automations

Automations | Prefect Docs

Yeachan_Park @Yeachan_Park: and from what I read (and what you explained), automations are also based on the flow level, is that correct? I.e. it’s not possible to create an automation based on a single task running over x minutes

@Anna_Geller: as I mentioned before, for this you could use a timeout argument on the task decorator. This allows you to fail a task run if it takes longer than say 5 min. Then, you could attach a state_handler to the same task to perform some action if this flow run fails. You could even then perform some action only if this task failed with a TimedOut exception

@task(timeout=300, state_handlers=[some_action_on_failed])
def some_task():
   pass

Yeachan_Park @Yeachan_Park: Thanks, yes I understand. Am I correct in assuming that both automations/state_handlers require an actual change in state in flows (for automations) and tasks (for state handlers)? Wondering if it’s possible to let flows/tasks execute past a predefined SLA, but still send the alert, for example set 60 minutes as the SLA threshold for a flow, send a notification once the 60 minutes is up, but continue to let the flow finish?

@Anna_Geller: this is hard - why do you need that? can you explain your use case?
you could build a flow of flows, have one subflow doing the initial work, and this subflow could have an SLA of say 60 minutes configured with an Automation that sends an alert once this SLA criterion is passed, but the work may still continue in other subflows. Everything could be orchestrated from a parent flow. Does it make sense?
if you need some examples of the flow-of-flows orchestration pattern, check the Discourse topics about it here Topics tagged flow-of-flows

I think the flow-of-flows would be the easiest and most reliable approach here

wait, scratch that - I think you can really do just that in Prefect Cloud - this will send an alert once 60 min are passed but the flow won’t be interrupted

Yeachan_Park @Yeachan_Park: OK, thanks! It’s clear that it’s possible on the flow level. Use case wise we do have quite a few extraction tasks that take quite a while to complete, but it would be nice to receive some alerts if they pass a certain threshold time to extract. It’s quite expensive to fail these and re-run them though, not just because of time but because it takes up quite a lot of the bandwidth of the upstream source.

@Anna_Geller: this totally makes sense! LMK if you have any other questions I can help with