Unit testing best practices for Prefect flows, subflows and tasks

anna_geller · June 17, 2022, 3:02am

This Discourse topic collects best practices and guidance around testing your dataflow.

https://discourse.prefect.io/t/how-to-disable-prefect-logger-for-unit-tests/1968

ahuang11 · June 17, 2022, 9:55pm

I really like these prompts; thanks for setting them up!

To answer “Unit testing tasks,” this is how I typically do it:

module.py

from prefect import task

@task
def divide_task(x, y):
    return x / y

tests/test_module.py

import pytest
from module import divide_task

def test_divide_task():
    assert divide_task.fn(1, 2) == 0.5  # fn calls underlying function


def test_divide_task_error():
    with pytest.raises(ZeroDivisionError):
        divide_task.fn(1, 0)

Then run in the command line.
pytest .

There’s another way of writing it, but I think this might be too verbose.

import pytest
from prefect import flow
from module import divide_task

def test_divide_task():
    @flow
    def test_flow():
        return divide_task(1, 2)
    task_state = test_flow().result()
    actual = task_state.result()
    assert actual == 0.5

Also, there’s docs here Testing - Prefect 2.0

anna_geller · June 17, 2022, 10:00pm

You are amazing, I didn’t even plan to share it until Tuesday (until then I plan to refine this content plan even more) and you are already contributing

serina · August 2, 2022, 6:25pm

Summary

Prefect 2.0 makes it easier than ever to test your flows, tasks, and subflows!

Video link

Coming soon.

Audience

Testing your Prefect flows, subflows, and tasks helps you identify and remove any errors from your code. It is paramount to test prior to merging or deploying your code to a higher environment.

What is Changing

Testing is easier: subflows, for example, no longer require registry prior to testing.
The flow decorator (@flow) means your code looks identical to Python code, making
it simpler than ever to create more robust test cases.

How to Convert from 1.0 to 2.0

Flows in 1.0

In Prefect 1.0, we needed to define our flow with a context manager. A flow required .run() to be callable.

"""Test a flow in Prefect 1.0"""
from prefect import Flow, task

@task
def my_task():
    return 42

with Flow('my flow') as my_flow:
    first_task = my_task()

# test a flow
def test_my_flow():
    # check the state of the flow for success
    state = my_flow.run()
    assert state.is_successful()

if __name__ == "__main__":
    test_my_flow()

Flows in 2.0

In Prefect 2.0, we replace the context manager with a flow decorator: @flow. Flows are directly callable so we don’t need .run().

We also have the option to use a context manager, prefect_test_harness, to run flows and tasks against a local SQLite database. If you want to use prefect_test_harness in multiple tests, you can use pytest.fixture and scope it to session to ensure efficient testing.

Learn more about pytest.fixture with prefect_test_harness.

Subflows in 1.0

In Prefect 1.0, in order to test a “flow of flows” locally, we would need to register the subflow and then use create_flow_run, specifying name, id and/or project. This makes testing more intricate.

Subflows in 2.0

In Prefect 2.0, testing a subflow is as easy as calling it.

"""Test a subflow in Prefect 2.0"""
from prefect import flow, task


@task
def subflow_task(nbr):
    return nbr * 2

@flow
def subflow(nbr):
    subflow_task(nbr)

@flow
def outer_flow():
    subflow()

# test a subflow
def test_subflow(nbr):
    subflow(nbr)

# test a subflow task
def test_subflow_task():
    assert subflow_task.fn(25) == 50

if __name__ == "__main__":
    test_subflow(2)
    test_subflow_task()

We can even use .fn() on tasks to test individual tasks.

Tasks in 1.0

In Prefect 1.0, you would use .run() to call the task. Alternatively, you might have used TaskRunner to track the state and result.

"""Test a flow in Prefect 1.0"""
from prefect import Flow, task

@task
def my_task():
    return 42

with Flow('my flow') as my_flow:
    first_task = my_task()

# test a flow
def test_my_flow():
    # check the state of the flow for success
    state = my_flow.run()
    assert state.is_successful()

if __name__ == "__main__":
    test_my_flow()

Tasks in 2.0

In Prefect 2.0, a task is callable so you don’t need .run() or a flow to test a task.

"""Test individual tasks with Prefect 2.0. Can also use task.fn()"""
from prefect import flow, task
from pytest import raises

@task
def my_task():
    return 42

def test_my_task():
    assert my_task.fn() == 42

def test_my_task_fails():
    with raises(AssertionError):
        assert my_task.fn() == 45

if __name__ == "__main__":
    test_my_task()
    test_my_task_fails()

With the huge improvements in 2.0, it’s easier than ever to create rigorous testing while reaping the vast benefits Prefect offers. Happy engineering!

serina · August 4, 2022, 1:26pm

@anna_geller
We could either:

Remove the flow import, as this is just testing an individual task to show that it can be done. This shows improvements to the developer experience during testing.
Add flow decorators to the tests if we wanted visibility into them. This shows improvements in visibility of tests.

Andreas · August 9, 2022, 8:39am

What about when our flows/tasks use prefect’s own logger by get_run_logger and we still want to only test the specific function without having to create a flow run for it? The .fn() seems to not work in that case as we get error:

RuntimeError: There is no active flow or task run context.

anna_geller · August 9, 2022, 10:56am

I asked the team how best to approach it. You’re right that if your tasks or flows use a Prefect logger, running only the relevant function without flow run or task run context will fail.

anna_geller · August 9, 2022, 5:05pm

The answer I got so far:

You can disable the Python logger e.g. logging.getLogger("prefect").enabled = False or PREFECT_LOGGING_LEVEL=ERROR
If you want the logger temporarily disabled for some tests, you can write a fixture that sets and unsets the enabled attribute. It must be set after prefect.logging.setup_logging is called.

ahuang11 · September 6, 2022, 11:16pm

There’s now a contextmanager to disable logging and bypass RuntimeError: There is no active flow or task run context..

from prefect.logging import disable_run_logger

with disable_run_logger():
    a_task.fn()

Topic		Replies	Views
How can I test flows and tasks in Prefect 2.0? Archive prefect-2-0 , unit-testing	2	2391	May 5, 2022
Migrate your Prefect 1.0 unit tests to Prefect 2.0 - all you need to know about testing your Prefect flows, subflows, and tasks Archive migration-guide , prefect-1-0 , prefect-2-0 , unit-testing	4	1241	April 3, 2023
What are best practices to build unit tests and integration tests for Prefect flows and for data pipelines in general? Archive prefect-1-0 , prefect-2-0 , best-practices , data-quality , data-validation , unit-testing	0	2691	May 10, 2022
How can I define state dependencies between tasks? Archive migration-guide , prefect-1-0 , prefect-2-0 , dag-flow-structure , state-dependencies , getting-started , task-decorator , wait_for , upstream_tasks , basics	2	3598	May 28, 2024
How to continue flow to completion even if individual tasks raise exceptions (Prefect 2.0) Help prefect-2-0	2	2824	February 26, 2023