How to migrate Prefect 1.0 task library tasks to a Prefect Collection repository in Prefect 2.0?

What are Prefect Collections?

Prefect Collections are groupings of pre-built tasks and flows used to quickly build data flows with Prefect.

Collections are grouped around the services with which they interact. For example, to download data from an S3 bucket, you could use the s3_download task from the prefect-aws collection, or if you want to send a Slack message as part of your flow, you could use the send_message task from the prefect-slack collection.

By using Prefect Collections, you can reduce the amount of boilerplate code that you need to write for interacting with common services and focus on the outcome you’re seeking to achieve.

Usage

To use a Prefect Collection, first install the collection via pip. As an example, to use prefect-aws:

pip install prefect-aws

The AWS tasks and flows in that collection can then be imported and called within your flow:

from prefect import flow
from prefect_aws import AwsCredentials
from prefect_aws.secrets_manager import read_secret


@flow
def connect_to_database():
    aws_credentials = AwsCredentials(
        aws_access_key_id="access_key_id",
        aws_secret_access_key="secret_access_key"
    )
    secret_value = read_secret(
        secret_name="db_password",
        aws_credentials=aws_credentials
    )

    # Use secret_value to connect to a database

Available Collections

To see the list of available Prefect Collections and links to each collection’s GitHub repository and documentation, please refer to the Collection Catalog in the Prefect documentation.

Contributing Prefect Collections

Anyone can create and share a Prefect Collection and we encourage anyone interested in creating a collection to do so!

Generating a project

To help you get started with your collection, we’ve created a template that gives the tools you need to create and publish your collection.

To generate a collection from the template, run the following:

# 1. Install cookiecutter
pip install cookiecutter

# 2. Generate a Prefect Collection project
cookiecutter https://github.com/PrefectHQ/prefect-collection-template

After your project has been generated, refer to the MAINTAINER.md in the generated project for information about developing your collection.

Listing in the Collections Catalog

To list your collection in the Prefect Collections Catalog, submit a PR to the Prefect repository adding a file to the docs/collections/catalog directory with details about your collection. Please use TEMPLATE.yaml in that folder as guide.

Related topics

How to migrate a task from Prefect 1.0 to a Prefect 2.0 collection?

View in #prefect-contributors on Slack

ale @ale: Hey folks :wave:
Are there any guidelines to “convert” existing Prefect 1.0 tasks to make them work with Prefect 2.0?

@Anna_Geller: Hi Ale, great question! I asked Chris about it - we will have more concrete guidelines for that later. For now, we have a guide on how you can contribute to Prefect Collections discussed here How to contribute to Prefect Collections?

ale @ale: Thanks Anna! :raised_hands:
After digging into the code and other collections I was able to setup a prefect-cubejs collection on my local environment :blush:

Here you can find the code!

What are the next steps to submit the code into Prefect repo?

@Anna_Geller: That’s awesome! I’m sure @alex can help with the review.

alex @alex: Hey @ale! Thanks so much for creating a Prefect Collection!

Here’s are the next steps I recommend for prefect-cubejs:

  • Ensure that your code has good unit test coverage. You can check this by running coverage run --branch -m pytest tests. We recommend at least 80% coverage.
  • Make sure that all the documentation is updated. This includes the docstrings in your Python code and README.md since they are used to generate your documentation. You can run mkdocs serve to view the generated documentation.
  • Publish a version of prefect-cubejs on PyPI. You can do this automatically by creating a tag in GitHub with a v prefix (e.g. v0.1.0). This will create a release on PyPI and also publish the autogenerated docs to GitHub Pages. You’ll need a PyPI to to publish to PyPI and you can refer to MAINTAINERS.md in your repo for more information.

Once you’ve completed the initial release, we can work together to get prefect-cubejs listed in the Prefect Collections Catalog.

If you run into any issues, let me know and I’ll be happy to help!

ale @ale: Hey @alex :wave:
Thanks for the detailed instructions!
I was able to release the very first version of prefect-cubejs on PyPi :tada:

Is there anything else I need to do?

alex @alex: Can you send me a link to the deployed documentation?

ale @ale: Sure, let me find it and will send it to you :sweat_smile:
Where can I find the URL of the generated Github Pages?

alex @alex: I think this quickstart will give you the info you need: Quickstart for GitHub Pages - GitHub Docs

ale @ale: I discovered Github Pages was disabled on my repo.
I’ve enabled it now, but I think I need to run the action again
There you go @alex https://alessandrolollo.github.io/prefect-cubejs/

alex @alex: Nice! Looks like the docstring on run_query need to be filled in. Can you you add a docstring that documents the arguments, return type, and examples for the task similar to these tasks here: https://github.com/PrefectHQ/prefect-slack/blob/main/prefect_slack/messages.py?

ale @ale: Working on it
Done!

alex @alex: Nice! In order to deploy the updated docs you’ll need to push the changes to the docs branch. Here’s a workflow that you can include in your project to deploy a new version of the docs: https://github.com/PrefectHQ/prefect-slack/blob/main/.github/workflows/publish-docs.yml. That workflow can be triggered manually from the GitHub Actions UI.

Also, I noticed references to prefect.engine.signals.FAIL in the new docstring, but I think that should be CubeJSAPIFailureException.

ale @ale: Will check it, thanks @alex :raised_hands:
Docstrings have been fixed and published to Github Pages :raised_hands:
@alex please let me know if there’s anything else I need to fix/improve

alex @alex: It looks like mkdocs is having trouble parsing the docstring on your task. I think that you can remove the typing from the docstring (e.g. (str, optional)) since mkdocs can pull typing info from the type hints on your function. If you want to experiment with changes to the documentation on your local you can run mkdocs serve which will rebuild the docs each time you make a change.

ale @ale: Should be fixed now

1 Like