View in #prefect-community on Slack
@Alexander_Butler: I am working on standing up Prefect 2.0 is a production environment. For internal data pipeline and reverse etl uses so no fire hazards on my end to use 2.0 early here.
Is there a general preference on YAML vs Code for the deployment specification. I noticed you can configure a flow deployment with YAML but I cant find any information on the schema of that document. For example:
- name: elt-salesforce
flow_location: ./salesforce_flows.py
flow_name: elt-salesforce
tags:
- salesforce
- core
parameters:
destination: "gcp"
schedule:
interval: 3600
Assuming interval is seconds? Can I specify another grain? Can schedule take a dict? If it takes cron, does that take a dict?
Honestly schedule is the primary question point. Everything else is straightforward enough.
@Kevin_Kho: Hi @Alexander_Butler, I’d need to check with the team tomorrow about this and get back to you.
@Anna_Geller: Good choice starting with 2.0 directly!
I’m more biased towards definition in Python, but YAML is also supported. Python definition is friendlier and cleaner.
Here is one example of using YAML:
- name: crypto_prices_etl_dev
flow_location: /Users/anna/repos/gitops-orion-flows/flows/crypto_prices_etl.py
flow_name: crypto_prices_etl
tags:
- dev
schedule:
interval: 3600
- name: repo_trending_check_prefect_dev
flow_location: ./flows/repo_trending_check.py
flow_name: repo_trending_check
tags:
- dev
parameters:
repo: "prefect"
schedule:
interval: 3600
- name: repo_trending_check_orion_dev
flow_location: /Users/anna/repos/gitops-orion-flows/flows/repo_trending_check.py
flow_name: repo_trending_check
tags:
- dev
parameters:
repo: "prefect"
schedule:
interval: 60
I believe the same definition via Python DeploymentSpec
is much cleaner and easier to understand/change, but YAML is also fine
@Alexander_Butler: I like Python too. I think the ambiguous bit is whether schedule
supports cron or different kwargs for interval?
or a different time grain
in yaml
@Michael_Adkins: The YAML is loaded using Pydantic models which infers the type based on the keys
So if you did cron: string-here
instead of interval: integer
it’d be loaded as a cron schedule
From the Pydantic documentation, you can provide more rich strings for intervals other than seconds
timedelta fields can be:
timedelta, existing timedelta object
int or float, assumed as seconds
str, following formats work:
[-][DD ][HH:MM]SS[.ffffff]
[±]P[DD]DT[HH]H[MM]M[SS]S (ISO 8601 format for timedelta)
See https://pydantic-docs.helpmanual.io/usage/types/#datetime-types
Field Types - pydantic
cc @terrence this is a good nugget
@terrence: Good note