@Alexander_Butler: I am working on standing up Prefect 2.0 is a production environment. For internal data pipeline and reverse etl uses so no fire hazards on my end to use 2.0 early here.
Is there a general preference on YAML vs Code for the deployment specification. I noticed you can configure a flow deployment with YAML but I cant find any information on the schema of that document. For example:
- name: elt-salesforce flow_location: ./salesforce_flows.py flow_name: elt-salesforce tags: - salesforce - core parameters: destination: "gcp" schedule: interval: 3600
Assuming interval is seconds? Can I specify another grain? Can schedule take a dict? If it takes cron, does that take a dict?
Honestly schedule is the primary question point. Everything else is straightforward enough.
@Kevin_Kho: Hi @Alexander_Butler, I’d need to check with the team tomorrow about this and get back to you.
@Anna_Geller: Good choice starting with 2.0 directly!
I’m more biased towards definition in Python, but YAML is also supported. Python definition is friendlier and cleaner.
Here is one example of using YAML:
- name: crypto_prices_etl_dev flow_location: /Users/anna/repos/gitops-orion-flows/flows/crypto_prices_etl.py flow_name: crypto_prices_etl tags: - dev schedule: interval: 3600 - name: repo_trending_check_prefect_dev flow_location: ./flows/repo_trending_check.py flow_name: repo_trending_check tags: - dev parameters: repo: "prefect" schedule: interval: 3600 - name: repo_trending_check_orion_dev flow_location: /Users/anna/repos/gitops-orion-flows/flows/repo_trending_check.py flow_name: repo_trending_check tags: - dev parameters: repo: "prefect" schedule: interval: 60
I believe the same definition via Python
DeploymentSpec is much cleaner and easier to understand/change, but YAML is also fine
@Alexander_Butler: I like Python too. I think the ambiguous bit is whether
schedule supports cron or different kwargs for interval?
or a different time grain
@Michael_Adkins: The YAML is loaded using Pydantic models which infers the type based on the keys
So if you did
cron: string-here instead of
interval: integer it’d be loaded as a cron schedule
From the Pydantic documentation, you can provide more rich strings for intervals other than seconds
timedelta fields can be: timedelta, existing timedelta object int or float, assumed as seconds str, following formats work: [-][DD ][HH:MM]SS[.ffffff] [±]P[DD]DT[HH]H[MM]M[SS]S (ISO 8601 format for timedelta)
Field Types - pydantic
cc @terrence this is a good nugget
@terrence: Good note