- Airflow and Luigi are OG data orchestrators that inspired databuild - Airflow uses explicit declaration of DAG structure - Luigi uses implicit, discovered DAG structure - Both use DAG runs as a top-level unit of execution - This is nice because you can see what's going to happen after the DAG run has launched - This is not nice because you have to deal with mid-execution DAG runs during deployments - what do you do? - Do you terminate existing dag runs and retrigger? (what if the workload is stateful? Don't do that!) - Do you let existing dag runs finish? - How do you deal with DAG run identity under changing DAG definition? - These questions are all red herrings. We don't care about the DAG definition - we care about the data we want to produce. - We should instead declare what partitions we want, and iteratively propagate