databuild/docs/design/glossary.md
Stuart Axelbrooke ea83610d35
Some checks failed
/ setup (push) Has been cancelled
A lot of refactoring
2025-09-27 15:29:22 -07:00

26 lines
1.3 KiB
Markdown

# `Job`
Atomic unit of work, producing and consuming specific partitions. See [jobs](core-build.md#jobs).
# `Graph`
Composes [jobs](#job) to build partitions. See [graphs](core-build.md#graphs)
# `Partition`
Partitions are atomic units of data, produced and depended on by jobs. A job can produce multiple partitions, but
multiple jobs cannot produce the same partition - e.g. job -> partition relationships must be unique/canonical.
# `PartitionRef`
PartitionsRefs are strings that uniquely identify partitions. They can contain anything, but generally they are S3
URIs, like `s3://companybkt/datasets/foo/date=2025-01-01`, or custom formats like
`dal://prod/clicks/region=4/date=2025-01-01/`. PartitionRefs are used as dependency signals during
[task graph analysis](core-build.md#graphanalyze). To enable explicit coupling and ergonomics, there are generally
helper classes for creating, parsing, and accessing fields for PartitionRefs in [GDLs](#graph-specification-language-gsl).
# `PartitionPattern`
Patterns that group partitions (e.g. a dataset) and allow for validation (e.g. does this job actually produce the
expected output partition?)
# Graph Definition Language (GDL)
Language-specific libraries that make implementing databuild graphs and jobs more succinct and ergonomic.
See [graph specification](graph-specification.md).