databuild/design/glossary.md


# `Job`
Atomic unit of work, producing and consuming specific partitions. See [jobs](./core-build.md#jobs).

# `Graph`
Composes [jobs](#job) to build partitions. See [graphs](./core-build.md#graphs)

# `Partition`
Partitions are atomic units of data, produced and depended on by jobs. A job can produce multiple partitions, but
multiple jobs cannot produce the same partition - e.g. job -> partition relationships must be unique/canonical.

# `PartitionRef`
PartitionsRefs are strings that uniquely identify partitions. They can contain anything, but generally they are S3
URIs, like `s3://companybkt/datasets/foo/date=2025-01-01`, or custom formats like
`dal://prod/clicks/region=4/date=2025-01-01/`. PartitionRefs are used as dependency signals during
[task graph analysis](./core-build.md#graphanalyze). To enable explicit coupling and ergonomics, there are generally
helper classes for creating, parsing, and accessing fields for PartitionRefs in [GSLs](#graph-specification-language-gsl).

# `PartitionPattern`
Patterns that group partitions (e.g. a dataset) and allow for validation (e.g. does this job actually produce the
expected output partition?)

# `JobConfig`
The complete configuration of a job needed to produce the desired partitions, as calculated by
[`job.config`](./core-build.md#jobconfig)

# `JobGraph`
A complete graph of job configs, with [`PartitionRef`](#partitionref) dependency edges, which when executed will
produce the requested partitions.

# Graph Specification Language (GSL)
Language-specific libraries that make implementing databuild graphs and jobs more succinct and ergonomic.
See [graph specification](./graph-specification.md).