Improve README header
Some checks failed
/ setup (push) Has been cancelled

This commit is contained in:
Stuart Axelbrooke 2025-08-01 21:58:38 -07:00
parent 70e34c4fa5
commit 3a9fd6a800

View file

@ -5,7 +5,15 @@ DataBuild is a trivially-deployable, partition-oriented, declarative data build
DataBuild is for teams at data-driven orgs who need reliable, flexible, and correct data pipelines and are tired of manually orchestrating complex dependency graphs. You define Jobs (that take input data partitions and produce output partitions), compose them into Graphs (partition dependency networks), and DataBuild handles the rest. Just ask it to build a partition, and databuild handles resolving the jobs that need to run, planning execution order, running builds concurrently, and tracking and exposing build progress. Instead of writing orchestration code that breaks when dependencies change, you focus on the data transformations while DataBuild ensures your pipelines are correct, observable, and reliable.
For important context, check out [DESIGN.md](./DESIGN.md). Also, check out [`databuild.proto`](./databuild/databuild.proto) for key system interfaces.
For important context, check out [DESIGN.md](./DESIGN.md). Also, check out [`databuild.proto`](./databuild/databuild.proto) for key system interfaces. Key features:
- **Declarative dependencies** - Ask for data, get data. Define partition dependencies and DataBuild automatically plans what jobs to run and when.
- **Partition-first design** - Build only what's needed. Late data arrivals and partial rebuilds work seamlessly with atomic data partitions.
- **Deploy anywhere** - One binary, any platform. Bazel-based builds create hermetic applications that run locally, in containers, or in the cloud.
- **Concurrent by design** - Multiple teams, zero conflicts. Event-sourced coordination enables parallel builds without stepping on each other.
## Usage