2.3 KiB
Core Build
Purpose: Enable continuous reconciliation of partition wants through distributed job execution.
Architecture
DataBuild uses a want-driven reconciliation model inspired by Kubernetes. Users declare wants (desired partitions), and the system continuously attempts to satisfy them through job execution.
Key Components
- Wants: Declarations of desired partitions with TTLs and SLAs
- Jobs: Stateless executables that transform input partitions to outputs
- Graph: Reconciliation runtime that monitors wants and dispatches jobs
- Build Event Log (BEL): Event-sourced ledger of all system activity
Reconciliation Loop
The graph continuously:
- Scans active wants from the BEL
- Groups wants by responsible job (via graph lookup)
- Dispatches jobs to build wanted partitions
- Handles job results:
- Success: Marks partitions available
- Missing Dependencies: Creates wants for missing deps with traceable ID
- Failure: Potentially retry based on job retry strategy
Jobs
Jobs are stateless executables with a single exec entrypoint. When invoked with requested partitions as args, they either:
- Successfully produce the partitions
- Fail with missing dependency error listing required upstream partitions
- Fail with other errors for potential retry
Jobs declare execution preferences (batching, concurrency) as metadata, but contain no orchestration logic.
Want Propagation
When jobs report missing dependencies, the graph:
- Parses the error for partition refs
- Creates child wants (linked via
parent_want_id) - Continues reconciliation with expanded want set
This creates want chains that naturally traverse the dependency graph without upfront planning.
Correctness Strategy
- Idempotency: Jobs must produce identical outputs given same inputs
- Atomicity: Partitions are either complete or absent
- Want chains: Full traceability via parent/root want IDs
- Event sourcing: All state changes recorded in BEL
- Protobuf interface: All build actions fit structs and interfaces defined by
databuild/databuild.proto
The system achieves correctness through convergence rather than planning—continuously reconciling until wants are satisfied or expired.