46 lines
2.3 KiB
Markdown
46 lines
2.3 KiB
Markdown
# Core Build
|
|
Purpose: Enable continuous reconciliation of partition wants through distributed job execution.
|
|
|
|
## Architecture
|
|
DataBuild uses a want-driven reconciliation model inspired by Kubernetes. Users declare wants (desired partitions), and the system continuously attempts to satisfy them through job execution.
|
|
|
|
### Key Components
|
|
- [**Wants**](wants.md): Declarations of desired partitions with TTLs and SLAs
|
|
- **Jobs**: Stateless executables that transform input partitions to outputs
|
|
- **Graph**: Reconciliation runtime that monitors wants and dispatches jobs
|
|
- [**Build Event Log (BEL)**](build-event-log.md): Event-sourced ledger of all system activity
|
|
|
|
## Reconciliation Loop
|
|
The graph continuously:
|
|
1. Scans active wants from the BEL
|
|
2. Groups wants by responsible job (via graph lookup)
|
|
3. Dispatches jobs to build wanted partitions
|
|
4. Handles job results:
|
|
- **Success**: Marks partitions available
|
|
- **Missing Dependencies**: Creates wants for missing deps with traceable ID
|
|
- **Failure**: Potentially retry based on job retry strategy
|
|
|
|
## Jobs
|
|
Jobs are stateless executables with a single `exec` entrypoint. When invoked with requested partitions as args, they either:
|
|
- Successfully produce the partitions
|
|
- Fail with missing dependency error listing required upstream partitions
|
|
- Fail with other errors for potential retry
|
|
|
|
Jobs declare execution preferences (batching, concurrency) as metadata, but contain no orchestration logic.
|
|
|
|
## Want Propagation
|
|
When jobs report missing dependencies, the graph:
|
|
1. Parses the error for partition refs
|
|
2. Creates child wants (linked via `parent_want_id`)
|
|
3. Continues reconciliation with expanded want set
|
|
|
|
This creates want chains that naturally traverse the dependency graph without upfront planning.
|
|
|
|
## Correctness Strategy
|
|
- **Idempotency**: Jobs must produce identical outputs given same inputs
|
|
- **Atomicity**: Partitions are either complete or absent
|
|
- **Want chains**: Full traceability via parent/root want IDs
|
|
- **Event sourcing**: All state changes recorded in BEL
|
|
- **Protobuf interface**: All build actions fit structs and interfaces defined by [`databuild/databuild.proto`](../databuild/databuild.proto)
|
|
|
|
The system achieves correctness through convergence rather than planning—continuously reconciling until wants are satisfied or expired.
|