databuild/docs/design/core-build.md
Stuart Axelbrooke ea83610d35
Some checks failed
/ setup (push) Has been cancelled
A lot of refactoring
2025-09-27 15:29:22 -07:00

46 lines
2.3 KiB
Markdown

# Core Build
Purpose: Enable continuous reconciliation of partition wants through distributed job execution.
## Architecture
DataBuild uses a want-driven reconciliation model inspired by Kubernetes. Users declare wants (desired partitions), and the system continuously attempts to satisfy them through job execution.
### Key Components
- [**Wants**](wants.md): Declarations of desired partitions with TTLs and SLAs
- **Jobs**: Stateless executables that transform input partitions to outputs
- **Graph**: Reconciliation runtime that monitors wants and dispatches jobs
- [**Build Event Log (BEL)**](build-event-log.md): Event-sourced ledger of all system activity
## Reconciliation Loop
The graph continuously:
1. Scans active wants from the BEL
2. Groups wants by responsible job (via graph lookup)
3. Dispatches jobs to build wanted partitions
4. Handles job results:
- **Success**: Marks partitions available
- **Missing Dependencies**: Creates wants for missing deps with traceable ID
- **Failure**: Potentially retry based on job retry strategy
## Jobs
Jobs are stateless executables with a single `exec` entrypoint. When invoked with requested partitions as args, they either:
- Successfully produce the partitions
- Fail with missing dependency error listing required upstream partitions
- Fail with other errors for potential retry
Jobs declare execution preferences (batching, concurrency) as metadata, but contain no orchestration logic.
## Want Propagation
When jobs report missing dependencies, the graph:
1. Parses the error for partition refs
2. Creates child wants (linked via `parent_want_id`)
3. Continues reconciliation with expanded want set
This creates want chains that naturally traverse the dependency graph without upfront planning.
## Correctness Strategy
- **Idempotency**: Jobs must produce identical outputs given same inputs
- **Atomicity**: Partitions are either complete or absent
- **Want chains**: Full traceability via parent/root want IDs
- **Event sourcing**: All state changes recorded in BEL
- **Protobuf interface**: All build actions fit structs and interfaces defined by [`databuild/databuild.proto`](../databuild/databuild.proto)
The system achieves correctness through convergence rather than planning—continuously reconciling until wants are satisfied or expired.