72 lines
3.1 KiB
Markdown
72 lines
3.1 KiB
Markdown
|
|
# Build Event Log (BEL)
|
|
Purpose: Store build events and provide efficient cross-graph coordination via a minimal, append-only event stream.
|
|
|
|
## Architecture
|
|
- Uses [event sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) /
|
|
[CQRS](https://www.wikipedia.org/wiki/cqrs) philosophy.
|
|
- BELs are only ever written to by graph processes (e.g. CLI or service), not the jobs themselves.
|
|
- **Three-layer architecture:**
|
|
1. **Storage Layer**: Append-only event storage with sequential scanning
|
|
2. **Query Engine Layer**: App-layer aggregation for entity queries (partition status, build summaries, etc.)
|
|
3. **Client Layer**: CLI, Service, Dashboard consuming aggregated views
|
|
- **Cross-graph coordination** via minimal `GraphService` API that supports event streaming since a given index
|
|
- Storage backends focus on efficient append + sequential scan operations (file-based, SQLite, Postgres, Delta Lake)
|
|
|
|
## Correctness Strategy
|
|
- Access layer will evaluate events requested to be written, returning an error if the event is not a correct next.
|
|
state based on the involved component's governing state diagram.
|
|
- Events are versioned, with each versions' schemas stored in [`databuild.proto`](../databuild/databuild.proto).
|
|
|
|
## Storage Layer Interface
|
|
Minimal append-only interface optimized for sequential scanning:
|
|
|
|
```rust
|
|
#[async_trait]
|
|
trait BELStorage {
|
|
async fn append_event(&self, event: BuildEvent) -> Result<i64>; // returns event index
|
|
async fn list_events(&self, since_idx: i64, filter: EventFilter) -> Result<EventPage>;
|
|
}
|
|
```
|
|
|
|
Where `EventFilter` is defined in `databuild.proto` as:
|
|
```protobuf
|
|
message EventFilter {
|
|
repeated string partition_refs = 1; // Exact partition matches
|
|
repeated string partition_patterns = 2; // Glob patterns like "data/users/*"
|
|
repeated string job_labels = 3; // Job-specific events
|
|
repeated string task_ids = 4; // Task run events
|
|
repeated string build_request_ids = 5; // Build-specific events
|
|
}
|
|
```
|
|
|
|
## Query Engine Interface
|
|
App-layer aggregation that scans storage layer events:
|
|
|
|
```rust
|
|
struct BELQueryEngine {
|
|
storage: Box<dyn BELStorage>,
|
|
partition_status_cache: Option<PartitionStatusCache>,
|
|
}
|
|
|
|
impl BELQueryEngine {
|
|
async fn get_latest_partition_status(&self, partition_ref: &str) -> Result<Option<PartitionStatus>>;
|
|
async fn get_active_builds_for_partition(&self, partition_ref: &str) -> Result<Vec<String>>;
|
|
async fn get_build_request_summary(&self, build_id: &str) -> Result<BuildRequestSummary>;
|
|
async fn list_build_requests(&self, limit: u32, offset: u32, status_filter: Option<BuildRequestStatus>) -> Result<Vec<BuildRequestSummary>>;
|
|
}
|
|
```
|
|
|
|
## Cross-Graph Coordination
|
|
Graphs coordinate via the `GraphService` API for efficient event streaming:
|
|
|
|
```rust
|
|
trait GraphService {
|
|
async fn list_events(&self, since_idx: i64, filter: EventFilter) -> Result<EventPage>;
|
|
}
|
|
```
|
|
|
|
This enables:
|
|
- **Event-driven reactivity**: Downstream graphs react within seconds of upstream partition availability
|
|
- **Efficient subscriptions**: Only scan events for relevant partitions
|
|
- **Reliable coordination**: HTTP polling avoids event-loss issues of streaming APIs
|