# Build Event Log (BEL) Purpose: Store build events and provide efficient cross-graph coordination via a minimal, append-only event stream. ## Architecture - Uses [event sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) / [CQRS](https://www.wikipedia.org/wiki/cqrs) philosophy. - BELs are only ever written to by graph processes (e.g. CLI or service), not the jobs themselves. - **Three-layer architecture:** 1. **Storage Layer**: Append-only event storage with sequential scanning 2. **Query Engine Layer**: App-layer aggregation for entity queries (partition status, build summaries, etc.) 3. **Client Layer**: CLI, Service, Dashboard consuming aggregated views - **Cross-graph coordination** via minimal `GraphService` API that supports event streaming since a given index - Storage backends focus on efficient append + sequential scan operations (file-based, SQLite, Postgres, Delta Lake) ## Correctness Strategy - Access layer will evaluate events requested to be written, returning an error if the event is not a correct next. state based on the involved component's governing state diagram. - Events are versioned, with each versions' schemas stored in [`databuild.proto`](../databuild/databuild.proto). ## Storage Layer Interface Minimal append-only interface optimized for sequential scanning: ```rust trait BELStorage { fn append_event(&self, event: BuildEvent) -> Result; // returns event index fn list_events(&self, since_idx: i64, filter: EventFilter, limit: i64) -> Result; } ``` Where `EventFilter` is defined in `databuild.proto` as: ```protobuf message EventFilter { repeated string partition_refs = 1; // Exact partition matches repeated string partition_patterns = 2; // Glob patterns like "data/users/*" repeated string job_labels = 3; // Job-specific events repeated string job_run_ids = 4; // Job run events } ``` The data build state is then built on top of this, as a reducer over the BEL event stream. ## Cross-Graph Coordination Graphs coordinate via the `GraphService` API for efficient event streaming: ```rust trait GraphService { async fn list_events(&self, since_idx: i64, filter: EventFilter, limit: i64) -> Result; } ``` This enables: - **Event-driven reactivity**: Downstream graphs react within seconds of upstream partition availability - **Efficient subscriptions**: Only scan events for relevant partitions - **Reliable coordination**: HTTP polling avoids event-loss issues of streaming APIs