# Build Event Log Design The foundation of persistence for DataBuild is the build event log, a fact table recording events related to build requests, partitions, and jobs. Each graph has exactly one build event log, upon which other views (potentially materialized) rely and aggregate, e.g. powering the partition liveness catalog and enabling delegation to in-progress partition builds. ## 1. Schema ```protobuf // Partition lifecycle states enum PartitionStatus { PARTITION_UNKNOWN = 0; PARTITION_REQUESTED = 1; // Partition requested but not yet scheduled PARTITION_SCHEDULED = 2; // Job scheduled to produce this partition PARTITION_BUILDING = 3; // Job actively building this partition PARTITION_AVAILABLE = 4; // Partition successfully built and available PARTITION_FAILED = 5; // Partition build failed PARTITION_STALE = 6; // Partition exists but upstream dependencies changed PARTITION_DELEGATED = 7; // Request delegated to existing build } // Job lifecycle enum JobStatus { // TODO implement me } // Individual partition activity event message BuildEvent { // TODO implement me } ``` Build events are practically job events, as they are the unit of work, but they also represent progress towards building specific partitions and their downstreams. One build request ID represents the literal request to the service (potentially accepting a provided build request ID). The expectation is that most build requests involve multiple partitions, and we should be able to see the tree structure over time to see jobs succeeding and progress towards the requested partition being built. Individual job runs should have their own ID allowing them to be referenced later. TODO narrative ## 2. Persistence TODO narrative + design, with requirements: - Should target postgres, sqlite, and delta tables ## 3. Access Layer TODO narrative + design