Update plan for BEL

This commit is contained in:
Stuart Axelbrooke 2025-07-06 13:05:05 -07:00
parent c664ef36bd
commit 28500fb956

View file

@ -245,3 +245,94 @@ GROUP BY br.status;
```
The service layer builds higher-level operations on top of both the simple interface and direct SQL access.
## 4. Core Build Implementation Integration
### Command Line Interface
The core build implementation (`analyze.rs` and `execute.rs`) will be enhanced with build event logging capabilities through new command line arguments:
```bash
# Standard usage with build event logging
./analyze partition_ref1 partition_ref2
./execute --build-event-log sqlite:///tmp/build.db < job_graph.json
# With explicit build request ID for correlation
./analyze --build-event-log postgres://user:pass@host/db --build-request-id 12345678-1234-1234-1234-123456789012
```
**New Command Line Arguments:**
- `--build-event-log <URI>` - Specify persistence URI for build events (logging to stdout is implicit)
- `sqlite://path` - Persist to SQLite database file
- `postgres://connection` - Persist to PostgreSQL database
- `--build-request-id <UUID>` - Optional build request ID (auto-generated if not provided)
### Integration Points
**In `analyze.rs` (Graph Analysis Phase):**
1. **Build Request Lifecycle**: Log `BUILD_REQUEST_RECEIVED` when analysis starts, `BUILD_REQUEST_PLANNING` during dependency resolution, and `BUILD_REQUEST_COMPLETED` when analysis finishes
2. **Staleness Detection**: Query build event log for existing `PARTITION_AVAILABLE` events to identify non-stale partitions that can be skipped
3. **Delegation Logging**: Log `PARTITION_DELEGATED` events when skipping partitions that are already being built by another request
4. **Job Planning**: Log `PARTITION_SCHEDULED` events for partitions that will be built
**In `execute.rs` (Graph Execution Phase):**
1. **Execution Lifecycle**: Log `BUILD_REQUEST_EXECUTING` when execution starts
2. **Job Execution Events**: Log `JOB_SCHEDULED`, `JOB_RUNNING`, `JOB_COMPLETED/FAILED` events throughout job execution
3. **Partition Status**: Log `PARTITION_BUILDING` when jobs start, `PARTITION_AVAILABLE/FAILED` when jobs complete
4. **Build Coordination**: Check for concurrent builds before starting partition work to avoid duplicate effort
### Non-Stale Partition Handling
The build event log enables intelligent partition skipping:
1. **During Analysis**: Query for recent `PARTITION_AVAILABLE` events to identify partitions that don't need rebuilding
2. **Staleness Logic**: Compare partition timestamps with upstream dependency timestamps to determine if rebuilding is needed
3. **Skip Documentation**: Log `PARTITION_DELEGATED` events with references to the existing build request ID that produced the partition
### Bazel Rules Integration
The `databuild_graph` rule in `rules.bzl` will be enhanced to propagate build event logging configuration:
```python
databuild_graph(
name = "my_graph",
jobs = [":job1", ":job2"],
lookup = ":job_lookup",
build_event_log = "sqlite:///tmp/builds.db", # New attribute
)
```
**Generated Targets Enhancement:**
- `my_graph_analyze`: Receives `--build-event-log` argument
- `my_graph_exec`: Receives `--build-event-log` argument
- `my_graph_build`: Coordinates build request ID across analyze/execute phases
### Implementation Strategy
**Phase 1: Infrastructure**
- Add `BuildEventLog` trait and implementations for stdout/SQLite/PostgreSQL
- Update `databuild.proto` with build event schema
- Add command line argument parsing to `analyze.rs` and `execute.rs`
**Phase 2: Analysis Integration**
- Integrate build event logging into `analyze.rs`
- Implement staleness detection queries
- Add partition delegation logic
**Phase 3: Execution Integration**
- Integrate build event logging into `execute.rs`
- Add job lifecycle event logging
- Implement build coordination checks
**Phase 4: Bazel Integration**
- Update `databuild_graph` rule with build event log support
- Add proper argument propagation and request ID correlation
- End-to-end testing with example graphs
### Key Benefits
1. **Stdout Logging**: Immediate visibility into build progress with `--build-event-log stdout`
2. **Persistent History**: Database persistence enables build coordination and historical analysis
3. **Intelligent Skipping**: Avoid rebuilding fresh partitions, significantly improving build performance
4. **Build Coordination**: Prevent duplicate work when multiple builds target the same partitions
5. **Audit Trail**: Complete record of all build activities for debugging and monitoring