databuild/plans/16-bel-delta-backend.md

15 KiB

BEL Delta Table Backend Implementation Plan

Motivation & High-Level Goals

Problem Statement

DataBuild currently supports SQLite and has stubs for PostgreSQL as Build Event Log (BEL) backends. While SQLite works well for single-node deployments, and PostgreSQL would provide traditional RDBMS capabilities, neither offers the benefits of a modern lakehouse architecture. Delta Lake would provide ACID transactions, scalable storage, and better integration with data processing ecosystems while maintaining the same event-sourced/CQRS architecture.

Strategic Goals

  1. Lakehouse Architecture: Enable DataBuild to use Delta tables as a BEL backend, bringing lakehouse benefits to the orchestration layer
  2. Interface Compatibility: Maintain exact parity with the existing BuildEventLog trait interface
  3. ACID Guarantees: Leverage Delta's ACID transactions for concurrent build safety
  4. Schema Evolution: Version Delta table schemas alongside protobuf definitions for forward compatibility
  5. Storage Flexibility: Support both local filesystem and (future) cloud storage backends

Success Criteria

  • Delta backend passes all existing BEL trait tests with identical results to SQLite
  • CLI and Service can use Delta backend interchangeably via URI configuration
  • Events written to Delta backend can be queried with same performance characteristics as SQLite for typical workloads
  • Schema versioning allows for backward-compatible evolution of event structures

Technical Design

URI Format

Following industry conventions for Delta table references:

  • Local filesystem: delta:///absolute/path/to/table
  • Future S3 support: delta+s3://bucket/path/to/table
  • Future Azure support: delta+azure://container/path/to/table

Table Schema

Single Delta table with nested structures matching the protobuf definitions:

CREATE TABLE build_events (
    -- Core event fields
    event_id STRING NOT NULL,
    timestamp BIGINT NOT NULL,
    build_request_id STRING NOT NULL,
    event_type STRING NOT NULL,
    
    -- Event-specific nested structures (all nullable)
    build_request_event STRUCT<
        status_code INT,
        status_name STRING,
        requested_partitions ARRAY<STRING>,
        message STRING
    >,
    
    partition_event STRUCT<
        partition_ref STRING,
        status_code INT,
        status_name STRING,
        message STRING,
        job_run_id STRING
    >,
    
    job_event STRUCT<
        job_run_id STRING,
        job_label STRING,
        target_partitions ARRAY<STRING>,
        status_code INT,
        status_name STRING,
        message STRING,
        config STRING,  -- JSON serialized JobConfig
        manifests STRING  -- JSON serialized array of PartitionManifest
    >,
    
    delegation_event STRUCT<
        partition_ref STRING,
        delegated_to_build_request_id STRING,
        message STRING
    >,
    
    job_graph_event STRUCT<
        job_graph STRING,  -- JSON serialized JobGraph
        message STRING
    >,
    
    partition_invalidation_event STRUCT<
        partition_ref STRING,
        reason STRING
    >,
    
    task_cancel_event STRUCT<
        job_run_id STRING,
        reason STRING
    >,
    
    build_cancel_event STRUCT<
        reason STRING
    >
)

Query Implementation

Use native delta-rs capabilities with in-memory filtering for CQRS-style aggregations:

  • All read operations implemented using delta-rs table scanning with Arrow RecordBatches
  • In-memory filtering and aggregation in Rust (similar to SQLite approach initially)
  • Leverage Delta's partition filtering where possible to reduce data scanned
  • No external query engine dependencies initially - can add DataFusion later when needed

Implementation Plan

Current Status: READY FOR IMPLEMENTATION

Issue Resolved: The Arrow ecosystem trait method conflict has been resolved by upgrading to deltalake v0.27:

  • Deltalake v0.27 resolves the arrow-arith trait method ambiguity with chrono
  • Dependencies enabled in MODULE.bazel (lines 138-141)
  • Build succeeds with all dependencies working together
  • Chrono removal work completed provides additional value (better performance, fewer dependencies)

Previous Issue (Now Resolved):

  • deltalake v0.20 depended on arrow-arith v52.2.0 which had trait method conflicts with chrono::Datelike::quarter()
  • This created trait method ambiguity rather than version conflicts
  • Solution: Upgrade to deltalake v0.27 which uses compatible Arrow versions

Current State:

  • Delta backend structure implemented and enabled
  • Dependencies working correctly (deltalake v0.27)
  • Ready to proceed with Phase 2 implementation

Phase 1: Basic Delta Backend Structure - COMPLETED

Status: Structure implemented, Dependencies enabled and working

Completed Deliverables

  • New databuild/event_log/delta.rs module with full trait implementation
  • DeltaBuildEventLog struct implementing BuildEventLog trait
  • URI recognition in databuild/event_log/mod.rs for delta:// URIs
  • Dependencies disabled in MODULE.bazel (lines 138-144) due to Arrow/chrono conflict

Implementation Status

  1. Delta dependencies disabled in MODULE.bazel:

    # Delta backend temporarily disabled due to Arrow/chrono ecosystem conflict
    # Even with chrono removed from our direct dependencies, it comes in transitively
    # through rusqlite and schemars, and conflicts with deltalake's arrow-arith
    # crate.spec(
    #     package = "deltalake", 
    #     version = "0.20",
    # )
    
  2. Delta module created in databuild/event_log/delta.rs with complete structure:

    pub struct DeltaBuildEventLog {
        table_path: String,
    }
    // All trait methods implemented with detailed error messages
    
  3. URI recognition implemented in databuild/event_log/mod.rs

  4. Chrono dependency removed from DataBuild codebase (replaced with std::time in log_collector.rs)

Verification Status

  • Cannot test due to disabled dependencies
  • Code structure ready for when dependencies can be enabled
  • No direct chrono usage remains in DataBuild

Resolution Paths

  1. Wait for ecosystem fix: Monitor Arrow ecosystem for chrono conflict resolution
  2. Alternative Delta implementation: Research delta-rs alternatives or native Parquet backend
  3. Dependency replacement: Replace rusqlite/schemars with chrono-free alternatives
  4. Fork approach: Fork and modify dependencies to resolve conflicts

Phase 2: Event Writing Implementation - COMPLETED

Status: Full implementation complete with working Delta table creation and append

Completed Deliverables

  • Complete event serialization: event_to_record_batch() converts all BuildEvent types to Arrow RecordBatch
  • Arrow schema definition: Complete Delta table schema with all event type columns
  • JSON serialization: All event subtypes properly serialized as JSON strings
  • Error handling: Proper error mapping for serialization failures
  • Build verification: Code compiles successfully with deltalake v0.27
  • Comprehensive test suite: All 8 BuildEvent types have serialization tests that pass
  • Write API research: Found correct RecordBatchWriter and DeltaWriter APIs
  • Table creation implemented: StructField-based schema creation for new Delta tables
  • Full append functionality: Complete append_event() with table creation and writing
  • End-to-end test: test_append_event() passes, creating tables and writing events

Current Status

  • Event serialization working: BuildEvent → RecordBatch conversion fully implemented and tested
  • Write API working: RecordBatchWriter::for_table() → write() → flush_and_commit() pattern implemented
  • Table creation solved: Separate Delta schema using StructField for table creation
  • Append functionality complete: Full end-to-end event writing with ACID transactions
  • 📝 Ready for Phase 3: Core Delta backend functionality complete and tested

Technical Achievement

  • Dual schema approach: Arrow schema for RecordBatch, Delta StructField schema for table creation
  • Automatic table creation: Creates Delta table on first append if it doesn't exist
  • ACID compliance: Uses Delta's transaction system for reliable writes
  • Type safety: Proper enum conversions and JSON serialization with error handling

Phase 2: Event Writing Implementation

Goal: Implement event append functionality with ACID guarantees

Deliverables

  • Full append_event() implementation
  • Event serialization to Delta schema format
  • Transaction handling for concurrent writes

Implementation Tasks

  1. Implement event-to-row conversion:

    • Convert BuildEvent to Delta row format
    • Handle all event type variants
    • Serialize complex fields (configs, manifests) as JSON strings
  2. Implement append_event() with Delta transactions:

    • Open Delta table
    • Convert event to row
    • Append with ACID transaction
    • Handle conflicts/retries
  3. Add helper functions for enum conversions and JSON serialization

Tests & Verification

  • Parity test: Write same events to SQLite and Delta, verify identical
  • Concurrent write test: Multiple writers don't corrupt data
  • All event types can be written and read back

Success Criteria

  • Events written to Delta match SQLite implementation exactly
  • Concurrent writes maintain ACID properties
  • No data loss or corruption under load

Phase 3: Native Query Implementation

Goal: Implement all read operations using native delta-rs scanning with in-memory processing

Deliverables

  • All query methods implemented with Arrow RecordBatch scanning
  • In-memory filtering and aggregation functions
  • Status tracking queries using Rust iterators

Implementation Tasks

  1. Implement core query methods using delta-rs table scanning:

    • get_build_request_events() - Scan table, filter by build_request_id in memory
    • get_partition_events() - Scan table, filter by partition_ref from JSON event_data
    • get_job_run_events() - Scan table, filter by job_run_id from JSON event_data
    • get_events_in_range() - Use timestamp column for efficient filtering
  2. Implement aggregation queries with in-memory processing:

    • get_latest_partition_status() - Scan events, group by partition, find latest
    • get_active_builds_for_partition() - Filter active build events
    • list_build_requests() - Aggregate build request events with pagination
    • list_recent_partitions() - Process partition events chronologically
  3. Implement helper functions for Arrow RecordBatch processing and JSON parsing

Tests & Verification

  • Parity tests: All queries return same results as SQLite
  • Performance tests: Acceptable performance for expected data volumes
  • Memory usage tests: Ensure in-memory processing doesn't cause issues

Success Criteria

  • All read methods return identical results to SQLite implementation
  • Performance acceptable for small-to-medium datasets (can optimize later)
  • Correct handling of pagination and filters using Rust iterators

Phase 4: Schema Versioning

Goal: Support schema evolution alongside protobuf versions

Deliverables

  • Schema version tracking in Delta table properties
  • Migration path for schema updates
  • Backward compatibility guarantees

Implementation Tasks

  1. Add schema version to Delta table properties:

    • Store version in table metadata
    • Check version on table open
    • Handle version mismatches
  2. Create schema migration framework:

    • Define migration path from v1 to vN
    • Implement safe column additions
    • Handle nullable fields for backward compatibility
  3. Document schema evolution process

Tests & Verification

  • Test reading v1 data with v2 code
  • Test schema migration process
  • Verify no data loss during migration

Success Criteria

  • Schema version tracked and validated
  • Safe migration path defined
  • Backward compatibility maintained

Phase 5: Integration and Polish

Goal: Complete integration with DataBuild system

Deliverables

  • Full test coverage and parity validation
  • Documentation updates
  • Performance benchmarking

Implementation Tasks

  1. Complete test suite:

    • Unit tests for all methods
    • Integration tests with mock data
    • Parity test suite comparing all backends
    • Memory usage and performance tests
  2. Update documentation:

    • Add Delta backend to README
    • Document URI format and limitations
    • Add deployment considerations
    • Document when to choose Delta vs SQLite
  3. Performance optimization:

    • Profile scanning and filtering operations
    • Optimize JSON parsing and Arrow processing
    • Add benchmarks against SQLite backend

Tests & Verification

  • Full test suite passes
  • Performance benchmarks complete
  • E2E tests work with Delta backend (future)

Success Criteria

  • Delta backend fully integrated and tested
  • Performance characteristics documented and acceptable
  • Clear migration path from SQLite documented

Future Enhancements

Cloud Storage Support

  • Add object_store dependency
  • Implement S3, Azure, GCS support
  • Handle authentication and credentials

Performance Optimizations

  • Implement columnar filtering before deserialization
  • Add Delta table partitioning by timestamp
  • Cache frequently accessed metadata
  • Optimize Arrow RecordBatch processing

Advanced Features

  • Delta table compaction and optimization
  • Time-based partition pruning
  • Change data feed for incremental processing
  • Support for Delta table ACID transactions

Risks and Mitigations

Risk: Query Performance

Mitigation: Start with simple implementation, profile actual usage, optimize based on real workload patterns

Risk: Schema Evolution Complexity

Mitigation: Start with simple versioning, require manual migration initially, automate as patterns emerge

Risk: Delta Library Maturity

Mitigation: Pin to stable version, thorough testing, maintain SQLite as fallback option

Dependencies

Required Crates

  • deltalake - Delta Lake implementation (includes Arrow support)

Future Crates

  • object_store - Cloud storage support (future)

Testing Strategy

Unit Tests

  • Test each method independently
  • Mock Delta table for fast tests
  • Verify event serialization

Integration Tests

  • Full lifecycle tests (write → read → aggregate)
  • Concurrent operation tests
  • Large dataset tests

Parity Tests

  • Compare Delta and SQLite outputs
  • Ensure identical behavior
  • Validate all edge cases

Success Metrics

  1. Functional Parity: 100% of BuildEventLog trait methods implemented
  2. Test Coverage: >90% code coverage with comprehensive tests
  3. Performance: Query latency within 2x of SQLite for p95 queries
  4. Reliability: Zero data loss under concurrent load
  5. Compatibility: CLI and Service work identically with Delta backend