databuild/plans/08-integration-test-v2.md

7.1 KiB

Integration Test Plan for DataBuild Delegation System

Overview

Create comprehensive integration tests for the basic_graph example that trigger delegation scenarios and verify Build Event Log (BEL) entries to ensure the delegation system works correctly and provides proper traceability.

Current Test Infrastructure Analysis

Existing Pattern: The current test suite in /tests/end_to_end/ follows a mature pattern:

  • Common utilities: lib/test_utils.sh, lib/db_utils.sh, lib/service_utils.sh
  • Test isolation: Separate SQLite databases per test to prevent interference
  • CLI vs Service validation: Tests ensure both paths produce identical events
  • Event analysis: Detailed breakdown of job/partition/request event counts
  • Robust service management: Start/stop with proper cleanup and health checks

Target System: basic_graph example with two jobs:

  • generate_number_job: Produces partitions like generated_number/pippin
  • sum_job: Depends on multiple generated numbers, produces sum/pippin_salem_sadie

New Test Implementation Plan

1. Create Delegation-Specific Test: basic_graph_delegation_test.sh

Test Scenarios:

  • Historical Delegation: Run same partition twice, verify second run delegates to first
  • Multi-partition Jobs: Test delegation behavior when jobs produce multiple partitions
  • Mixed Availability: Test jobs where some target partitions exist, others don't
  • BEL Verification: Validate specific delegation events and job status transitions

Core Test Cases:

  1. Single Partition Historical Delegation

    • Build generated_number/pippin (first run - normal execution)
    • Build generated_number/pippin again (second run - should delegate)
    • Verify BEL contains: DelegationEvent + JOB_SKIPPED for second run
  2. Multi-Partition Delegation Scenarios

    • Build generated_number/pippin, generated_number/salem, generated_number/sadie
    • Build sum/pippin_salem_sadie (should delegate to existing partitions)
    • Verify delegation events point to correct source build requests
  3. Partial Delegation Scenario

    • Build generated_number/pippin, generated_number/salem
    • Request generated_number/pippin, generated_number/salem, generated_number/sadie
    • Verify: delegations for pippin/salem, normal execution for sadie
  4. Cross-Run Delegation Chain

    • Run 1: Build generated_number/pippin
    • Run 2: Build generated_number/salem
    • Run 3: Build sum/pippin_salem_sadie (requires sadie, should delegate pippin/salem)
    • Verify delegation traceability to correct source builds

2. BEL Validation Utilities

New functions in lib/db_utils.sh:

  • get_delegation_events(): Extract delegation events for specific partition
  • verify_job_skipped(): Check job was properly skipped with delegation
  • get_delegation_source_build(): Validate delegation points to correct build request
  • compare_delegation_behavior(): Compare CLI vs Service delegation consistency

Event Validation Logic:

# For historical delegation, verify event sequence:
# 1. DelegationEvent(partition_ref, delegated_to_build_request_id, message)
# 2. JobEvent(status=JOB_SKIPPED, message="Job skipped - all target partitions already available")
# 3. No JobEvent(JOB_SCHEDULED/RUNNING/COMPLETED) for delegated job

# For successful delegation:
# - Success rate should be 100% (JOB_SKIPPED counts as success)
# - Partition should show as available without re-execution
# - Build request should complete successfully

3. Performance and Reliability Validation

Delegation Efficiency Tests:

  • Time comparison: first run vs delegated run (should be significantly faster)
  • Resource usage: ensure delegated runs don't spawn job processes
  • Concurrency: multiple builds requesting same partition simultaneously

Error Scenarios:

  • Source build request failure handling
  • Corrupted delegation data
  • Stale partition detection

4. Integration with Existing Test Suite

File Structure:

tests/end_to_end/
├── basic_graph_delegation_test.sh    # New delegation-specific tests
├── basic_graph_test.sh               # Existing functionality tests (enhanced)
├── lib/
│   ├── delegation_utils.sh           # New delegation validation utilities  
│   ├── db_utils.sh                   # Enhanced with delegation functions
│   └── test_utils.sh                 # Existing utilities
└── BUILD                             # Updated to include new test

Bazel Integration:

  • Add basic_graph_delegation_test as new sh_test target
  • Include in run_e2e_tests.sh execution
  • Tag with ["delegation", "e2e"] for selective running

5. CLI vs Service Delegation Consistency

Validation Approach:

  • Run identical delegation scenarios through both CLI and Service
  • Compare BEL entries for identical delegation behavior
  • Ensure both paths produce same success rates and event counts
  • Validate API responses include delegation information

6. Documentation and Debugging Support

Test Output Enhancement:

  • Clear delegation event logging during test execution
  • Detailed failure diagnostics showing expected vs actual delegation behavior
  • BEL dump utilities for debugging delegation issues
  • Performance metrics (execution time, event counts)

Expected Outcomes

Success Criteria:

  1. 100% Success Rate: Delegated builds show 100% success rate in dashboard
  2. Event Consistency: CLI and Service produce identical delegation events
  3. Traceability: All delegations link to correct source build requests
  4. Performance: Delegated runs complete in <5 seconds vs 30+ seconds for full execution
  5. Multi-partition Correctness: Complex jobs with mixed partition availability handled properly

Regression Prevention:

  • Automated validation prevents delegation system regressions
  • Comprehensive BEL verification ensures audit trail integrity
  • Performance benchmarks detect delegation efficiency degradation

Implementation Priority

  1. High: Core delegation test cases (historical, multi-partition)
  2. High: BEL validation utilities and event verification
  3. Medium: Performance benchmarking and efficiency validation
  4. Medium: Error scenario testing and edge cases
  5. Low: Advanced concurrency and stress testing

This plan provides a comprehensive testing strategy that validates both the functional correctness and performance benefits of the delegation system while ensuring long-term reliability and debuggability.

Implementation Notes

This plan was created following the user's request to improve system reliability and testability for the DataBuild delegation system. The focus is on the basic_graph example because it provides a simpler, more predictable test environment compared to the podcast_reviews example, while still covering all the essential delegation scenarios.

The delegation system currently shows some issues (67% success rate instead of 100%) that these tests should help identify and prevent regression of once fixed. The comprehensive BEL validation will ensure that the delegation events provide proper audit trails and traceability as intended by the system design.