History

Stuart Axelbrooke 4f05192229 Pre-typescript mithril commit		2025-07-20 17:54:32 -07:00
..
lib	Make event emission match between cli and service build paths	2025-07-07 22:16:25 -07:00
basic_graph_test.sh	make tests actually fail when they should	2025-07-07 22:35:42 -07:00
BUILD	Fix e2e test	2025-07-13 21:54:17 -07:00
delegation_test.sh	Big change	2025-07-13 21:18:15 -07:00
podcast_reviews_test.sh	make tests actually fail when they should	2025-07-07 22:35:42 -07:00
podcast_simple_test.sh	Pre-typescript mithril commit	2025-07-20 17:54:32 -07:00
README.md	Add e2e test	2025-07-07 19:20:45 -07:00
simple_test.sh	Pre-typescript mithril commit	2025-07-20 17:54:32 -07:00
validate_runner.sh	Big change	2025-07-13 21:18:15 -07:00

README.md

DataBuild End-to-End Tests

This directory contains comprehensive end-to-end tests for DataBuild that validate CLI and Service build consistency across different graph examples.

Quick Start

To run all end-to-end tests:

# From the root of the databuild repository
./run_e2e_tests.sh

To run just the Bazel-integrated validation test:

bazel test //tests/end_to_end:e2e_runner_test

To run all tests (including core DataBuild tests):

bazel test //...

Test Coverage

Basic Graph Tests

Single Partition Build: CLI vs Service for generated_number/pippin
Multiple Partition Build: CLI vs Service for multiple partitions
Sum Partition Build: Tests dependency resolution with sum/pippin_salem_sadie
Event Validation: Compares build events between CLI and Service

Podcast Reviews Tests

Simple Pipeline: CLI build for reviews/date=2020-01-01
Complex Pipeline: Multi-stage data pipeline validation
Directory Dependencies: Tests jobs that require specific working directories

Validation Tests

Build Event Logging: Verifies SQLite database creation and event storage
Service API: Tests HTTP API endpoints and responses
Consistency: Ensures CLI and Service produce similar results

Test Architecture

tests/end_to_end/
├── README.md                    # This file
├── BUILD                        # Bazel test targets
├── validate_runner.sh           # Simple validation test
├── simple_test.sh              # Working basic test
├── basic_graph_test.sh         # Comprehensive basic graph tests
├── podcast_reviews_test.sh     # Comprehensive podcast reviews tests
└── lib/
    ├── test_utils.sh           # Common test utilities
    ├── db_utils.sh             # Database comparison utilities
    └── service_utils.sh        # Service management utilities

Key Findings

Partition Format: Basic graph uses generated_number/pippin format, not just pippin
Service Configuration: Services use hardcoded database paths in their wrapper scripts
API Response Format: Service returns build_request_id and lowercase status values
Working Directory: Podcast reviews jobs must run from their package directory

Test Results

The tests demonstrate successful end-to-end functionality:

✅ CLI Build: Generates proper build events (10 events for basic graph)
✅ Service Build: Responds correctly to HTTP API requests (14 events for basic graph)
✅ Event Consistency: Both approaches generate expected events
✅ Complex Pipelines: Podcast reviews pipeline executes successfully
✅ Database Isolation: Separate databases prevent test interference

Manual Testing

You can also run individual tests manually:

# Test basic graph
cd examples/basic_graph
bazel build //:basic_graph.build //:basic_graph.service
../../tests/end_to_end/simple_test.sh \
  bazel-bin/basic_graph.build \
  bazel-bin/basic_graph.service

# Test podcast reviews CLI
cd examples/podcast_reviews  
bazel build //:podcast_reviews_graph.build
export DATABUILD_BUILD_EVENT_LOG="sqlite:///tmp/test.db"
bazel-bin/podcast_reviews_graph.build "reviews/date=2020-01-01"

Integration with CI/CD

The tests are designed to integrate with CI/CD systems:

Bazel Integration: bazel test //... runs validation tests
Shell Script: ./run_e2e_tests.sh provides standalone execution
Exit Codes: Proper exit codes for automation
Cleanup: Automatic cleanup of test processes and files