1.6 KiB
1.6 KiB
DataBuild
A bazel-based data build system.
For important context, check out the manifesto, and core concepts. Also, check out databuild.proto for key system interfaces.
Testing
Quick Test
Run the comprehensive end-to-end test suite:
./run_e2e_tests.sh
Core Unit Tests
# Run all core DataBuild tests
./scripts/bb_test_all
# Remote testing
./scripts/bb_remote_test_all
Manual Testing
# Test basic graph CLI build
cd examples/basic_graph
bazel run //:basic_graph.build -- "generated_number/pippin"
# Test podcast reviews CLI build
cd examples/podcast_reviews
bazel run //:podcast_reviews_graph.build -- "reviews/date=2020-01-01"
# Test service builds
bazel run //:basic_graph.service -- --port=8080
# Then in another terminal:
curl -X POST -H "Content-Type: application/json" \
-d '{"partitions": ["generated_number/pippin"]}' \
http://localhost:8080/api/v1/builds
Event Validation Tests
The end-to-end tests validate that CLI and Service builds emit identical events:
- Event count alignment: CLI and Service must generate the same total event count
- Event type breakdown: Job, partition, and build_request events must match exactly
- Event consistency: Both interfaces represent the same logical build process
Example test output:
Event breakdown:
Job events: CLI=2, Service=2
Partition events: CLI=3, Service=3
Request events: CLI=9, Service=9
✅ All build events (job, partition, and request) are identical
✅ Total event counts are identical: 14 events each