# DataBuild A bazel-based data build system. For important context, check out [the manifesto](./manifesto.md), and [core concepts](./core-concepts.md). Also, check out [`databuild.proto`](./databuild/databuild.proto) for key system interfaces. ## Testing ### Quick Test Run the comprehensive end-to-end test suite: ```bash ./run_e2e_tests.sh ``` ### Core Unit Tests ```bash # Run all core DataBuild tests ./scripts/bb_test_all # Remote testing ./scripts/bb_remote_test_all ``` ### Manual Testing ```bash # Test basic graph CLI build cd examples/basic_graph bazel run //:basic_graph.build -- "generated_number/pippin" # Test podcast reviews CLI build cd examples/podcast_reviews bazel run //:podcast_reviews_graph.build -- "reviews/date=2020-01-01" # Test service builds bazel run //:basic_graph.service -- --port=8080 # Then in another terminal: curl -X POST -H "Content-Type: application/json" \ -d '{"partitions": ["generated_number/pippin"]}' \ http://localhost:8080/api/v1/builds ``` ### Event Validation Tests The end-to-end tests validate that CLI and Service builds emit identical events: - **Event count alignment**: CLI and Service must generate the same total event count - **Event type breakdown**: Job, partition, and build_request events must match exactly - **Event consistency**: Both interfaces represent the same logical build process Example test output: ``` Event breakdown: Job events: CLI=2, Service=2 Partition events: CLI=3, Service=3 Request events: CLI=9, Service=9 ✅ All build events (job, partition, and request) are identical ✅ Total event counts are identical: 14 events each ```