databuild/README.md
Stuart Axelbrooke b27a249b09
Some checks are pending
/ setup (push) Waiting to run
Add testing details
2025-07-07 22:42:59 -07:00

1.6 KiB

DataBuild

A bazel-based data build system.

For important context, check out the manifesto, and core concepts. Also, check out databuild.proto for key system interfaces.

Testing

Quick Test

Run the comprehensive end-to-end test suite:

./run_e2e_tests.sh

Core Unit Tests

# Run all core DataBuild tests
./scripts/bb_test_all

# Remote testing
./scripts/bb_remote_test_all

Manual Testing

# Test basic graph CLI build
cd examples/basic_graph
bazel run //:basic_graph.build -- "generated_number/pippin"

# Test podcast reviews CLI build  
cd examples/podcast_reviews
bazel run //:podcast_reviews_graph.build -- "reviews/date=2020-01-01"

# Test service builds
bazel run //:basic_graph.service -- --port=8080
# Then in another terminal:
curl -X POST -H "Content-Type: application/json" \
  -d '{"partitions": ["generated_number/pippin"]}' \
  http://localhost:8080/api/v1/builds

Event Validation Tests

The end-to-end tests validate that CLI and Service builds emit identical events:

  • Event count alignment: CLI and Service must generate the same total event count
  • Event type breakdown: Job, partition, and build_request events must match exactly
  • Event consistency: Both interfaces represent the same logical build process

Example test output:

Event breakdown:
  Job events:       CLI=2, Service=2
  Partition events: CLI=3, Service=3
  Request events:   CLI=9, Service=9
✅ All build events (job, partition, and request) are identical
✅ Total event counts are identical: 14 events each