No description
Find a file
2025-07-10 21:39:43 -07:00
.forgejo/workflows Add CI 2025-05-03 20:53:44 -07:00
databuild Builds passing 2025-07-10 21:39:43 -07:00
examples Update docs 2025-07-09 15:05:57 -07:00
plans Update docs 2025-07-09 15:05:57 -07:00
scripts Fix tests on remote 2025-05-03 20:51:01 -07:00
tests/end_to_end Fix broken test 2025-07-08 23:09:01 -07:00
tools/build_rules Builds passing 2025-07-10 21:39:43 -07:00
.bazelignore Add .bazelignore 2025-04-21 21:40:30 -07:00
.bazelrc Builds passing 2025-07-10 21:39:43 -07:00
.bazelversion Builds passing 2025-07-10 21:39:43 -07:00
.envrc commit 2025-06-29 19:28:46 -07:00
.gitignore ignore .venv 2025-07-02 21:36:22 -07:00
BUILD.bazel Add e2e test 2025-07-07 19:20:45 -07:00
CLAUDE.md Update docs 2025-07-09 15:05:57 -07:00
core-concepts.md Reorganize repo 2025-05-07 17:39:25 -07:00
manifesto.md commit 2025-06-29 19:28:46 -07:00
MODULE.bazel Builds passing 2025-07-10 21:39:43 -07:00
MODULE.bazel.lock Builds passing 2025-07-10 21:39:43 -07:00
README.md Add testing details 2025-07-07 22:42:59 -07:00
run_e2e_tests.sh fix podcast review test 2025-07-07 22:38:54 -07:00

DataBuild

A bazel-based data build system.

For important context, check out the manifesto, and core concepts. Also, check out databuild.proto for key system interfaces.

Testing

Quick Test

Run the comprehensive end-to-end test suite:

./run_e2e_tests.sh

Core Unit Tests

# Run all core DataBuild tests
./scripts/bb_test_all

# Remote testing
./scripts/bb_remote_test_all

Manual Testing

# Test basic graph CLI build
cd examples/basic_graph
bazel run //:basic_graph.build -- "generated_number/pippin"

# Test podcast reviews CLI build  
cd examples/podcast_reviews
bazel run //:podcast_reviews_graph.build -- "reviews/date=2020-01-01"

# Test service builds
bazel run //:basic_graph.service -- --port=8080
# Then in another terminal:
curl -X POST -H "Content-Type: application/json" \
  -d '{"partitions": ["generated_number/pippin"]}' \
  http://localhost:8080/api/v1/builds

Event Validation Tests

The end-to-end tests validate that CLI and Service builds emit identical events:

  • Event count alignment: CLI and Service must generate the same total event count
  • Event type breakdown: Job, partition, and build_request events must match exactly
  • Event consistency: Both interfaces represent the same logical build process

Example test output:

Event breakdown:
  Job events:       CLI=2, Service=2
  Partition events: CLI=3, Service=3
  Request events:   CLI=9, Service=9
✅ All build events (job, partition, and request) are identical
✅ Total event counts are identical: 14 events each