databuild/README.md
Stuart Axelbrooke b27a249b09
Some checks are pending
/ setup (push) Waiting to run
Add testing details
2025-07-07 22:42:59 -07:00

57 lines
1.6 KiB
Markdown

# DataBuild
A bazel-based data build system.
For important context, check out [the manifesto](./manifesto.md), and [core concepts](./core-concepts.md). Also, check out [`databuild.proto`](./databuild/databuild.proto) for key system interfaces.
## Testing
### Quick Test
Run the comprehensive end-to-end test suite:
```bash
./run_e2e_tests.sh
```
### Core Unit Tests
```bash
# Run all core DataBuild tests
./scripts/bb_test_all
# Remote testing
./scripts/bb_remote_test_all
```
### Manual Testing
```bash
# Test basic graph CLI build
cd examples/basic_graph
bazel run //:basic_graph.build -- "generated_number/pippin"
# Test podcast reviews CLI build
cd examples/podcast_reviews
bazel run //:podcast_reviews_graph.build -- "reviews/date=2020-01-01"
# Test service builds
bazel run //:basic_graph.service -- --port=8080
# Then in another terminal:
curl -X POST -H "Content-Type: application/json" \
-d '{"partitions": ["generated_number/pippin"]}' \
http://localhost:8080/api/v1/builds
```
### Event Validation Tests
The end-to-end tests validate that CLI and Service builds emit identical events:
- **Event count alignment**: CLI and Service must generate the same total event count
- **Event type breakdown**: Job, partition, and build_request events must match exactly
- **Event consistency**: Both interfaces represent the same logical build process
Example test output:
```
Event breakdown:
Job events: CLI=2, Service=2
Partition events: CLI=3, Service=3
Request events: CLI=9, Service=9
✅ All build events (job, partition, and request) are identical
✅ Total event counts are identical: 14 events each
```