85 lines
2.2 KiB
Markdown
85 lines
2.2 KiB
Markdown
# Multi-Hop Dependency Example
|
|
|
|
This example demonstrates DataBuild's ability to handle multi-hop dependencies between jobs.
|
|
|
|
## Overview
|
|
|
|
The example consists of two jobs:
|
|
|
|
- **job_alpha**: Produces the `data/alpha` partition
|
|
- **job_beta**: Depends on `data/alpha` and produces `data/beta`
|
|
|
|
When you request `data/beta`:
|
|
1. Beta job runs and detects missing `data/alpha` dependency
|
|
2. Orchestrator creates a want for `data/alpha`
|
|
3. Alpha job runs and produces `data/alpha`
|
|
4. Beta job runs again and succeeds, producing `data/beta`
|
|
|
|
## Running the Example
|
|
|
|
From the repository root:
|
|
|
|
```bash
|
|
# Build the CLI
|
|
bazel build //databuild:databuild_cli
|
|
|
|
# Clean up any previous state
|
|
rm -f /tmp/databuild_multihop*.db /tmp/databuild_multihop_alpha_complete
|
|
|
|
# Start the server with the multihop configuration
|
|
./bazel-bin/databuild/databuild_cli serve \
|
|
--port 3050 \
|
|
--database /tmp/databuild_multihop.db \
|
|
--config examples/multihop/config.json
|
|
```
|
|
|
|
In another terminal, create a want for `data/beta`:
|
|
|
|
```bash
|
|
# Create a want for data/beta (which will trigger the dependency chain)
|
|
./bazel-bin/databuild/databuild_cli --server http://localhost:3050 \
|
|
want data/beta
|
|
|
|
# Watch the wants
|
|
./bazel-bin/databuild/databuild_cli --server http://localhost:3050 \
|
|
wants list
|
|
|
|
# Watch the job runs
|
|
./bazel-bin/databuild/databuild_cli --server http://localhost:3050 \
|
|
job-runs list
|
|
|
|
# Watch the partitions
|
|
./bazel-bin/databuild/databuild_cli --server http://localhost:3050 \
|
|
partitions list
|
|
```
|
|
|
|
## Expected Behavior
|
|
|
|
1. Initial want for `data/beta` is created
|
|
2. Beta job runs, detects missing `data/alpha`, reports dependency miss
|
|
3. Orchestrator creates derivative want for `data/alpha`
|
|
4. Alpha job runs and succeeds
|
|
5. Beta job runs again and succeeds
|
|
6. Both partitions are now in `Live` state
|
|
|
|
## Configuration Format
|
|
|
|
The example uses JSON format (`config.json`), but TOML is also supported. Here's the equivalent TOML:
|
|
|
|
```toml
|
|
[[jobs]]
|
|
label = "//examples/multihop:job_alpha"
|
|
entrypoint = "./examples/multihop/job_alpha.sh"
|
|
partition_patterns = ["data/alpha"]
|
|
|
|
[jobs.environment]
|
|
JOB_NAME = "alpha"
|
|
|
|
[[jobs]]
|
|
label = "//examples/multihop:job_beta"
|
|
entrypoint = "./examples/multihop/job_beta.sh"
|
|
partition_patterns = ["data/beta"]
|
|
|
|
[jobs.environment]
|
|
JOB_NAME = "beta"
|
|
```
|