Compare commits

...

5 commits

35 changed files with 394 additions and 8878 deletions

View file

@ -1,65 +1,43 @@
# Claude Instructions
# Agent Instructions
## Project Overview
DataBuild is a bazel-based data build system. Key files:
- [`DESIGN.md`](./DESIGN.md) - Overall design of databuild
- [`databuild.proto`](databuild/databuild.proto) - System interfaces
- [`manifesto.md`](manifesto.md) - Project philosophy
- [`core-concepts.md`](core-concepts.md) - Core concepts
- Component designs - design docs for specific aspects or components of databuild:
- [Core build](./design/core-build.md) - How the core semantics of databuild works and are implemented
- [Build event log](./design/build-event-log.md) - How the build event log works and is accessed
- [Service](./design/service.md) - How the databuild HTTP service and web app are designed.
- [Glossary](./design/glossary.md) - Centralized description of key terms.
- [Graph specification](./design/graph-specification.md) - Describes the different libraries that enable more succinct declaration of databuild applications than the core bazel-based interface.
- [Observability](./design/observability.md) - How observability is systematically achieved throughout databuild applications.
- [Deploy strategies](./design/deploy-strategies.md) - Different strategies for deploying databuild applications.
- [Triggers](./design/triggers.md) - How triggering works in databuild applications.
- [Why databuild?](./design/why-databuild.md) - Why to choose databuild instead of other better established orchestration solutions.
Please reference these for any related work, as they indicate key technical bias/direction of the project.
## Tenets
- Declarative over imperative wherever possible/reasonable.
- We are building for the future, and choose to do "the right thing" rather than taking shortcuts to get unstuck. If you get stuck, pause and ask for help/input.
- In addition, do not add "unknown" results when parses or matches fail - these should always throw.
- Do not add "unknown" results when parses or matches fail - these should always throw.
- Compile time correctness is a super-power, and investment in it speeds up flywheel for development and user value.
## Build & Test
```bash
# Run comprehensive end-to-end tests (validates CLI vs Service consistency)
# Build all databuild components
bazel build //...
# Run databuild unit tests
bazel test //...
# Run end-to-end tests (validates CLI vs Service consistency)
./run_e2e_tests.sh
# Run all core unit tests
./scripts/bb_test_all
# Remote testing
./scripts/bb_remote_test_all
# Do not try to `bazel test //examples/basic_graph/...`, as this will not work.
```
## End-to-End Testing
The project includes comprehensive end-to-end tests that validate CLI and Service build consistency:
### Test Suite Structure
- `tests/end_to_end/simple_test.sh` - Basic CLI vs Service validation
- `tests/end_to_end/podcast_simple_test.sh` - Podcast reviews CLI vs Service validation
- `tests/end_to_end/basic_graph_test.sh` - Comprehensive basic graph testing
- `tests/end_to_end/podcast_reviews_test.sh` - Comprehensive podcast testing
### Event Validation
Tests ensure CLI and Service emit identical build events:
- **Build request events**: Orchestration lifecycle (received, planning, executing, completed)
- **Job events**: Job execution tracking
- **Partition events**: Partition build status
### CLI vs Service Event Alignment
Recent improvements ensure both paths emit identical events:
- CLI: Enhanced with orchestration events to match Service behavior
- Service: HTTP API orchestration events + core build events
- Validation: Tests fail if event counts or types differ between CLI and Service
### Running Individual Tests
```bash
# Test basic graph
tests/end_to_end/simple_test.sh \
examples/basic_graph/bazel-bin/basic_graph.build \
examples/basic_graph/bazel-bin/basic_graph.service
# Test podcast reviews (run from correct directory)
cd examples/podcast_reviews
../../tests/end_to_end/podcast_simple_test.sh \
bazel-bin/podcast_reviews_graph.build \
bazel-bin/podcast_reviews_graph.service
```
## Project Structure
- `databuild/` - Core system (Rust/Proto)
- `examples/` - Example implementations
@ -89,21 +67,6 @@ def main():
handle_exec(sys.argv[2:]) # Perform actual work
```
### Job Configuration Requirements
**CRITICAL**: Job configs must include non-empty `args` for execution to work:
```python
config = {
"configs": [{
"outputs": [{"str": partition_ref}],
"inputs": [...],
"args": ["some_arg"], # REQUIRED: Cannot be empty []
"env": {"PARTITION_REF": partition_ref}
}]
}
```
Jobs with `"args": []` will only have their config function called during execution, not exec.
### DataBuild Execution Flow
1. **Planning Phase**: DataBuild calls `.cfg` targets to get job configurations
2. **Execution Phase**: DataBuild calls main job targets which pipe config to exec

View file

@ -39,6 +39,8 @@ The `databuild_job` rule expects to reference a binary that adheres to the follo
- For the `config` subcommand, it prints the JSON job config to stdout based on the requested partitions, e.g. for a binary `bazel-bin/my_binary`, it prints a valid job config when called like `bazel-bin/my_binary config my_dataset/color=red my_dataset/color=blue`.
- For the `exec` subcommand, it produces the partitions requested to the `config` subcommand when configured by the job config it produced. E.g., if `config` had produced `{..., "args": ["red", "blue"], "env": {"MY_ENV": "foo"}`, then calling `MY_ENV=foo bazel-bin/my_binary exec red blue` should produce partitions `my_dataset/color=red` and `my_dataset/color=blue`.
Jobs are executed via a wrapper component that provides observability, error handling, and standardized communication with the graph. The wrapper captures all job output as structured logs, enabling comprehensive monitoring without requiring jobs to have network connectivity.
### Graph
The `databuild_graph` rule expects two fields, `jobs`, and `lookup`:

1
GEMINI.md Symbolic link
View file

@ -0,0 +1 @@
CLAUDE.md

View file

@ -6,6 +6,7 @@ status summary, job run statistics, etc.
## Architecture
- Uses [event sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) /
[CQRS](https://www.wikipedia.org/wiki/cqrs) philosophy.
- BELs are only ever written to by graph processes (e.g. CLI or service), not the jobs themselves.
- BEL uses only two types of tables:
- The root event table, with event ID, timestamp, message, event type, and ID fields for related event types.
- Type-specific event tables (e.g. task even, partition event, build request event, etc).

View file

@ -12,8 +12,11 @@ Purpose: Centralize the build logic and semantics in a performant, correct core.
- Graph-based composition is the basis for databuild application [deployment](./deploy-strategies.md)
## Jobs
Jobs are the atomic unit of work in databuild.
- Job wrapper fulfills configuration, observability, and record keeping
Jobs are the atomic unit of work in databuild, executed via a Rust-based wrapper that provides:
- Structured logging and telemetry collection
- Platform-agnostic execution across local, container, and cloud environments
- Zero-network-dependency operation via log-based communication
- Standardized error handling and exit code categorization
### `job.config`
Purpose: Enable planning of execution graph. Executed in-process when possible for speed. For interface details, see
@ -50,17 +53,18 @@ trait DataBuildJob {
#### `job.exec` State Diagram
```mermaid
flowchart TD
begin((begin)) --> validate_config
emit_job_exec_fail --> fail((fail))
validate_config -- fail --> emit_config_validate_fail --> emit_job_exec_fail
validate_config -- success --> emit_config_validate_success --> launch_task
launch_task -- fail --> emit_task_launch_fail --> emit_job_exec_fail
launch_task -- success --> emit_task_launch_success --> await_task
await_task -- waited N seconds --> emit_heartbeat --> await_task
await_task -- non-zero exit code --> emit_task_failed --> emit_job_exec_fail
await_task -- zero exit code --> emit_task_success --> calculate_metadata
calculate_metadata -- fail --> emit_metadata_calculation_fail --> emit_job_exec_fail
calculate_metadata -- success --> emit_metadata ---> success((success))
begin((begin)) --> wrapper_validate_config
emit_job_exec_fail --> fail((fail))
wrapper_validate_config -- fail --> emit_config_validate_fail --> emit_job_exec_fail
wrapper_validate_config -- success --> emit_config_validate_success --> wrapper_launch_task
wrapper_launch_task -- fail --> emit_task_launch_fail --> emit_job_exec_fail
wrapper_launch_task -- success --> emit_task_launch_success --> wrapper_monitor_task
wrapper_monitor_task -- heartbeat timer --> emit_heartbeat --> wrapper_monitor_task
wrapper_monitor_task -- job stderr --> emit_log_entry --> wrapper_monitor_task
wrapper_monitor_task -- job stdout --> emit_log_entry --> wrapper_monitor_task
wrapper_monitor_task -- non-zero exit --> emit_task_failed --> emit_job_exec_fail
wrapper_monitor_task -- zero exit --> emit_task_success --> emit_partition_manifest
emit_partition_manifest --> success((success))
```
## Graphs

View file

@ -1,19 +1,47 @@
# Observability
- Purpose
- To enable simple, comprehensive metrics and logging observability for databuild applications
- Wrappers as observability implementation
- Liveness guarantees are:
- Task process is still running
- Logs are being shipped
- Metrics are being gathered (graph scrapes worker metrics, re-exposes)
- Heartbeating
- Log shipping
- Metrics exposed
- Metrics
- Service
- Jobs
- Logging
- Service
- Jobs
## Purpose
Provide comprehensive, platform-agnostic observability for DataBuild applications through standardized job wrapper
telemetry.
## Architecture
### Wrapper-Based Observability
All observability flows through the job wrapper:
- **Jobs** emit application logs to stdout/stderr
- **Wrapper** captures and enriches with structured metadata
- **Graph** parses structured logs into metrics, events, and monitoring data
- [**BEL**](./build-event-log.md) stores aggregated telemetry for historical analysis
### Communication Protocol
Log-based telemetry using protobuf-defined structured messages:
- LogMessage: Application stdout/stderr with metadata
- MetricPoint: StatsD-style metrics with labels
- JobEvent: State transitions and system events
- PartitionManifest: Job completion with output metadata
## Implementation
### Metrics Collection
- Format: StatsD-like embedded in structured logs
- Aggregation: Graph components collect and expose via Prometheus
- Storage: Summary metrics stored in BEL for historical analysis
- Scope: Job execution, resource usage, partition metadata
### Logging
- Capture: All job stdout/stderr via wrapper
- Enhancement: Automatic injection of job_id, partition_ref, timestamps
- Format: Structured JSON for consistent parsing
- Retention: Platform-dependent (container logs, cloud logging APIs)
### Monitoring
- Heartbeats: 30-second intervals with resource utilization
- Health: Exit code categorization and failure analysis
- Alerting: Standard Prometheus/alertmanager integration
- Debugging: Complete log trails for job troubleshooting
### Platform Integration
- **Local**: Direct stdout pipe reading
- **Docker**: Container log persistence and `docker logs`
- **Kubernetes**: Pod logs API with configurable retention
- **Cloud**: Platform logging services (CloudWatch, Cloud Logging)

11
design/why-databuild.md Normal file
View file

@ -0,0 +1,11 @@
# Why DataBuild?
(work in progress)
Why?
- Orchestration logic changes all the time, better to not write it directly
- Declarative -> Compile time correctness (e.g. can detect when no job produces a partition pattern)
- Compartmentalized jobs + data deps -> Simplicity and compartmentalization of complexity
- Bazel based -> Easy to deploy, maintain, and update

View file

@ -1,24 +0,0 @@
import ../../.bazelrc
build --java_runtime_version=21
build --tool_java_runtime_version=21
test --test_output=errors
# Default to quiet mode for run commands
run --ui_event_filters=-info,-stdout,-stderr
run --noshow_progress
run --noannounce_rc
# Explicit quiet mode configuration (same as default)
run:quiet --ui_event_filters=-info,-stdout,-stderr
run:quiet --noshow_progress
run:quiet --noannounce_rc
# Loud mode configuration (override the quiet default)
run:loud --ui_event_filters=
run:loud --show_progress
run:loud --announce_rc
# TypeScript configuration
common --@aspect_rules_ts//ts:skipLibCheck=always

View file

@ -1,68 +0,0 @@
load("@databuild//databuild:rules.bzl", "databuild_graph", "databuild_job")
load("@rules_java//java:defs.bzl", "java_binary")
platform(
name = "linux_arm",
constraint_values = [
"@platforms//os:linux",
"@platforms//cpu:arm64",
],
)
platform(
name = "linux_x86",
constraint_values = [
"@platforms//os:linux",
"@platforms//cpu:x86_64",
],
)
databuild_graph(
name = "basic_graph",
jobs = [
"//:generate_number_job",
"//:sum_job",
],
lookup = ":job_lookup",
visibility = ["//visibility:public"],
)
py_binary(
name = "job_lookup",
srcs = ["job_lookup.py"],
main = "job_lookup.py",
)
databuild_job(
name = "generate_number_job",
binary = ":generate_number_binary",
visibility = ["//visibility:public"],
)
java_binary(
name = "generate_number_binary",
srcs = ["UnifiedGenerateNumber.java"],
main_class = "com.databuild.examples.basic_graph.UnifiedGenerateNumber",
deps = [
"@maven//:com_fasterxml_jackson_core_jackson_annotations",
"@maven//:com_fasterxml_jackson_core_jackson_core",
"@maven//:com_fasterxml_jackson_core_jackson_databind",
],
)
databuild_job(
name = "sum_job",
binary = ":sum_binary",
visibility = ["//visibility:public"],
)
java_binary(
name = "sum_binary",
srcs = ["UnifiedSum.java"],
main_class = "com.databuild.examples.basic_graph.UnifiedSum",
deps = [
"@maven//:com_fasterxml_jackson_core_jackson_annotations",
"@maven//:com_fasterxml_jackson_core_jackson_core",
"@maven//:com_fasterxml_jackson_core_jackson_databind",
],
)

View file

@ -1,6 +0,0 @@
package com.databuild.examples.basic_graph;
public class DataDep {
public String depType; // "query" or "materialize"
public String ref;
}

View file

@ -1,24 +0,0 @@
package com.databuild.examples.basic_graph;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.annotation.JsonAutoDetect;
import com.fasterxml.jackson.annotation.JsonAutoDetect.Visibility;
@JsonAutoDetect(fieldVisibility = Visibility.ANY)
public class JobConfig {
public List<DataDep> inputs;
public List<String> outputs;
public List<String> args;
public Map<String, String> env;
// Just one constructor if you want defaults
public JobConfig() {
this.inputs = new ArrayList<>();
this.outputs = new ArrayList<>();
this.args = new ArrayList<>();
this.env = new HashMap<>();
}
}

View file

@ -1,63 +0,0 @@
module(
name = "databuild_basic_composition",
version = "0.1",
)
# Databuild dep - overridden so ignore version
bazel_dep(name = "databuild", version = "0.0")
local_path_override(
module_name = "databuild",
path = "../..",
)
# Java dependencies
bazel_dep(name = "rules_java", version = "8.11.0")
# Configure JDK 17
register_toolchains("@rules_java//toolchains:all")
bazel_dep(name = "rules_jvm_external", version = "6.3")
maven = use_extension("@rules_jvm_external//:extensions.bzl", "maven")
maven.install(
artifacts = [
"com.fasterxml.jackson.core:jackson-core:2.15.2",
"com.fasterxml.jackson.core:jackson-databind:2.15.2",
"com.fasterxml.jackson.core:jackson-annotations:2.15.2",
"com.fasterxml.jackson.module:jackson-module-jsonSchema:2.15.2",
],
repositories = [
"https://repo1.maven.org/maven2",
],
)
use_repo(maven, "maven")
# Rules OCI - necessary for producing a docker container
bazel_dep(name = "rules_oci", version = "2.2.6")
# For testing, we also recommend https://registry.bazel.build/modules/container_structure_test
oci = use_extension("@rules_oci//oci:extensions.bzl", "oci")
# Declare external images you need to pull, for example:
oci.pull(
name = "debian",
image = "docker.io/library/python",
platforms = [
"linux/arm64/v8",
"linux/amd64",
],
# 'latest' is not reproducible, but it's convenient.
# During the build we print a WARNING message that includes recommended 'digest' and 'platforms'
# values which you can use here in place of 'tag' to pin for reproducibility.
tag = "3.12-bookworm",
)
# For each oci.pull call, repeat the "name" here to expose them as dependencies.
use_repo(oci, "debian", "debian_linux_amd64", "debian_linux_arm64_v8")
# Platforms for specifying linux/arm
bazel_dep(name = "platforms", version = "0.0.11")
# TypeScript rules for handling skipLibCheck flag
# https://github.com/aspect-build/rules_ts/issues/483#issuecomment-1814586063
bazel_dep(name = "aspect_rules_ts", version = "3.6.3")

File diff suppressed because one or more lines are too long

View file

@ -1,39 +0,0 @@
# Basic Graph - Random Number Generator
This example demonstrates a databuild_job that generates a random number seeded based on the partition ref.
## Building Output Partitions
### CLI Build
Use the DataBuild CLI to build specific partitions:
```bash
# Builds bazel-bin/basic_graph.build
bazel build //:basic_graph.service
# Build individual partitions
bazel-bin/basic_graph.build pippin salem sadie
# Build sum partition
bazel-bin/basic_graph.build pippin_salem_sadie
```
### Service Build
Use the Build Graph Service for HTTP API access:
```bash
# Start the service
bazel run //:basic_graph.service
bazel-bin/basic_graph.service
# Submit build request via HTTP
curl -X POST http://localhost:8080/api/v1/builds \
-H "Content-Type: application/json" \
-d '{"partitions": ["pippin", "salem", "sadie"]}'
# Check build status
curl http://localhost:8080/api/v1/builds/BUILD_REQUEST_ID
# Get partition status
curl http://localhost:8080/api/v1/partitions/pippin/status
```

View file

@ -1,117 +0,0 @@
package com.databuild.examples.basic_graph;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import java.util.ArrayList;
import java.util.List;
import java.io.File;
import java.util.Arrays;
import java.util.Collections;
import java.io.FileWriter;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Random;
/**
* Unified job that handles both configuration and execution via subcommands.
*/
public class UnifiedGenerateNumber {
public static String BASE_PATH = "/tmp/databuild_test/examples/basic_graph/";
public static void main(String[] args) {
if (args.length < 1) {
System.err.println("Usage: UnifiedGenerateNumber {config|exec} [args...]");
System.exit(1);
}
String command = args[0];
switch (command) {
case "config":
handleConfig(Arrays.copyOfRange(args, 1, args.length));
break;
case "exec":
handleExec(Arrays.copyOfRange(args, 1, args.length));
break;
default:
System.err.println("Unknown command: " + command);
System.err.println("Usage: UnifiedGenerateNumber {config|exec} [args...]");
System.exit(1);
}
}
private static void handleConfig(String[] args) {
if (args.length < 1) {
System.err.println("Config mode requires partition ref");
System.exit(1);
}
String partitionRef = args[0];
try {
ObjectMapper mapper = new ObjectMapper();
// Create job configuration
var config = mapper.createObjectNode();
// Create outputs as PartitionRef objects
var outputs = mapper.createArrayNode();
var outputPartRef = mapper.createObjectNode();
outputPartRef.put("str", partitionRef);
outputs.add(outputPartRef);
config.set("outputs", outputs);
config.set("inputs", mapper.createArrayNode());
config.set("args", mapper.createArrayNode().add("will").add("generate").add(partitionRef));
config.set("env", mapper.createObjectNode().put("PARTITION_REF", partitionRef));
var response = mapper.createObjectNode();
response.set("configs", mapper.createArrayNode().add(config));
System.out.println(mapper.writeValueAsString(response));
} catch (Exception e) {
System.err.println("Error creating config: " + e.getMessage());
System.exit(1);
}
}
private static void handleExec(String[] args) {
if (args.length < 3) {
System.err.println("Execute mode requires: will generate <partition_ref>");
System.exit(1);
}
String partitionRef = args[2];
try {
// Generate a random number based on the partition ref
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] hash = md.digest(partitionRef.getBytes(StandardCharsets.UTF_8));
long seed = 0;
for (int i = 0; i < 8; i++) {
seed = (seed << 8) | (hash[i] & 0xFF);
}
Random random = new Random(seed);
int randomNumber = random.nextInt(100) + 1;
// Write to file - partitionRef is the full path
File outputFile = new File(partitionRef);
File outputDir = outputFile.getParentFile();
if (outputDir != null) {
outputDir.mkdirs();
}
try (FileWriter writer = new FileWriter(outputFile)) {
writer.write(String.valueOf(randomNumber));
}
System.out.println("Generated number " + randomNumber + " for partition " + partitionRef);
} catch (Exception e) {
System.err.println("Error in execution: " + e.getMessage());
System.exit(1);
}
}
}

View file

@ -1,137 +0,0 @@
package com.databuild.examples.basic_graph;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import java.io.File;
import java.util.Arrays;
import java.util.Collections;
import java.util.stream.Collectors;
import java.io.FileWriter;
import java.io.IOException;
// import static com.databuild.examples.basic_graph.GenerateExecute.BASE_PATH;
/**
* Unified sum job that handles both configuration and execution via subcommands.
*/
public class UnifiedSum {
public static String BASE_PATH = "/tmp/databuild_test/examples/basic_graph/";
public static void main(String[] args) {
if (args.length < 1) {
System.err.println("Usage: UnifiedSum {config|exec} [args...]");
System.exit(1);
}
String command = args[0];
switch (command) {
case "config":
handleConfig(Arrays.copyOfRange(args, 1, args.length));
break;
case "exec":
handleExec(Arrays.copyOfRange(args, 1, args.length));
break;
default:
System.err.println("Unknown command: " + command);
System.err.println("Usage: UnifiedSum {config|exec} [args...]");
System.exit(1);
}
}
private static void handleConfig(String[] args) {
if (args.length != 1) {
System.err.println("Config mode requires exactly one partition ref");
System.exit(1);
}
String partitionRef = args[0];
String[] pathParts = partitionRef.split("/");
String[] upstreams = Arrays.stream(pathParts[pathParts.length - 1].split("_"))
.map(part -> BASE_PATH + "generated_number/" + part)
.toArray(String[]::new);
try {
ObjectMapper mapper = new ObjectMapper();
// Create data dependencies
var inputs = mapper.createArrayNode();
for (String upstream : upstreams) {
var dataDep = mapper.createObjectNode();
dataDep.put("dep_type", 0); // QUERY
var partRef = mapper.createObjectNode();
partRef.put("str", upstream);
dataDep.set("partition_ref", partRef);
inputs.add(dataDep);
}
// Create job configuration
var config = mapper.createObjectNode();
// Create outputs as PartitionRef objects
var outputs = mapper.createArrayNode();
var outputPartRef = mapper.createObjectNode();
outputPartRef.put("str", partitionRef);
outputs.add(outputPartRef);
config.set("outputs", outputs);
config.set("inputs", inputs);
var argsArray = mapper.createArrayNode();
for (String upstream : upstreams) {
argsArray.add(upstream);
}
config.set("args", argsArray);
config.set("env", mapper.createObjectNode().put("OUTPUT_REF", partitionRef));
var response = mapper.createObjectNode();
response.set("configs", mapper.createArrayNode().add(config));
System.out.println(mapper.writeValueAsString(response));
} catch (Exception e) {
System.err.println("Error creating config: " + e.getMessage());
System.exit(1);
}
}
private static void handleExec(String[] args) {
// Get output ref from env var OUTPUT_REF
String outputRef = System.getenv("OUTPUT_REF");
if (outputRef == null) {
System.err.println("Error: OUTPUT_REF environment variable is required");
System.exit(1);
}
// For each arg, load it from the file system and add it to the sum
int sum = 0;
for (String partitionRef : args) {
try {
String path = partitionRef;
int partitionValue = Integer.parseInt(new String(java.nio.file.Files.readAllBytes(java.nio.file.Paths.get(path))));
System.out.println("Summing partition " + partitionRef + " with value " + partitionValue);
sum += partitionValue;
} catch (Exception e) {
System.err.println("Error: Failed to read partition " + partitionRef + ": " + e.getMessage());
e.printStackTrace();
System.exit(1);
}
}
System.out.println("Sum of " + args.length + " partitions: " + sum);
// Write the sum to the output file
try {
File outputDir = new File(outputRef).getParentFile();
if (outputDir != null) {
outputDir.mkdirs();
}
try (FileWriter writer = new FileWriter(outputRef)) {
writer.write(String.valueOf(sum));
}
System.out.println("Wrote sum " + sum + " to " + outputRef);
} catch (Exception e) {
System.err.println("Error writing output: " + e.getMessage());
System.exit(1);
}
}
}

View file

@ -1 +0,0 @@
34

View file

@ -1,28 +0,0 @@
import sys
import json
from collections import defaultdict
def main():
output_refs = sys.argv[1:]
assert len(output_refs) > 0, "Need at least 1 ref to lookup"
result = defaultdict(list)
# Partition output prefix makes it obvious which job should fulfill
for ref in output_refs:
print(ref, file=sys.stderr)
body, tail = ref.rsplit("/", 1)
if "generated_number" in body:
result["//:generate_number_job"].append(ref)
elif "sum" in body:
result["//:sum_job"].append(ref)
else:
raise ValueError(f"No job found for ref `{ref}`")
print(json.dumps({k: v for k, v in result.items() if v}))
if __name__ == '__main__':
main()

View file

@ -1,36 +0,0 @@
sh_test(
name = "generate_number_test",
srcs = ["generate_number_test.sh"],
data = [
"//:generate_number_job.cfg",
"//:generate_number_job.exec",
],
)
sh_test(
name = "sum_test",
srcs = ["sum_test.sh"],
data = [
"//:sum_job.cfg",
"//:sum_job.exec",
],
)
sh_test(
name = "graph_test",
srcs = ["graph_test.sh"],
data = [
"//:basic_graph.lookup",
"//:basic_graph.analyze",
],
)
sh_test(
name = "exec_test",
srcs = ["exec_test.sh"],
data = [
"//:basic_graph.exec",
"//:basic_graph.build",
"//:basic_graph.analyze",
],
)

View file

@ -1,10 +0,0 @@
#!/usr/bin/env bash
set -e
# Test the .exec rule
echo exec
basic_graph.exec < <(basic_graph.analyze /tmp/databuild_test/examples/basic_graph/generated_number/pippin_salem_sadie)
# Test the .build rule
echo build
basic_graph.build /tmp/databuild_test/examples/basic_graph/generated_number/pippin_salem_sadie

View file

@ -1,15 +0,0 @@
#!/bin/bash
set -e
# Test configure
generate_number_job.cfg /tmp/databuild_test/examples/basic_graph/generated_number/pippin /tmp/databuild_test/examples/basic_graph/generated_number/salem /tmp/databuild_test/examples/basic_graph/generated_number/sadie
# Test run
generate_number_job.cfg /tmp/databuild_test/examples/basic_graph/generated_number/pippin | jq -c ".configs[0]" | generate_number_job.exec
# Validate that contents of pippin is 1 (deterministic based on SHA-256 hash)
if [[ "$(cat /tmp/databuild_test/examples/basic_graph/generated_number/pippin)" != "1" ]]; then
echo "Assertion failed: File does not contain 1"
cat /tmp/databuild_test/examples/basic_graph/generated_number/pippin
exit 1
fi

View file

@ -1,6 +0,0 @@
#!/usr/bin/env bash
set -e
basic_graph.lookup /tmp/databuild_test/examples/basic_graph/generated_number/pippin_salem_sadie
basic_graph.analyze /tmp/databuild_test/examples/basic_graph/generated_number/pippin_salem_sadie

View file

@ -1,24 +0,0 @@
#!/bin/bash
set -e
rm -rf /tmp/databuild_test/examples/basic_graph
mkdir -p /tmp/databuild_test/examples/basic_graph/generated_number
mkdir -p /tmp/databuild_test/examples/basic_graph/sum
# Test configure
sum_job.cfg /tmp/databuild_test/examples/basic_graph/sum/pippin_salem_sadie
# Test run
echo -n 83 > /tmp/databuild_test/examples/basic_graph/generated_number/pippin
echo -n 34 > /tmp/databuild_test/examples/basic_graph/generated_number/salem
echo -n 19 > /tmp/databuild_test/examples/basic_graph/generated_number/sadie
sum_job.cfg /tmp/databuild_test/examples/basic_graph/sum/pippin_salem_sadie | jq -c ".configs[0]" | sum_job.exec
# Validate that contents of output is 136
if [[ "$(cat /tmp/databuild_test/examples/basic_graph/sum/pippin_salem_sadie)" != "136" ]]; then
echo "Assertion failed: File does not contain 136"
cat /tmp/databuild_test/examples/basic_graph/sum/pippin_salem_sadie
exit 1
fi

View file

@ -1 +0,0 @@
import ../../.bazelrc

View file

@ -1,12 +0,0 @@
load("@databuild//databuild:rules.bzl", "databuild_job")
databuild_job(
name = "test_job",
binary = ":test_job_binary",
visibility = ["//visibility:public"],
)
sh_binary(
name = "test_job_binary",
srcs = ["unified_job.sh"],
)

View file

@ -1,11 +0,0 @@
module(
name = "databuild_hello_world",
version = "0.1",
)
# Databuild dep - overridden so ignore version
bazel_dep(name = "databuild", version = "0.0")
local_path_override(
module_name = "databuild",
path = "../..",
)

View file

@ -1,422 +0,0 @@
{
"lockFileVersion": 18,
"registryFileHashes": {
"https://bcr.bazel.build/bazel_registry.json": "8a28e4aff06ee60aed2a8c281907fb8bcbf3b753c91fb5a5c57da3215d5b3497",
"https://bcr.bazel.build/modules/abseil-cpp/20210324.2/MODULE.bazel": "7cd0312e064fde87c8d1cd79ba06c876bd23630c83466e9500321be55c96ace2",
"https://bcr.bazel.build/modules/abseil-cpp/20211102.0/MODULE.bazel": "70390338f7a5106231d20620712f7cccb659cd0e9d073d1991c038eb9fc57589",
"https://bcr.bazel.build/modules/abseil-cpp/20230125.1/MODULE.bazel": "89047429cb0207707b2dface14ba7f8df85273d484c2572755be4bab7ce9c3a0",
"https://bcr.bazel.build/modules/abseil-cpp/20230802.0.bcr.1/MODULE.bazel": "1c8cec495288dccd14fdae6e3f95f772c1c91857047a098fad772034264cc8cb",
"https://bcr.bazel.build/modules/abseil-cpp/20230802.0/MODULE.bazel": "d253ae36a8bd9ee3c5955384096ccb6baf16a1b1e93e858370da0a3b94f77c16",
"https://bcr.bazel.build/modules/abseil-cpp/20230802.1/MODULE.bazel": "fa92e2eb41a04df73cdabeec37107316f7e5272650f81d6cc096418fe647b915",
"https://bcr.bazel.build/modules/abseil-cpp/20240116.1/MODULE.bazel": "37bcdb4440fbb61df6a1c296ae01b327f19e9bb521f9b8e26ec854b6f97309ed",
"https://bcr.bazel.build/modules/abseil-cpp/20240116.1/source.json": "9be551b8d4e3ef76875c0d744b5d6a504a27e3ae67bc6b28f46415fd2d2957da",
"https://bcr.bazel.build/modules/apple_support/1.17.1/MODULE.bazel": "655c922ab1209978a94ef6ca7d9d43e940cd97d9c172fb55f94d91ac53f8610b",
"https://bcr.bazel.build/modules/apple_support/1.17.1/source.json": "6b2b8c74d14e8d485528a938e44bdb72a5ba17632b9e14ef6e68a5ee96c8347f",
"https://bcr.bazel.build/modules/aspect_bazel_lib/2.14.0/MODULE.bazel": "2b31ffcc9bdc8295b2167e07a757dbbc9ac8906e7028e5170a3708cecaac119f",
"https://bcr.bazel.build/modules/aspect_bazel_lib/2.14.0/source.json": "0cf1826853b0bef8b5cd19c0610d717500f5521aa2b38b72b2ec302ac5e7526c",
"https://bcr.bazel.build/modules/aspect_bazel_lib/2.7.2/MODULE.bazel": "780d1a6522b28f5edb7ea09630748720721dfe27690d65a2d33aa7509de77e07",
"https://bcr.bazel.build/modules/aspect_bazel_lib/2.7.7/MODULE.bazel": "491f8681205e31bb57892d67442ce448cda4f472a8e6b3dc062865e29a64f89c",
"https://bcr.bazel.build/modules/aspect_bazel_lib/2.9.3/MODULE.bazel": "66baf724dbae7aff4787bf2245cc188d50cb08e07789769730151c0943587c14",
"https://bcr.bazel.build/modules/aspect_rules_esbuild/0.21.0/MODULE.bazel": "77dc393c43ad79398b05865444c5200c6f1aae6765615544f2c7730b5858d533",
"https://bcr.bazel.build/modules/aspect_rules_esbuild/0.21.0/source.json": "062b1d3dba8adcfeb28fe60c185647f5a53ec0487ffe93cf0ae91566596e4b49",
"https://bcr.bazel.build/modules/aspect_rules_js/2.0.0/MODULE.bazel": "b45b507574aa60a92796e3e13c195cd5744b3b8aff516a9c0cb5ae6a048161c5",
"https://bcr.bazel.build/modules/aspect_rules_js/2.0.0/source.json": "a6b09288ab135225982a58ac0b5e2c032c331d88f80553d86596000e894e86b3",
"https://bcr.bazel.build/modules/aspect_rules_ts/3.6.3/MODULE.bazel": "d09db394970f076176ce7bab5b5fa7f0d560fd4f30b8432ea5e2c2570505b130",
"https://bcr.bazel.build/modules/aspect_rules_ts/3.6.3/source.json": "641e58c62e5090d52a0d3538451893acdb2d79a36e8b3d1d30a013c580bc2058",
"https://bcr.bazel.build/modules/bazel_features/1.1.1/MODULE.bazel": "27b8c79ef57efe08efccbd9dd6ef70d61b4798320b8d3c134fd571f78963dbcd",
"https://bcr.bazel.build/modules/bazel_features/1.10.0/MODULE.bazel": "f75e8807570484a99be90abcd52b5e1f390362c258bcb73106f4544957a48101",
"https://bcr.bazel.build/modules/bazel_features/1.11.0/MODULE.bazel": "f9382337dd5a474c3b7d334c2f83e50b6eaedc284253334cf823044a26de03e8",
"https://bcr.bazel.build/modules/bazel_features/1.15.0/MODULE.bazel": "d38ff6e517149dc509406aca0db3ad1efdd890a85e049585b7234d04238e2a4d",
"https://bcr.bazel.build/modules/bazel_features/1.17.0/MODULE.bazel": "039de32d21b816b47bd42c778e0454217e9c9caac4a3cf8e15c7231ee3ddee4d",
"https://bcr.bazel.build/modules/bazel_features/1.18.0/MODULE.bazel": "1be0ae2557ab3a72a57aeb31b29be347bcdc5d2b1eb1e70f39e3851a7e97041a",
"https://bcr.bazel.build/modules/bazel_features/1.19.0/MODULE.bazel": "59adcdf28230d220f0067b1f435b8537dd033bfff8db21335ef9217919c7fb58",
"https://bcr.bazel.build/modules/bazel_features/1.21.0/MODULE.bazel": "675642261665d8eea09989aa3b8afb5c37627f1be178382c320d1b46afba5e3b",
"https://bcr.bazel.build/modules/bazel_features/1.30.0/MODULE.bazel": "a14b62d05969a293b80257e72e597c2da7f717e1e69fa8b339703ed6731bec87",
"https://bcr.bazel.build/modules/bazel_features/1.30.0/source.json": "b07e17f067fe4f69f90b03b36ef1e08fe0d1f3cac254c1241a1818773e3423bc",
"https://bcr.bazel.build/modules/bazel_features/1.4.1/MODULE.bazel": "e45b6bb2350aff3e442ae1111c555e27eac1d915e77775f6fdc4b351b758b5d7",
"https://bcr.bazel.build/modules/bazel_features/1.9.0/MODULE.bazel": "885151d58d90d8d9c811eb75e3288c11f850e1d6b481a8c9f766adee4712358b",
"https://bcr.bazel.build/modules/bazel_features/1.9.1/MODULE.bazel": "8f679097876a9b609ad1f60249c49d68bfab783dd9be012faf9d82547b14815a",
"https://bcr.bazel.build/modules/bazel_skylib/1.0.3/MODULE.bazel": "bcb0fd896384802d1ad283b4e4eb4d718eebd8cb820b0a2c3a347fb971afd9d8",
"https://bcr.bazel.build/modules/bazel_skylib/1.1.1/MODULE.bazel": "1add3e7d93ff2e6998f9e118022c84d163917d912f5afafb3058e3d2f1545b5e",
"https://bcr.bazel.build/modules/bazel_skylib/1.2.0/MODULE.bazel": "44fe84260e454ed94ad326352a698422dbe372b21a1ac9f3eab76eb531223686",
"https://bcr.bazel.build/modules/bazel_skylib/1.2.1/MODULE.bazel": "f35baf9da0efe45fa3da1696ae906eea3d615ad41e2e3def4aeb4e8bc0ef9a7a",
"https://bcr.bazel.build/modules/bazel_skylib/1.3.0/MODULE.bazel": "20228b92868bf5cfc41bda7afc8a8ba2a543201851de39d990ec957b513579c5",
"https://bcr.bazel.build/modules/bazel_skylib/1.4.1/MODULE.bazel": "a0dcb779424be33100dcae821e9e27e4f2901d9dfd5333efe5ac6a8d7ab75e1d",
"https://bcr.bazel.build/modules/bazel_skylib/1.4.2/MODULE.bazel": "3bd40978e7a1fac911d5989e6b09d8f64921865a45822d8b09e815eaa726a651",
"https://bcr.bazel.build/modules/bazel_skylib/1.5.0/MODULE.bazel": "32880f5e2945ce6a03d1fbd588e9198c0a959bb42297b2cfaf1685b7bc32e138",
"https://bcr.bazel.build/modules/bazel_skylib/1.6.1/MODULE.bazel": "8fdee2dbaace6c252131c00e1de4b165dc65af02ea278476187765e1a617b917",
"https://bcr.bazel.build/modules/bazel_skylib/1.7.0/MODULE.bazel": "0db596f4563de7938de764cc8deeabec291f55e8ec15299718b93c4423e9796d",
"https://bcr.bazel.build/modules/bazel_skylib/1.7.1/MODULE.bazel": "3120d80c5861aa616222ec015332e5f8d3171e062e3e804a2a0253e1be26e59b",
"https://bcr.bazel.build/modules/bazel_skylib/1.8.1/MODULE.bazel": "88ade7293becda963e0e3ea33e7d54d3425127e0a326e0d17da085a5f1f03ff6",
"https://bcr.bazel.build/modules/bazel_skylib/1.8.1/source.json": "7ebaefba0b03efe59cac88ed5bbc67bcf59a3eff33af937345ede2a38b2d368a",
"https://bcr.bazel.build/modules/buildozer/7.1.2/MODULE.bazel": "2e8dd40ede9c454042645fd8d8d0cd1527966aa5c919de86661e62953cd73d84",
"https://bcr.bazel.build/modules/buildozer/7.1.2/source.json": "c9028a501d2db85793a6996205c8de120944f50a0d570438fcae0457a5f9d1f8",
"https://bcr.bazel.build/modules/google_benchmark/1.8.2/MODULE.bazel": "a70cf1bba851000ba93b58ae2f6d76490a9feb74192e57ab8e8ff13c34ec50cb",
"https://bcr.bazel.build/modules/googletest/1.11.0/MODULE.bazel": "3a83f095183f66345ca86aa13c58b59f9f94a2f81999c093d4eeaa2d262d12f4",
"https://bcr.bazel.build/modules/googletest/1.14.0.bcr.1/MODULE.bazel": "22c31a561553727960057361aa33bf20fb2e98584bc4fec007906e27053f80c6",
"https://bcr.bazel.build/modules/googletest/1.14.0.bcr.1/source.json": "41e9e129f80d8c8bf103a7acc337b76e54fad1214ac0a7084bf24f4cd924b8b4",
"https://bcr.bazel.build/modules/googletest/1.14.0/MODULE.bazel": "cfbcbf3e6eac06ef9d85900f64424708cc08687d1b527f0ef65aa7517af8118f",
"https://bcr.bazel.build/modules/jsoncpp/1.9.5/MODULE.bazel": "31271aedc59e815656f5736f282bb7509a97c7ecb43e927ac1a37966e0578075",
"https://bcr.bazel.build/modules/jsoncpp/1.9.5/source.json": "4108ee5085dd2885a341c7fab149429db457b3169b86eb081fa245eadf69169d",
"https://bcr.bazel.build/modules/libpfm/4.11.0/MODULE.bazel": "45061ff025b301940f1e30d2c16bea596c25b176c8b6b3087e92615adbd52902",
"https://bcr.bazel.build/modules/platforms/0.0.10/MODULE.bazel": "8cb8efaf200bdeb2150d93e162c40f388529a25852b332cec879373771e48ed5",
"https://bcr.bazel.build/modules/platforms/0.0.11/MODULE.bazel": "0daefc49732e227caa8bfa834d65dc52e8cc18a2faf80df25e8caea151a9413f",
"https://bcr.bazel.build/modules/platforms/0.0.11/source.json": "f7e188b79ebedebfe75e9e1d098b8845226c7992b307e28e1496f23112e8fc29",
"https://bcr.bazel.build/modules/platforms/0.0.4/MODULE.bazel": "9b328e31ee156f53f3c416a64f8491f7eb731742655a47c9eec4703a71644aee",
"https://bcr.bazel.build/modules/platforms/0.0.5/MODULE.bazel": "5733b54ea419d5eaf7997054bb55f6a1d0b5ff8aedf0176fef9eea44f3acda37",
"https://bcr.bazel.build/modules/platforms/0.0.6/MODULE.bazel": "ad6eeef431dc52aefd2d77ed20a4b353f8ebf0f4ecdd26a807d2da5aa8cd0615",
"https://bcr.bazel.build/modules/platforms/0.0.7/MODULE.bazel": "72fd4a0ede9ee5c021f6a8dd92b503e089f46c227ba2813ff183b71616034814",
"https://bcr.bazel.build/modules/platforms/0.0.8/MODULE.bazel": "9f142c03e348f6d263719f5074b21ef3adf0b139ee4c5133e2aa35664da9eb2d",
"https://bcr.bazel.build/modules/platforms/0.0.9/MODULE.bazel": "4a87a60c927b56ddd67db50c89acaa62f4ce2a1d2149ccb63ffd871d5ce29ebc",
"https://bcr.bazel.build/modules/protobuf/21.7/MODULE.bazel": "a5a29bb89544f9b97edce05642fac225a808b5b7be74038ea3640fae2f8e66a7",
"https://bcr.bazel.build/modules/protobuf/27.0/MODULE.bazel": "7873b60be88844a0a1d8f80b9d5d20cfbd8495a689b8763e76c6372998d3f64c",
"https://bcr.bazel.build/modules/protobuf/27.1/MODULE.bazel": "703a7b614728bb06647f965264967a8ef1c39e09e8f167b3ca0bb1fd80449c0d",
"https://bcr.bazel.build/modules/protobuf/29.0-rc2/MODULE.bazel": "6241d35983510143049943fc0d57937937122baf1b287862f9dc8590fc4c37df",
"https://bcr.bazel.build/modules/protobuf/29.0/MODULE.bazel": "319dc8bf4c679ff87e71b1ccfb5a6e90a6dbc4693501d471f48662ac46d04e4e",
"https://bcr.bazel.build/modules/protobuf/29.0/source.json": "b857f93c796750eef95f0d61ee378f3420d00ee1dd38627b27193aa482f4f981",
"https://bcr.bazel.build/modules/protobuf/3.19.0/MODULE.bazel": "6b5fbb433f760a99a22b18b6850ed5784ef0e9928a72668b66e4d7ccd47db9b0",
"https://bcr.bazel.build/modules/pybind11_bazel/2.11.1/MODULE.bazel": "88af1c246226d87e65be78ed49ecd1e6f5e98648558c14ce99176da041dc378e",
"https://bcr.bazel.build/modules/pybind11_bazel/2.11.1/source.json": "be4789e951dd5301282729fe3d4938995dc4c1a81c2ff150afc9f1b0504c6022",
"https://bcr.bazel.build/modules/re2/2023-09-01/MODULE.bazel": "cb3d511531b16cfc78a225a9e2136007a48cf8a677e4264baeab57fe78a80206",
"https://bcr.bazel.build/modules/re2/2023-09-01/source.json": "e044ce89c2883cd957a2969a43e79f7752f9656f6b20050b62f90ede21ec6eb4",
"https://bcr.bazel.build/modules/rules_android/0.1.1/MODULE.bazel": "48809ab0091b07ad0182defb787c4c5328bd3a278938415c00a7b69b50c4d3a8",
"https://bcr.bazel.build/modules/rules_android/0.1.1/source.json": "e6986b41626ee10bdc864937ffb6d6bf275bb5b9c65120e6137d56e6331f089e",
"https://bcr.bazel.build/modules/rules_cc/0.0.1/MODULE.bazel": "cb2aa0747f84c6c3a78dad4e2049c154f08ab9d166b1273835a8174940365647",
"https://bcr.bazel.build/modules/rules_cc/0.0.10/MODULE.bazel": "ec1705118f7eaedd6e118508d3d26deba2a4e76476ada7e0e3965211be012002",
"https://bcr.bazel.build/modules/rules_cc/0.0.13/MODULE.bazel": "0e8529ed7b323dad0775ff924d2ae5af7640b23553dfcd4d34344c7e7a867191",
"https://bcr.bazel.build/modules/rules_cc/0.0.14/MODULE.bazel": "5e343a3aac88b8d7af3b1b6d2093b55c347b8eefc2e7d1442f7a02dc8fea48ac",
"https://bcr.bazel.build/modules/rules_cc/0.0.15/MODULE.bazel": "6704c35f7b4a72502ee81f61bf88706b54f06b3cbe5558ac17e2e14666cd5dcc",
"https://bcr.bazel.build/modules/rules_cc/0.0.16/MODULE.bazel": "7661303b8fc1b4d7f532e54e9d6565771fea666fbdf839e0a86affcd02defe87",
"https://bcr.bazel.build/modules/rules_cc/0.0.2/MODULE.bazel": "6915987c90970493ab97393024c156ea8fb9f3bea953b2f3ec05c34f19b5695c",
"https://bcr.bazel.build/modules/rules_cc/0.0.6/MODULE.bazel": "abf360251023dfe3efcef65ab9d56beefa8394d4176dd29529750e1c57eaa33f",
"https://bcr.bazel.build/modules/rules_cc/0.0.8/MODULE.bazel": "964c85c82cfeb6f3855e6a07054fdb159aced38e99a5eecf7bce9d53990afa3e",
"https://bcr.bazel.build/modules/rules_cc/0.0.9/MODULE.bazel": "836e76439f354b89afe6a911a7adf59a6b2518fafb174483ad78a2a2fde7b1c5",
"https://bcr.bazel.build/modules/rules_cc/0.1.1/MODULE.bazel": "2f0222a6f229f0bf44cd711dc13c858dad98c62d52bd51d8fc3a764a83125513",
"https://bcr.bazel.build/modules/rules_cc/0.1.1/source.json": "d61627377bd7dd1da4652063e368d9366fc9a73920bfa396798ad92172cf645c",
"https://bcr.bazel.build/modules/rules_foreign_cc/0.9.0/MODULE.bazel": "c9e8c682bf75b0e7c704166d79b599f93b72cfca5ad7477df596947891feeef6",
"https://bcr.bazel.build/modules/rules_fuzzing/0.5.2/MODULE.bazel": "40c97d1144356f52905566c55811f13b299453a14ac7769dfba2ac38192337a8",
"https://bcr.bazel.build/modules/rules_fuzzing/0.5.2/source.json": "c8b1e2c717646f1702290959a3302a178fb639d987ab61d548105019f11e527e",
"https://bcr.bazel.build/modules/rules_java/4.0.0/MODULE.bazel": "5a78a7ae82cd1a33cef56dc578c7d2a46ed0dca12643ee45edbb8417899e6f74",
"https://bcr.bazel.build/modules/rules_java/5.3.5/MODULE.bazel": "a4ec4f2db570171e3e5eb753276ee4b389bae16b96207e9d3230895c99644b86",
"https://bcr.bazel.build/modules/rules_java/6.0.0/MODULE.bazel": "8a43b7df601a7ec1af61d79345c17b31ea1fedc6711fd4abfd013ea612978e39",
"https://bcr.bazel.build/modules/rules_java/6.3.0/MODULE.bazel": "a97c7678c19f236a956ad260d59c86e10a463badb7eb2eda787490f4c969b963",
"https://bcr.bazel.build/modules/rules_java/6.4.0/MODULE.bazel": "e986a9fe25aeaa84ac17ca093ef13a4637f6107375f64667a15999f77db6c8f6",
"https://bcr.bazel.build/modules/rules_java/6.5.2/MODULE.bazel": "1d440d262d0e08453fa0c4d8f699ba81609ed0e9a9a0f02cd10b3e7942e61e31",
"https://bcr.bazel.build/modules/rules_java/7.10.0/MODULE.bazel": "530c3beb3067e870561739f1144329a21c851ff771cd752a49e06e3dc9c2e71a",
"https://bcr.bazel.build/modules/rules_java/7.12.2/MODULE.bazel": "579c505165ee757a4280ef83cda0150eea193eed3bef50b1004ba88b99da6de6",
"https://bcr.bazel.build/modules/rules_java/7.2.0/MODULE.bazel": "06c0334c9be61e6cef2c8c84a7800cef502063269a5af25ceb100b192453d4ab",
"https://bcr.bazel.build/modules/rules_java/7.3.2/MODULE.bazel": "50dece891cfdf1741ea230d001aa9c14398062f2b7c066470accace78e412bc2",
"https://bcr.bazel.build/modules/rules_java/7.6.1/MODULE.bazel": "2f14b7e8a1aa2f67ae92bc69d1ec0fa8d9f827c4e17ff5e5f02e91caa3b2d0fe",
"https://bcr.bazel.build/modules/rules_java/8.12.0/MODULE.bazel": "8e6590b961f2defdfc2811c089c75716cb2f06c8a4edeb9a8d85eaa64ee2a761",
"https://bcr.bazel.build/modules/rules_java/8.12.0/source.json": "cbd5d55d9d38d4008a7d00bee5b5a5a4b6031fcd4a56515c9accbcd42c7be2ba",
"https://bcr.bazel.build/modules/rules_jvm_external/4.4.2/MODULE.bazel": "a56b85e418c83eb1839819f0b515c431010160383306d13ec21959ac412d2fe7",
"https://bcr.bazel.build/modules/rules_jvm_external/5.1/MODULE.bazel": "33f6f999e03183f7d088c9be518a63467dfd0be94a11d0055fe2d210f89aa909",
"https://bcr.bazel.build/modules/rules_jvm_external/5.2/MODULE.bazel": "d9351ba35217ad0de03816ef3ed63f89d411349353077348a45348b096615036",
"https://bcr.bazel.build/modules/rules_jvm_external/5.3/MODULE.bazel": "bf93870767689637164657731849fb887ad086739bd5d360d90007a581d5527d",
"https://bcr.bazel.build/modules/rules_jvm_external/6.1/MODULE.bazel": "75b5fec090dbd46cf9b7d8ea08cf84a0472d92ba3585b476f44c326eda8059c4",
"https://bcr.bazel.build/modules/rules_jvm_external/6.3/MODULE.bazel": "c998e060b85f71e00de5ec552019347c8bca255062c990ac02d051bb80a38df0",
"https://bcr.bazel.build/modules/rules_jvm_external/6.3/source.json": "6f5f5a5a4419ae4e37c35a5bb0a6ae657ed40b7abc5a5189111b47fcebe43197",
"https://bcr.bazel.build/modules/rules_kotlin/1.9.0/MODULE.bazel": "ef85697305025e5a61f395d4eaede272a5393cee479ace6686dba707de804d59",
"https://bcr.bazel.build/modules/rules_kotlin/1.9.6/MODULE.bazel": "d269a01a18ee74d0335450b10f62c9ed81f2321d7958a2934e44272fe82dcef3",
"https://bcr.bazel.build/modules/rules_kotlin/1.9.6/source.json": "2faa4794364282db7c06600b7e5e34867a564ae91bda7cae7c29c64e9466b7d5",
"https://bcr.bazel.build/modules/rules_license/0.0.3/MODULE.bazel": "627e9ab0247f7d1e05736b59dbb1b6871373de5ad31c3011880b4133cafd4bd0",
"https://bcr.bazel.build/modules/rules_license/0.0.7/MODULE.bazel": "088fbeb0b6a419005b89cf93fe62d9517c0a2b8bb56af3244af65ecfe37e7d5d",
"https://bcr.bazel.build/modules/rules_license/1.0.0/MODULE.bazel": "a7fda60eefdf3d8c827262ba499957e4df06f659330bbe6cdbdb975b768bb65c",
"https://bcr.bazel.build/modules/rules_license/1.0.0/source.json": "a52c89e54cc311196e478f8382df91c15f7a2bfdf4c6cd0e2675cc2ff0b56efb",
"https://bcr.bazel.build/modules/rules_nodejs/6.2.0/MODULE.bazel": "ec27907f55eb34705adb4e8257952162a2d4c3ed0f0b3b4c3c1aad1fac7be35e",
"https://bcr.bazel.build/modules/rules_nodejs/6.2.0/source.json": "a77c307175a82982f0847fd6a8660db5b21440d8a9d073642cb4afa7a18612ff",
"https://bcr.bazel.build/modules/rules_oci/2.2.6/MODULE.bazel": "2ba6ddd679269e00aeffe9ca04faa2d0ca4129650982c9246d0d459fe2da47d9",
"https://bcr.bazel.build/modules/rules_oci/2.2.6/source.json": "94e7decb8f95d9465b0bbea71c65064cd16083be1350c7468f131818641dc4a5",
"https://bcr.bazel.build/modules/rules_pkg/0.7.0/MODULE.bazel": "df99f03fc7934a4737122518bb87e667e62d780b610910f0447665a7e2be62dc",
"https://bcr.bazel.build/modules/rules_pkg/1.0.1/MODULE.bazel": "5b1df97dbc29623bccdf2b0dcd0f5cb08e2f2c9050aab1092fd39a41e82686ff",
"https://bcr.bazel.build/modules/rules_pkg/1.0.1/source.json": "bd82e5d7b9ce2d31e380dd9f50c111d678c3bdaca190cb76b0e1c71b05e1ba8a",
"https://bcr.bazel.build/modules/rules_proto/4.0.0/MODULE.bazel": "a7a7b6ce9bee418c1a760b3d84f83a299ad6952f9903c67f19e4edd964894e06",
"https://bcr.bazel.build/modules/rules_proto/5.3.0-21.7/MODULE.bazel": "e8dff86b0971688790ae75528fe1813f71809b5afd57facb44dad9e8eca631b7",
"https://bcr.bazel.build/modules/rules_proto/6.0.0/MODULE.bazel": "b531d7f09f58dce456cd61b4579ce8c86b38544da75184eadaf0a7cb7966453f",
"https://bcr.bazel.build/modules/rules_proto/6.0.2/MODULE.bazel": "ce916b775a62b90b61888052a416ccdda405212b6aaeb39522f7dc53431a5e73",
"https://bcr.bazel.build/modules/rules_proto/7.0.2/MODULE.bazel": "bf81793bd6d2ad89a37a40693e56c61b0ee30f7a7fdbaf3eabbf5f39de47dea2",
"https://bcr.bazel.build/modules/rules_proto/7.0.2/source.json": "1e5e7260ae32ef4f2b52fd1d0de8d03b606a44c91b694d2f1afb1d3b28a48ce1",
"https://bcr.bazel.build/modules/rules_python/0.10.2/MODULE.bazel": "cc82bc96f2997baa545ab3ce73f196d040ffb8756fd2d66125a530031cd90e5f",
"https://bcr.bazel.build/modules/rules_python/0.23.1/MODULE.bazel": "49ffccf0511cb8414de28321f5fcf2a31312b47c40cc21577144b7447f2bf300",
"https://bcr.bazel.build/modules/rules_python/0.25.0/MODULE.bazel": "72f1506841c920a1afec76975b35312410eea3aa7b63267436bfb1dd91d2d382",
"https://bcr.bazel.build/modules/rules_python/0.28.0/MODULE.bazel": "cba2573d870babc976664a912539b320cbaa7114cd3e8f053c720171cde331ed",
"https://bcr.bazel.build/modules/rules_python/0.31.0/MODULE.bazel": "93a43dc47ee570e6ec9f5779b2e64c1476a6ce921c48cc9a1678a91dd5f8fd58",
"https://bcr.bazel.build/modules/rules_python/0.4.0/MODULE.bazel": "9208ee05fd48bf09ac60ed269791cf17fb343db56c8226a720fbb1cdf467166c",
"https://bcr.bazel.build/modules/rules_python/0.40.0/MODULE.bazel": "9d1a3cd88ed7d8e39583d9ffe56ae8a244f67783ae89b60caafc9f5cf318ada7",
"https://bcr.bazel.build/modules/rules_python/0.40.0/source.json": "939d4bd2e3110f27bfb360292986bb79fd8dcefb874358ccd6cdaa7bda029320",
"https://bcr.bazel.build/modules/rules_rust/0.61.0/MODULE.bazel": "0318a95777b9114c8740f34b60d6d68f9cfef61e2f4b52424ca626213d33787b",
"https://bcr.bazel.build/modules/rules_rust/0.61.0/source.json": "d1bc743b5fa2e2abb35c436df7126a53dab0c3f35890ae6841592b2253786a63",
"https://bcr.bazel.build/modules/rules_shell/0.2.0/MODULE.bazel": "fda8a652ab3c7d8fee214de05e7a9916d8b28082234e8d2c0094505c5268ed3c",
"https://bcr.bazel.build/modules/rules_shell/0.3.0/MODULE.bazel": "de4402cd12f4cc8fda2354fce179fdb068c0b9ca1ec2d2b17b3e21b24c1a937b",
"https://bcr.bazel.build/modules/rules_shell/0.4.0/MODULE.bazel": "0f8f11bb3cd11755f0b48c1de0bbcf62b4b34421023aa41a2fc74ef68d9584f0",
"https://bcr.bazel.build/modules/rules_shell/0.4.0/source.json": "1d7fa7f941cd41dc2704ba5b4edc2e2230eea1cc600d80bd2b65838204c50b95",
"https://bcr.bazel.build/modules/stardoc/0.5.1/MODULE.bazel": "1a05d92974d0c122f5ccf09291442580317cdd859f07a8655f1db9a60374f9f8",
"https://bcr.bazel.build/modules/stardoc/0.5.3/MODULE.bazel": "c7f6948dae6999bf0db32c1858ae345f112cacf98f174c7a8bb707e41b974f1c",
"https://bcr.bazel.build/modules/stardoc/0.5.4/MODULE.bazel": "6569966df04610b8520957cb8e97cf2e9faac2c0309657c537ab51c16c18a2a4",
"https://bcr.bazel.build/modules/stardoc/0.5.6/MODULE.bazel": "c43dabc564990eeab55e25ed61c07a1aadafe9ece96a4efabb3f8bf9063b71ef",
"https://bcr.bazel.build/modules/stardoc/0.6.2/MODULE.bazel": "7060193196395f5dd668eda046ccbeacebfd98efc77fed418dbe2b82ffaa39fd",
"https://bcr.bazel.build/modules/stardoc/0.7.0/MODULE.bazel": "05e3d6d30c099b6770e97da986c53bd31844d7f13d41412480ea265ac9e8079c",
"https://bcr.bazel.build/modules/stardoc/0.7.1/MODULE.bazel": "3548faea4ee5dda5580f9af150e79d0f6aea934fc60c1cc50f4efdd9420759e7",
"https://bcr.bazel.build/modules/stardoc/0.7.1/source.json": "b6500ffcd7b48cd72c29bb67bcac781e12701cc0d6d55d266a652583cfcdab01",
"https://bcr.bazel.build/modules/upb/0.0.0-20220923-a547704/MODULE.bazel": "7298990c00040a0e2f121f6c32544bab27d4452f80d9ce51349b1a28f3005c43",
"https://bcr.bazel.build/modules/zlib/1.2.11/MODULE.bazel": "07b389abc85fdbca459b69e2ec656ae5622873af3f845e1c9d80fe179f3effa0",
"https://bcr.bazel.build/modules/zlib/1.3.1.bcr.5/MODULE.bazel": "eec517b5bbe5492629466e11dae908d043364302283de25581e3eb944326c4ca",
"https://bcr.bazel.build/modules/zlib/1.3.1.bcr.5/source.json": "22bc55c47af97246cfc093d0acf683a7869377de362b5d1c552c2c2e16b7a806",
"https://bcr.bazel.build/modules/zlib/1.3.1/MODULE.bazel": "751c9940dcfe869f5f7274e1295422a34623555916eb98c174c1e945594bf198"
},
"selectedYankedVersions": {},
"moduleExtensions": {
"@@apple_support+//crosstool:setup.bzl%apple_cc_configure_extension": {
"general": {
"bzlTransitiveDigest": "xcBTf2+GaloFpg7YEh/Bv+1yAczRkiCt3DGws4K7kSk=",
"usagesDigest": "3L+PK6aRnliv0iIS8m3kdo+LjmvjJWoFCm3qZcPSg+8=",
"recordedFileInputs": {},
"recordedDirentsInputs": {},
"envVariables": {},
"generatedRepoSpecs": {
"local_config_apple_cc_toolchains": {
"repoRuleId": "@@apple_support+//crosstool:setup.bzl%_apple_cc_autoconf_toolchains",
"attributes": {}
},
"local_config_apple_cc": {
"repoRuleId": "@@apple_support+//crosstool:setup.bzl%_apple_cc_autoconf",
"attributes": {}
}
},
"recordedRepoMappingEntries": [
[
"apple_support+",
"bazel_tools",
"bazel_tools"
],
[
"bazel_tools",
"rules_cc",
"rules_cc+"
]
]
}
},
"@@rules_kotlin+//src/main/starlark/core/repositories:bzlmod_setup.bzl%rules_kotlin_extensions": {
"general": {
"bzlTransitiveDigest": "hUTp2w+RUVdL7ma5esCXZJAFnX7vLbVfLd7FwnQI6bU=",
"usagesDigest": "QI2z8ZUR+mqtbwsf2fLqYdJAkPOHdOV+tF2yVAUgRzw=",
"recordedFileInputs": {},
"recordedDirentsInputs": {},
"envVariables": {},
"generatedRepoSpecs": {
"com_github_jetbrains_kotlin_git": {
"repoRuleId": "@@rules_kotlin+//src/main/starlark/core/repositories:compiler.bzl%kotlin_compiler_git_repository",
"attributes": {
"urls": [
"https://github.com/JetBrains/kotlin/releases/download/v1.9.23/kotlin-compiler-1.9.23.zip"
],
"sha256": "93137d3aab9afa9b27cb06a824c2324195c6b6f6179d8a8653f440f5bd58be88"
}
},
"com_github_jetbrains_kotlin": {
"repoRuleId": "@@rules_kotlin+//src/main/starlark/core/repositories:compiler.bzl%kotlin_capabilities_repository",
"attributes": {
"git_repository_name": "com_github_jetbrains_kotlin_git",
"compiler_version": "1.9.23"
}
},
"com_github_google_ksp": {
"repoRuleId": "@@rules_kotlin+//src/main/starlark/core/repositories:ksp.bzl%ksp_compiler_plugin_repository",
"attributes": {
"urls": [
"https://github.com/google/ksp/releases/download/1.9.23-1.0.20/artifacts.zip"
],
"sha256": "ee0618755913ef7fd6511288a232e8fad24838b9af6ea73972a76e81053c8c2d",
"strip_version": "1.9.23-1.0.20"
}
},
"com_github_pinterest_ktlint": {
"repoRuleId": "@@bazel_tools//tools/build_defs/repo:http.bzl%http_file",
"attributes": {
"sha256": "01b2e0ef893383a50dbeb13970fe7fa3be36ca3e83259e01649945b09d736985",
"urls": [
"https://github.com/pinterest/ktlint/releases/download/1.3.0/ktlint"
],
"executable": true
}
},
"rules_android": {
"repoRuleId": "@@bazel_tools//tools/build_defs/repo:http.bzl%http_archive",
"attributes": {
"sha256": "cd06d15dd8bb59926e4d65f9003bfc20f9da4b2519985c27e190cddc8b7a7806",
"strip_prefix": "rules_android-0.1.1",
"urls": [
"https://github.com/bazelbuild/rules_android/archive/v0.1.1.zip"
]
}
}
},
"recordedRepoMappingEntries": [
[
"rules_kotlin+",
"bazel_tools",
"bazel_tools"
]
]
}
},
"@@rules_oci+//oci:extensions.bzl%oci": {
"general": {
"bzlTransitiveDigest": "KHcdN2ovRQGX1MKsH0nGoGPFd/84U43tssN2jImCeJU=",
"usagesDigest": "/O1PwnnkqSBmI9Oe08ZYYqjM4IS8JR+/9rjgzVTNDaQ=",
"recordedFileInputs": {},
"recordedDirentsInputs": {},
"envVariables": {},
"generatedRepoSpecs": {
"oci_crane_darwin_amd64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "darwin_amd64",
"crane_version": "v0.18.0"
}
},
"oci_crane_darwin_arm64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "darwin_arm64",
"crane_version": "v0.18.0"
}
},
"oci_crane_linux_arm64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "linux_arm64",
"crane_version": "v0.18.0"
}
},
"oci_crane_linux_armv6": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "linux_armv6",
"crane_version": "v0.18.0"
}
},
"oci_crane_linux_i386": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "linux_i386",
"crane_version": "v0.18.0"
}
},
"oci_crane_linux_s390x": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "linux_s390x",
"crane_version": "v0.18.0"
}
},
"oci_crane_linux_amd64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "linux_amd64",
"crane_version": "v0.18.0"
}
},
"oci_crane_windows_armv6": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "windows_armv6",
"crane_version": "v0.18.0"
}
},
"oci_crane_windows_amd64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%crane_repositories",
"attributes": {
"platform": "windows_amd64",
"crane_version": "v0.18.0"
}
},
"oci_crane_toolchains": {
"repoRuleId": "@@rules_oci+//oci/private:toolchains_repo.bzl%toolchains_repo",
"attributes": {
"toolchain_type": "@rules_oci//oci:crane_toolchain_type",
"toolchain": "@oci_crane_{platform}//:crane_toolchain"
}
},
"oci_regctl_darwin_amd64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%regctl_repositories",
"attributes": {
"platform": "darwin_amd64"
}
},
"oci_regctl_darwin_arm64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%regctl_repositories",
"attributes": {
"platform": "darwin_arm64"
}
},
"oci_regctl_linux_arm64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%regctl_repositories",
"attributes": {
"platform": "linux_arm64"
}
},
"oci_regctl_linux_s390x": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%regctl_repositories",
"attributes": {
"platform": "linux_s390x"
}
},
"oci_regctl_linux_amd64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%regctl_repositories",
"attributes": {
"platform": "linux_amd64"
}
},
"oci_regctl_windows_amd64": {
"repoRuleId": "@@rules_oci+//oci:repositories.bzl%regctl_repositories",
"attributes": {
"platform": "windows_amd64"
}
},
"oci_regctl_toolchains": {
"repoRuleId": "@@rules_oci+//oci/private:toolchains_repo.bzl%toolchains_repo",
"attributes": {
"toolchain_type": "@rules_oci//oci:regctl_toolchain_type",
"toolchain": "@oci_regctl_{platform}//:regctl_toolchain"
}
}
},
"moduleExtensionMetadata": {
"explicitRootModuleDirectDeps": [],
"explicitRootModuleDirectDevDeps": [],
"useAllRepos": "NO",
"reproducible": false
},
"recordedRepoMappingEntries": [
[
"aspect_bazel_lib+",
"bazel_tools",
"bazel_tools"
],
[
"bazel_features+",
"bazel_tools",
"bazel_tools"
],
[
"rules_oci+",
"aspect_bazel_lib",
"aspect_bazel_lib+"
],
[
"rules_oci+",
"bazel_features",
"bazel_features+"
],
[
"rules_oci+",
"bazel_skylib",
"bazel_skylib+"
]
]
}
}
}
}

View file

@ -1,22 +0,0 @@
# Hello World
Demonstrates a simple parameterized job.
## Configure
```bash
$ bazel run //:test_job.cfg test_output
{"outputs":["test_output"],"inputs":[],"args":["will", "build", "test_output"],"env":{"foo":"bar"}}
```
## Execute
Doesn't actually write an output.
```bash
$ bazel run //:test_job.cfg test_output | bazel run //:test_job
EXECUTE!
foo=bar
args=will build test_output
```

View file

@ -1,8 +0,0 @@
sh_test(
name = "test",
srcs = ["test.sh"],
data = [
"//:test_job.cfg",
"//:test_job.exec",
],
)

View file

@ -1,5 +0,0 @@
#!/usr/bin/env bash
test_job.cfg nice
test_job.cfg cool | jq -c ".configs[0]" | test_job.exec

View file

@ -1,21 +0,0 @@
#!/bin/bash
# Simple unified job that handles both config and exec via subcommands
case "${1:-}" in
"config")
# Configuration mode - output job config JSON
partition_ref="${2:-}"
echo "{\"configs\":[{\"outputs\":[{\"str\":\"${partition_ref}\"}],\"inputs\":[],\"args\":[\"will\", \"build\", \"${partition_ref}\"],\"env\":{\"foo\":\"bar\"}}]}"
;;
"exec")
# Execution mode - run the job
echo 'EXECUTE UNIFIED!'
echo "foo=$foo"
echo "args=$@"
;;
*)
echo "Usage: $0 {config|exec} [args...]"
exit 1
;;
esac

View file

@ -184,12 +184,6 @@ databuild_job(
binary = ":test_job_binary",
)
py_binary(
name = "test_job_binary",
srcs = ["unified_job.py"],
main = "unified_job.py",
)
# Test target
py_binary(
name = "test_jobs",

View file

@ -0,0 +1,9 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="true">
<exclude-output />
<content url="file://$MODULE_DIR$" />
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
</module>

View file

@ -1,47 +0,0 @@
#!/usr/bin/env python3
import sys
import json
def main():
if len(sys.argv) < 2:
print("Usage: unified_job.py {config|exec} [args...]", file=sys.stderr)
sys.exit(1)
command = sys.argv[1]
if command == "config":
handle_config(sys.argv[2:])
elif command == "exec":
handle_exec(sys.argv[2:])
else:
print(f"Unknown command: {command}", file=sys.stderr)
print("Usage: unified_job.py {config|exec} [args...]", file=sys.stderr)
sys.exit(1)
def handle_config(args):
if len(args) < 1:
print("Config mode requires partition ref", file=sys.stderr)
sys.exit(1)
partition_ref = args[0]
config = {
"configs": [{
"outputs": [{"str": partition_ref}],
"inputs": [],
"args": ["Hello", "gorgeous", partition_ref],
"env": {"PARTITION_REF": partition_ref}
}]
}
print(json.dumps(config))
def handle_exec(args):
print("What a time to be alive.")
print(f"Partition ref: {os.getenv('PARTITION_REF', 'unknown')}")
print(f"Args: {args}")
if __name__ == "__main__":
import os
main()

284
plans/job-wrapper.md Normal file
View file

@ -0,0 +1,284 @@
# Job Wrapper v2 Plan
## Overview
The job wrapper is a critical component that mediates between DataBuild graphs and job executables, providing observability, error handling, and state management. This plan describes the next generation job wrapper implementation in Rust.
## Architecture
### Core Design Principles
1. **Single Communication Channel**: Jobs communicate with graphs exclusively through structured logs
2. **Platform Agnostic**: Works identically across local, Docker, K8s, and cloud platforms
3. **Zero Network Requirements**: Jobs don't need to connect to any services
4. **Fail-Safe**: Graceful handling of job crashes and fast completions
### Communication Model
```
Graph → Job: Launch with JobConfig (via CLI args/env)
Job → Graph: Structured logs (stdout)
Graph: Tails logs and interprets into metrics, events, and manifests
```
## Structured Log Protocol
### Message Format (Protobuf)
```proto
message JobLogEntry {
string timestamp = 1;
string job_id = 2;
string partition_ref = 3;
uint64 sequence_number = 4; // Monotonic sequence starting from 1
oneof content {
LogMessage log = 5;
MetricPoint metric = 6;
JobEvent event = 7;
PartitionManifest manifest = 8;
}
}
message LogMessage {
enum LogLevel {
DEBUG = 0;
INFO = 1;
WARN = 2;
ERROR = 3;
}
LogLevel level = 1;
string message = 2;
map<string, string> fields = 3;
}
message MetricPoint {
string name = 1;
double value = 2;
map<string, string> labels = 3;
string unit = 4;
}
message JobEvent {
string event_type = 1; // "task_launched", "heartbeat", "task_completed", etc
google.protobuf.Any details = 2;
map<string, string> metadata = 3;
}
```
### Log Stream Lifecycle
1. Wrapper emits `job_config_started` event (sequence #1)
2. Wrapper validates configuration
3. Wrapper emits `task_launched` event (sequence #2)
4. Job executes, wrapper captures stdout/stderr (sequence #3+)
5. Wrapper emits periodic `heartbeat` events (every 30s)
6. Wrapper detects job completion
7. Wrapper emits `PartitionManifest` message (final required message with highest sequence number)
8. Wrapper exits
The PartitionManifest serves as the implicit end-of-logs marker - the graph knows processing is complete when it sees this message. Sequence numbers enable the graph to detect missing or out-of-order messages and ensure reliable telemetry collection.
## Wrapper Implementation
### Interfaces
```rust
trait JobWrapper {
// Config mode - accepts PartitionRef objects
fn config(outputs: Vec<PartitionRef>) -> Result<JobConfig>;
// Exec mode - accepts serialized JobConfig
fn exec(config: JobConfig) -> Result<()>;
}
```
### Exit Code Standards
Following POSIX conventions and avoiding collisions with standard exit codes:
Reference:
- https://manpages.ubuntu.com/manpages/noble/man3/sysexits.h.3head.html
- https://tldp.org/LDP/abs/html/exitcodes.html
```rust
// Standard POSIX codes we respect:
// 0 - Success
// 1 - General error
// 2 - Misuse of shell builtin
// 64 - Command line usage error (EX_USAGE)
// 65 - Data format error (EX_DATAERR)
// 66 - Cannot open input (EX_NOINPUT)
// 69 - Service unavailable (EX_UNAVAILABLE)
// 70 - Internal software error (EX_SOFTWARE)
// 71 - System error (EX_OSERR)
// 73 - Can't create output file (EX_CANTCREAT)
// 74 - Input/output error (EX_IOERR)
// 75 - Temp failure; retry (EX_TEMPFAIL)
// 77 - Permission denied (EX_NOPERM)
// 78 - Configuration error (EX_CONFIG)
// DataBuild-specific codes (100+ to avoid collisions):
// 100-109 - User-defined permanent failures
// 110-119 - User-defined transient failures
// 120-129 - User-defined resource failures
// 130+ - Other user-defined codes
enum ExitCodeCategory {
Success, // 0
StandardError, // 1-63 (shell/system)
PosixError, // 64-78 (sysexits.h)
TransientFailure, // 75 (EX_TEMPFAIL) or 110-119
UserDefined, // 100+
}
```
## Platform-Specific Log Handling
### Local Execution
- Graph spawns wrapper process
- Graph reads from stdout pipe directly
- PartitionManifest indicates completion
### Docker
- Graph runs `docker run` with wrapper as entrypoint
- Graph uses `docker logs -f` to tail output
- Logs persist after container exit
### Kubernetes
- Job pods use wrapper as container entrypoint
- Graph tails logs via K8s API
- Configure `terminationGracePeriodSeconds` for log retention
### Cloud Run / Lambda
- Wrapper logs to platform logging service
- Graph queries logs via platform API
- Natural buffering and persistence
## Observability Features
### Metrics Collection
For metrics, we'll use a simplified StatsD-like format in our structured logs, which the graph can aggregate and expose via Prometheus format:
```json
{
"timestamp": "2025-01-27T10:30:45Z",
"content": {
"metric": {
"name": "rows_processed",
"value": 1500000,
"labels": {
"partition": "date=2025-01-27",
"stage": "transform"
},
"unit": "count"
}
}
}
```
The graph component will:
- Aggregate metrics from job logs
- Expose them in Prometheus format for scraping (when running as a service)
- Store summary metrics in the BEL for historical analysis
For CLI-invoked builds, metrics are still captured in the BEL but not exposed for scraping (which is acceptable since these are typically one-off runs).
### Heartbeating
Fixed 30-second heartbeat interval (based on Kubernetes best practices):
```json
{
"timestamp": "2025-01-27T10:30:45Z",
"content": {
"event": {
"event_type": "heartbeat",
"metadata": {
"memory_usage_mb": "1024",
"cpu_usage_percent": "85.2"
}
}
}
}
```
### Log Bandwidth Limits
To prevent log flooding:
- Maximum log rate: 1000 messages/second
- Maximum message size: 1MB
- If limits exceeded: Wrapper emits rate limit warning and drops messages
- Final metrics show dropped message count
## Testing Strategy
### Unit Tests
- Log parsing and serialization
- Exit code categorization
- Rate limiting behavior
- State machine transitions
### Integration Tests
- Full job execution lifecycle
- Platform-specific log tailing
- Fast job completion handling
- Large log volume handling
### Platform Tests
- Local process execution
- Docker container runs
- Kubernetes job pods
- Cloud Run invocations
### Failure Scenario Tests
- Job crashes (SIGSEGV, SIGKILL)
- Wrapper crashes
- Log tailing interruptions
- Platform-specific failures
## Implementation Phases
### Phase 0: Minimal Bootstrap
Implement the absolute minimum to unblock development and testing:
- Simple JSON-based logging (no protobuf yet)
- Basic wrapper that only handles happy path
- Support for local execution only
- Minimal log parsing in graph
- Integration with existing example jobs
This phase delivers a working end-to-end system that can be continuously evolved.
### Phase 1: Core Protocol
- Define protobuf schemas
- Implement structured logger
- Add error handling and exit codes
- Implement heartbeating
- Graph-side log parser improvements
### Phase 2: Platform Support
- Docker integration
- Kubernetes support
- Cloud platform adapters
- Platform-specific testing
### Phase 3: Production Hardening
- Rate limiting
- Error recovery
- Performance optimization
- Monitoring integration
### Phase 4: Advanced Features
- In-process config for library jobs
- Custom metrics backends
- Advanced failure analysis
## Success Criteria
1. **Zero Network Dependencies**: Jobs run without any network access
2. **Platform Parity**: Identical behavior across all execution platforms
3. **Minimal Overhead**: < 100ms wrapper overhead for config, < 1s for exec
4. **Complete Observability**: All job state changes captured in logs
5. **Graceful Failures**: No log data loss even in crash scenarios
## Next Steps
1. Implement minimal bootstrap wrapper
2. Test with existing example jobs
3. Iterate on log format based on real usage
4. Gradually add features per implementation phases