Add explicit requirements.md

lets go
2025-09-07 18:03:55 -07:00 · 2025-09-03 21:32:17 -07:00
187 changed files with 581 additions and 30043 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -0,0 +1,71 @@
 # Agent Instructions
 ## Project Overview
 DataBuild is a bazel-based data build system. Key files:
 - [`DESIGN.md`](./DESIGN.md) - Overall design of databuild
 - [`databuild.proto`](databuild/databuild.proto) - System interfaces
 - Component designs - design docs for specific aspects or components of databuild:
  - [Core build](./design/core-build.md) - How the core semantics of databuild works and are implemented
  - [Build event log](./design/build-event-log.md) - How the build event log works and is accessed
  - [Service](./design/service.md) - How the databuild HTTP service and web app are designed.
  - [Glossary](./design/glossary.md) - Centralized description of key terms.
  - [Graph specification](./design/graph-specification.md) - Describes the different libraries that enable more succinct declaration of databuild applications than the core bazel-based interface.
  - [Deploy strategies](./design/deploy-strategies.md) - Different strategies for deploying databuild applications.
  - [Wants](./design/wants.md) - How triggering works in databuild applications.
  - [Why databuild?](./design/why-databuild.md) - Why to choose databuild instead of other better established orchestration solutions.
 Please reference these for any related work, as they indicate key technical bias/direction of the project.
 ## Tenets
 - Declarative over imperative wherever possible/reasonable.
 - We are building for the future, and choose to do "the right thing" rather than taking shortcuts to get unstuck. If you get stuck, pause and ask for help/input.
 - Do not add "unknown" results when parses or matches fail - these should always throw.
 - Compile time correctness is a super-power, and investment in it speeds up flywheel for development and user value.
 - **CLI/Service Interchangeability**: Both the CLI and service must produce identical artifacts (BEL events, logs, metrics, outputs) in the same locations. Users should be able to build with one interface and query/inspect results from the other seamlessly. This principle applies to all DataBuild operations, not just builds.
 ## Build & Test
 ```bash
 # Build all databuild components
 bazel build //...
 # Run databuild unit tests
 bazel test //...
 # Run end-to-end tests (validates CLI vs Service consistency)
 ./run_e2e_tests.sh
 # Do not try to `bazel test //examples/basic_graph/...`, as this will not work.
 ```
 ## Project Structure
 - `databuild/` - Core system (Rust/Proto)
 - `examples/` - Example implementations
 - `scripts/` - Build utilities
 ## DataBuild Job Architecture
 ### Job Target Structure
 Each DataBuild job creates three Bazel targets:
 - `job_name.exec` - Execution target (calls binary with "exec" subcommand)  
 - `job_name` - Main job target (pipes config output to exec input)
 ### Graph Configuration
 ```python
 databuild_graph(
    name = "my_graph",
    jobs = [":job1", ":job2"],  # Reference base job targets
    lookup = ":job_lookup",     # Binary that routes partition refs to jobs
 )
 ```
 ### Job Lookup Pattern
 ```python
 def lookup_job_for_partition(partition_ref: str) -> str:
    if pattern.match(partition_ref):
        return "//:job_name"  # Return base job target
    raise ValueError(f"No job found for: {partition_ref}")
 ```
 ## Notes / Tips
 - Rust dependencies are implemented via rules_rust, so new dependencies should be added in the `MODULE.bazel` file.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -1,105 +0,0 @@
 # Agent Instructions
 ## Project Overview
 DataBuild is a bazel-based data build system. Key files:
 - [`DESIGN.md`](./DESIGN.md) - Overall design of databuild
 - [`databuild.proto`](databuild/databuild.proto) - System interfaces
 - Component designs - design docs for specific aspects or components of databuild:
  - [Core build](./design/core-build.md) - How the core semantics of databuild works and are implemented
  - [Build event log](./design/build-event-log.md) - How the build event log works and is accessed
  - [Service](./design/service.md) - How the databuild HTTP service and web app are designed.
  - [Glossary](./design/glossary.md) - Centralized description of key terms.
  - [Graph specification](./design/graph-specification.md) - Describes the different libraries that enable more succinct declaration of databuild applications than the core bazel-based interface.
  - [Observability](./design/observability.md) - How observability is systematically achieved throughout databuild applications.
  - [Deploy strategies](./design/deploy-strategies.md) - Different strategies for deploying databuild applications.
  - [Wants](./design/wants.md) - How triggering works in databuild applications.
  - [Why databuild?](./design/why-databuild.md) - Why to choose databuild instead of other better established orchestration solutions.
 Please reference these for any related work, as they indicate key technical bias/direction of the project.
 ## Tenets
 - Declarative over imperative wherever possible/reasonable.
 - We are building for the future, and choose to do "the right thing" rather than taking shortcuts to get unstuck. If you get stuck, pause and ask for help/input.
 - Do not add "unknown" results when parses or matches fail - these should always throw.
 - Compile time correctness is a super-power, and investment in it speeds up flywheel for development and user value.
 - **CLI/Service Interchangeability**: Both the CLI and service must produce identical artifacts (BEL events, logs, metrics, outputs) in the same locations. Users should be able to build with one interface and query/inspect results from the other seamlessly. This principle applies to all DataBuild operations, not just builds.
 ## Build & Test
 ```bash
 # Build all databuild components
 bazel build //...
 # Run databuild unit tests
 bazel test //...
 # Run end-to-end tests (validates CLI vs Service consistency)
 ./run_e2e_tests.sh
 # Do not try to `bazel test //examples/basic_graph/...`, as this will not work.
 ```
 ## Project Structure
 - `databuild/` - Core system (Rust/Proto)
 - `examples/` - Example implementations
 - `scripts/` - Build utilities
 ## Key Components
 - Graph analysis/execution in Rust
 - Bazel rules for job orchestration
 - Java/Python examples for different use cases
 ## DataBuild Job Architecture
 ### Job Target Structure
 Each DataBuild job creates three Bazel targets:
 - `job_name.cfg` - Configuration target (calls binary with "config" subcommand)
 - `job_name.exec` - Execution target (calls binary with "exec" subcommand)  
 - `job_name` - Main job target (pipes config output to exec input)
 ### Unified Job Binary Pattern
 Jobs use a single binary with subcommands:
 ```python
 def main():
    command = sys.argv[1]  # "config" or "exec"
    if command == "config":
        handle_config(sys.argv[2:])  # Output job configuration JSON
    elif command == "exec":
        handle_exec(sys.argv[2:])    # Perform actual work
 ```
 ### DataBuild Execution Flow
 1. **Planning Phase**: DataBuild calls `.cfg` targets to get job configurations
 2. **Execution Phase**: DataBuild calls main job targets which pipe config to exec
 3. **Job Resolution**: Job lookup returns base job names (e.g., `//:job_name`), not `.cfg` variants
 ### Graph Configuration
 ```python
 databuild_graph(
    name = "my_graph",
    jobs = [":job1", ":job2"],  # Reference base job targets
    lookup = ":job_lookup",     # Binary that routes partition refs to jobs
 )
 ```
 ### Job Lookup Pattern
 ```python
 def lookup_job_for_partition(partition_ref: str) -> str:
    if pattern.match(partition_ref):
        return "//:job_name"  # Return base job target
    raise ValueError(f"No job found for: {partition_ref}")
 ```
 ### Common Pitfalls
 - **Not using protobuf-defined interface**: Where structs and interfaces are defined centrally in [`databuild.proto`](./databuild/databuild.proto), those interfaces should always be used. E.g., in rust depending on them via the prost-generated structs, and in the web app via the OpenAPI-generated typescript interfaces.
 - **Empty args**: Jobs with `"args": []` won't execute properly
 - **Wrong target refs**: Job lookup must return base targets, not `.cfg` variants
 - **Missing partition refs**: All outputs must be addressable via partition references
 - **Not adding new generated files to OpenAPI outs**: Bazel hermeticity demands that we specify each output file, so when the OpenAPI code gen would create new files, we need to explicitly add them to the target's outs field.
 ## Notes / Tips
 - Rust dependencies are implemented via rules_rust, so new dependencies should be added in the `MODULE.bazel` file.
 ## Documentation
 We use plans / designs in the [plans](./plans/) directory to anchor most large scale efforts. We create plans that are good bets, though not necessarily exhaustive, then (and this is critical) we update them after the work is completed, or after significant progress towards completion. 
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1 @@
 AGENTS.md
--- a/DESIGN.md
+++ b/DESIGN.md
@ -1,45 +1,34 @@
 # DataBuild Design
-DataBuild is a trivially-deployable, partition-oriented, declarative build system. Where data orchestration flows are normally imperative and implicit (do this, then do that, etc), DataBuild uses stated data dependencies to make this process declarative and explicit. DataBuild scales the declarative nature of tools like DBT to meet the needs of modern, broadly integrated data and ML organizations, who consume data from many sources and which arrive on a highly varying basis. DataBuild enables confident, bounded completeness in a world where input data is effectively never complete at any given time.
+DataBuild is a trivially-deployable, partition-oriented, declarative build system. Where data orchestration flows are normally imperative and implicitly coupled (do this, then do that, etc), DataBuild uses stated data dependencies to make this process declarative and explicit. DataBuild scales the declarative nature of tools like DBT to meet the needs of modern, broadly integrated data and ML organizations, who consume data from many sources and which arrive on a highly varying basis. DataBuild enables confident, bounded completeness in a world where input data is effectively never complete at any given time.
 ## Philosophy
 Inspired by [these requirements.](./design/requirements.md)
 Many large-scale systems for producing data leave the complexity of true orchestration to the user - even DAG-based systems for implementing dependencies leave the system as a collection of DAGs, requiring engineers to solve the same "why doesn't this data exist?" and "how do I build this data?"
-DataBuild takes inspiration from modern data orchestration and build systems to fully internalize this complexity, using the Job concept to localize all decisions of turning upstream data into output data (and making all dependencies explicit); and the Graph concept to handle composition of jobs, answering what sequence of jobs must be run to build a specific partition of data. With Jobs and Graphs, DataBuild takes complete responsibility for the data build process, allowing engineers to consider concerns only local to the jobs relevant to their feature.
+DataBuild takes inspiration from modern data orchestration and build systems to fully internalize this complexity, using the Job concept to localize all decisions of turning upstream data into output data (and making all dependencies explicit); and the Graph concept to handle composition of jobs, enabling continuous data reconciliation for data platforms of all sizes. With Jobs and Graphs, DataBuild takes complete responsibility for the data build process, allowing engineers to consider concerns only local to the jobs relevant to their feature.
 Graphs and jobs are defined in [bazel](https://bazel.build), allowing graphs (and their constituent jobs) to be built and deployed trivially.
 ## Concepts
 - **Partitions** - A partition is an atomic unit of data. DataBuild's data dependencies work by using partition references (e.g. `s3://some/dataset/date=2025-06-01`) as dependency signals between jobs, allowing the construction of build graphs to produce arbitrary partitions.
- **Jobs** - Their `exec` entrypoint builds partitions from partitions, and their `config` entrypoint specifies what partitions are required to produce the requested partition(s), along with the specific config to run `exec` with to build said partitions.
+- **Jobs** - Builds requested partitions from specific input partitions, or raising when input partitions are missing (specifying which partitions can't be built because of specific missing partitions)
- **Graphs** - Composes jobs together to achieve multi-job orchestration, using a `lookup` mechanism to resolve a requested partition to the job that can build it. Together with its constituent jobs, Graphs can fully plan the build of any set of partitions. Most interactions with a DataBuild app happen with a graph.
+- **Graphs** - Composes jobs together to achieve multi-job orchestration, using a `lookup` mechanism to resolve a requested partition to the job that can build it. Together with its constituent jobs, Graphs can fully build any set of partitions. Most interactions with a DataBuild app happen with a graph.
- **Build Event Log** - Encodes the state of the system, recording build requests, job activity, partition production, etc to enable running databuild as a deployed application.
+- **Build Event Log** - Encodes the state of the system, recording partition wants, job activity, partition production, etc to enable running databuild as a deployed application.
- **Wants** - Partition wants can be registered with DataBuild, causing it to build the wanted partitions as soon as its graph-external dependencies are met.
+- **Wants** - Partition wants can be registered with DataBuild, enabling continuous data reconciliation and build of wanted partitions as soon as their graph-external dependencies are met.
 - **Taints** - Taints mark a partition as invalid, indicating that readers should not use it, and that it should be rebuilt when requested or depended upon. If there is a still-active want for the tainted partition, it will be rebuilt immediately.
 - **Bazel Targets** - Bazel is a fast, extensible, and hermetic build system. DataBuild uses bazel targets to describe graphs and jobs, making graphs themselves deployable application. Implementing a DataBuild app is the process of integrating your data build jobs in `databuild_job` bazel targets, and connecting them with a `databuild_graph` target.
- [**Graph Specification Strategies**](design/graph-specification.md) (coming soon) Application libraries in Python/Rust/Scala that use language features to enable ergonomic and succinct specification of jobs and graphs.
+- [**Graph Definition Languages**](design/graph-specification.md) Application libraries in Python/Rust/Scala that use language features to enable ergonomic and succinct specification of jobs and graphs.
-### Partition / Job Assumptions and Best Practices
+## Bazel Components
 - **Partitions are atomic and final** - Either the data is complete or its "not there".
 - **Partitions are mutually exclusive and collectively exhaustive** - Row membership to a partition should be unambiguous and consistent.
 - **Jobs are idempotent** - For the same input data and parameters, the same partition is produced (functionally).
 ### Partition Delegation
 If a partition is already up to date, or is already being built by a previous build request, a new build request will "delegate" to that build request. Instead of running the job to build said partition again, it will emit a delegation event in the build event log, explicitly pointing to the build action it is delegating to.
 ## Components
 ### Job
-The `databuild_job` rule expects to reference a binary that adheres to the following expectations:
+The `databuild_job` rule requires just a binary target that it can execute, and any relevant metadata that helps the graph call it properly. The referenced binary should accept a list of partitions that it needs to produce, and if any required partitions are missing, report which are missing and which requested partitions they prevent from being built.
 - For the `config` subcommand, it prints the JSON job config to stdout based on the requested partitions, e.g. for a binary `bazel-bin/my_binary`, it prints a valid job config when called like `bazel-bin/my_binary config my_dataset/color=red my_dataset/color=blue`.
 - For the `exec` subcommand, it produces the partitions requested to the `config` subcommand when configured by the job config it produced. E.g., if `config` had produced `{..., "args": ["red", "blue"], "env": {"MY_ENV": "foo"}`, then calling `MY_ENV=foo bazel-bin/my_binary exec red blue` should produce partitions `my_dataset/color=red` and `my_dataset/color=blue`.
 Jobs are executed via a wrapper component that provides observability, error handling, and standardized communication with the graph. The wrapper captures all job output as structured logs, enabling comprehensive monitoring without requiring jobs to have network connectivity.
@ -50,19 +39,16 @@ The `databuild_graph` rule expects two fields, `jobs`, and `lookup`:
 - The `lookup` binary target should return a JSON object with keys as job labels and values as the list of partitions that each job is responsible for producing. This enables graph planning by walking backwards in the data dependency graph.
 - The `jobs` list should just be a list of all jobs involved in the graph. The graph will recursively call config to resolve the full set of jobs to run.
-### Build Event Log (BEL)
+### [Build Event Log (BEL)](./design/build-event-log.md)
-The BEL encodes all relevant build actions that occur, enabling concurrent builds. This includes:
+The BEL encodes all relevant build actions that occur, enabling distributed/concurrent builds. This includes submitted wants, job events (started, succeeded, partitions missing, etc)
 - Graph events, including "build requested", "build started", "analysis started", "build failed", "build completed", etc.
 - Job events, including "..."
 The BEL is similar to [event-sourced](https://martinfowler.com/eaaDev/EventSourcing.html) systems, as all application state is rendered from aggregations over the BEL. This enables the BEL to stay simple while also powering concurrent builds, the data catalog, and the DataBuild service. 
-### Triggers and Wants (Coming Soon)
+### Wants and Taints
-["Wants"](./design/wants.md) are the main mechanism for continually building partitions over time. In real world scenarios, it is standard for data to arrive late, or not at all. Wants cause the databuild graph to continually attempt to build the wanted partitions until a) the partitions are live or b) the want expires, at which another script can be run. Wants are the mechanism that implements SLA checking. 
+["Wants"](./design/wants.md) are the main mechanism for eventually built partitions. In real world scenarios, it is standard for data to arrive late, or not at all. Wants cause the databuild graph to continually attempt to build the wanted partitions while they aren't live, and enabling it to list wants who are past SLA.
-You can also use cron-based triggers, which return partition refs that they want built.
+Taints allow for manual/programmatic invalidation of built partitions. Partitions tainted since their last build are considered as non-existent, and will be rebuilt if any other wanted partition depends on them. This also opens the door for invalidating downstream partitions as well.
 # Key Insights
@ -70,6 +56,15 @@ You can also use cron-based triggers, which return partition refs that they want
 - Orchestration decisions and application logic is innately coupled.
 - "systemd for data platforms"
 ## What About Configuration?
 Configuration is all the information that is provided to a job that isn't a) the data the job reads or b) the partitions the job is being asked to produce. This could be info like "what modeling strategy do we use for this customer" or "when did was this feed configured", etc. It has the inconvenient features of being critical for practical business value and is also difficult to fit in as data (since you often want to change and "tweak" it).
 DataBuild explicitly and intentionally treats configuration as a job-internal concept: jobs are not pure functions, but it is a good idea for almost all of the implementation to be purely functional: it's recommended to calculate structured job configuration up front (along with trying to resolve the required input data), then invoking the rest of your job as a pure function over the config and data. 
 What about situations where data is configured by a web app, etc? Taints are a great way to invalidate partitions that are impacted by config changes, and you can create callbacks in your application to taint impacted partitions. 
 ## Assumptions
 - Job -> partition relationships are canonical, job runs are idempotent
--- a/README.md
+++ b/README.md
@ -18,7 +18,9 @@
 █████████╔╝  ██████╔═╝  ██╔╝  ████████╗ ███████╔═╝
 ╚════════╝   ╚═════╝    ╚═╝   ╚═══════╝ ╚══════╝
-        - --  S Y S T E M   O N L I N E  -- -
+     -  - --  D E C L A R A T I V E  -- -  -
     -  - --  P A R T I T I O N E D  -- -  -
     -  - --  D A T A   B U I L D S  -- -  -
 ```
 DataBuild is a trivially-deployable, partition-oriented, declarative data build system.
@ -33,8 +35,6 @@ For important context, check out [DESIGN.md](./DESIGN.md), along with designs in
 - **Deploy anywhere** - One binary, any platform. Bazel-based builds create hermetic applications that run locally, in containers, or in the cloud.
 - **Concurrent by design** - Multiple teams, zero conflicts. Event-sourced coordination enables parallel builds without stepping on each other.
 ## Usage
 ### Graph Description Methods
@ -103,3 +103,6 @@ End to end testing:
 ```bash
 ./run_e2e_tests.sh
 ```
 #### Test Strategy
 Where possible, we make invalid state unrepresentable via rust's type system. Where that is not possible, we prefer [property-testing](https://en.wikipedia.org/wiki/Software_testing#Property_testing), with a handful of bespoke tests to capture critical edge cases or important behaviors. 
--- a/databuild/BUILD.bazel
+++ b/databuild/BUILD.bazel
@ -20,30 +20,8 @@ rust_binary(
 rust_library(
    name = "databuild",
    srcs = [
-        "event_log/mock.rs",
+        "build_event_log.rs",
        "event_log/mod.rs",
        "event_log/query_engine.rs",
        "event_log/sqlite_storage.rs",
        "event_log/storage.rs",
        "event_log/writer.rs",
        "format_consistency_test.rs",
        "lib.rs",
        "log_access.rs",
        "log_collector.rs",
        "mermaid_utils.rs",
        "metric_templates.rs",
        "metrics_aggregator.rs",
        "orchestration/error.rs",
        "orchestration/events.rs",
        "orchestration/mod.rs",
        "repositories/builds/mod.rs",
        "repositories/jobs/mod.rs",
        "repositories/mod.rs",
        "repositories/partitions/mod.rs",
        "repositories/tasks/mod.rs",
        "service/handlers.rs",
        "service/mod.rs",
        "status_utils.rs",
        ":generate_databuild_rust",
    ],
    edition = "2021",
@ -69,20 +47,9 @@ rust_library(
    ],
 )
-# OpenAPI Spec Generator binary (no dashboard dependency)
+rust_test(
-# No need to run this manually - it will automatically generate source and it will be used in
+    name = "databuild_test",
-# the related targets (e.g. //databuild/client:extract_openapi_spec)
+    crate = ":databuild",
 rust_binary(
    name = "openapi_spec_generator",
    srcs = ["service/openapi_spec_generator.rs"],
    edition = "2021",
    visibility = ["//visibility:public"],
    deps = [
        ":databuild",
        "@crates//:log",
        "@crates//:serde_json",
        "@crates//:tokio",
    ],
 )
 # Build Graph Service binary
@ -111,17 +78,6 @@ rust_binary(
    ],
 )
 # Test for orchestration module
 rust_test(
    name = "orchestration_test",
    crate = ":databuild",
    edition = "2021",
    deps = [
        "@crates//:tempfile",
        "@crates//:tokio",
    ],
 )
 # Legacy filegroup for backwards compatibility
 filegroup(
    name = "proto",
--- a/databuild/README.md
+++ b/databuild/README.md
@ -1,26 +0,0 @@
 # DataBuild
 ## API
 A sort of requirements doc for the semantics of DataBuild, enumerating the nouns and verbs they can do.
 ### Graph
 - `analyze` - Produce the job graph required to build the requested set of partitions.
 - `build` - Analyze and then execute the produced job graph to build the requested partitions.
 - `builds`
  - `list` - List past builds.
  - `show` - Shows current status of specified build and list events. Can tail build events for a build with `--follow/-f`
  - `cancel` - Cancel specified build.
 - `partitions`
  - `list` - Lists partitions.
  - `show` - Shows current status of the specified partition.
  - `invalidate` - Marks a partition as invalid (will be rebuilt, won't be read).
 - `jobs`
  - `list` - List jobs in the graph.
  - `show` - Shows task statistics (success %, runtime, etc) and recent task results.
 - `tasks` (job runs)
  - `list` - Lists past tasks.
  - `show` - Describes current task status and lists events.
  - `cancel` - Cancels a specific task.
--- a/databuild/build_event_log.rs
+++ b/databuild/build_event_log.rs
@ -0,0 +1,154 @@
 use crate::data_build_event::Event;
 use crate::{
    BuildState, DataBuildEvent, EventFilter, WantState,
 };
 use std::error::Error;
 use std::sync::{Arc, RwLock};
 trait BELStorage {
    fn append_event(&mut self, event: Event) -> Result<u64, Box<dyn Error>>;
    fn list_events(
        &self,
        since_idx: u64,
        filter: EventFilter,
        limit: u64,
    ) -> Result<Vec<DataBuildEvent>, Box<dyn Error>>;
 }
 struct BuildEventLog<B: BELStorage> {
    storage: B,
    state: Arc<RwLock<BuildState>>,
 }
 impl<B: BELStorage> BuildEventLog<B> {
    fn create(storage: B) -> BuildEventLog<B> {
        BuildEventLog {
            storage,
            state: Arc::new(Default::default()),
        }
    }
 }
 impl<B: BELStorage> BuildEventLog<B> {
    fn append_event(&mut self, event: Event) -> Result<u64, Box<dyn Error>> {
        let idx = self.storage.append_event(event.clone())?;
        self.reduce(event);
        Ok(idx)
    }
    fn reduce(&mut self, event: Event) {
        match event {
            Event::JobRunBuffer(e) => {}
            Event::JobRunQueue(_) => {}
            Event::JobRunStarted(_) => {}
            Event::JobRunHeartbeat(_) => {}
            Event::JobRunSuccess(_) => {}
            Event::JobRunFailure(_) => {}
            Event::JobRunCancel(_) => {}
            Event::JobRunMissingDeps(_) => {}
            Event::WantCreate(e) => {
                self.state
                    .write()
                    .expect("couldn't take write lock")
                    .wants
                    .insert(e.want_id.clone(), WantState { want_id: e.want_id });
            }
            Event::WantCancel(_) => {}
            Event::TaintCreate(_) => {}
            Event::TaintDelete(_) => {}
        }
    }
 }
 mod tests {
    use crate::build_event_log::{BELStorage, BuildEventLog};
    use crate::data_build_event::Event;
    use crate::{DataBuildEvent, EventFilter, PartitionRef, WantCreateEvent};
    use std::error::Error;
    use std::time::{SystemTime, UNIX_EPOCH};
    struct TestBELStorage {
        events: Vec<DataBuildEvent>,
    }
    impl TestBELStorage {
        fn create() -> TestBELStorage {
            TestBELStorage { events: vec![] }
        }
    }
    impl BELStorage for TestBELStorage {
        fn append_event(&mut self, event: Event) -> Result<u64, Box<dyn Error>> {
            let now = SystemTime::now();
            let duration_since_epoch = now.duration_since(UNIX_EPOCH)
                .expect("Time went backwards");
            let timestamp = duration_since_epoch.as_nanos() as u64;
            let dbe = DataBuildEvent {
                timestamp,
                event_id: self.events.len() as u64,
                event: Some(event),
            };
            self.events.push(dbe);
            Ok(self.events.len() as u64)
        }
        fn list_events(
            &self,
            since_idx: u64,
            filter: EventFilter,
            limit: u64,
        ) -> Result<Vec<DataBuildEvent>, Box<dyn Error>> {
            Ok(self.events.clone())
        }
    }
    #[test]
    fn test_hello() {
        assert_eq!(2 + 3, 5);
    }
    #[test]
    fn test_append_event() {
        let storage = TestBELStorage::create();
        let mut log = BuildEventLog::create(storage);
        // Initial state
        assert_eq!(log.storage.events.len(), 0);
        let want_id = "1234".to_string();
        {
            let state = log.state.read().unwrap();
            assert!(state.wants.get(&want_id).is_none());
        }
        // Given
        log.append_event(Event::WantCreate(WantCreateEvent {
            want_id: want_id.clone(),
            root_want_id: "123".to_string(),
            parent_want_id: "123".to_string(),
            partitions: vec![PartitionRef {
                r#ref: "".to_string(),
            }],
            data_timestamp: 0,
            ttl_seconds: 1,
            sla_seconds: 1,
            source: None,
            comment: None,
        }))
        .expect("append_event failed");
        // Assert
        assert_eq!(log.storage.events.len(), 1);
        let state = log.state.read().expect("couldn't take read lock");
        assert!(state.wants.get(&want_id).is_some(), "want_id not found");
        assert_eq!(
            state
                .wants
                .get(&want_id)
                .map(|want| want.want_id.clone())
                .expect("state.wants want_id not found"),
            want_id,
            "want_id not equal",
        );
    }
 }
--- a/databuild/cli/BUILD.bazel
+++ b/databuild/cli/BUILD.bazel
@ -1,27 +0,0 @@
 load("@rules_rust//rust:defs.bzl", "rust_binary")
 # DataBuild CLI wrapper using orchestrator
 rust_binary(
    name = "databuild_cli",
    srcs = [
        "main.rs",
        "error.rs",
    ],
    edition = "2021",
    visibility = ["//visibility:public"],
    data = [
        "//databuild/graph:analyze",
        "//databuild/graph:execute",
    ],
    deps = [
        "//databuild:databuild",
        "@crates//:clap",
        "@crates//:log",
        "@crates//:serde",
        "@crates//:serde_json",
        "@crates//:simple_logger",
        "@crates//:thiserror",
        "@crates//:tokio",
        "@crates//:uuid",
    ],
 )
--- a/databuild/cli/error.rs
+++ b/databuild/cli/error.rs
@ -1,31 +0,0 @@
 use crate::event_log::BuildEventLogError;
 use crate::orchestration::OrchestrationError;
 #[derive(Debug, thiserror::Error)]
 pub enum CliError {
    #[error("Event log error: {0}")]
    EventLog(#[from] BuildEventLogError),
    #[error("Orchestration error: {0}")]
    Orchestration(#[from] OrchestrationError),
    #[error("Analysis error: {0}")]
    Analysis(String),
    #[error("Execution error: {0}")]
    Execution(String),
    #[error("Environment error: {0}")]
    Environment(String),
    #[error("Invalid arguments: {0}")]
    InvalidArguments(String),
    #[error("Database error: {0}")]
    Database(String),
    #[error("Output formatting error: {0}")]
    Output(String),
 }
 pub type Result<T> = std::result::Result<T, CliError>;
--- a/databuild/cli/main.rs
+++ b/databuild/cli/main.rs
@ -1,999 +0,0 @@
 use databuild::*;
 use databuild::event_log::create_bel_query_engine;
 use databuild::orchestration::{BuildOrchestrator, BuildResult};
 use databuild::repositories::{
    partitions::PartitionsRepository,
    jobs::JobsRepository,
    tasks::TasksRepository,
    builds::BuildsRepository
 };
 use clap::{Arg, Command as ClapCommand, ArgMatches};
 use log::{info, error};
 use simple_logger::SimpleLogger;
 use std::env;
 use std::process::{Command, Stdio};
 use uuid::Uuid;
 mod error;
 use error::{CliError, Result};
 /// Run the analyze command and return the job graph
 async fn run_analysis(
    partitions: &[String],
    orchestrator: &BuildOrchestrator,
 ) -> Result<JobGraph> {
    info!("Running analysis for partitions: {:?}", partitions);
    // Get required environment variables
    let candidate_jobs = env::var("DATABUILD_CANDIDATE_JOBS_CFG")
        .map_err(|_| CliError::Environment("DATABUILD_CANDIDATE_JOBS_CFG not set".to_string()))?;
    let job_lookup_path = env::var("DATABUILD_JOB_LOOKUP_PATH")
        .map_err(|_| CliError::Environment("DATABUILD_JOB_LOOKUP_PATH not set".to_string()))?;
    let graph_label = env::var("DATABUILD_GRAPH_LABEL")
        .map_err(|_| CliError::Environment("DATABUILD_GRAPH_LABEL not set".to_string()))?;
    // Find analyze binary using runfiles
    let analyze_path = env::var("DATABUILD_ANALYZE_BINARY")
        .map_err(|_| CliError::Environment("DATABUILD_ANALYZE_BINARY not set".to_string()))?;
    // Build analyze command
    let cmd = Command::new(analyze_path)
        .args(partitions)
        .env("DATABUILD_CANDIDATE_JOBS_CFG", candidate_jobs)
        .env("DATABUILD_JOB_LOOKUP_PATH", job_lookup_path)
        .env("DATABUILD_GRAPH_LABEL", graph_label)
        .env("DATABUILD_MODE", "plan")
        .env("DATABUILD_BUILD_REQUEST_ID", orchestrator.build_request_id())
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()
        .map_err(|e| CliError::Analysis(format!("Failed to spawn analyze process: {}", e)))?;
    let output = cmd.wait_with_output()
        .map_err(|e| CliError::Analysis(format!("Failed to run analyze: {}", e)))?;
    if !output.status.success() {
        let stderr = String::from_utf8_lossy(&output.stderr);
        return Err(CliError::Analysis(format!("Analysis failed: {}", stderr)));
    }
    let stdout = String::from_utf8_lossy(&output.stdout);
    let job_graph: JobGraph = serde_json::from_str(&stdout)
        .map_err(|e| CliError::Analysis(format!("Failed to parse job graph: {}", e)))?;
    info!("Analysis complete, found {} tasks", job_graph.nodes.len());
    Ok(job_graph)
 }
 /// Run the execute command with the job graph
 async fn run_execution(
    job_graph: JobGraph,
    orchestrator: &BuildOrchestrator,
 ) -> Result<BuildResult> {
    info!("Running execution for {} tasks", job_graph.nodes.len());
    // Serialize job graph to JSON for the execute command
    let job_graph_json = serde_json::to_string(&job_graph)
        .map_err(|e| CliError::Execution(format!("Failed to serialize job graph: {}", e)))?;
    // Get required environment variables
    let candidate_jobs = env::var("DATABUILD_CANDIDATE_JOBS_CFG")
        .map_err(|_| CliError::Environment("DATABUILD_CANDIDATE_JOBS_CFG not set".to_string()))?;
    let build_event_log_uri = env::var("DATABUILD_BUILD_EVENT_LOG").unwrap_or_else(|_| "stdout".to_string());
    // Find execute binary using runfiles
    let execute_path = env::var("DATABUILD_EXECUTE_BINARY")
        .map_err(|_| CliError::Environment("DATABUILD_EXECUTE_BINARY not set".to_string()))?;
    // Build execute command
    let mut cmd = Command::new(execute_path)
        .env("DATABUILD_CANDIDATE_JOBS_CFG", candidate_jobs)
        .env("DATABUILD_BUILD_EVENT_LOG", build_event_log_uri)
        .env("DATABUILD_BUILD_REQUEST_ID", orchestrator.build_request_id())
        .stdin(Stdio::piped())
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()
        .map_err(|e| CliError::Execution(format!("Failed to spawn execute process: {}", e)))?;
    // Write job graph to stdin
    if let Some(stdin) = cmd.stdin.as_mut() {
        use std::io::Write;
        stdin.write_all(job_graph_json.as_bytes())
            .map_err(|e| CliError::Execution(format!("Failed to write job graph to execute: {}", e)))?;
    }
    let output = cmd.wait_with_output()
        .map_err(|e| CliError::Execution(format!("Failed to run execute: {}", e)))?;
    if !output.status.success() {
        let stderr = String::from_utf8_lossy(&output.stderr);
        error!("Execution failed:\n{}", stderr);
        return Err(CliError::Execution("Execution failed".to_string()));
    }
    // For now, assume success if the command completed without error
    // In the future, we could parse the output to get more detailed results
    info!("Execution completed successfully");
    Ok(BuildResult::Success { jobs_completed: job_graph.nodes.len() })
 }
 async fn handle_build_command(matches: &ArgMatches) -> Result<()> {
    let partitions: Vec<String> = matches.get_many::<String>("partitions")
        .unwrap()
        .cloned()
        .collect();
    let event_log_uri = matches.get_one::<String>("event-log")
        .cloned()
        .or_else(|| env::var("DATABUILD_BUILD_EVENT_LOG").ok())
        .unwrap_or_else(|| "stdout".to_string());
    let build_request_id = matches.get_one::<String>("build-request-id")
        .cloned()
        .or_else(|| env::var("DATABUILD_BUILD_REQUEST_ID").ok())
        .unwrap_or_else(|| Uuid::new_v4().to_string());
    info!("Build request ID: {}", build_request_id);
    info!("Partitions: {:?}", partitions);
    info!("Event log URI: {}", event_log_uri);
    // Create event log and orchestrator
    let query_engine = create_bel_query_engine(&event_log_uri).await?;
    let requested_partitions: Vec<PartitionRef> = partitions.iter()
        .map(|p| PartitionRef { str: p.clone() })
        .collect();
    let orchestrator = BuildOrchestrator::new(
        query_engine.clone(),
        build_request_id,
        requested_partitions,
    );
    // Emit orchestration events
    orchestrator.start_build().await?;
    orchestrator.start_planning().await?;
    // Run analysis
    let job_graph = run_analysis(&partitions, &orchestrator).await?;
    orchestrator.start_execution().await?;
    // Run execution  
    let result = run_execution(job_graph, &orchestrator).await?;
    orchestrator.complete_build(result).await?;
    info!("DataBuild CLI completed successfully");
    Ok(())
 }
 fn format_timestamp(timestamp_nanos: i64) -> String {
    use std::time::{UNIX_EPOCH, Duration};
    let timestamp_secs = timestamp_nanos / 1_000_000_000;
    let system_time = UNIX_EPOCH + Duration::from_secs(timestamp_secs as u64);
    match system_time.duration_since(UNIX_EPOCH) {
        Ok(duration) => {
            let secs = duration.as_secs();
            let days = secs / 86400;
            let hours = (secs % 86400) / 3600;
            let minutes = (secs % 3600) / 60;
            if days > 0 {
                format!("{}d {}h ago", days, hours)
            } else if hours > 0 {
                format!("{}h {}m ago", hours, minutes)
            } else {
                format!("{}m ago", minutes)
            }
        }
        Err(_) => "unknown".to_string(),
    }
 }
 #[tokio::main]
 async fn main() -> Result<()> {
    // Initialize logger
    SimpleLogger::new()
        .with_level(log::LevelFilter::Info)
        .init()
        .map_err(|e| CliError::Environment(format!("Failed to initialize logger: {}", e)))?;
    // Parse command line arguments
    let matches = ClapCommand::new("databuild")
        .version("1.0")
        .about("DataBuild unified CLI")
        .subcommand_required(false)
        .arg_required_else_help(false)
        .arg(
            Arg::new("partitions")
                .help("Partition references to build (legacy direct build mode)")
                .num_args(1..)
                .value_name("PARTITIONS")
        )
        .subcommand(
            ClapCommand::new("build")
                .about("Build partitions using the DataBuild execution engine")
                .arg(
                    Arg::new("partitions")
                        .help("Partition references to build")
                        .required(true)
                        .num_args(1..)
                        .value_name("PARTITIONS")
                )
                .arg(
                    Arg::new("event-log")
                        .long("event-log")
                        .help("Event log URI (default: stdout)")
                        .value_name("URI")
                )
                .arg(
                    Arg::new("build-request-id")
                        .long("build-request-id")
                        .help("Build request ID (default: generate UUID)")
                        .value_name("ID")
                )
        )
        .subcommand(
            ClapCommand::new("partitions")
                .about("Query and manage partitions")
                .subcommand(
                    ClapCommand::new("list")
                        .about("List all partitions")
                        .arg(Arg::new("limit").long("limit").short('l').value_name("LIMIT").help("Maximum number of partitions to show"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("show")
                        .about("Show partition details")
                        .arg(Arg::new("partition_ref").required(true).help("Partition reference"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("invalidate")
                        .about("Invalidate a partition")
                        .arg(Arg::new("partition_ref").required(true).help("Partition reference"))
                        .arg(Arg::new("reason").long("reason").short('r').required(true).help("Reason for invalidation"))
                        .arg(Arg::new("build_request_id").long("build-request-id").short('b').required(true).help("Build request ID"))
                )
        )
        .subcommand(
            ClapCommand::new("jobs")
                .about("Query job execution data")
                .subcommand(
                    ClapCommand::new("list")
                        .about("List all jobs")
                        .arg(Arg::new("limit").long("limit").short('l').value_name("LIMIT").help("Maximum number of jobs to show"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("show")
                        .about("Show job details")
                        .arg(Arg::new("job_label").required(true).help("Job label"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
        )
        .subcommand(
            ClapCommand::new("tasks")
                .about("Query and manage tasks (job runs)")
                .subcommand(
                    ClapCommand::new("list")
                        .about("List all tasks")
                        .arg(Arg::new("limit").long("limit").short('l').value_name("LIMIT").help("Maximum number of tasks to show"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("show")
                        .about("Show task details")
                        .arg(Arg::new("job_run_id").required(true).help("Job run ID"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("cancel")
                        .about("Cancel a task")
                        .arg(Arg::new("job_run_id").required(true).help("Job run ID"))
                        .arg(Arg::new("reason").long("reason").short('r').required(true).help("Reason for cancellation"))
                        .arg(Arg::new("build_request_id").long("build-request-id").short('b').required(true).help("Build request ID"))
                )
        )
        .subcommand(
            ClapCommand::new("builds")
                .about("Query and manage build requests")
                .subcommand(
                    ClapCommand::new("list")
                        .about("List all builds")
                        .arg(Arg::new("limit").long("limit").short('l').value_name("LIMIT").help("Maximum number of builds to show"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("show")
                        .about("Show build details")
                        .arg(Arg::new("build_request_id").required(true).help("Build request ID"))
                        .arg(Arg::new("format").long("format").short('f').value_name("FORMAT").help("Output format (table or json)").default_value("table"))
                )
                .subcommand(
                    ClapCommand::new("cancel")
                        .about("Cancel a build")
                        .arg(Arg::new("build_request_id").required(true).help("Build request ID"))
                        .arg(Arg::new("reason").long("reason").short('r').required(true).help("Reason for cancellation"))
                )
        )
        .arg(
            Arg::new("event-log")
                .long("event-log")
                .help("Event log URI (default: sqlite:databuild.db for repository commands)")
                .value_name("URI")
                .global(true)
        )
        .get_matches();
    // Get global event log URI
    let event_log_uri = matches.get_one::<String>("event-log")
        .cloned()
        .or_else(|| env::var("DATABUILD_BUILD_EVENT_LOG").ok())
        .unwrap_or_else(|| "sqlite:databuild.db".to_string());
    match matches.subcommand() {
        Some(("build", sub_matches)) => {
            handle_build_command(sub_matches).await?;
        }
        Some(("partitions", sub_matches)) => {
            handle_partitions_command(sub_matches, &event_log_uri).await?;
        }
        Some(("jobs", sub_matches)) => {
            handle_jobs_command(sub_matches, &event_log_uri).await?;
        }
        Some(("tasks", sub_matches)) => {
            handle_tasks_command(sub_matches, &event_log_uri).await?;
        }
        Some(("builds", sub_matches)) => {
            handle_builds_command(sub_matches, &event_log_uri).await?;
        }
        _ => {
            // Check if direct partition arguments were provided (legacy mode)
            if let Some(partitions) = matches.get_many::<String>("partitions") {
                let partition_list: Vec<String> = partitions.cloned().collect();
                if !partition_list.is_empty() {
                    // Create a synthetic build command with these partitions
                    let build_cmd = ClapCommand::new("build")
                        .arg(Arg::new("partitions").num_args(1..))
                        .arg(Arg::new("event-log").long("event-log"))
                        .arg(Arg::new("build-request-id").long("build-request-id"));
                    let build_matches = build_cmd.try_get_matches_from(
                        std::iter::once("build".to_string()).chain(partition_list.clone())
                    ).map_err(|e| CliError::InvalidArguments(format!("Failed to parse legacy build arguments: {}", e)))?;
                    handle_build_command(&build_matches).await?;
                    return Ok(());
                }
            }
            // Show help if no subcommand or arguments provided
            let mut cmd = ClapCommand::new("databuild")
                .version("1.0")
                .about("DataBuild unified CLI");
            cmd.print_help().unwrap();
            println!();
        }
    }
    Ok(())
 }
 async fn handle_partitions_command(matches: &ArgMatches, event_log_uri: &str) -> Result<()> {
    let query_engine = create_bel_query_engine(event_log_uri).await
        .map_err(|e| CliError::Database(format!("Failed to connect to event log: {}", e)))?;
    let repository = PartitionsRepository::new(query_engine);
    match matches.subcommand() {
        Some(("list", sub_matches)) => {
            let limit = sub_matches.get_one::<String>("limit").and_then(|s| s.parse::<u32>().ok());
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            // Use new protobuf response format for consistency with service
            let request = PartitionsListRequest {
                limit,
                offset: None, // TODO: Add offset support to CLI
                status_filter: None, // TODO: Add status filtering to CLI
            };
            let response = repository.list_protobuf(request).await
                .map_err(|e| CliError::Database(format!("Failed to list partitions: {}", e)))?;
            match format {
                "json" => {
                    let json = serde_json::to_string_pretty(&response)
                        .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                    println!("{}", json);
                }
                _ => {
                    if response.partitions.is_empty() {
                        println!("No partitions found");
                        return Ok(());
                    }
                    println!("Partitions ({} total):", response.total_count);
                    println!();
                    println!("{:<30} {:<15} {:<12} {:<12} {:<20}", "Partition", "Status", "Builds", "Invalidated", "Last Updated");
                    println!("{}", "-".repeat(90));
                    for partition in response.partitions {
                        let last_updated = format_timestamp(partition.last_updated);
                        println!("{:<30} {:<15} {:<12} {:<12} {:<20}", 
                            partition.partition_ref.map(|p| p.str).unwrap_or("".to_string()),
                            partition.status_name, // Use human-readable status name
                            partition.builds_count,
                            partition.invalidation_count,
                            last_updated
                        );
                    }
                    if response.has_more {
                        println!("\nNote: More results available. Use --limit to control output.");
                    }
                }
            }
        }
        Some(("show", sub_matches)) => {
            let partition_ref = sub_matches.get_one::<String>("partition_ref").unwrap();
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let result = repository.show_protobuf(partition_ref).await
                .map_err(|e| CliError::Database(format!("Failed to show partition: {}", e)))?;
            match result {
                Some(detail) => {
                    match format {
                        "json" => {
                            let json = serde_json::to_string_pretty(&detail)
                                .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                            println!("{}", json);
                        }
                        _ => {
                            println!("Partition: {}", detail.partition_ref.map(|p| p.str).unwrap_or("".to_string()));
                            println!("Status: {} ({})", detail.status_name, detail.status_code);
                            println!("Builds involved: {}", detail.builds_count);
                            println!("Invalidation count: {}", detail.invalidation_count);
                            println!("Last updated: {}", format_timestamp(detail.last_updated));
                            if let Some(ref last_build) = detail.last_successful_build {
                                println!("\nLast successful build: {}", last_build);
                            }
                            if !detail.timeline.is_empty() {
                                println!("\nTimeline ({} events):", detail.timeline.len());
                                for event in detail.timeline {
                                    let timestamp = format_timestamp(event.timestamp);
                                    println!("  {} [{}] {}", timestamp, event.status_name, event.message);
                                    if event.message.starts_with("Invalidated:") {
                                        // Invalidation reason is in the message
                                    }
                                }
                            }
                        }
                    }
                }
                None => {
                    match format {
                        "json" => {
                            println!("null");
                        }
                        _ => {
                            println!("Partition '{}' not found", partition_ref);
                        }
                    }
                }
            }
        }
        Some(("invalidate", sub_matches)) => {
            let partition_ref = sub_matches.get_one::<String>("partition_ref").unwrap();
            let reason = sub_matches.get_one::<String>("reason").unwrap();
            let build_request_id = sub_matches.get_one::<String>("build_request_id").unwrap();
            let partition_ref_obj = PartitionRef { str: partition_ref.clone() };
            repository.invalidate(&partition_ref_obj.str, reason.clone(), build_request_id.clone()).await
                .map_err(|e| CliError::Database(format!("Failed to invalidate partition: {}", e)))?;
            println!("Successfully invalidated partition '{}' with reason: {}", partition_ref, reason);
        }
        _ => {
            println!("Unknown partitions subcommand. Use 'list', 'show', or 'invalidate'.");
        }
    }
    Ok(())
 }
 async fn handle_jobs_command(matches: &ArgMatches, event_log_uri: &str) -> Result<()> {
    let query_engine = create_bel_query_engine(event_log_uri).await
        .map_err(|e| CliError::Database(format!("Failed to connect to event log: {}", e)))?;
    let repository = JobsRepository::new(query_engine);
    match matches.subcommand() {
        Some(("list", sub_matches)) => {
            let limit = sub_matches.get_one::<String>("limit").and_then(|s| s.parse().ok());
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let jobs = repository.list(limit).await
                .map_err(|e| CliError::Database(format!("Failed to list jobs: {}", e)))?;
            match format {
                "json" => {
                    let json = serde_json::to_string_pretty(&jobs)
                        .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                    println!("{}", json);
                }
                _ => {
                    if jobs.is_empty() {
                        println!("No jobs found");
                        return Ok(());
                    }
                    println!("Jobs ({} total):", jobs.len());
                    println!();
                    println!("{:<40} {:<8} {:<8} {:<8} {:<8} {:<8} {:<20}", "Job Label", "Runs", "Success", "Failed", "Cancel", "Avg Parts", "Last Run");
                    println!("{}", "-".repeat(120));
                    for job in jobs {
                        let success_rate = if job.total_runs > 0 {
                            (job.successful_runs as f64 / job.total_runs as f64 * 100.0) as u32
                        } else {
                            0
                        };
                        let last_run = format_timestamp(job.last_run_timestamp);
                        let last_status = format!("{:?}", job.last_run_status);
                        println!("{:<40} {:<8} {:<8} {:<8} {:<8} {:<8.1} {:<20}", 
                            job.job_label,
                            job.total_runs,
                            format!("{}({}%)", job.successful_runs, success_rate),
                            job.failed_runs,
                            job.cancelled_runs,
                            job.average_partitions_per_run,
                            format!("{} ({})", last_run, last_status)
                        );
                    }
                }
            }
        }
        Some(("show", sub_matches)) => {
            let job_label = sub_matches.get_one::<String>("job_label").unwrap();
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let result = repository.show_protobuf(job_label).await
                .map_err(|e| CliError::Database(format!("Failed to show job: {}", e)))?;
            match result {
                Some(detail) => {
                    match format {
                        "json" => {
                            let json = serde_json::to_string_pretty(&detail)
                                .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                            println!("{}", json);
                        }
                        _ => {
                            println!("Job: {}", detail.job_label);
                            println!("Total runs: {}", detail.total_runs);
                            println!("Successful runs: {} ({:.1}%)", detail.successful_runs, 
                                if detail.total_runs > 0 { detail.successful_runs as f64 / detail.total_runs as f64 * 100.0 } else { 0.0 });
                            println!("Failed runs: {}", detail.failed_runs);
                            println!("Cancelled runs: {}", detail.cancelled_runs);
                            println!("Average partitions per run: {:.1}", detail.average_partitions_per_run);
                            println!("Last run: {} ({} - {})", format_timestamp(detail.last_run_timestamp), detail.last_run_status_name, detail.last_run_status_code);
                            if !detail.recent_builds.is_empty() {
                                println!("\nRecent builds:");
                                for build_id in &detail.recent_builds {
                                    println!("  - {}", build_id);
                                }
                            }
                            if !detail.runs.is_empty() {
                                println!("\nExecution history ({} runs):", detail.runs.len());
                                println!("{:<25} {:<15} {:<15} {:<10} {:<30}", "Run ID", "Status", "Duration", "Parts", "Build Request");
                                println!("{}", "-".repeat(95));
                                for run in detail.runs.iter().take(10) { // Show last 10 runs
                                    let duration_str = if let Some(duration) = run.duration_ms {
                                        if duration > 1000 {
                                            format!("{:.1}s", duration as f64 / 1000.0)
                                        } else {
                                            format!("{}ms", duration)
                                        }
                                    } else {
                                        "N/A".to_string()
                                    };
                                    println!("{:<25} {:<15} {:<15} {:<10} {:<30}", 
                                        run.job_run_id,
                                        run.status_name,
                                        duration_str,
                                        run.target_partitions.len(),
                                        run.build_request_id
                                    );
                                }
                                if detail.runs.len() > 10 {
                                    println!("... and {} more runs", detail.runs.len() - 10);
                                }
                            }
                        }
                    }
                }
                None => {
                    match format {
                        "json" => {
                            println!("null");
                        }
                        _ => {
                            println!("Job '{}' not found", job_label);
                        }
                    }
                }
            }
        }
        _ => {
            println!("Unknown jobs subcommand. Use 'list' or 'show'.");
        }
    }
    Ok(())
 }
 async fn handle_tasks_command(matches: &ArgMatches, event_log_uri: &str) -> Result<()> {
    let query_engine = create_bel_query_engine(event_log_uri).await
        .map_err(|e| CliError::Database(format!("Failed to connect to event log: {}", e)))?;
    let repository = TasksRepository::new(query_engine);
    match matches.subcommand() {
        Some(("list", sub_matches)) => {
            let limit = sub_matches.get_one::<String>("limit").and_then(|s| s.parse().ok());
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let tasks = repository.list(limit).await
                .map_err(|e| CliError::Database(format!("Failed to list tasks: {}", e)))?;
            match format {
                "json" => {
                    let json = serde_json::to_string_pretty(&tasks)
                        .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                    println!("{}", json);
                }
                _ => {
                    if tasks.is_empty() {
                        println!("No tasks found");
                        return Ok(());
                    }
                    println!("Tasks ({} total):", tasks.len());
                    println!();
                    println!("{:<25} {:<30} {:<15} {:<15} {:<10} {:<20}", "Job Run ID", "Job Label", "Status", "Duration", "Parts", "Scheduled");
                    println!("{}", "-".repeat(115));
                    for task in tasks {
                        let duration_str = if let Some(duration) = task.duration_ms {
                            if duration > 1000 {
                                format!("{:.1}s", duration as f64 / 1000.0)
                            } else {
                                format!("{}ms", duration)
                            }
                        } else {
                            "N/A".to_string()
                        };
                        let scheduled = format_timestamp(task.scheduled_at);
                        let status_str = if task.cancelled {
                            format!("{:?}*", task.status) // Add asterisk for cancelled tasks
                        } else {
                            format!("{:?}", task.status)
                        };
                        println!("{:<25} {:<30} {:<15} {:<15} {:<10} {:<20}", 
                            task.job_run_id,
                            task.job_label,
                            status_str,
                            duration_str,
                            task.target_partitions.len(),
                            scheduled
                        );
                    }
                    println!("\n* = Cancelled task");
                }
            }
        }
        Some(("show", sub_matches)) => {
            let job_run_id = sub_matches.get_one::<String>("job_run_id").unwrap();
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let result = repository.show_protobuf(job_run_id).await
                .map_err(|e| CliError::Database(format!("Failed to show task: {}", e)))?;
            match result {
                Some(detail) => {
                    match format {
                        "json" => {
                            let json = serde_json::to_string_pretty(&detail)
                                .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                            println!("{}", json);
                        }
                        _ => {
                            println!("Task: {}", detail.job_run_id);
                            println!("Job: {}", detail.job_label);
                            println!("Build request: {}", detail.build_request_id);
                            println!("Status: {} ({})", detail.status_name, detail.status_code);
                            println!("Target partitions: {}", detail.target_partitions.len());
                            println!("Scheduled: {}", format_timestamp(detail.scheduled_at));
                            if let Some(started) = detail.started_at {
                                println!("Started: {}", format_timestamp(started));
                            }
                            if let Some(completed) = detail.completed_at {
                                println!("Completed: {}", format_timestamp(completed));
                            }
                            if let Some(duration) = detail.duration_ms {
                                if duration > 1000 {
                                    println!("Duration: {:.1}s", duration as f64 / 1000.0);
                                } else {
                                    println!("Duration: {}ms", duration);
                                }
                            }
                            if detail.cancelled {
                                println!("Cancelled: Yes");
                                if let Some(ref reason) = detail.cancel_reason {
                                    println!("Cancel reason: {}", reason);
                                }
                            }
                            if !detail.message.is_empty() {
                                println!("Message: {}", detail.message);
                            }
                            if !detail.target_partitions.is_empty() {
                                println!("\nTarget partitions:");
                                for partition in &detail.target_partitions {
                                    println!("  - {}", partition.str);
                                }
                            }
                            if !detail.timeline.is_empty() {
                                println!("\nTimeline ({} events):", detail.timeline.len());
                                for event in detail.timeline {
                                    let timestamp = format_timestamp(event.timestamp);
                                    let status_info = if let Some(ref status_name) = event.status_name {
                                        format!(" -> {}", status_name)
                                    } else {
                                        String::new()
                                    };
                                    println!("  {} [{}]{} {}", timestamp, event.event_type, status_info, event.message);
                                    if let Some(ref reason) = event.cancel_reason {
                                        println!("    Reason: {}", reason);
                                    }
                                }
                            }
                        }
                    }
                }
                None => {
                    match format {
                        "json" => {
                            println!("null");
                        }
                        _ => {
                            println!("Task '{}' not found", job_run_id);
                        }
                    }
                }
            }
        }
        Some(("cancel", sub_matches)) => {
            let job_run_id = sub_matches.get_one::<String>("job_run_id").unwrap();
            let reason = sub_matches.get_one::<String>("reason").unwrap();
            let build_request_id = sub_matches.get_one::<String>("build_request_id").unwrap();
            repository.cancel(job_run_id, reason.clone(), build_request_id.clone()).await
                .map_err(|e| CliError::Database(format!("Failed to cancel task: {}", e)))?;
            println!("Successfully cancelled task '{}' with reason: {}", job_run_id, reason);
        }
        _ => {
            println!("Unknown tasks subcommand. Use 'list', 'show', or 'cancel'.");
        }
    }
    Ok(())
 }
 async fn handle_builds_command(matches: &ArgMatches, event_log_uri: &str) -> Result<()> {
    let query_engine = create_bel_query_engine(event_log_uri).await
        .map_err(|e| CliError::Database(format!("Failed to connect to event log: {}", e)))?;
    let repository = BuildsRepository::new(query_engine);
    match matches.subcommand() {
        Some(("list", sub_matches)) => {
            let limit = sub_matches.get_one::<String>("limit").and_then(|s| s.parse().ok());
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let builds = repository.list(limit).await
                .map_err(|e| CliError::Database(format!("Failed to list builds: {}", e)))?;
            match format {
                "json" => {
                    let json = serde_json::to_string_pretty(&builds)
                        .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                    println!("{}", json);
                }
                _ => {
                    if builds.is_empty() {
                        println!("No builds found");
                        return Ok(());
                    }
                    println!("Builds ({} total):", builds.len());
                    println!();
                    println!("{:<40} {:<15} {:<15} {:<8} {:<8} {:<8} {:<20}", "Build Request ID", "Status", "Duration", "Parts", "Jobs", "Comp", "Requested");
                    println!("{}", "-".repeat(120));
                    for build in builds {
                        let duration_str = if let Some(duration) = build.duration_ms {
                            if duration > 60000 {
                                format!("{:.1}m", duration as f64 / 60000.0)
                            } else if duration > 1000 {
                                format!("{:.1}s", duration as f64 / 1000.0)
                            } else {
                                format!("{}ms", duration)
                            }
                        } else {
                            "N/A".to_string()
                        };
                        let requested = format_timestamp(build.requested_at);
                        let status_str = if build.cancelled {
                            format!("{:?}*", build.status) // Add asterisk for cancelled builds
                        } else {
                            format!("{:?}", build.status)
                        };
                        let completion_rate = if build.total_jobs > 0 {
                            format!("{}/{}", build.completed_jobs, build.total_jobs)
                        } else {
                            "0/0".to_string()
                        };
                        println!("{:<40} {:<15} {:<15} {:<8} {:<8} {:<8} {:<20}", 
                            build.build_request_id,
                            status_str,
                            duration_str,
                            build.requested_partitions.len(),
                            build.total_jobs,
                            completion_rate,
                            requested
                        );
                    }
                    println!("\n* = Cancelled build");
                }
            }
        }
        Some(("show", sub_matches)) => {
            let build_request_id = sub_matches.get_one::<String>("build_request_id").unwrap();
            let format = sub_matches.get_one::<String>("format").map(|s| s.as_str()).unwrap_or("table");
            let result = repository.show_protobuf(build_request_id).await
                .map_err(|e| CliError::Database(format!("Failed to show build: {}", e)))?;
            match result {
                Some(detail) => {
                    match format {
                        "json" => {
                            let json = serde_json::to_string_pretty(&detail)
                                .map_err(|e| CliError::Output(format!("Failed to serialize to JSON: {}", e)))?;
                            println!("{}", json);
                        }
                        _ => {
                            println!("Build: {}", detail.build_request_id);
                            println!("Status: {} ({})", detail.status.clone().unwrap().name, detail.status.unwrap().code);
                            println!("Requested partitions: {}", detail.requested_partitions.len());
                            println!("Total jobs: {}", detail.total_jobs);
                            println!("Completed jobs: {}", detail.completed_jobs);
                            println!("Failed jobs: {}", detail.failed_jobs);
                            println!("Cancelled jobs: {}", detail.cancelled_jobs);
                            println!("Requested: {}", format_timestamp(detail.requested_at));
                            if let Some(started) = detail.started_at {
                                println!("Started: {}", format_timestamp(started));
                            }
                            if let Some(completed) = detail.completed_at {
                                println!("Completed: {}", format_timestamp(completed));
                            }
                            if let Some(duration) = detail.duration_ms {
                                if duration > 60000 {
                                    println!("Duration: {:.1}m", duration as f64 / 60000.0);
                                } else if duration > 1000 {
                                    println!("Duration: {:.1}s", duration as f64 / 1000.0);
                                } else {
                                    println!("Duration: {}ms", duration);
                                }
                            }
                            if detail.cancelled {
                                println!("Cancelled: Yes");
                                if let Some(ref reason) = detail.cancel_reason {
                                    println!("Cancel reason: {}", reason);
                                }
                            }
                            if !detail.requested_partitions.is_empty() {
                                println!("\nRequested partitions:");
                                for partition in &detail.requested_partitions {
                                    println!("  - {}", partition.str);
                                }
                            }
                            // Show job statistics
                            if detail.total_jobs > 0 {
                                let success_rate = (detail.completed_jobs as f64 / detail.total_jobs as f64 * 100.0) as u32;
                                println!("\nJob statistics:");
                                println!("  Success rate: {}% ({}/{})", success_rate, detail.completed_jobs, detail.total_jobs);
                                if detail.failed_jobs > 0 {
                                    println!("  Failed: {}", detail.failed_jobs);
                                }
                                if detail.cancelled_jobs > 0 {
                                    println!("  Cancelled: {}", detail.cancelled_jobs);
                                }
                            }
                            if !detail.timeline.is_empty() {
                                println!("\nTimeline ({} events):", detail.timeline.len());
                                for event in detail.timeline {
                                    let timestamp = format_timestamp(event.timestamp);
                                    let status_info = event.status.unwrap().name;
                                    println!("  {} [{}]{} {}", timestamp, event.event_type, status_info, event.message);
                                    if let Some(ref reason) = event.cancel_reason {
                                        println!("    Reason: {}", reason);
                                    }
                                }
                            }
                        }
                    }
                }
                None => {
                    match format {
                        "json" => {
                            println!("null");
                        }
                        _ => {
                            println!("Build '{}' not found", build_request_id);
                        }
                    }
                }
            }
        }
        Some(("cancel", sub_matches)) => {
            let build_request_id = sub_matches.get_one::<String>("build_request_id").unwrap();
            let reason = sub_matches.get_one::<String>("reason").unwrap();
            repository.cancel(build_request_id, reason.clone()).await
                .map_err(|e| CliError::Database(format!("Failed to cancel build: {}", e)))?;
            println!("Successfully cancelled build '{}' with reason: {}", build_request_id, reason);
        }
        _ => {
            println!("Unknown builds subcommand. Use 'list', 'show', or 'cancel'.");
        }
    }
    Ok(())
 }
--- a/databuild/client/BUILD.bazel
+++ b/databuild/client/BUILD.bazel
@ -1,194 +0,0 @@
 load("@aspect_rules_ts//ts:defs.bzl", "ts_config", "ts_project")
 # Extract OpenAPI spec from the dedicated spec generator binary
 genrule(
    name = "extract_openapi_spec",
    srcs = [],
    outs = ["openapi.json"],
    cmd = """
    $(location //databuild:openapi_spec_generator) > $@
    """,
    tools = [
        "//databuild:openapi_spec_generator",
    ],
    visibility = ["//visibility:public"],
 )
 # TypeScript generator configuration
 filegroup(
    name = "typescript_generator_config",
    srcs = ["typescript_generator_config.json"],
    visibility = ["//visibility:public"],
 )
 # Generate TypeScript client using OpenAPI Generator JAR
 genrule(
    name = "typescript_client",
    srcs = [
        ":extract_openapi_spec",
        ":typescript_generator_config",
    ],
    outs = [
        "typescript_generated/src/apis/DefaultApi.ts",
        "typescript_generated/src/apis/index.ts",
        "typescript_generated/src/models/index.ts",
        "typescript_generated/src/models/ActivityApiResponse.ts",
        "typescript_generated/src/models/ActivityResponse.ts",
        "typescript_generated/src/models/AnalyzeRequest.ts",
        "typescript_generated/src/models/AnalyzeResponse.ts",
        "typescript_generated/src/models/BuildCancelPathRequest.ts",
        "typescript_generated/src/models/BuildCancelRepositoryResponse.ts",
        "typescript_generated/src/models/BuildDetailRequest.ts",
        "typescript_generated/src/models/BuildDetailResponse.ts",
        "typescript_generated/src/models/BuildEventSummary.ts",
        "typescript_generated/src/models/BuildRequest.ts",
        "typescript_generated/src/models/BuildRequestResponse.ts",
        "typescript_generated/src/models/BuildSummary.ts",
        "typescript_generated/src/models/BuildRequestStatus.ts",
        "typescript_generated/src/models/BuildTimelineEvent.ts",
        "typescript_generated/src/models/BuildsListApiResponse.ts",
        "typescript_generated/src/models/BuildsListResponse.ts",
        "typescript_generated/src/models/CancelBuildRepositoryRequest.ts",
        "typescript_generated/src/models/InvalidatePartitionRequest.ts",
        "typescript_generated/src/models/JobDailyStats.ts",
        "typescript_generated/src/models/JobDetailRequest.ts",
        "typescript_generated/src/models/JobDetailResponse.ts",
        "typescript_generated/src/models/JobMetricsRequest.ts",
        "typescript_generated/src/models/JobMetricsResponse.ts",
        "typescript_generated/src/models/JobRunDetail.ts",
        "typescript_generated/src/models/JobSummary.ts",
        "typescript_generated/src/models/JobsListApiResponse.ts",
        "typescript_generated/src/models/JobsListResponse.ts",
        "typescript_generated/src/models/PaginationInfo.ts",
        "typescript_generated/src/models/PartitionDetailRequest.ts",
        "typescript_generated/src/models/PartitionDetailResponse.ts",
        "typescript_generated/src/models/PartitionEventsRequest.ts",
        "typescript_generated/src/models/PartitionEventsResponse.ts",
        "typescript_generated/src/models/PartitionInvalidatePathRequest.ts",
        "typescript_generated/src/models/PartitionInvalidateResponse.ts",
        "typescript_generated/src/models/PartitionRef.ts",
        "typescript_generated/src/models/PartitionStatusRequest.ts",
        "typescript_generated/src/models/PartitionStatusResponse.ts",
        "typescript_generated/src/models/PartitionSummary.ts",
        "typescript_generated/src/models/PartitionTimelineEvent.ts",
        "typescript_generated/src/models/PartitionsListApiResponse.ts",
        "typescript_generated/src/models/PartitionsListResponse.ts",
        "typescript_generated/src/models/CancelTaskRequest.ts",
        "typescript_generated/src/models/JobRunDetailResponse.ts",
        "typescript_generated/src/models/JobRunSummary.ts",
        "typescript_generated/src/models/JobRunSummary2.ts",
        "typescript_generated/src/models/JobRunTimelineEvent.ts",
        "typescript_generated/src/models/JobRunsListApiResponse.ts",
        "typescript_generated/src/models/JobRunsListResponse.ts",
        "typescript_generated/src/models/TaskCancelPathRequest.ts",
        "typescript_generated/src/models/TaskCancelResponse.ts",
        "typescript_generated/src/models/TaskDetailRequest.ts",
        "typescript_generated/src/runtime.ts",
        "typescript_generated/src/index.ts",
    ],
    cmd = """
    # Download OpenAPI Generator JAR
    OPENAPI_JAR=/tmp/openapi-generator-cli.jar
    if [ ! -f $$OPENAPI_JAR ]; then
        curl -L -o $$OPENAPI_JAR https://repo1.maven.org/maven2/org/openapitools/openapi-generator-cli/7.2.0/openapi-generator-cli-7.2.0.jar
    fi
    # Create temporary directory for generation
    TEMP_DIR=$$(mktemp -d)
    # Generate TypeScript client to temp directory
    java -jar $$OPENAPI_JAR generate \
        -i $(location :extract_openapi_spec) \
        -g typescript-fetch \
        -c $(location :typescript_generator_config) \
        -o $$TEMP_DIR
    # Copy generated files to expected output locations
    cp $$TEMP_DIR/src/apis/DefaultApi.ts $(location typescript_generated/src/apis/DefaultApi.ts)
    cp $$TEMP_DIR/src/apis/index.ts $(location typescript_generated/src/apis/index.ts)
    cp $$TEMP_DIR/src/models/index.ts $(location typescript_generated/src/models/index.ts)
    cp $$TEMP_DIR/src/models/ActivityApiResponse.ts $(location typescript_generated/src/models/ActivityApiResponse.ts)
    cp $$TEMP_DIR/src/models/ActivityResponse.ts $(location typescript_generated/src/models/ActivityResponse.ts)
    cp $$TEMP_DIR/src/models/AnalyzeRequest.ts $(location typescript_generated/src/models/AnalyzeRequest.ts)
    cp $$TEMP_DIR/src/models/AnalyzeResponse.ts $(location typescript_generated/src/models/AnalyzeResponse.ts)
    cp $$TEMP_DIR/src/models/BuildCancelPathRequest.ts $(location typescript_generated/src/models/BuildCancelPathRequest.ts)
    cp $$TEMP_DIR/src/models/BuildCancelRepositoryResponse.ts $(location typescript_generated/src/models/BuildCancelRepositoryResponse.ts)
    cp $$TEMP_DIR/src/models/BuildDetailRequest.ts $(location typescript_generated/src/models/BuildDetailRequest.ts)
    cp $$TEMP_DIR/src/models/BuildDetailResponse.ts $(location typescript_generated/src/models/BuildDetailResponse.ts)
    cp $$TEMP_DIR/src/models/BuildEventSummary.ts $(location typescript_generated/src/models/BuildEventSummary.ts)
    cp $$TEMP_DIR/src/models/BuildRequest.ts $(location typescript_generated/src/models/BuildRequest.ts)
    cp $$TEMP_DIR/src/models/BuildRequestResponse.ts $(location typescript_generated/src/models/BuildRequestResponse.ts)
    cp $$TEMP_DIR/src/models/BuildSummary.ts $(location typescript_generated/src/models/BuildSummary.ts)
    cp $$TEMP_DIR/src/models/BuildRequestStatus.ts $(location typescript_generated/src/models/BuildRequestStatus.ts)
    cp $$TEMP_DIR/src/models/BuildTimelineEvent.ts $(location typescript_generated/src/models/BuildTimelineEvent.ts)
    cp $$TEMP_DIR/src/models/BuildsListApiResponse.ts $(location typescript_generated/src/models/BuildsListApiResponse.ts)
    cp $$TEMP_DIR/src/models/BuildsListResponse.ts $(location typescript_generated/src/models/BuildsListResponse.ts)
    cp $$TEMP_DIR/src/models/CancelBuildRepositoryRequest.ts $(location typescript_generated/src/models/CancelBuildRepositoryRequest.ts)
    cp $$TEMP_DIR/src/models/InvalidatePartitionRequest.ts $(location typescript_generated/src/models/InvalidatePartitionRequest.ts)
    cp $$TEMP_DIR/src/models/JobDailyStats.ts $(location typescript_generated/src/models/JobDailyStats.ts)
    cp $$TEMP_DIR/src/models/JobDetailRequest.ts $(location typescript_generated/src/models/JobDetailRequest.ts)
    cp $$TEMP_DIR/src/models/JobDetailResponse.ts $(location typescript_generated/src/models/JobDetailResponse.ts)
    cp $$TEMP_DIR/src/models/JobMetricsRequest.ts $(location typescript_generated/src/models/JobMetricsRequest.ts)
    cp $$TEMP_DIR/src/models/JobMetricsResponse.ts $(location typescript_generated/src/models/JobMetricsResponse.ts)
    cp $$TEMP_DIR/src/models/JobRunDetail.ts $(location typescript_generated/src/models/JobRunDetail.ts)
    cp $$TEMP_DIR/src/models/JobRunSummary.ts $(location typescript_generated/src/models/JobRunSummary.ts)
    cp $$TEMP_DIR/src/models/JobSummary.ts $(location typescript_generated/src/models/JobSummary.ts)
    cp $$TEMP_DIR/src/models/JobsListApiResponse.ts $(location typescript_generated/src/models/JobsListApiResponse.ts)
    cp $$TEMP_DIR/src/models/JobsListResponse.ts $(location typescript_generated/src/models/JobsListResponse.ts)
    cp $$TEMP_DIR/src/models/PaginationInfo.ts $(location typescript_generated/src/models/PaginationInfo.ts)
    cp $$TEMP_DIR/src/models/PartitionDetailRequest.ts $(location typescript_generated/src/models/PartitionDetailRequest.ts)
    cp $$TEMP_DIR/src/models/PartitionDetailResponse.ts $(location typescript_generated/src/models/PartitionDetailResponse.ts)
    cp $$TEMP_DIR/src/models/PartitionEventsRequest.ts $(location typescript_generated/src/models/PartitionEventsRequest.ts)
    cp $$TEMP_DIR/src/models/PartitionEventsResponse.ts $(location typescript_generated/src/models/PartitionEventsResponse.ts)
    cp $$TEMP_DIR/src/models/PartitionInvalidatePathRequest.ts $(location typescript_generated/src/models/PartitionInvalidatePathRequest.ts)
    cp $$TEMP_DIR/src/models/PartitionInvalidateResponse.ts $(location typescript_generated/src/models/PartitionInvalidateResponse.ts)
    cp $$TEMP_DIR/src/models/PartitionRef.ts $(location typescript_generated/src/models/PartitionRef.ts)
    cp $$TEMP_DIR/src/models/PartitionStatusRequest.ts $(location typescript_generated/src/models/PartitionStatusRequest.ts)
    cp $$TEMP_DIR/src/models/PartitionStatusResponse.ts $(location typescript_generated/src/models/PartitionStatusResponse.ts)
    cp $$TEMP_DIR/src/models/PartitionSummary.ts $(location typescript_generated/src/models/PartitionSummary.ts)
    cp $$TEMP_DIR/src/models/PartitionTimelineEvent.ts $(location typescript_generated/src/models/PartitionTimelineEvent.ts)
    cp $$TEMP_DIR/src/models/PartitionsListApiResponse.ts $(location typescript_generated/src/models/PartitionsListApiResponse.ts)
    cp $$TEMP_DIR/src/models/PartitionsListResponse.ts $(location typescript_generated/src/models/PartitionsListResponse.ts)
    cp $$TEMP_DIR/src/models/JobRunSummary.ts $(location typescript_generated/src/models/JobRunSummary.ts)
    cp $$TEMP_DIR/src/models/JobRunTimelineEvent.ts $(location typescript_generated/src/models/JobRunTimelineEvent.ts)
    cp $$TEMP_DIR/src/models/JobRunsListApiResponse.ts $(location typescript_generated/src/models/JobRunsListApiResponse.ts)
    cp $$TEMP_DIR/src/models/JobRunsListResponse.ts $(location typescript_generated/src/models/JobRunsListResponse.ts)
    cp $$TEMP_DIR/src/models/CancelTaskRequest.ts $(location typescript_generated/src/models/CancelTaskRequest.ts)
    cp $$TEMP_DIR/src/models/JobRunDetailResponse.ts $(location typescript_generated/src/models/JobRunDetailResponse.ts)
    cp $$TEMP_DIR/src/models/JobRunSummary2.ts $(location typescript_generated/src/models/JobRunSummary2.ts)
    cp $$TEMP_DIR/src/models/TaskCancelPathRequest.ts $(location typescript_generated/src/models/TaskCancelPathRequest.ts)
    cp $$TEMP_DIR/src/models/TaskCancelResponse.ts $(location typescript_generated/src/models/TaskCancelResponse.ts)
    cp $$TEMP_DIR/src/models/TaskDetailRequest.ts $(location typescript_generated/src/models/TaskDetailRequest.ts)
    cp $$TEMP_DIR/src/runtime.ts $(location typescript_generated/src/runtime.ts)
    cp $$TEMP_DIR/src/index.ts $(location typescript_generated/src/index.ts)
    """,
    visibility = ["//visibility:public"],
 )
 # TypeScript configuration for the client
 ts_config(
    name = "ts_config",
    src = "tsconfig.json",
    visibility = ["//visibility:public"],
 )
 # Create a proper TypeScript project from the generated files
 ts_project(
    name = "typescript_lib",
    srcs = [":typescript_client"],
    allow_js = True,
    declaration = True,
    resolve_json_module = True,
    transpiler = "tsc",
    tsconfig = ":ts_config",
    visibility = ["//visibility:public"],
 )
 # Main TypeScript client target
 filegroup(
    name = "typescript",
    srcs = [
        ":typescript_client",
    ],
    visibility = ["//visibility:public"],
 )
--- a/databuild/client/tsconfig.json
+++ b/databuild/client/tsconfig.json
@ -1,21 +0,0 @@
 {
  "compilerOptions": {
    "target": "ES2020",
    "module": "CommonJS",
    "moduleResolution": "node",
    "allowJs": true,
    "declaration": true,
    "strict": false,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true,
    "resolveJsonModule": true,
    "isolatedModules": true,
    "noEmit": false
  },
  "include": ["**/*"],
  "exclude": [
    "node_modules", 
    "**/*.test.ts"
  ]
 }
--- a/databuild/client/typescript_generator_config.json
+++ b/databuild/client/typescript_generator_config.json
@ -1,14 +0,0 @@
 {
  "enumPropertyNaming": "snake_case",
  "withInterfaces": true,
  "useSingleRequestParameter": true,
  "typescriptThreePlus": true,
  "npmName": "databuild-client",
  "npmVersion": "1.0.0",
  "stringEnums": true,
  "generateAliasAsModel": false,
  "modelPropertyNaming": "snake_case",
  "paramNaming": "snake_case",
  "supportsES6": true,
  "withoutRuntimeChecks": false
 }
--- a/databuild/dashboard/BUILD.bazel
+++ b/databuild/dashboard/BUILD.bazel
@ -1,111 +0,0 @@
 load("@aspect_rules_esbuild//esbuild:defs.bzl", "esbuild")
 load("@aspect_rules_js//js:defs.bzl", "js_test")
 load("@aspect_rules_ts//ts:defs.bzl", "ts_config", "ts_project")
 load("@databuild_npm//:defs.bzl", "npm_link_all_packages")
 npm_link_all_packages(name = "node_modules")
 filegroup(
    name = "dist",
    srcs = [
        # To be added once we have one
        #        "favicon.svg",
        "index.html",
        ":app_dist",
        ":css",
    ],
    visibility = ["//visibility:public"],
 )
 genrule(
    name = "css",
    srcs = [
        "index.css",
        "index.html",
        ":node_modules/daisyui",
        ":app_dist",
    ],
    outs = ["dist.css"],
    cmd = """
 # Must manually copy sources, because tailwind silently ignores symlinked files:
 # https://github.com/tailwindlabs/tailwindcss/issues/13731
 WORKDIR=$$(dirname $(location index.css))
 find $$WORKDIR -type l -exec bash -c 'echo "> $${0}" && cp -fL "$${0}" "$${0}.tmp" && mv "$${0}.tmp" "$${0}"' {} \\;
 # Copy over source from built TS app so that tailwind can see the used classes
 for fpath in $(locations :app_dist); do
    cp $$fpath $$WORKDIR
 done
 # Include daisyui plugin
 cp -R $(@D)/node_modules/.aspect_rules_js/*/node_modules $$WORKDIR/node_modules
 # Run tailwind build
 $(location //tools/build_rules:tailwind) -i $(location index.css) -o $@
    """,
    tools = ["//tools/build_rules:tailwind"],
 )
 ts_config(
    name = "ts_config_app",
    src = ":tsconfig_app.json",
    visibility = ["//visibility:public"],
 )
 # Making modules of ts projects seems to be a rats nest.
 # Hopefully we can figure this out in the future.
 ts_project(
    name = "app",
    srcs = [
        "index.ts",
        "layout.ts",
        "pages.ts",
        "services.ts",
        "types.ts",
        "utils.ts",
        # Test files
        "index.test.ts",
        "utils.test.ts",
        "transformation-tests.ts",
    ],
    allow_js = True,
    resolve_json_module = True,
    transpiler = "tsc",
    tsconfig = ":ts_config_app",
    deps = [
        ":node_modules/@types/mithril",
        ":node_modules/@types/node",
        ":node_modules/@types/ospec",
        ":node_modules/mithril",
        ":node_modules/ospec",
        ":node_modules/whatwg-fetch",
        "//databuild/client:typescript_lib",
    ],
 )
 esbuild(
    name = "app_dist",
    srcs = [":app"],
    bazel_sandbox_plugin = True,
    entry_point = "index.js",
    # esbuild_log_level = "verbose",
    # js_log_level = "debug",
    metafile = True,
    visibility = ["//visibility:public"],
 )
 js_test(
    name = "app_test",
    chdir = package_name(),
    data = [":app"],
    entry_point = "index.test.js",
 )
 # Test to verify strict TypeScript configuration catches expected failures
 sh_test(
    name = "strict_config_test",
    srcs = ["test-strict-config.sh"],
    data = [
        "test-data/strict-config-failures.ts",
        "tsconfig_app.json",
        ":node_modules/@types/node",
        ":node_modules/typescript",
    ],
 )
--- a/databuild/dashboard/README.md
+++ b/databuild/dashboard/README.md
@ -1,4 +0,0 @@
 # Dashboard
 A dashboard for viewing past build status, current running builds, etc. Extremely prototyped right now.
--- a/databuild/dashboard/TYPE_SAFETY.md
+++ b/databuild/dashboard/TYPE_SAFETY.md
@ -1,127 +0,0 @@
 # Dashboard Type Safety Architecture
 ## Overview
 This document describes the type safety architecture implemented in the DataBuild dashboard to prevent runtime errors from backend API changes.
 ## Problem Statement
 The dashboard previously experienced runtime crashes when backend API changes were deployed:
 - `status.toLowerCase()` failed when status changed from string to object
 - `partition.str` access failed when partition structure changed
 - TypeScript compilation passed but runtime errors occurred
 ## Solution Architecture
 ### 1. Dashboard Data Contracts
 We define stable TypeScript interfaces in `types.ts` that represent the data shapes the UI components expect:
 ```typescript
 export interface DashboardBuild {
  build_request_id: string;
  status: string; // Always a human-readable string
  requested_partitions: string[]; // Always flat string array
  // ... other fields
 }
 ```
 ### 2. Transformation Layer
 The `services.ts` file contains transformation functions that convert OpenAPI-generated types to dashboard types:
 ```typescript
 function transformBuildSummary(apiResponse: BuildSummary): DashboardBuild {
  return {
    build_request_id: apiResponse.build_request_id,
    status: apiResponse.status_name, // Extract string from API
    requested_partitions: apiResponse.requested_partitions.map(p => p.str), // Flatten objects
    // ... transform other fields
  };
 }
 ```
 ### 3. Component Isolation
 All UI components use only dashboard types, never raw API types:
 ```typescript
 // GOOD: Using dashboard types
 const build: DashboardBuild = await DashboardService.getBuildDetail(id);
 m('div', build.status.toLowerCase()); // Safe - status is always string
 // BAD: Using API types directly
 const build: BuildSummary = await apiClient.getBuild(id);
 m('div', build.status.toLowerCase()); // Unsafe - status might be object
 ```
 ## Benefits
 1. **Compile-time Safety**: TypeScript catches type mismatches during development
 2. **Runtime Protection**: Transformation functions handle API changes gracefully
 3. **Clear Boundaries**: UI code is isolated from API implementation details
 4. **Easier Updates**: API changes require updates only in transformation functions
 ## Testing Strategy
 ### Unit Tests
 - `transformation-tests.ts`: Verify transformation functions produce correct dashboard types
 ### Strict TypeScript Configuration
 - `exactOptionalPropertyTypes`: Ensures optional properties are handled explicitly
 - `strictNullChecks`: Prevents null/undefined errors
 - `noImplicitAny`: Requires explicit typing
 ## Maintenance Guidelines
 ### When Backend API Changes
 1. Update the OpenAPI spec and regenerate client
 2. TypeScript compilation will fail in transformation functions if types changed
 3. Update only the transformation functions to handle new API shape
 4. Run tests to verify UI components still work correctly
 ### Adding New Features
 1. Define dashboard types in `types.ts`
 2. Create transformation functions in `services.ts`
 3. Use only dashboard types in components
 4. Add tests for the transformation logic
 ## Example: Handling API Evolution
 If the backend changes `status` from string to object:
 ```typescript
 // Old API
 { status_name: "COMPLETED" }
 // New API  
 { status: { code: 4, name: "COMPLETED" } }
 // Transformation handles both
 function transformBuildSummary(apiResponse: any): DashboardBuild {
  return {
    status: apiResponse.status_name || apiResponse.status?.name || 'UNKNOWN',
    // ... other fields
  };
 }
 ```
 The UI components continue working without changes because they always receive the expected `string` type.
 ## Monitoring
 To maintain type safety over time:
 1. **Build-time Checks**: TypeScript compilation catches type errors
 2. **Test Suite**: Transformation tests run on every build
 3. **Code Reviews**: Ensure new code follows the pattern
 4. **Documentation**: Keep this document updated with patterns
 ## Related Files
 - `types.ts` - Dashboard type definitions
 - `services.ts` - API transformation functions
 - `transformation-tests.ts` - Unit tests for transformations
 - `tsconfig_app.json` - Strict TypeScript configuration
--- a/databuild/dashboard/index.css
+++ b/databuild/dashboard/index.css
@ -1,78 +0,0 @@
@import "tailwindcss" source("./**/*.{js,html}");
@plugin "daisyui" {
 }
@plugin "daisyui/theme" {
  name: "databuild-light";
  default: true;
  prefersdark: false;
  color-scheme: "light";
  --color-base-100: oklch(100% 0 0);
  --color-base-200: oklch(98% 0.002 247.839);
  --color-base-300: oklch(96% 0.003 264.542);
  --color-base-content: oklch(21% 0.034 264.665);
  --color-primary: oklch(37% 0.01 67.558);
  --color-primary-content: oklch(100% 0 0);
  --color-secondary: oklch(77% 0.152 181.912);
  --color-secondary-content: oklch(100% 0 0);
  --color-accent: oklch(75% 0.183 55.934);
  --color-accent-content: oklch(100% 0 0);
  --color-neutral: oklch(37% 0.01 67.558);
  --color-neutral-content: oklch(98% 0.002 247.839);
  --color-info: oklch(80% 0.105 251.813);
  --color-info-content: oklch(28% 0.091 267.935);
  --color-success: oklch(84% 0.238 128.85);
  --color-success-content: oklch(27% 0.072 132.109);
  --color-warning: oklch(85% 0.199 91.936);
  --color-warning-content: oklch(27% 0.077 45.635);
  --color-error: oklch(70% 0.191 22.216);
  --color-error-content: oklch(25% 0.092 26.042);
  --radius-selector: 0.5rem;
  --radius-field: 0.5rem;
  --radius-box: 0.5rem;
  --size-selector: 0.25rem;
  --size-field: 0.25rem;
  --border: 1px;
  --depth: 0;
  --noise: 0;
 }
@plugin "daisyui/theme" {
  name: "databuild-dark";
  default: false;
  prefersdark: false;
  color-scheme: "dark";
  --color-base-100: oklch(15% 0.002 247.839);
  --color-base-200: oklch(18% 0.003 264.542);
  --color-base-300: oklch(22% 0.006 264.531);
  --color-base-content: oklch(92% 0.034 264.665);
  --color-primary: oklch(75% 0.005 56.366);
  --color-primary-content: oklch(15% 0.006 56.043);
  --color-secondary: oklch(65% 0.152 181.912);
  --color-secondary-content: oklch(15% 0 0);
  --color-accent: oklch(70% 0.183 55.934);
  --color-accent-content: oklch(15% 0 0);
  --color-neutral: oklch(25% 0.01 67.558);
  --color-neutral-content: oklch(92% 0.002 247.839);
  --color-info: oklch(65% 0.165 254.624);
  --color-info-content: oklch(15% 0.091 267.935);
  --color-success: oklch(75% 0.238 128.85);
  --color-success-content: oklch(15% 0.072 132.109);
  --color-warning: oklch(80% 0.199 91.936);
  --color-warning-content: oklch(15% 0.077 45.635);
  --color-error: oklch(65% 0.191 22.216);
  --color-error-content: oklch(15% 0.092 26.042);
  --radius-selector: 0.5rem;
  --radius-field: 0.5rem;
  --radius-box: 0.5rem;
  --size-selector: 0.25rem;
  --size-field: 0.25rem;
  --border: 1px;
  --depth: 0;
  --noise: 0;
 }
--- a/databuild/dashboard/index.html
+++ b/databuild/dashboard/index.html
@ -1,21 +0,0 @@
 <!doctype html>
 <html data-theme="databuild-light">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>DataBuild Dashboard</title>
    <link href="/static/dist.css" rel="stylesheet">
    <script src="/static/app_dist.js"></script>
    <script type="module">
        import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
        window.mermaid = mermaid;
        mermaid.initialize({ startOnLoad: true });
        console.info("mermaid loaded", mermaid);
    </script>
 </head>
 <body>
 <div id="app">
    Loading...
 </div>
 </body>
 </html>
--- a/databuild/dashboard/index.test.ts
+++ b/databuild/dashboard/index.test.ts
@ -1,15 +0,0 @@
 const { appName } = require('./index');
 const o = require('ospec');
 // Import transformation tests
 require('./transformation-tests');
 o.spec("appName", () => {
    o("should be databuild", () => {
        o(appName).equals("databuild") `Should be databuild`;
    });
 });
 // TODO - I think we can create an ospec target that invokes these with the ospec CLI?
 // https://github.com/MithrilJS/ospec?tab=readme-ov-file#command-line-interface
 o.run();
--- a/databuild/dashboard/index.ts
+++ b/databuild/dashboard/index.ts
@ -1,76 +0,0 @@
 import m from 'mithril';
 import { Layout } from './layout';
 import { 
  RecentActivity, 
  BuildStatus, 
  PartitionsList, 
  PartitionStatus, 
  JobsList, 
  JobMetrics, 
  GraphAnalysis 
 } from './pages';
 import { decodePartitionRef } from './utils';
 import { 
  TypedComponent, 
  LayoutWrapperAttrs,
  RecentActivityAttrs,
  BuildStatusAttrs,
  PartitionStatusAttrs,
  PartitionsListAttrs,
  JobsListAttrs,
  JobMetricsAttrs,
  GraphAnalysisAttrs
 } from './types';
 export const appName = "databuild";
 // Wrapper components that include layout - now with type safety
 function createLayoutWrapper<TAttrs>(component: TypedComponent<TAttrs>): m.Component<TAttrs> {
  const wrapper: any = {
    view: (vnode: m.Vnode<TAttrs>) => m(Layout, [component.view.call(component, vnode)])
  };
  // Only add lifecycle methods if they exist to avoid exactOptionalPropertyTypes issues
  if (component.oninit) {
    wrapper.oninit = (vnode: m.Vnode<TAttrs>) => component.oninit!.call(component, vnode);
  }
  if (component.oncreate) {
    wrapper.oncreate = (vnode: m.VnodeDOM<TAttrs>) => component.oncreate!.call(component, vnode);
  }
  if (component.onupdate) {
    wrapper.onupdate = (vnode: m.VnodeDOM<TAttrs>) => component.onupdate!.call(component, vnode);
  }
  if (component.onbeforeremove) {
    wrapper.onbeforeremove = (vnode: m.VnodeDOM<TAttrs>) => component.onbeforeremove!.call(component, vnode);
  }
  if (component.onremove) {
    wrapper.onremove = (vnode: m.VnodeDOM<TAttrs>) => component.onremove!.call(component, vnode);
  }
  if (component.onbeforeupdate) {
    wrapper.onbeforeupdate = (vnode: m.Vnode<TAttrs>, old: m.VnodeDOM<TAttrs>) => component.onbeforeupdate!.call(component, vnode, old);
  }
  return wrapper;
 }
 // Route definitions with type safety
 const routes = {
  '/': createLayoutWrapper<RecentActivityAttrs>(RecentActivity),
  '/builds/:id': createLayoutWrapper<BuildStatusAttrs>(BuildStatus),
  '/partitions': createLayoutWrapper<PartitionsListAttrs>(PartitionsList),
  '/partitions/:base64_ref': createLayoutWrapper<PartitionStatusAttrs>(PartitionStatus),
  '/jobs': createLayoutWrapper<JobsListAttrs>(JobsList),
  '/jobs/:label': createLayoutWrapper<JobMetricsAttrs>(JobMetrics),
  '/analyze': createLayoutWrapper<GraphAnalysisAttrs>(GraphAnalysis),
 };
 if (typeof window !== "undefined") {
    document.addEventListener("DOMContentLoaded", () => {
        // Initialize theme from localStorage
        const savedTheme = localStorage.getItem('theme') || 'databuild-light';
        document.documentElement.setAttribute('data-theme', savedTheme);
        // Set up routing
        m.route(document.getElementById('app') as HTMLElement, '/', routes);
    });
 }
--- a/databuild/dashboard/layout.ts
+++ b/databuild/dashboard/layout.ts
@ -1,52 +0,0 @@
 import m from 'mithril';
 export const Layout = {
  view: (vnode: any) => [
    m('header.navbar.bg-base-100.shadow-lg', [
      m('div.navbar-start', [
        m('div.dropdown', [
          m('div.btn.btn-ghost.lg:hidden[tabindex="0"][role="button"]', [
            m('svg.w-5.h-5[xmlns="http://www.w3.org/2000/svg"][fill="none"][viewBox="0 0 24 24"]', [
              m('path[stroke-linecap="round"][stroke-linejoin="round"][stroke-width="2"][stroke="currentColor"][d="M4 6h16M4 12h8m-8 6h16"]'),
            ]),
          ]),
          m('ul.menu.menu-sm.dropdown-content.bg-base-100.rounded-box.z-1.mt-3.w-52.p-2.shadow[tabindex="0"]', [
            m('li', m(m.route.Link, { href: '/partitions' }, 'Partitions')),
            m('li', m(m.route.Link, { href: '/jobs' }, 'Jobs')),
            m('li', m(m.route.Link, { href: '/analyze' }, 'Analyze')),
          ]),
        ]),
        m(m.route.Link, { href: '/', class: 'btn btn-ghost text-xl' }, 'DataBuild Dashboard'),
      ]),
      m('div.navbar-center.hidden.lg:flex', [
        m('ul.menu.menu-horizontal.px-1', [
          m('li', m(m.route.Link, { href: '/' }, 'Dashboard')),
          m('li', m(m.route.Link, { href: '/partitions' }, 'Partitions')),
          m('li', m(m.route.Link, { href: '/jobs' }, 'Jobs')),
          m('li', m(m.route.Link, { href: '/analyze' }, 'Analyze')),
        ]),
      ]),
      m('div.navbar-end', [
        m('label.swap.swap-rotate', [
          m('input.theme-controller[type="checkbox"]', {
            value: 'databuild-dark',
            onchange: (e: Event) => {
              const target = e.target as HTMLInputElement;
              const theme = target.checked ? 'databuild-dark' : 'databuild-light';
              document.documentElement.setAttribute('data-theme', theme);
              localStorage.setItem('theme', theme);
            },
            checked: localStorage.getItem('theme') === 'databuild-dark'
          }),
          m('svg.swap-off.fill-current.w-6.h-6[xmlns="http://www.w3.org/2000/svg"][viewBox="0 0 24 24"]', [
            m('path[d="M5.64,17l-.71.71a1,1,0,0,0,0,1.41,1,1,0,0,0,1.41,0l.71-.71A1,1,0,0,0,5.64,17ZM5,12a1,1,0,0,0-1-1H3a1,1,0,0,0,0,2H4A1,1,0,0,0,5,12Zm7-7a1,1,0,0,0,1-1V3a1,1,0,0,0-2,0V4A1,1,0,0,0,12,5ZM5.64,7.05a1,1,0,0,0,.7.29,1,1,0,0,0,.71-.29,1,1,0,0,0,0-1.41l-.71-.71A1,1,0,0,0,4.93,6.34Zm12,.29a1,1,0,0,0,.7-.29l.71-.71a1,1,0,1,0-1.41-1.41L17,5.64a1,1,0,0,0,0,1.41A1,1,0,0,0,17.66,7.34ZM21,11H20a1,1,0,0,0,0,2h1a1,1,0,0,0,0-2Zm-9,8a1,1,0,0,0-1,1v1a1,1,0,0,0,2,0V20A1,1,0,0,0,12,19ZM18.36,17A1,1,0,0,0,17,18.36l.71.71a1,1,0,0,0,1.41,0,1,1,0,0,0,0-1.41ZM12,6.5A5.5,5.5,0,1,0,17.5,12,5.51,5.51,0,0,0,12,6.5Zm0,9A3.5,3.5,0,1,1,15.5,12,3.5,3.5,0,0,1,12,15.5Z"]'),
          ]),
          m('svg.swap-on.fill-current.w-6.h-6[xmlns="http://www.w3.org/2000/svg"][viewBox="0 0 24 24"]', [
            m('path[d="M21.64,13a1,1,0,0,0-1.05-.14,8.05,8.05,0,0,1-3.37.73A8.15,8.15,0,0,1,9.08,5.49a8.59,8.59,0,0,1,.25-2A1,1,0,0,0,8,2.36,10.14,10.14,0,1,0,22,14.05,1,1,0,0,0,21.64,13Zm-9.5,6.69A8.14,8.14,0,0,1,7.08,5.22v.27A10.15,10.15,0,0,0,17.22,15.63a9.79,9.79,0,0,0,2.1-.22A8.11,8.11,0,0,1,12.14,19.73Z"]'),
          ]),
        ]),
      ]),
    ]),
    m('main.min-h-screen.bg-base-200.pt-4', vnode.children),
  ]
 };
--- a/databuild/dashboard/package.json
+++ b/databuild/dashboard/package.json
@ -1,16 +0,0 @@
 {
  "private": true,
  "devDependencies": {
    "typescript": "5.7.3",
    "@types/node": "^22.12.0",
    "mithril": "^2.2.7",
    "@types/mithril": "^2.2.7",
    "ospec": "^4.2.0",
    "@types/ospec": "^4.2.0",
    "whatwg-fetch": "^3.6.20",
    "daisyui": "^5.0.0-beta.6"
  },
  "pnpm": {
    "onlyBuiltDependencies": []
  }
 }
--- a/databuild/dashboard/pages.ts
+++ b/databuild/dashboard/pages.ts
--- a/databuild/dashboard/pnpm-lock.yaml
+++ b/databuild/dashboard/pnpm-lock.yaml
@ -1,111 +0,0 @@
 lockfileVersion: '9.0'
 settings:
  autoInstallPeers: true
  excludeLinksFromLockfile: false
 importers:
  .:
    devDependencies:
      '@types/mithril':
        specifier: ^2.2.7
        version: 2.2.7
      '@types/node':
        specifier: ^22.12.0
        version: 22.12.0
      '@types/ospec':
        specifier: ^4.2.0
        version: 4.2.0
      daisyui:
        specifier: ^5.0.0-beta.6
        version: 5.0.0-beta.6
      mithril:
        specifier: ^2.2.7
        version: 2.2.13
      ospec:
        specifier: ^4.2.0
        version: 4.2.1
      typescript:
        specifier: ^5.7.3
        version: 5.7.3
      whatwg-fetch:
        specifier: ^3.6.20
        version: 3.6.20
 packages:
  '@types/mithril@2.2.7':
    resolution: {integrity: sha512-uetxoYizBMHPELl6DSZUfO6Q/aOm+h0NUCv9bVAX2iAxfrdBSOvU9KKFl+McTtxR13F+BReYLY814pJsZvnSxg==}
  '@types/node@22.12.0':
    resolution: {integrity: sha512-Fll2FZ1riMjNmlmJOdAyY5pUbkftXslB5DgEzlIuNaiWhXd00FhWxVC/r4yV/4wBb9JfImTu+jiSvXTkJ7F/gA==}
  '@types/ospec@4.2.0':
    resolution: {integrity: sha512-QgwAtrYYstU7otBXmQ2yjUWaYMWkF48EevmG+IfYzAWk39cwsTw7ZHp7dK2XyA3eJ2v5AvbMa5ijcLewklDRDA==}
  balanced-match@1.0.2:
    resolution: {integrity: sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==}
  brace-expansion@2.0.1:
    resolution: {integrity: sha512-XnAIvQ8eM+kC6aULx6wuQiwVsnzsi9d3WxzV3FpWTGA19F621kwdbsAcFKXgKUHZWsy+mY6iL1sHTxWEFCytDA==}
  daisyui@5.0.0-beta.6:
    resolution: {integrity: sha512-gwXHv6MApRBrvUayzg83vS6bfZ+y7/1VGLu0a8/cEAMviS4rXLCd4AndEdlVxhq+25wkAp0CZRkNQ7O4wIoFnQ==}
  fs.realpath@1.0.0:
    resolution: {integrity: sha512-OO0pH2lK6a0hZnAdau5ItzHPI6pUlvI7jMVnxUQRtw4owF2wk8lOSabtGDCTP4Ggrg2MbGnWO9X8K1t4+fGMDw==}
  glob@9.3.5:
    resolution: {integrity: sha512-e1LleDykUz2Iu+MTYdkSsuWX8lvAjAcs0Xef0lNIu0S2wOAzuTxCJtcd9S3cijlwYF18EsU3rzb8jPVobxDh9Q==}
    engines: {node: '>=16 || 14 >=14.17'}
  lru-cache@10.4.3:
    resolution: {integrity: sha512-JNAzZcXrCt42VGLuYz0zfAzDfAvJWW6AfYlDBQyDV5DClI2m5sAmK+OIO7s59XfsRsWHp02jAJrRadPRGTt6SQ==}
  minimatch@8.0.4:
    resolution: {integrity: sha512-W0Wvr9HyFXZRGIDgCicunpQ299OKXs9RgZfaukz4qAW/pJhcpUfupc9c+OObPOFueNy8VSrZgEmDtk6Kh4WzDA==}
    engines: {node: '>=16 || 14 >=14.17'}
  minipass@4.2.8:
    resolution: {integrity: sha512-fNzuVyifolSLFL4NzpF+wEF4qrgqaaKX0haXPQEdQ7NKAN+WecoKMHV09YcuL/DHxrUsYQOK3MiuDf7Ip2OXfQ==}
    engines: {node: '>=8'}
  minipass@7.1.2:
    resolution: {integrity: sha512-qOOzS1cBTWYF4BH8fVePDBOO9iptMnGUEZwNc/cMWnTV2nVLZ7VoNWEPHkYczZA0pdoA7dl6e7FL659nX9S2aw==}
    engines: {node: '>=16 || 14 >=14.17'}
  mithril@2.2.13:
    resolution: {integrity: sha512-dfWFYmRJDXAROG6B1AsQXEwhSgFZ65Am/5Xj3oJ/R1wZtrC0W20P4sIAtFQB0SZsGwV7H2MiEJiFGmlUtXF1Ww==}
  ospec@4.2.1:
    resolution: {integrity: sha512-LsJw2WMaVlFDiaIPPH+LMtsxOABjFD29XQ12ENZM+8Cwgg5BEgW65CB+SPL1PceIun+HSfdw8hkf27C8iF/XFw==}
    hasBin: true
  path-scurry@1.11.1:
    resolution: {integrity: sha512-Xa4Nw17FS9ApQFJ9umLiJS4orGjm7ZzwUrwamcGQuHSzDyth9boKDaycYdDcZDuqYATXw4HFXgaqWTctW/v1HA==}
    engines: {node: '>=16 || 14 >=14.18'}
  typescript@5.7.3:
    resolution: {integrity: sha512-84MVSjMEHP+FQRPy3pX9sTVV/INIex71s9TL2Gm5FG/WG1SqXeKyZ0k7/blY/4FdOzI12CBy1vGc4og/eus0fw==}
    engines: {node: '>=14.17'}
    hasBin: true
  undici-types@6.20.0:
    resolution: {integrity: sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==}
  whatwg-fetch@3.6.20:
    resolution: {integrity: sha512-EqhiFU6daOA8kpjOWTL0olhVOF3i7OrFzSYiGsEMB8GcXS+RrzauAERX65xMeNWVqxA6HXH2m69Z9LaKKdisfg==}
 snapshots:
  '@types/mithril@2.2.7': {}
  '@types/node@22.12.0':
    dependencies:
      undici-types: 6.20.0
  '@types/ospec@4.2.0': {}
  balanced-match@1.0.2: {}
  brace-expansion@2.0.1:
    dependencies:
      balanced-match: 1.0.2
  daisyui@5.0.0-beta.6: {}
  fs.realpath@1.0.0: {}
  glob@9.3.5:
    dependencies:
      fs.realpath: 1.0.0
      minimatch: 8.0.4
      minipass: 4.2.8
      path-scurry: 1.11.1
  lru-cache@10.4.3: {}
  minimatch@8.0.4:
    dependencies:
      brace-expansion: 2.0.1
  minipass@4.2.8: {}
  minipass@7.1.2: {}
  mithril@2.2.13: {}
  ospec@4.2.1:
    dependencies:
      glob: 9.3.5
  path-scurry@1.11.1:
    dependencies:
      lru-cache: 10.4.3
      minipass: 7.1.2
  typescript@5.7.3: {}
  undici-types@6.20.0: {}
  whatwg-fetch@3.6.20: {}
--- a/databuild/dashboard/pnpm-workspace.yaml
+++ b/databuild/dashboard/pnpm-workspace.yaml
@ -1,2 +0,0 @@
 packages:
  - "databuild"
--- a/databuild/dashboard/services.ts
+++ b/databuild/dashboard/services.ts
@ -1,490 +0,0 @@
 // Import the generated TypeScript client
 import { 
  DefaultApi, 
  Configuration, 
  ActivityApiResponse, 
  ActivityResponse, 
  BuildSummary, 
  BuildDetailResponse,
  PartitionSummary, 
  JobsListApiResponse, 
  JobMetricsResponse, 
  JobSummary, 
  JobRunSummary, 
  JobDailyStats 
 } from '../client/typescript_generated/src/index';
 // Import our dashboard types 
 import { 
  DashboardActivity,
  DashboardBuild, 
  DashboardPartition, 
  DashboardJob,
  isDashboardActivity,
  isDashboardBuild,
  isDashboardPartition,
  isDashboardJob
 } from './types';
 // Configure the API client
 const apiConfig = new Configuration({
  basePath: '',  // Use relative paths since we're on the same host
 });
 const apiClient = new DefaultApi(apiConfig);
 // Transformation functions: Convert API responses to dashboard types
 // These functions prevent runtime errors by ensuring consistent data shapes
 function transformBuildSummary(apiResponse: BuildSummary): DashboardBuild {
  return {
    build_request_id: apiResponse.build_request_id,
    status: apiResponse.status!,
    requested_partitions: apiResponse.requested_partitions, // Keep as PartitionRef array
    total_jobs: apiResponse.total_jobs,
    completed_jobs: apiResponse.completed_jobs,
    failed_jobs: apiResponse.failed_jobs,
    cancelled_jobs: apiResponse.cancelled_jobs,
    requested_at: apiResponse.requested_at,
    started_at: apiResponse.started_at ?? null,
    completed_at: apiResponse.completed_at ?? null,
    duration_ms: apiResponse.duration_ms ?? null,
    cancelled: apiResponse.cancelled,
  };
 }
 function transformBuildDetail(apiResponse: BuildDetailResponse): DashboardBuild {
  return {
    build_request_id: apiResponse.build_request_id,
    status: apiResponse.status!,
    requested_partitions: apiResponse.requested_partitions, // Keep as PartitionRef array
    total_jobs: apiResponse.total_jobs,
    completed_jobs: apiResponse.completed_jobs,
    failed_jobs: apiResponse.failed_jobs,
    cancelled_jobs: apiResponse.cancelled_jobs,
    requested_at: apiResponse.requested_at,
    started_at: apiResponse.started_at ?? null,
    completed_at: apiResponse.completed_at ?? null,
    duration_ms: apiResponse.duration_ms ?? null,
    cancelled: apiResponse.cancelled,
  };
 }
 function transformPartitionSummary(apiResponse: PartitionSummary): DashboardPartition {
  if (!apiResponse.partition_ref) {
    throw new Error('PartitionSummary must have a valid partition_ref');
  }
  return {
    partition_ref: apiResponse.partition_ref, // Keep as PartitionRef object
    status_code: apiResponse.status_code,
    status_name: apiResponse.status_name,
    last_updated: apiResponse.last_updated ?? null,
    build_requests: (apiResponse as any).build_requests || [], // This field might not be in the OpenAPI spec
  };
 }
 function transformJobSummary(apiResponse: JobSummary): DashboardJob {
  return {
    job_label: apiResponse.job_label,
    total_runs: apiResponse.total_runs,
    successful_runs: apiResponse.successful_runs,
    failed_runs: apiResponse.failed_runs,
    cancelled_runs: apiResponse.cancelled_runs,
    last_run_timestamp: apiResponse.last_run_timestamp,
    last_run_status_code: apiResponse.last_run_status_code,
    last_run_status_name: apiResponse.last_run_status_name,
    average_partitions_per_run: apiResponse.average_partitions_per_run,
    recent_builds: apiResponse.recent_builds || [], // Default for optional array field
  };
 }
 function transformActivityResponse(apiResponse: ActivityResponse): DashboardActivity {
  return {
    active_builds_count: apiResponse.active_builds_count,
    recent_builds: apiResponse.recent_builds.map(transformBuildSummary),
    recent_partitions: apiResponse.recent_partitions.map(transformPartitionSummary),
    total_partitions_count: apiResponse.total_partitions_count,
    system_status: apiResponse.system_status,
    graph_name: apiResponse.graph_name,
  };
 }
 // Type guards for runtime validation
 function isValidBuildDetailResponse(data: unknown): data is BuildDetailResponse {
  return typeof data === 'object' && 
         data !== null && 
         'build_request_id' in data &&
         'status_name' in data &&
         'requested_partitions' in data;
 }
 function isValidActivityResponse(data: unknown): data is ActivityResponse {
  return typeof data === 'object' && 
         data !== null && 
         'active_builds_count' in data &&
         'recent_builds' in data &&
         'recent_partitions' in data;
 }
 function isValidJobsListApiResponse(data: unknown): data is JobsListApiResponse {
  return typeof data === 'object' && 
         data !== null && 
         'data' in data;
 }
 // API Service for fetching recent activity data
 export class DashboardService {
  private static instance: DashboardService;
  static getInstance(): DashboardService {
    if (!DashboardService.instance) {
      DashboardService.instance = new DashboardService();
    }
    return DashboardService.instance;
  }
  async getRecentActivity(): Promise<DashboardActivity> {
    try {
      // Use the new activity endpoint that aggregates all the data we need
      const activityApiResponse: ActivityApiResponse = await apiClient.apiV1ActivityGet();
      console.info('Recent activity:', activityApiResponse);
      const activityResponse = activityApiResponse.data;
      // Validate API response structure
      if (!isValidActivityResponse(activityResponse)) {
        throw new Error('Invalid activity response structure');
      }
      // Transform API response to dashboard format using transformation function
      const dashboardActivity = transformActivityResponse(activityResponse);
      // Validate transformed result
      if (!isDashboardActivity(dashboardActivity)) {
        throw new Error('Transformation produced invalid dashboard activity');
      }
      return dashboardActivity;
    } catch (error) {
      console.error('Failed to fetch recent activity:', error);
      // Fall back to valid dashboard format if API call fails
      return {
        active_builds_count: 0,
        recent_builds: [],
        recent_partitions: [],
        total_partitions_count: 0,
        system_status: 'error',
        graph_name: 'Unknown Graph'
      };
    }
  }
  async getJobs(searchTerm?: string): Promise<DashboardJob[]> {
    try {
      // Build query parameters manually since the generated client may not support query params correctly
      const queryParams = new URLSearchParams();
      if (searchTerm) {
        queryParams.append('search', searchTerm);
      }
      const url = `/api/v1/jobs${queryParams.toString() ? '?' + queryParams.toString() : ''}`;
      const response = await fetch(url);
      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
      }
      const data: unknown = await response.json();
      // Validate API response structure
      if (!isValidJobsListApiResponse(data)) {
        throw new Error('Invalid jobs list response structure');
      }
      // Transform each job using our transformation function
      const dashboardJobs = data.data.jobs.map(transformJobSummary);
      // Validate each transformed job
      for (const job of dashboardJobs) {
        if (!isDashboardJob(job)) {
          throw new Error('Transformation produced invalid dashboard job');
        }
      }
      return dashboardJobs;
    } catch (error) {
      console.error('Failed to fetch jobs:', error);
      return [];
    }
  }
  async getBuildDetail(buildId: string): Promise<DashboardBuild | null> {
    try {
      const url = `/api/v1/builds/${buildId}`;
      const response = await fetch(url);
      if (!response.ok) {
        if (response.status === 404) {
          return null; // Build not found
        }
        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
      }
      const data: unknown = await response.json();
      // Validate API response structure  
      if (!isValidBuildDetailResponse(data)) {
        throw new Error('Invalid build detail response structure');
      }
      // Transform to dashboard format
      const dashboardBuild = transformBuildDetail(data);
      // Validate transformed result
      if (!isDashboardBuild(dashboardBuild)) {
        throw new Error('Transformation produced invalid dashboard build');
      }
      return dashboardBuild;
    } catch (error) {
      console.error('Failed to fetch build detail:', error);
      return null;
    }
  }
  async getPartitionDetail(partitionRef: string): Promise<DashboardPartition | null> {
    try {
      // Encode partition ref for URL safety
      const encodedRef = btoa(partitionRef).replace(/\+/g, '-').replace(/\//g, '_').replace(/=/g, '');
      const url = `/api/v1/partitions/${encodedRef}`;
      const response = await fetch(url);
      if (!response.ok) {
        if (response.status === 404) {
          return null; // Partition not found
        }
        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
      }
      const data: unknown = await response.json();
      // For partition detail, we need to extract the PartitionSummary from the response
      // and transform it to dashboard format
      if (typeof data === 'object' && data !== null && 'partition_ref' in data) {
        const dashboardPartition = transformPartitionSummary(data as PartitionSummary);
        if (!isDashboardPartition(dashboardPartition)) {
          throw new Error('Transformation produced invalid dashboard partition');
        }
        return dashboardPartition;
      } else {
        throw new Error('Invalid partition detail response structure');
      }
    } catch (error) {
      console.error('Failed to fetch partition detail:', error);
      return null;
    }
  }
  async getJobMetrics(jobLabel: string): Promise<DashboardJob | null> {
    try {
      // Encode job label like partition refs for URL safety
      const encodedLabel = btoa(jobLabel).replace(/\+/g, '-').replace(/\//g, '_').replace(/=/g, '');
      const url = `/api/v1/jobs/${encodedLabel}`;
      const response = await fetch(url);
      if (!response.ok) {
        if (response.status === 404) {
          return null; // Job not found
        }
        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
      }
      const data: unknown = await response.json();
      console.log('Job metrics response:', data);
      // Extract job summary from metrics response and transform it
      if (typeof data === 'object' && data !== null && 'job_label' in data) {
        const dashboardJob = transformJobSummary(data as unknown as JobSummary);
        console.log('Transformed job summary:', dashboardJob);
        if (!isDashboardJob(dashboardJob)) {
          throw new Error('Transformation produced invalid dashboard job');
        }
        return dashboardJob;
      }
      throw new Error('Invalid job metrics response structure');
    } catch (error) {
      console.error('Failed to fetch job metrics:', error);
      return null;
    }
  }
  async getMermaidDiagram(buildId: string): Promise<string | null> {
    try {
      const url = `/api/v1/builds/${buildId}/mermaid`;
      const response = await fetch(url);
      if (!response.ok) {
        if (response.status === 404) {
          return null; // Build not found or no job graph
        }
        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
      }
      const data = await response.json();
      // Validate response structure
      if (typeof data === 'object' && data !== null && 'mermaid' in data && typeof data.mermaid === 'string') {
        return data.mermaid;
      }
      throw new Error('Invalid mermaid response structure');
    } catch (error) {
      console.error('Failed to fetch mermaid diagram:', error);
      return null;
    }
  }
 }
 // Polling manager with Page Visibility API integration
 export class PollingManager {
  private intervals: Map<string, NodeJS.Timeout> = new Map();
  private isTabVisible: boolean = true;
  private visibilityChangeHandler: () => void;
  constructor() {
    this.visibilityChangeHandler = () => {
      this.isTabVisible = !document.hidden;
      // Pause or resume polling based on tab visibility
      if (this.isTabVisible) {
        this.resumePolling();
      } else {
        this.pausePolling();
      }
    };
    // Set up Page Visibility API listener only in browser environment
    if (typeof document !== 'undefined') {
      document.addEventListener('visibilitychange', this.visibilityChangeHandler);
    }
  }
  startPolling(key: string, callback: () => void, intervalMs: number): void {
    // Clear existing interval if any
    this.stopPolling(key);
    // Only start polling if tab is visible
    if (this.isTabVisible) {
      const interval = setInterval(callback, intervalMs);
      this.intervals.set(key, interval);
    }
  }
  stopPolling(key: string): void {
    const interval = this.intervals.get(key);
    if (interval) {
      clearInterval(interval);
      this.intervals.delete(key);
    }
  }
  private pausePolling(): void {
    // Store current intervals but clear them
    for (const [key, interval] of this.intervals) {
      clearInterval(interval);
    }
  }
  private resumePolling(): void {
    // This is a simplified approach - in practice you'd want to store the callback
    // and interval info to properly resume. For now, components will handle this
    // by checking visibility state when setting up polling.
  }
  cleanup(): void {
    // Clean up all intervals
    for (const interval of this.intervals.values()) {
      clearInterval(interval);
    }
    this.intervals.clear();
    // Remove event listener only in browser environment
    if (typeof document !== 'undefined') {
      document.removeEventListener('visibilitychange', this.visibilityChangeHandler);
    }
  }
  isVisible(): boolean {
    return this.isTabVisible;
  }
 }
 // Export singleton instance
 export const pollingManager = new PollingManager();
 // Utility functions for time formatting
 export function formatTime(epochNanos: number): string {
  const date = new Date(epochNanos / 1000000);
  const now = new Date();
  const diffMs = now.getTime() - date.getTime();
  if (diffMs < 60000) { // Less than 1 minute
    return 'just now';
  } else if (diffMs < 3600000) { // Less than 1 hour
    const minutes = Math.floor(diffMs / 60000);
    return `${minutes}m ago`;
  } else if (diffMs < 86400000) { // Less than 1 day
    const hours = Math.floor(diffMs / 3600000);
    return `${hours}h ago`;
  } else {
    return date.toLocaleDateString();
  }
 }
 export function formatDateTime(epochNanos: number): string {
  const date = new Date(epochNanos / 1000000);
  const dateStr = date.toLocaleDateString('en-US');
  const timeStr = date.toLocaleTimeString('en-US', {
    hour: 'numeric',
    minute: '2-digit',
    second: '2-digit',
    hour12: true,
    timeZoneName: 'short'
  });
  const millisStr = date.getMilliseconds().toString().padStart(3, '0');
  // Insert milliseconds between seconds and AM/PM: "7/12/2025, 9:03:48.264 AM EST"
  return `${dateStr}, ${timeStr.replace(/(\d{2})\s+(AM|PM)/, `$1.${millisStr} $2`)}`;
 }
 export function formatDuration(durationNanos?: number | null): string {
  let durationMs = durationNanos ? durationNanos / 1000000 : null;
  console.warn('Formatting duration:', durationMs);
  if (!durationMs || durationMs <= 0) {
    return '—';
  }
  if (durationMs < 1000) {
    return `${Math.round(durationMs)}ms`;
  } else if (durationMs < 60000) {
    return `${(durationMs / 1000).toFixed(1)}s`;
  } else if (durationMs < 3600000) {
    const minutes = Math.floor(durationMs / 60000);
    const seconds = Math.floor((durationMs % 60000) / 1000);
    return `${minutes}m ${seconds}s`;
  } else {
    const hours = Math.floor(durationMs / 3600000);
    const minutes = Math.floor((durationMs % 3600000) / 60000);
    return `${hours}h ${minutes}m`;
  }
 }
 export function formatDate(epochNanos: number): string {
  const date = new Date(epochNanos / 1000000);
  return date.toLocaleDateString('en-US', {
    month: 'short',
    day: 'numeric',
    year: 'numeric'
  });
 }
--- a/databuild/dashboard/test-data/strict-config-failures.ts
+++ b/databuild/dashboard/test-data/strict-config-failures.ts
@ -1,44 +0,0 @@
 // Test file designed to fail TypeScript compilation with strict config
 // These are the exact patterns that caused runtime failures in production
 // Test 1: Reproduce original status.toLowerCase() failure
 const mockResponseWithStatusObject = { status_code: 1, status_name: "COMPLETED" };
 // This should cause compilation error: Property 'status' does not exist
 const test1 = mockResponseWithStatusObject.status?.toLowerCase(); 
 // Test 2: Reproduce original status?.status access failure  
 const test2 = mockResponseWithStatusObject.status?.status;
 // Test 3: Optional field access without null check
 interface PartitionSummaryTest {
  last_updated?: number;
  partition_ref: string;
 }
 const testPartition: PartitionSummaryTest = {
  partition_ref: "test-partition"
 };
 // This should fail: accessing optional field without null check
 const timestamp = testPartition.last_updated.toString();
 // Test 4: Exact optional property types
 interface StrictTest {
  required: string;
  optional?: string;
 }
 // This should fail with exactOptionalPropertyTypes
 const testObj: StrictTest = {
  required: "test",
  optional: undefined  // undefined not assignable to optional string
 };
 // Test 5: Array access without undefined handling
 const testArray: string[] = ["a", "b", "c"];
 const element: string = testArray[10]; // Should include undefined in type
 // Test 6: Null access without proper checks
 let possiblyNull: string | null = Math.random() > 0.5 ? "value" : null;
 const upperCase = possiblyNull.toUpperCase(); // Should fail with strictNullChecks
--- a/databuild/dashboard/test-strict-config.sh
+++ b/databuild/dashboard/test-strict-config.sh
@ -1,69 +0,0 @@
 #!/bin/bash
 # Test script to verify strict TypeScript configuration catches expected failures
 set -e
 echo "Testing strict TypeScript configuration..."
 # Find TypeScript compiler in runfiles
 if [[ -n "${RUNFILES_DIR:-}" ]]; then
    TSC="${RUNFILES_DIR}/_main/databuild/dashboard/node_modules/typescript/bin/tsc"
 else
    # Fallback for local execution
    TSC="$(find . -name tsc -type f | head -1)"
    if [[ -z "$TSC" ]]; then
        echo "ERROR: Could not find TypeScript compiler"
        exit 1
    fi
 fi
 # Get paths relative to runfiles
 if [[ -n "${RUNFILES_DIR:-}" ]]; then
    TEST_DATA_DIR="${RUNFILES_DIR}/_main/databuild/dashboard/test-data"
    TSCONFIG="${RUNFILES_DIR}/_main/databuild/dashboard/tsconfig_app.json"
 else
    SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
    TEST_DATA_DIR="$SCRIPT_DIR/test-data"  
    TSCONFIG="$SCRIPT_DIR/tsconfig_app.json"
 fi
 # Function to test that TypeScript compilation fails with expected errors
 test_compilation_failures() {
    local test_file="$1"
    local expected_errors="$2"
    echo "Testing compilation failures for: $test_file"
    # Run TypeScript compilation and capture output
    if node "$TSC" --noEmit --strict --strictNullChecks --noImplicitAny --noImplicitReturns --noUncheckedIndexedAccess --exactOptionalPropertyTypes "$test_file" 2>&1; then
        echo "ERROR: Expected TypeScript compilation to fail for $test_file, but it passed"
        return 1
    fi
    # Check that we get the expected error patterns
    local tsc_output=$(node "$TSC" --noEmit --strict --strictNullChecks --noImplicitAny --noImplicitReturns --noUncheckedIndexedAccess --exactOptionalPropertyTypes "$test_file" 2>&1 || true)
    IFS='|' read -ra ERROR_PATTERNS <<< "$expected_errors"
    for pattern in "${ERROR_PATTERNS[@]}"; do
        if ! echo "$tsc_output" | grep -q "$pattern"; then
            echo "ERROR: Expected error pattern '$pattern' not found in TypeScript output"
            echo "Actual output:"
            echo "$tsc_output"
            return 1
        fi
    done
    echo "✓ Compilation correctly failed with expected errors"
 }
 # Test 1: Verify strict config catches undefined property access
 test_compilation_failures "$TEST_DATA_DIR/strict-config-failures.ts" "Property 'status' does not exist|is possibly 'undefined'|Type 'undefined' is not assignable"
 echo "All strict TypeScript configuration tests passed!"
 echo ""
 echo "Summary of what strict config catches:"
 echo "- ✓ Undefined property access (status.toLowerCase() failures)"  
 echo "- ✓ Optional field access without null checks"
 echo "- ✓ Exact optional property type mismatches"
 echo "- ✓ Array access without undefined handling"
 echo "- ✓ Null/undefined access without proper checks"
--- a/databuild/dashboard/transformation-tests.ts
+++ b/databuild/dashboard/transformation-tests.ts
@ -1,318 +0,0 @@
 // Phase 3.5: Unit tests for transformation functions
 // These tests verify that transformation functions prevent the observed runtime failures
 import o from 'ospec';
 import {
  BuildSummary,
  BuildDetailResponse,
  PartitionSummary,
  JobSummary,
  ActivityResponse,
  BuildRequestStatus
 } from '../client/typescript_generated/src/index';
 // Import types directly since we're now in the same ts_project
 import {
  DashboardActivity,
  DashboardBuild,
  DashboardPartition,
  DashboardJob,
  isDashboardActivity,
  isDashboardBuild,
  isDashboardPartition,
  isDashboardJob
 } from './types';
 // Mock transformation functions for testing (since they're not exported from services.ts)
 function transformBuildSummary(apiResponse: BuildSummary): DashboardBuild {
  return {
    build_request_id: apiResponse.build_request_id,
    status: apiResponse.status!,
    requested_partitions: apiResponse.requested_partitions, // Keep as PartitionRef array
    total_jobs: apiResponse.total_jobs,
    completed_jobs: apiResponse.completed_jobs,
    failed_jobs: apiResponse.failed_jobs,
    cancelled_jobs: apiResponse.cancelled_jobs,
    requested_at: apiResponse.requested_at,
    started_at: apiResponse.started_at ?? null,
    completed_at: apiResponse.completed_at ?? null,
    duration_ms: apiResponse.duration_ms ?? null,
    cancelled: apiResponse.cancelled,
  };
 }
 function transformBuildDetail(apiResponse: BuildDetailResponse): DashboardBuild {
  return {
    build_request_id: apiResponse.build_request_id,
    status: apiResponse.status!,
    requested_partitions: apiResponse.requested_partitions, // Keep as PartitionRef array
    total_jobs: apiResponse.total_jobs,
    completed_jobs: apiResponse.completed_jobs,
    failed_jobs: apiResponse.failed_jobs,
    cancelled_jobs: apiResponse.cancelled_jobs,
    requested_at: apiResponse.requested_at,
    started_at: apiResponse.started_at ?? null,
    completed_at: apiResponse.completed_at ?? null,
    duration_ms: apiResponse.duration_ms ?? null,
    cancelled: apiResponse.cancelled,
  };
 }
 function transformPartitionSummary(apiResponse: any): DashboardPartition {
  return {
    partition_ref: apiResponse.partition_ref, // Keep as PartitionRef object
    status_code: apiResponse.status_code,
    status_name: apiResponse.status_name,
    last_updated: apiResponse.last_updated ?? null,
    build_requests: apiResponse.build_requests || [],
  };
 }
 function transformJobSummary(apiResponse: JobSummary): DashboardJob {
  return {
    job_label: apiResponse.job_label,
    total_runs: apiResponse.total_runs,
    successful_runs: apiResponse.successful_runs,
    failed_runs: apiResponse.failed_runs,
    cancelled_runs: apiResponse.cancelled_runs,
    last_run_timestamp: apiResponse.last_run_timestamp,
    last_run_status_code: apiResponse.last_run_status_code,
    last_run_status_name: apiResponse.last_run_status_name,
    average_partitions_per_run: apiResponse.average_partitions_per_run,
    recent_builds: apiResponse.recent_builds || [],
  };
 }
 function transformActivityResponse(apiResponse: ActivityResponse): DashboardActivity {
  return {
    active_builds_count: apiResponse.active_builds_count,
    recent_builds: apiResponse.recent_builds.map(transformBuildSummary),
    recent_partitions: apiResponse.recent_partitions.map(transformPartitionSummary),
    total_partitions_count: apiResponse.total_partitions_count,
    system_status: apiResponse.system_status,
    graph_name: apiResponse.graph_name,
  };
 }
 // Test Data Mocks
 const mockBuildSummary: BuildSummary = {
  build_request_id: 'build-123',
  status: {code: 4, name: 'COMPLETED'},
  requested_partitions: [{ str: 'partition-1' }, { str: 'partition-2' }],
  total_jobs: 5,
  completed_jobs: 5,
  failed_jobs: 0,
  cancelled_jobs: 0,
  requested_at: 1640995200000000000, // 2022-01-01 00:00:00 UTC in nanos
  started_at: 1640995260000000000,   // 2022-01-01 00:01:00 UTC in nanos
  completed_at: 1640995320000000000, // 2022-01-01 00:02:00 UTC in nanos
  duration_ms: 60000, // 1 minute
  cancelled: false
 };
 const mockPartitionSummary: any = {
  partition_ref: { str: 'test-partition' },
  status_code: 4, // PARTITION_AVAILABLE
  status_name: 'AVAILABLE',
  last_updated: 1640995200000000000,
  builds_count: 3,
  invalidation_count: 0,
  build_requests: ['build-123', 'build-124'],
  last_successful_build: 'build-123'
 };
 const mockJobSummary: JobSummary = {
  job_label: '//:test-job',
  total_runs: 10,
  successful_runs: 9,
  failed_runs: 1,
  cancelled_runs: 0,
  average_partitions_per_run: 2.5,
  last_run_timestamp: 1640995200000000000,
  last_run_status_code: 3, // JOB_COMPLETED
  last_run_status_name: 'COMPLETED',
  recent_builds: ['build-123', 'build-124']
 };
 const mockActivityResponse: ActivityResponse = {
  active_builds_count: 2,
  recent_builds: [mockBuildSummary],
  recent_partitions: [mockPartitionSummary],
  total_partitions_count: 100,
  system_status: 'healthy',
  graph_name: 'test-graph'
 };
 // Test Suite
 o.spec('Transformation Functions', () => {
  o('transformBuildSummary handles status fields correctly', () => {
    const result = transformBuildSummary(mockBuildSummary);
    // The key fix: status_name should be a string, status_code a number
    o(typeof result.status?.code).equals('number');
    o(typeof result.status?.name).equals('string');
    o(result.status.name).equals('COMPLETED');
    // This should not throw (preventing the original runtime error)
    o(() => result.status.name.toLowerCase()).notThrows('status_name.toLowerCase should work');
  });
  o('transformBuildSummary handles null optional fields', () => {
    const buildWithNulls: BuildSummary = {
      ...mockBuildSummary,
      started_at: null,
      completed_at: null,
      duration_ms: null
    };
    const result = transformBuildSummary(buildWithNulls);
    // Explicit null handling prevents undefined property access
    o(result.started_at).equals(null);
    o(result.completed_at).equals(null);
    o(result.duration_ms).equals(null);
  });
  o('transformPartitionSummary preserves PartitionRef objects correctly', () => {
    const result = transformPartitionSummary(mockPartitionSummary);
    // The key fix: partition_ref should remain as PartitionRef object
    o(typeof result.partition_ref).equals('object');
    o(result.partition_ref.str).equals('test-partition');
    // This should not throw (preventing original runtime errors)
    o(() => result.partition_ref.str.toLowerCase()).notThrows('partition_ref.str.toLowerCase should work');
  });
  o('transformPartitionSummary handles missing arrays safely', () => {
    const partitionWithoutArray: any = {
      ...mockPartitionSummary
    };
    delete partitionWithoutArray.build_requests;
    const result = transformPartitionSummary(partitionWithoutArray);
    // Should default to empty array, preventing length/iteration errors
    o(Array.isArray(result.build_requests)).equals(true);
    o(result.build_requests.length).equals(0);
  });
  o('transformJobSummary handles status fields correctly', () => {
    const result = transformJobSummary(mockJobSummary);
    // The key fix: both status code and name should be preserved
    o(typeof result.last_run_status_code).equals('number');
    o(typeof result.last_run_status_name).equals('string');
    o(result.last_run_status_name).equals('COMPLETED');
    // This should not throw
    o(() => result.last_run_status_name.toLowerCase()).notThrows('last_run_status_name.toLowerCase should work');
  });
  o('transformActivityResponse maintains structure consistency', () => {
    const result = transformActivityResponse(mockActivityResponse);
    // Should pass our type guard
    o(isDashboardActivity(result)).equals(true);
    // All nested objects should be properly transformed
    o(result.recent_builds.length).equals(1);
    o(typeof result.recent_builds[0]?.status.name).equals('string');
    o(result.recent_partitions.length).equals(1);
    o(typeof result.recent_partitions[0]?.partition_ref).equals('object');
    o(typeof result.recent_partitions[0]?.partition_ref.str).equals('string');
  });
  o('transformations prevent original runtime failures', () => {
    const result = transformActivityResponse(mockActivityResponse);
    // These are the exact patterns that caused runtime failures:
    // 1. status_name.toLowerCase() - should not crash
    result.recent_builds.forEach((build: DashboardBuild) => {
      o(() => build.status.name.toLowerCase()).notThrows('build.status.name.toLowerCase should work');
      o(build.status.name.toLowerCase()).equals('completed');
    });
    // 2. partition_ref.str access - should access string property
    result.recent_partitions.forEach((partition: DashboardPartition) => {
      o(typeof partition.partition_ref).equals('object');
      o(typeof partition.partition_ref.str).equals('string');
      o(() => partition.partition_ref.str.toLowerCase()).notThrows('partition.partition_ref.str.toLowerCase should work');
    });
    // 3. Null/undefined handling - should be explicit
    result.recent_builds.forEach((build: DashboardBuild) => {
      // These fields can be null but never undefined
      o(build.started_at === null || typeof build.started_at === 'number').equals(true);
      o(build.completed_at === null || typeof build.completed_at === 'number').equals(true);
      o(build.duration_ms === null || typeof build.duration_ms === 'number').equals(true);
    });
  });
 });
 // Edge Cases and Error Conditions
 o.spec('Transformation Edge Cases', () => {
  o('handles empty arrays correctly', () => {
    const emptyActivity: ActivityResponse = {
      ...mockActivityResponse,
      recent_builds: [],
      recent_partitions: []
    };
    const result = transformActivityResponse(emptyActivity);
    o(Array.isArray(result.recent_builds)).equals(true);
    o(result.recent_builds.length).equals(0);
    o(Array.isArray(result.recent_partitions)).equals(true);
    o(result.recent_partitions.length).equals(0);
  });
  o('handles malformed PartitionRef gracefully', () => {
    const malformedPartition: any = {
      ...mockPartitionSummary,
      partition_ref: { str: '' }  // Empty string
    };
    const result = transformPartitionSummary(malformedPartition);
    o(typeof result.partition_ref.str).equals('string');
    o(result.partition_ref.str).equals('');
  });
  o('transformations produce valid dashboard types', () => {
    // Test that all transformation results pass type guards
    const transformedBuild = transformBuildSummary(mockBuildSummary);
    const transformedPartition = transformPartitionSummary(mockPartitionSummary);
    const transformedJob = transformJobSummary(mockJobSummary);
    const transformedActivity = transformActivityResponse(mockActivityResponse);
    o(isDashboardBuild(transformedBuild)).equals(true);
    o(isDashboardPartition(transformedPartition)).equals(true);
    o(isDashboardJob(transformedJob)).equals(true);
    o(isDashboardActivity(transformedActivity)).equals(true);
  });
 });
 // Performance and Memory Tests
 o.spec('Transformation Performance', () => {
  o('transforms large datasets efficiently', () => {
    const largeActivity: ActivityResponse = {
      ...mockActivityResponse,
      recent_builds: Array(1000).fill(mockBuildSummary),
      recent_partitions: Array(1000).fill(mockPartitionSummary)
    };
    const start = Date.now();
    const result = transformActivityResponse(largeActivity);
    const duration = Date.now() - start;
    // Should complete transformation in reasonable time  
    o(duration < 1000).equals(true); // Less than 1 second
    o(result.recent_builds.length).equals(1000);
    o(result.recent_partitions.length).equals(1000);
  });
 });
 // Export default removed - tests are run by importing this file
--- a/databuild/dashboard/tsconfig_app.json
+++ b/databuild/dashboard/tsconfig_app.json
@ -1,21 +0,0 @@
 {
  "compilerOptions": {
    "target": "es2016",                                  /* Set the JavaScript language version for emitted JavaScript and include compatible library declarations. */
    "lib": ["es6","dom", "es2021"],                                /* Specify a set of bundled library declaration files that describe the target runtime environment. */
    "module": "commonjs",                                /* Specify what module code is generated. */
    "rootDir": "./",                                     /* Specify the root folder within your source files. */
    "moduleResolution": "node",                          /* Specify how TypeScript looks up a file from a given module specifier. */
    "resolveJsonModule": true,                           /* Enable importing .json files. */
    "allowJs": true,                                     /* Allow JavaScript files to be a part of your program. Use the 'checkJS' option to get errors from these files. */
    "inlineSourceMap": true,                             /* Include sourcemap files inside the emitted JavaScript. */
    "esModuleInterop": true,                             /* Emit additional JavaScript to ease support for importing CommonJS modules. This enables 'allowSyntheticDefaultImports' for type compatibility. */
    "forceConsistentCasingInFileNames": true,            /* Ensure that casing is correct in imports. */
    "strict": true,                                      /* Enable all strict type-checking options. */
    "noImplicitAny": true,                               /* Enable error reporting for expressions and declarations with an implied 'any' type. */
    "strictNullChecks": true,                            /* Enable error reporting for null and undefined values. */
    "noImplicitReturns": true,                           /* Enable error reporting for codepaths that do not explicitly return. */
    "noUncheckedIndexedAccess": true,                    /* Add 'undefined' to index signature results. */
    "exactOptionalPropertyTypes": true,                  /* Ensure optional property types are exact. */
    "skipLibCheck": true                                 /* Skip type checking all .d.ts files. */
  }
 }
--- a/databuild/dashboard/tsconfig_test.json
+++ b/databuild/dashboard/tsconfig_test.json
@ -1,22 +0,0 @@
 {
  "compilerOptions": {
    "target": "es2016",                                  /* Set the JavaScript language version for emitted JavaScript and include compatible library declarations. */
    "lib": ["es6","dom"],                                /* Specify a set of bundled library declaration files that describe the target runtime environment. */
    "module": "commonjs",                                /* Specify what module code is generated. */
    "rootDir": "./",                                     /* Specify the root folder within your source files. */
    "moduleResolution": "node",                          /* Specify how TypeScript looks up a file from a given module specifier. */
    "baseUrl": "./",                                     /* Specify the base directory to resolve non-relative module names. */
    "resolveJsonModule": true,                           /* Enable importing .json files. */
    "allowJs": true,                                     /* Allow JavaScript files to be a part of your program. Use the 'checkJS' option to get errors from these files. */
    "inlineSourceMap": true,                             /* Include sourcemap files inside the emitted JavaScript. */
    "esModuleInterop": true,                             /* Emit additional JavaScript to ease support for importing CommonJS modules. This enables 'allowSyntheticDefaultImports' for type compatibility. */
    "forceConsistentCasingInFileNames": true,            /* Ensure that casing is correct in imports. */
    "strict": true,                                      /* Enable all strict type-checking options. */
    "noImplicitAny": true,                               /* Enable error reporting for expressions and declarations with an implied 'any' type. */
    "strictNullChecks": true,                            /* Enable error reporting for null and undefined values. */
    "noImplicitReturns": true,                           /* Enable error reporting for codepaths that do not explicitly return. */
    "noUncheckedIndexedAccess": true,                    /* Add 'undefined' to index signature results. */
    "exactOptionalPropertyTypes": true,                  /* Ensure optional property types are exact. */
    "skipLibCheck": true                                 /* Skip type checking all .d.ts files. */
  }
 }
--- a/databuild/dashboard/types.ts
+++ b/databuild/dashboard/types.ts
@ -1,285 +0,0 @@
 import m from 'mithril';
 import { 
  ActivityResponse, 
  ActivityApiResponse,
  BuildSummary, 
  BuildDetailResponse,
  PartitionSummary,
  PartitionDetailResponse,
  PartitionEventsResponse,
  JobSummary,
  JobMetricsResponse,
  JobDailyStats,
  JobRunSummary,
  PartitionRef,
  BuildRequestStatus
 } from '../client/typescript_generated/src/index';
 // Dashboard-optimized types - canonical frontend types independent of backend schema
 // These types prevent runtime errors by ensuring consistent data shapes throughout components
 export interface DashboardBuild {
  build_request_id: string;
  status: BuildRequestStatus;
  requested_partitions: PartitionRef[];
  total_jobs: number;
  completed_jobs: number;
  failed_jobs: number;
  cancelled_jobs: number;
  requested_at: number;
  started_at: number | null;
  completed_at: number | null;
  duration_ms: number | null;
  cancelled: boolean;
 }
 export interface DashboardPartition {
  partition_ref: PartitionRef;
  status_code: number;
  status_name: string;
  last_updated: number | null;
  build_requests: string[];
 }
 export interface DashboardJob {
  job_label: string;
  total_runs: number;
  successful_runs: number;
  failed_runs: number;
  cancelled_runs: number;
  last_run_timestamp: number;
  last_run_status_code: number;
  last_run_status_name: string;
  average_partitions_per_run: number;
  recent_builds: string[];
 }
 export interface DashboardActivity {
  active_builds_count: number;
  recent_builds: DashboardBuild[];
  recent_partitions: DashboardPartition[];
  total_partitions_count: number;
  system_status: string;
  graph_name: string;
 }
 // Dashboard timeline event types for consistent UI handling
 export interface DashboardBuildTimelineEvent {
  timestamp: number;
  status: BuildRequestStatus;
  message: string;
  event_type: string;
  cancel_reason?: string;
 }
 export interface DashboardPartitionTimelineEvent {
  timestamp: number;
  status: BuildRequestStatus;
  message: string;
  build_request_id: string;
  job_run_id?: string;
 }
 // Generic typed component interface that extends Mithril's component
 // Uses intersection type to allow arbitrary properties while ensuring type safety for lifecycle methods
 export interface TypedComponent<TAttrs = {}> extends Record<string, any> {
  oninit?(vnode: m.Vnode<TAttrs>): void;
  oncreate?(vnode: m.VnodeDOM<TAttrs>): void;
  onupdate?(vnode: m.VnodeDOM<TAttrs>): void;
  onbeforeremove?(vnode: m.VnodeDOM<TAttrs>): Promise<any> | void;
  onremove?(vnode: m.VnodeDOM<TAttrs>): void;
  onbeforeupdate?(vnode: m.Vnode<TAttrs>, old: m.VnodeDOM<TAttrs>): boolean | void;
  view(vnode: m.Vnode<TAttrs>): m.Children;
 }
 // Helper type for typed vnodes
 export type TypedVnode<TAttrs = {}> = m.Vnode<TAttrs>;
 export type TypedVnodeDOM<TAttrs = {}> = m.VnodeDOM<TAttrs>;
 // Route parameter types
 export interface RouteParams {
  [key: string]: string;
 }
 export interface BuildRouteParams extends RouteParams {
  id: string;
 }
 export interface PartitionRouteParams extends RouteParams {
  base64_ref: string;
 }
 export interface JobRouteParams extends RouteParams {
  label: string;
 }
 // Component attribute interfaces that reference OpenAPI types
 export interface RecentActivityAttrs {
  // No external attrs needed - component manages its own data loading
 }
 export interface BuildStatusAttrs {
  id: string;
 }
 export interface PartitionStatusAttrs {
  base64_ref: string;
 }
 export interface PartitionsListAttrs {
  // No external attrs needed - component manages its own data loading
 }
 export interface JobsListAttrs {
  // No external attrs needed - component manages its own data loading
 }
 export interface JobMetricsAttrs {
  label: string;
 }
 export interface GraphAnalysisAttrs {
  // No external attrs needed for now
 }
 // Badge component attribute interfaces with OpenAPI type constraints
 export interface BuildStatusBadgeAttrs {
  status: string; // This should be constrained to BuildSummary status values
  size?: 'xs' | 'sm' | 'md' | 'lg';
  class?: string;
 }
 export interface PartitionStatusBadgeAttrs {
  status: string; // This should be constrained to PartitionSummary status values
  size?: 'xs' | 'sm' | 'md' | 'lg';
  class?: string;
 }
 export interface EventTypeBadgeAttrs {
  eventType: string; // This should be constrained to known event types
  size?: 'xs' | 'sm' | 'md' | 'lg';
  class?: string;
 }
 // Layout wrapper attributes
 export interface LayoutWrapperAttrs {
  // Layout wrapper will pass through attributes to wrapped component
  [key: string]: any;
 }
 // Data types for component state (using Dashboard types for consistency)
 export interface RecentActivityData {
  data: DashboardActivity | null;
  loading: boolean;
  error: string | null;
 }
 export interface BuildStatusData {
  data: DashboardBuild | null;
  partitionStatuses: Map<string, DashboardPartition>; // Key is partition_ref.str
  timeline: DashboardBuildTimelineEvent[];
  loading: boolean;
  error: string | null;
  buildId: string;
 }
 export interface PartitionStatusData {
  data: DashboardPartition | null;
  timeline: DashboardPartitionTimelineEvent[];
  loading: boolean;
  error: string | null;
  partitionRef: string;
  buildHistory: DashboardBuild[];
 }
 export interface JobsListData {
  jobs: DashboardJob[];
  searchTerm: string;
  loading: boolean;
  error: string | null;
  searchTimeout: NodeJS.Timeout | null;
 }
 export interface JobMetricsData {
  jobLabel: string;
  job: DashboardJob | null;
  loading: boolean;
  error: string | null;
 }
 // Utility type for creating typed components
 export type CreateTypedComponent<TAttrs> = TypedComponent<TAttrs>;
 /*
 ## Dashboard Type Transformation Rationale
 The dashboard types provide a stable interface between the OpenAPI-generated types and UI components:
 1. **Explicit Null Handling**: Protobuf optional fields become `T | null` instead of `T | undefined`
   to ensure consistent null checking throughout the application.
 2. **Type Safety**: Keep protobuf structure (PartitionRef objects, status codes) to maintain
   type safety from backend to frontend. Only convert to display strings in components.
 3. **Clear Boundaries**: Dashboard types are the contract between services and components.
   Services handle API responses, components handle presentation.
 Key principles:
 - Preserve protobuf structure for type safety
 - Explicit null handling for optional fields
 - Convert to display strings only at the UI layer
 - Consistent types prevent runtime errors
 */
 // Type guards and validators for Dashboard types
 export function isDashboardActivity(data: any): data is DashboardActivity {
  return data && 
    typeof data.active_builds_count === 'number' &&
    typeof data.graph_name === 'string' &&
    Array.isArray(data.recent_builds) &&
    Array.isArray(data.recent_partitions) &&
    typeof data.system_status === 'string' &&
    typeof data.total_partitions_count === 'number';
 }
 export function isDashboardBuild(data: any): data is DashboardBuild {
  return data &&
    typeof data.build_request_id === 'string' &&
    typeof data.status?.code === 'number' &&
    typeof data.status?.name === 'string' &&
    typeof data.requested_at === 'number' &&
    Array.isArray(data.requested_partitions);
 }
 export function isDashboardPartition(data: any): data is DashboardPartition {
  return data &&
    data.partition_ref &&
    typeof data.partition_ref.str === 'string' &&
    typeof data.status_code === 'number' &&
    typeof data.status_name === 'string' &&
    (data.last_updated === null || typeof data.last_updated === 'number') &&
    Array.isArray(data.build_requests);
 }
 export function isDashboardJob(data: any): data is DashboardJob {
  return data &&
    typeof data.job_label === 'string' &&
    typeof data.total_runs === 'number' &&
    typeof data.last_run_status_code === 'number' &&
    typeof data.last_run_status_name === 'string' &&
    Array.isArray(data.recent_builds);
 }
 // Helper function to create type-safe Mithril components
 export function createTypedComponent<TAttrs>(
  component: TypedComponent<TAttrs>
 ): m.Component<TAttrs> {
  return component as m.Component<TAttrs>;
 }
 // Helper for type-safe route handling
 export function getTypedRouteParams<T extends RouteParams>(vnode: m.Vnode<T>): T {
  return vnode.attrs;
 }
--- a/databuild/dashboard/utils.test.ts
+++ b/databuild/dashboard/utils.test.ts
@ -1,52 +0,0 @@
 import o from 'ospec';
 // Inline the utils functions for testing since we can't import from the app module in tests
 function encodePartitionRef(ref: string): string {
  return btoa(ref).replace(/\+/g, '-').replace(/\//g, '_').replace(/=/g, '');
 }
 function decodePartitionRef(encoded: string): string {
  // Add padding if needed
  const padding = '='.repeat((4 - (encoded.length % 4)) % 4);
  const padded = encoded.replace(/-/g, '+').replace(/_/g, '/') + padding;
  return atob(padded);
 }
 o.spec('URL Encoding Utils', () => {
  o('should encode and decode partition references correctly', () => {
    const testCases = [
      'simple/partition',
      'complex/partition/with/slashes',
      'partition+with+plus',
      'partition=with=equals',
      'partition_with_underscores',
      'partition-with-dashes',
      'partition/with/mixed+symbols=test_case-123',
    ];
    testCases.forEach(original => {
      const encoded = encodePartitionRef(original);
      const decoded = decodePartitionRef(encoded);
      o(decoded).equals(original)(`Failed for: ${original}`);
      // Encoded string should be URL-safe (no +, /, or = characters)
      o(encoded.includes('+')).equals(false)(`Encoded string contains +: ${encoded}`);
      o(encoded.includes('/')).equals(false)(`Encoded string contains /: ${encoded}`);
      o(encoded.includes('=')).equals(false)(`Encoded string contains =: ${encoded}`);
    });
  });
  o('should handle empty string', () => {
    const encoded = encodePartitionRef('');
    const decoded = decodePartitionRef(encoded);
    o(decoded).equals('');
  });
  o('should handle special characters', () => {
    const special = 'test/path?query=value&other=123#fragment';
    const encoded = encodePartitionRef(special);
    const decoded = decodePartitionRef(encoded);
    o(decoded).equals(special);
  });
 });
--- a/databuild/dashboard/utils.ts
+++ b/databuild/dashboard/utils.ts
@ -1,108 +0,0 @@
 // URL encoding utilities for partition references
 export function encodePartitionRef(ref: string): string {
  return btoa(ref).replace(/\+/g, '-').replace(/\//g, '_').replace(/=/g, '');
 }
 export function decodePartitionRef(encoded: string): string {
  // Add padding if needed
  const padding = '='.repeat((4 - (encoded.length % 4)) % 4);
  const padded = encoded.replace(/-/g, '+').replace(/_/g, '/') + padding;
  return atob(padded);
 }
 // Job label encoding utilities (same pattern as partition refs)
 export function encodeJobLabel(label: string): string {
  return btoa(label).replace(/\+/g, '-').replace(/\//g, '_').replace(/=/g, '');
 }
 export function decodeJobLabel(encoded: string): string {
  // Add padding if needed
  const padding = '='.repeat((4 - (encoded.length % 4)) % 4);
  const padded = encoded.replace(/-/g, '+').replace(/_/g, '/') + padding;
  return atob(padded);
 }
 import m from 'mithril';
 import { 
  TypedComponent, 
  BuildStatusBadgeAttrs, 
  PartitionStatusBadgeAttrs, 
  EventTypeBadgeAttrs,
  createTypedComponent 
 } from './types';
 // Mithril components for status badges - encapsulates both logic and presentation
 export const BuildStatusBadge: TypedComponent<BuildStatusBadgeAttrs> = {
  view(vnode: m.Vnode<BuildStatusBadgeAttrs>) {
    const { status, size = 'sm', class: className, ...attrs } = vnode.attrs;
    const normalizedStatus = status.toLowerCase();
    let badgeClass = 'badge-neutral';
    if (normalizedStatus.includes('completed')) {
      badgeClass = 'badge-success';
    } else if (normalizedStatus.includes('executing') || normalizedStatus.includes('planning')) {
      badgeClass = 'badge-warning';
    } else if (normalizedStatus.includes('received')) {
      badgeClass = 'badge-info';
    } else if (normalizedStatus.includes('failed') || normalizedStatus.includes('cancelled')) {
      badgeClass = 'badge-error';
    }
    return m(`span.badge.badge-${size}.${badgeClass}`, { class: className, ...attrs }, status);
  }
 };
 export const PartitionStatusBadge: TypedComponent<PartitionStatusBadgeAttrs> = {
  view(vnode: m.Vnode<PartitionStatusBadgeAttrs>) {
    const { status, size = 'sm', class: className, ...attrs } = vnode.attrs;
    if (!status) {
      return m(`span.badge.badge-${size}.badge-neutral`, { class: className, ...attrs }, 'Unknown');
    }
    const normalizedStatus = status.toLowerCase();
    let badgeClass = 'badge-neutral';
    if (normalizedStatus.includes('available')) {
      badgeClass = 'badge-success';
    } else if (normalizedStatus.includes('building') || normalizedStatus.includes('analyzed')) {
      badgeClass = 'badge-warning';
    } else if (normalizedStatus.includes('requested') || normalizedStatus.includes('delegated')) {
      badgeClass = 'badge-info';
    } else if (normalizedStatus.includes('failed')) {
      badgeClass = 'badge-error';
    }
    return m(`span.badge.badge-${size}.${badgeClass}`, { class: className, ...attrs }, status);
  }
 };
 export const EventTypeBadge: TypedComponent<EventTypeBadgeAttrs> = {
  view(vnode: m.Vnode<EventTypeBadgeAttrs>) {
    const { eventType, size = 'sm', class: className, ...attrs } = vnode.attrs;
    let badgeClass = 'badge-ghost';
    let displayName = eventType;
    switch (eventType) {
      case 'build_request':
        badgeClass = 'badge-primary';
        displayName = 'Build';
        break;
      case 'job':
        badgeClass = 'badge-secondary';
        displayName = 'Job';
        break;
      case 'partition':
        badgeClass = 'badge-accent';
        displayName = 'Partition';
        break;
      case 'delegation':
        badgeClass = 'badge-info';
        displayName = 'Delegation';
        break;
    }
    return m(`span.badge.badge-${size}.${badgeClass}`, { class: className, ...attrs }, displayName);
  }
 };
--- a/databuild/databuild.proto
+++ b/databuild/databuild.proto
--- a/databuild/dsl/python/BUILD.bazel
+++ b/databuild/dsl/python/BUILD.bazel
@ -1,29 +0,0 @@
 py_library(
    name = "dsl",
    srcs = ["dsl.py"],
    visibility = ["//visibility:public"],
    deps = [
        "//databuild:py_proto",
    ],
 )
 py_library(
    name = "generator_lib",
    srcs = ["generator_lib.py"],
    visibility = ["//visibility:public"],
    deps = [
        ":dsl",
        "//databuild:py_proto",
    ],
 )
 py_binary(
    name = "generator",
    srcs = ["generator.py"],
    data = ["dsl_job_wrapper.py"],
    main = "generator.py",
    visibility = ["//visibility:public"],
    deps = [
        ":generator_lib",
    ],
 )
--- a/databuild/dsl/python/dsl.py
+++ b/databuild/dsl/python/dsl.py
@ -1,431 +0,0 @@
 from databuild.proto import JobConfig, PartitionRef, DataDep, DepType
 from typing import Self, Protocol, get_type_hints, get_origin, get_args
 from dataclasses import fields, is_dataclass, dataclass, field
 import re
 class PartitionPattern:
    _raw_pattern: str
    @property
    def _pattern(self) -> re.Pattern:
        return re.compile(self._raw_pattern)
    def _validate_pattern(self):
        """Checks that both conditions are met:
        1. All fields from the PartitionFields type are present in the pattern
        2. All fields from the pattern are present in the PartitionFields type
        """
        # TODO how do I get this to be called?
        assert is_dataclass(self), "Should be a dataclass also (for partition fields)"
        pattern_fields = set(self._pattern.groupindex.keys())
        partition_fields = {field.name for field in fields(self)}
        if pattern_fields != partition_fields:
            raise ValueError(f"Pattern fields {pattern_fields} do not match partition fields {partition_fields}")
    @classmethod
    def deserialize(cls, raw_value: str) -> Self:
        """Parses a partition from a string based on the defined pattern."""
        # Create a temporary instance to access the compiled pattern
        # We need to compile the pattern to match against it
        pattern = re.compile(cls._raw_pattern)
        # Match the raw value against the pattern
        match = pattern.match(raw_value)
        if not match:
            raise ValueError(f"String '{raw_value}' does not match pattern '{cls._pattern}'")
        # Extract the field values from the match
        field_values = match.groupdict()
        # Create and return a new instance with the extracted values
        return cls(**field_values)
    def serialize(self) -> str:
        """Returns a string representation by filling in the pattern template with field values."""
        # Start with the pattern
        result = self._raw_pattern
        # Replace each named group in the pattern with its corresponding field value
        for field in fields(self):
            # Find the named group pattern and replace it with the actual value
            # We need to replace the regex pattern with the actual value
            # Look for the pattern (?P<field_name>...) and replace with the field value
            pattern_to_replace = rf'\(\?P<{field.name}>[^)]+\)'
            actual_value = getattr(self, field.name)
            result = re.sub(pattern_to_replace, actual_value, result)
        return result
 class DataBuildJob(Protocol):
    # The types of partitions that this job produces
    output_types: list[type[PartitionPattern]]
    def config(self, outputs: list[PartitionPattern]) -> list[JobConfig]: ...
    def exec(self, *args: str) -> None: ...
 class DataBuildGraph:
    def __init__(self, label: str):
        self.label = label
        self.lookup = {}
    def job(self, cls: type[DataBuildJob]) -> None:
        """Register a job with the graph."""
        for partition in cls.output_types:
            assert partition not in self.lookup, f"Partition `{partition}` already registered"
            self.lookup[partition] = cls
        return cls
    def generate_bazel_module(self):
        """Generates a complete databuild application, packaging up referenced jobs and this graph via bazel targets"""
        raise NotImplementedError
    def generate_bazel_package(self, name: str, output_dir: str, deps: list = None) -> None:
        """Generate BUILD.bazel and binaries into a generated/ subdirectory.
        Args:
            name: Base name for the generated graph (without .generate suffix)
            output_dir: Directory to write generated files to (will create generated/ subdir)
            deps: List of Bazel dependency labels to use in generated BUILD.bazel
        """
        import os
        import shutil
        # Create generated/ subdirectory
        generated_dir = os.path.join(output_dir, "generated")
        os.makedirs(generated_dir, exist_ok=True)
        # Generate BUILD.bazel with job and graph targets
        self._generate_build_bazel(generated_dir, name, deps or [])
        # Generate individual job scripts (instead of shared wrapper)
        self._generate_job_scripts(generated_dir)
        # Generate job lookup binary
        self._generate_job_lookup(generated_dir, name)
        package_name = self._get_package_name()
        print(f"Generated DataBuild package '{name}' in {generated_dir}")
        if package_name != "UNKNOWN_PACKAGE":
            print(f"Run 'bazel build \"@databuild//{package_name}/generated:{name}_graph.analyze\"' to use the generated graph")
        else:
            print(f"Run 'bazel build generated:{name}_graph.analyze' to use the generated graph")
    def _generate_build_bazel(self, output_dir: str, name: str, deps: list) -> None:
        """Generate BUILD.bazel with databuild_job and databuild_graph targets."""
        import os
        # Get job classes from the lookup table
        job_classes = sorted(set(self.lookup.values()), key=lambda cls: cls.__name__)
        # Format deps for BUILD.bazel
        if deps:
            deps_str = ", ".join([f'"{dep}"' for dep in deps])
        else:
            # Fallback to parent package if no deps provided
            parent_package = self._get_package_name()
            deps_str = f'"//{parent_package}:dsl_src"'
        # Generate py_binary targets for each job
        job_binaries = []
        job_targets = []
        for job_class in job_classes:
            job_name = self._snake_case(job_class.__name__)
            binary_name = f"{job_name}_binary"
            job_targets.append(f'"{job_name}"')
            job_script_name = f"{job_name}.py"
            job_binaries.append(f'''py_binary(
    name = "{binary_name}",
    srcs = ["{job_script_name}"],
    main = "{job_script_name}",
    deps = [{deps_str}],
 )
 databuild_job(
    name = "{job_name}",
    binary = ":{binary_name}",
 )''')
        # Generate the complete BUILD.bazel content
        build_content = f'''load("@databuild//databuild:rules.bzl", "databuild_job", "databuild_graph")
 # Generated by DataBuild DSL - do not edit manually
 # This file is generated in a subdirectory to avoid overwriting the original BUILD.bazel
 {chr(10).join(job_binaries)}
 py_binary(
    name = "{name}_job_lookup",
    srcs = ["{name}_job_lookup.py"],
    deps = [{deps_str}],
 )
 databuild_graph(
    name = "{name}_graph",
    jobs = [{", ".join(job_targets)}],
    lookup = ":{name}_job_lookup",
    visibility = ["//visibility:public"],
 )
 # Create tar archive of generated files for testing
 genrule(
    name = "existing_generated",
    srcs = glob(["*.py", "BUILD.bazel"]),
    outs = ["existing_generated.tar"],
    cmd = "mkdir -p temp && cp $(SRCS) temp/ && find temp -exec touch -t 197001010000 {{}} + && tar -cf $@ -C temp .",
    visibility = ["//visibility:public"],
 )
 '''
        with open(os.path.join(output_dir, "BUILD.bazel"), "w") as f:
            f.write(build_content)
    def _generate_job_scripts(self, output_dir: str) -> None:
        """Generate individual Python scripts for each job class."""
        import os
        # Get job classes and generate a script for each one
        job_classes = list(set(self.lookup.values()))
        graph_module_path = self._get_graph_module_path()
        for job_class in job_classes:
            job_name = self._snake_case(job_class.__name__)
            script_name = f"{job_name}.py"
            script_content = f'''#!/usr/bin/env python3
 """
 Generated job script for {job_class.__name__}.
 """
 import sys
 import json
 from {graph_module_path} import {job_class.__name__}
 from databuild.proto import PartitionRef, JobConfigureResponse, to_dict
 def parse_outputs_from_args(args: list[str]) -> list:
    """Parse partition output references from command line arguments."""
    outputs = []
    for arg in args:
        # Find which output type can deserialize this partition reference
        for output_type in {job_class.__name__}.output_types:
            try:
                partition = output_type.deserialize(arg)
                outputs.append(partition)
                break
            except ValueError:
                continue
        else:
            raise ValueError(f"No output type in {job_class.__name__} can deserialize partition ref: {{arg}}")
    return outputs
 if __name__ == "__main__":
    if len(sys.argv) < 2:
        raise Exception(f"Invalid command usage")
    command = sys.argv[1]
    job_instance = {job_class.__name__}()
    if command == "config":
        # Parse output partition references as PartitionRef objects (for Rust wrapper)
        output_refs = [PartitionRef(str=raw_ref) for raw_ref in sys.argv[2:]]
        # Also parse them into DSL partition objects (for DSL job.config())
        outputs = parse_outputs_from_args(sys.argv[2:])
        # Call job's config method - returns list[JobConfig]
        configs = job_instance.config(outputs)
        # Wrap in JobConfigureResponse and serialize using to_dict()
        response = JobConfigureResponse(configs=configs)
        print(json.dumps(to_dict(response)))
    elif command == "exec":
        # The exec method expects a JobConfig but the Rust wrapper passes args
        # For now, let the DSL job handle the args directly 
        # TODO: This needs to be refined based on actual Rust wrapper interface
        job_instance.exec(*sys.argv[2:])
    else:
        raise Exception(f"Invalid command `{{sys.argv[1]}}`")
 '''
            script_path = os.path.join(output_dir, script_name)
            with open(script_path, "w") as f:
                f.write(script_content)
            # Make it executable
            os.chmod(script_path, 0o755)
    def _generate_job_lookup(self, output_dir: str, name: str) -> None:
        """Generate job lookup binary that maps partition patterns to job targets."""
        import os
        # Build the job lookup mappings with full package paths
        package_name = self._get_package_name()
        lookup_mappings = []
        for partition_type, job_class in self.lookup.items():
            job_name = self._snake_case(job_class.__name__)
            pattern = partition_type._raw_pattern
            full_target = f"//{package_name}/generated:{job_name}"
            lookup_mappings.append(f'    r"{pattern}": "{full_target}",')
        lookup_content = f'''#!/usr/bin/env python3
 """
 Generated job lookup for DataBuild DSL graph.
 Maps partition patterns to job targets.
 """
 import sys
 import re
 import json
 from collections import defaultdict
 # Mapping from partition patterns to job targets
 JOB_MAPPINGS = {{
 {chr(10).join(lookup_mappings)}
 }}
 def lookup_job_for_partition(partition_ref: str) -> str:
    """Look up which job can build the given partition reference."""
    for pattern, job_target in JOB_MAPPINGS.items():
        if re.match(pattern, partition_ref):
            return job_target
    raise ValueError(f"No job found for partition: {{partition_ref}}")
 def main():
    if len(sys.argv) < 2:
        print("Usage: job_lookup.py <partition_ref> [partition_ref...]", file=sys.stderr)
        sys.exit(1)
    results = defaultdict(list)
    try:
        for partition_ref in sys.argv[1:]:
            job_target = lookup_job_for_partition(partition_ref)
            results[job_target].append(partition_ref)
        # Output the results as JSON (matching existing lookup format)
        print(json.dumps(dict(results)))
    except ValueError as e:
        print(f"ERROR: {{e}}", file=sys.stderr)
        sys.exit(1)
 if __name__ == "__main__":
    main()
 '''
        lookup_file = os.path.join(output_dir, f"{name}_job_lookup.py")
        with open(lookup_file, "w") as f:
            f.write(lookup_content)
        # Make it executable
        os.chmod(lookup_file, 0o755)
    def _snake_case(self, name: str) -> str:
        """Convert CamelCase to snake_case."""
        import re
        s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
        return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()
    def _get_graph_module_path(self) -> str:
        """Get the module path for the graph containing this instance."""
        # Try to find the module by looking at where the graph object is defined
        import inspect
        import sys
        # Look through all loaded modules to find where this graph instance is defined
        for module_name, module in sys.modules.items():
            if hasattr(module, 'graph') and getattr(module, 'graph') is self:
                if module_name != '__main__':
                    return module_name
        # Look through the call stack to find the module that imported us
        for frame_info in inspect.stack():
            frame_globals = frame_info.frame.f_globals
            module_name = frame_globals.get('__name__')
            if module_name and module_name != '__main__' and 'graph' in frame_globals:
                # Check if this frame has our graph
                if frame_globals.get('graph') is self:
                    return module_name
        # Last resort fallback - this will need to be manually configured
        return "UNKNOWN_MODULE"
    def _get_package_name(self) -> str:
        """Get the Bazel package name where the DSL source files are located."""
        # Extract package from the graph label if available
        if hasattr(self, 'label') and self.label.startswith('//'):
            # Extract package from label like "//databuild/test/app:dsl_graph"
            package_part = self.label.split(':')[0]
            return package_part[2:]  # Remove "//" prefix
        # Fallback to trying to infer from module path
        module_path = self._get_graph_module_path()
        if module_path != "UNKNOWN_MODULE":
            # Convert module path to package path
            # e.g., "databuild.test.app.dsl.graph" -> "databuild/test/app/dsl"
            parts = module_path.split('.')
            if parts[-1] in ['graph', 'main']:
                parts = parts[:-1]
            return '/'.join(parts)
        return "UNKNOWN_PACKAGE"
@dataclass
 class JobConfigBuilder:
    outputs: list[PartitionRef] = field(default_factory=list)
    inputs: list[DataDep] = field(default_factory=list)
    args: list[str] = field(default_factory=list)
    env: dict[str, str] = field(default_factory=dict)
    def build(self) -> JobConfig:
        return JobConfig(
            outputs=self.outputs,
            inputs=self.inputs,
            args=self.args,
            env=self.env,
        )
    def add_inputs(self, *partitions: PartitionPattern, dep_type: DepType=DepType.MATERIALIZE) -> Self:
        for p in partitions:
            dep_type_name = "materialize" if dep_type == DepType.MATERIALIZE else "query"
            self.inputs.append(DataDep(dep_type_code=dep_type, dep_type_name=dep_type_name, partition_ref=PartitionRef(str=p.serialize())))
        return self
    def add_outputs(self, *partitions: PartitionPattern) -> Self:
        for p in partitions:
            self.outputs.append(PartitionRef(str=p.serialize()))
        return self
    def add_args(self, *args: str) -> Self:
        self.args.extend(args)
        return self
    def set_args(self, args: list[str]) -> Self:
        self.args = args
        return self
    def set_env(self, env: dict[str, str]) -> Self:
        self.env = env
        return self
    def add_env(self, **kwargs) -> Self:
        for k, v in kwargs.items():
            assert isinstance(k, str), f"Expected a string key, got `{k}`"
            assert isinstance(v, str), f"Expected a string key, got `{v}`"
            self.env[k] = v
        return self
--- a/databuild/dsl/python/dsl_job_wrapper.py
+++ b/databuild/dsl/python/dsl_job_wrapper.py
@ -1,118 +0,0 @@
 #!/usr/bin/env python3
 """
 Shared DSL job wrapper that can execute any DataBuildJob defined in a DSL graph.
 Configured via environment variables:
 - DATABUILD_DSL_GRAPH_MODULE: Python module path containing the graph (e.g., 'databuild.test.app.dsl.graph')
 - DATABUILD_JOB_CLASS: Job class name to execute (e.g., 'IngestColorVotes')
 """
 import sys
 import json
 import os
 import importlib
 from typing import List, Any
 from databuild.proto import JobConfig
 def parse_outputs_from_args(args: List[str], job_class: Any) -> List[Any]:
    """Parse partition output references from command line arguments into partition objects."""
    outputs = []
    for arg in args:
        # Find which output type can deserialize this partition reference
        for output_type in job_class.output_types:
            try:
                partition = output_type.deserialize(arg)
                outputs.append(partition)
                break
            except ValueError:
                continue
        else:
            raise ValueError(f"No output type in {job_class.__name__} can deserialize partition ref: {arg}")
    return outputs
 def main():
    if len(sys.argv) < 2:
        print("Usage: dsl_job_wrapper.py <config|exec> [args...]", file=sys.stderr)
        sys.exit(1)
    command = sys.argv[1]
    # Read configuration from environment
    graph_module_path = os.environ.get('DATABUILD_DSL_GRAPH_MODULE')
    job_class_name = os.environ.get('DATABUILD_JOB_CLASS')
    if not graph_module_path:
        print("ERROR: DATABUILD_DSL_GRAPH_MODULE environment variable not set", file=sys.stderr)
        sys.exit(1)
    if not job_class_name:
        print("ERROR: DATABUILD_JOB_CLASS environment variable not set", file=sys.stderr)
        sys.exit(1)
    try:
        # Import the graph module
        module = importlib.import_module(graph_module_path)
        graph = getattr(module, 'graph')
        # Get the job class
        job_class = getattr(module, job_class_name)
        # Create job instance
        job_instance = job_class()
    except (ImportError, AttributeError) as e:
        print(f"ERROR: Failed to load job {job_class_name} from {graph_module_path}: {e}", file=sys.stderr)
        sys.exit(1)
    if command == "config":
        try:
            # Parse output partition references from remaining args
            output_refs = sys.argv[2:]
            if not output_refs:
                print("ERROR: No output partition references provided", file=sys.stderr)
                sys.exit(1)
            outputs = parse_outputs_from_args(output_refs, job_class)
            # Call job's config method
            configs = job_instance.config(outputs)
            # Output each config as JSON (one per line for multiple configs)
            for config in configs:
                # Convert JobConfig to dict for JSON serialization
                config_dict = {
                    'outputs': [{'str': ref.str} for ref in config.outputs],
                    'inputs': [
                        {
                            'dep_type_code': dep.dep_type_code,
                            'dep_type_name': dep.dep_type_name,
                            'partition_ref': {'str': dep.partition_ref.str}
                        } for dep in config.inputs
                    ],
                    'args': config.args,
                    'env': config.env,
                }
                print(json.dumps(config_dict))
        except Exception as e:
            print(f"ERROR: Config failed: {e}", file=sys.stderr)
            sys.exit(1)
    elif command == "exec":
        try:
            # Read config from stdin
            job_instance.exec(*sys.argv[2:])
        except Exception as e:
            print(f"ERROR: Execution failed: {e}", file=sys.stderr)
            sys.exit(1)
    else:
        print(f"ERROR: Unknown command '{command}'. Use 'config' or 'exec'.", file=sys.stderr)
        sys.exit(1)
 if __name__ == "__main__":
    main()
--- a/databuild/dsl/python/generator.py
+++ b/databuild/dsl/python/generator.py
@ -1,29 +0,0 @@
 #!/usr/bin/env python3
 """
 DSL code generator that can be run as a py_binary with proper dependencies.
 """
 import sys
 from databuild.dsl.python.generator_lib import generate_dsl_package
 def main():
    if len(sys.argv) != 4:
        print("Usage: generator.py <module_path> <graph_attr> <output_dir>", file=sys.stderr)
        sys.exit(1)
    module_path = sys.argv[1]
    graph_attr = sys.argv[2]
    output_dir = sys.argv[3]
    try:
        generate_dsl_package(module_path, graph_attr, output_dir)
    except Exception as e:
        print(f"ERROR: {e}", file=sys.stderr)
        import traceback
        traceback.print_exc()
        sys.exit(1)
 if __name__ == "__main__":
    main()
--- a/databuild/dsl/python/generator_lib.py
+++ b/databuild/dsl/python/generator_lib.py
@ -1,38 +0,0 @@
 #!/usr/bin/env python3
 """
 Core DSL code generation library that can be imported by different generator binaries.
 """
 import os
 import importlib
 def generate_dsl_package(module_path: str, graph_attr: str, output_dir: str, deps: list = None):
    """
    Generate DataBuild DSL package from a graph definition.
    Args:
        module_path: Python module path (e.g., "databuild.test.app.dsl.graph")
        graph_attr: Name of the graph attribute in the module
        output_dir: Directory where to generate the DSL package
        deps: List of Bazel dependency labels to use in generated BUILD.bazel
    """
    # Extract the base name from the output directory for naming
    name = os.path.basename(output_dir.rstrip('/')) or "graph"
    try:
        # Import the graph module
        module = importlib.import_module(module_path)
        graph = getattr(module, graph_attr)
        # Generate the bazel package
        graph.generate_bazel_package(name, output_dir, deps or [])
        print(f"Generated DataBuild DSL package in {output_dir}")
    except ImportError as e:
        raise ImportError(f"Failed to import {graph_attr} from {module_path}: {e}")
    except AttributeError as e:
        raise AttributeError(f"Module {module_path} does not have attribute {graph_attr}: {e}")
    except Exception as e:
        raise Exception(f"Generation failed: {e}")
--- a/databuild/dsl/python/test/BUILD.bazel
+++ b/databuild/dsl/python/test/BUILD.bazel
@ -1,8 +0,0 @@
 py_test(
    name = "dsl_test",
    srcs = glob(["*.py"]),
    deps = [
        "//databuild/dsl/python:dsl",
        "@databuild_pypi//pytest",
    ],
 )
--- a/databuild/dsl/python/test/dsl_test.py
+++ b/databuild/dsl/python/test/dsl_test.py
@ -1,75 +0,0 @@
 from databuild.dsl.python.dsl import PartitionPattern, DataBuildGraph, DataBuildJob
 from databuild.proto import  JobConfig, PartitionManifest
 from dataclasses import dataclass
 import pytest
@dataclass
 class DateCategory:
    data_date: str
    category: str
 class CategoryAnalysisPartition(DateCategory, PartitionPattern):
    _raw_pattern = r"category_analysis/category=(?P<category>[^/]+)/date=(?P<data_date>\d{4}-\d{2}-\d{2})"
 def test_basic_partition_pattern():
    p1 = CategoryAnalysisPartition(data_date="2025-01-01", category="comedy")
    assert p1.serialize() == "category_analysis/category=comedy/date=2025-01-01"
    p2 = CategoryAnalysisPartition.deserialize("category_analysis/category=technology/date=2025-01-02")
    assert p2.data_date == "2025-01-02"
    assert p2.category == "technology"
 class NotEnoughFieldsPartition(DateCategory, PartitionPattern):
    # Doesn't use the partition fields
    _raw_pattern = r"invalid_partition_pattern"
 class TooManyFieldsPartition(DateCategory, PartitionPattern):
    # Doesn't use the partition fields
    _raw_pattern = r"category_analysis/category=(?P<category>[^/]+)/date=(?P<data_date>\d{4}-\d{2}-\d{2})/hour=(?P<hour>\d{2})"
 def test_invalid_partition_pattern():
    with pytest.raises(ValueError):
        NotEnoughFieldsPartition(data_date="2025-01-01", category="comedy")._validate_pattern()
    with pytest.raises(ValueError):
        TooManyFieldsPartition(data_date="2025-01-01", category="comedy")._validate_pattern()
 def test_basic_graph_definition():
    graph = DataBuildGraph("//:test_graph")
    @graph.job
    class TestJob(DataBuildJob):
        output_types = [CategoryAnalysisPartition]
        def exec(self, config: JobConfig) -> None: ...
        def config(self, outputs: list[PartitionPattern]) -> list[JobConfig]: ...
    assert len(graph.lookup) == 1
    assert CategoryAnalysisPartition in graph.lookup
 def test_graph_collision():
    graph = DataBuildGraph("//:test_graph")
    @graph.job
    class TestJob1(DataBuildJob):
        output_types = [CategoryAnalysisPartition]
        def exec(self, config: JobConfig) -> None: ...
        def config(self, outputs: list[PartitionPattern]) -> list[JobConfig]: ...
    with pytest.raises(AssertionError):
        # Outputs the same partition, so should raise
        @graph.job
        class TestJob2(DataBuildJob):
            output_types = [CategoryAnalysisPartition]
            def exec(self, config: JobConfig) -> None: ...
            def config(self, outputs: list[PartitionPattern]) -> list[JobConfig]: ...
 if __name__ == "__main__":
    raise SystemExit(pytest.main([__file__]))
--- a/databuild/event_log/mock.rs
+++ b/databuild/event_log/mock.rs
@ -1,665 +0,0 @@
 use crate::*;
 use crate::event_log::{BuildEventLogError, Result};
 use crate::event_log::storage::BELStorage;
 use crate::event_log::query_engine::BELQueryEngine;
 use async_trait::async_trait;
 use std::sync::{Arc, Mutex};
 use rusqlite::Connection;
 /// MockBuildEventLog provides an in-memory SQLite database for testing
 /// 
 /// This implementation makes it easy to specify test data and verify behavior
 /// while using the real code paths for event writing and repository queries.
 /// 
 /// Key features:
 /// - Uses in-memory SQLite for parallel test execution
 /// - Provides event constructors with sensible defaults
 /// - Allows easy specification of test scenarios
 /// - Uses the same SQL schema as production SQLite implementation
 pub struct MockBuildEventLog {
    connection: Arc<Mutex<Connection>>,
 }
 impl MockBuildEventLog {
    /// Create a new MockBuildEventLog with an in-memory SQLite database
    pub async fn new() -> Result<Self> {
        let conn = Connection::open(":memory:")
            .map_err(|e| BuildEventLogError::ConnectionError(e.to_string()))?;
        // Disable foreign key constraints for simplicity in testing
        // conn.execute("PRAGMA foreign_keys = ON", [])
        let mock = Self {
            connection: Arc::new(Mutex::new(conn)),
        };
        // Initialize the schema
        mock.initialize().await?;
        Ok(mock)
    }
    /// Create a new MockBuildEventLog with predefined events
    pub async fn with_events(events: Vec<BuildEvent>) -> Result<Self> {
        let mock = Self::new().await?;
        // Insert all provided events
        for event in events {
            mock.append_event(event).await?;
        }
        Ok(mock)
    }
    /// Get the number of events in the mock event log
    pub async fn event_count(&self) -> Result<usize> {
        let conn = self.connection.lock().unwrap();
        let mut stmt = conn.prepare("SELECT COUNT(*) FROM build_events")
            .map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let count: i64 = stmt.query_row([], |row| row.get(0))
            .map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        Ok(count as usize)
    }
    /// Get all events ordered by timestamp
    pub async fn get_all_events(&self) -> Result<Vec<BuildEvent>> {
        let conn = self.connection.lock().unwrap();
        let mut stmt = conn.prepare(
            "SELECT event_data FROM build_events ORDER BY timestamp ASC"
        ).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let rows = stmt.query_map([], |row| {
            let event_data: String = row.get(0)?;
            Ok(event_data)
        }).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let mut events = Vec::new();
        for row in rows {
            let event_data = row.map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
            let event: BuildEvent = serde_json::from_str(&event_data)
                .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
            events.push(event);
        }
        Ok(events)
    }
    /// Clear all events from the mock event log
    pub async fn clear(&self) -> Result<()> {
        let conn = self.connection.lock().unwrap();
        // Clear all tables
        conn.execute("DELETE FROM build_events", [])
            .map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute("DELETE FROM build_request_events", [])
            .map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute("DELETE FROM partition_events", [])
            .map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute("DELETE FROM job_events", [])
            .map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute("DELETE FROM delegation_events", [])
            .map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute("DELETE FROM job_graph_events", [])
            .map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        Ok(())
    }
    /// Initialize the database schema for testing
    pub async fn initialize(&self) -> Result<()> {
        let conn = self.connection.lock().unwrap();
        // Create main events table
        conn.execute(
            "CREATE TABLE IF NOT EXISTS build_events (
                event_id TEXT PRIMARY KEY,
                timestamp INTEGER NOT NULL,
                build_request_id TEXT NOT NULL,
                event_type TEXT NOT NULL,
                event_data TEXT NOT NULL
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        // Create supporting tables for easier queries
        conn.execute(
            "CREATE TABLE IF NOT EXISTS build_request_events (
                event_id TEXT PRIMARY KEY,
                status TEXT NOT NULL,
                requested_partitions TEXT NOT NULL,
                message TEXT NOT NULL
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute(
            "CREATE TABLE IF NOT EXISTS partition_events (
                event_id TEXT PRIMARY KEY,
                partition_ref TEXT NOT NULL,
                status TEXT NOT NULL,
                message TEXT NOT NULL,
                job_run_id TEXT
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute(
            "CREATE TABLE IF NOT EXISTS job_events (
                event_id TEXT PRIMARY KEY,
                job_run_id TEXT NOT NULL,
                job_label TEXT NOT NULL,
                target_partitions TEXT NOT NULL,
                status TEXT NOT NULL,
                message TEXT NOT NULL,
                config_json TEXT,
                manifests_json TEXT NOT NULL
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute(
            "CREATE TABLE IF NOT EXISTS delegation_events (
                event_id TEXT PRIMARY KEY,
                partition_ref TEXT NOT NULL,
                delegated_to_build_request_id TEXT NOT NULL,
                message TEXT NOT NULL
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        conn.execute(
            "CREATE TABLE IF NOT EXISTS job_graph_events (
                event_id TEXT PRIMARY KEY,
                job_graph_json TEXT NOT NULL
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        Ok(())
    }
    /// Append an event to the mock event log
    pub async fn append_event(&self, event: BuildEvent) -> Result<()> {
        let conn = self.connection.lock().unwrap();
        // Serialize the entire event for storage
        let event_data = serde_json::to_string(&event)
            .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
        // Insert into main events table
        conn.execute(
            "INSERT INTO build_events (event_id, timestamp, build_request_id, event_type, event_data) VALUES (?1, ?2, ?3, ?4, ?5)",
            rusqlite::params![
                event.event_id,
                event.timestamp,
                event.build_request_id,
                match &event.event_type {
                    Some(crate::build_event::EventType::BuildRequestEvent(_)) => "build_request",
                    Some(crate::build_event::EventType::PartitionEvent(_)) => "partition",
                    Some(crate::build_event::EventType::JobEvent(_)) => "job",
                    Some(crate::build_event::EventType::DelegationEvent(_)) => "delegation",
                    Some(crate::build_event::EventType::JobGraphEvent(_)) => "job_graph",
                    Some(crate::build_event::EventType::PartitionInvalidationEvent(_)) => "partition_invalidation",
                    Some(crate::build_event::EventType::JobRunCancelEvent(_)) => "job_run_cancel",
                    Some(crate::build_event::EventType::BuildCancelEvent(_)) => "build_cancel",
                    Some(crate::build_event::EventType::WantEvent(_)) => "want",
                    Some(crate::build_event::EventType::TaintEvent(_)) => "taint",
                    None => "unknown",
                },
                event_data
            ],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        // Insert into specific event type table for better querying
        match &event.event_type {
            Some(crate::build_event::EventType::BuildRequestEvent(br_event)) => {
                let partitions_json = serde_json::to_string(&br_event.requested_partitions)
                    .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
                conn.execute(
                    "INSERT INTO build_request_events (event_id, status, requested_partitions, message) VALUES (?1, ?2, ?3, ?4)",
                    rusqlite::params![
                        event.event_id,
                        br_event.clone().status.unwrap().code.to_string(),
                        partitions_json,
                        br_event.message
                    ],
                ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
            }
            Some(crate::build_event::EventType::PartitionEvent(p_event)) => {
                conn.execute(
                    "INSERT INTO partition_events (event_id, partition_ref, status, message, job_run_id) VALUES (?1, ?2, ?3, ?4, ?5)",
                    rusqlite::params![
                        event.event_id,
                        p_event.partition_ref.as_ref().map(|r| &r.str).unwrap_or(&String::new()),
                        p_event.status_code.to_string(),
                        p_event.message,
                        if p_event.job_run_id.is_empty() { None } else { Some(&p_event.job_run_id) }
                    ],
                ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
            }
            Some(crate::build_event::EventType::JobEvent(j_event)) => {
                let partitions_json = serde_json::to_string(&j_event.target_partitions)
                    .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
                let config_json = j_event.config.as_ref()
                    .map(|c| serde_json::to_string(c))
                    .transpose()
                    .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
                let manifests_json = serde_json::to_string(&j_event.manifests)
                    .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
                conn.execute(
                    "INSERT INTO job_events (event_id, job_run_id, job_label, target_partitions, status, message, config_json, manifests_json) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8)",
                    rusqlite::params![
                        event.event_id,
                        j_event.job_run_id,
                        j_event.job_label.as_ref().map(|l| &l.label).unwrap_or(&String::new()),
                        partitions_json,
                        j_event.status_code.to_string(),
                        j_event.message,
                        config_json,
                        manifests_json
                    ],
                ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
            }
            _ => {} // Other event types don't need special handling for testing
        }
        Ok(())
    }
    /// Get all events for a specific build request
    pub async fn get_build_request_events(&self, build_request_id: &str, _limit: Option<u32>) -> Result<Vec<BuildEvent>> {
        let conn = self.connection.lock().unwrap();
        let mut stmt = conn.prepare(
            "SELECT event_data FROM build_events WHERE build_request_id = ? ORDER BY timestamp ASC"
        ).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let rows = stmt.query_map([build_request_id], |row| {
            let event_data: String = row.get(0)?;
            Ok(event_data)
        }).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let mut events = Vec::new();
        for row in rows {
            let event_data = row.map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
            let event: BuildEvent = serde_json::from_str(&event_data)
                .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
            events.push(event);
        }
        Ok(events)
    }
    /// Get all events for a specific partition
    pub async fn get_partition_events(&self, partition_ref: &str, _limit: Option<u32>) -> Result<Vec<BuildEvent>> {
        let conn = self.connection.lock().unwrap();
        let mut stmt = conn.prepare(
            "SELECT e.event_data FROM build_events e 
             JOIN partition_events p ON e.event_id = p.event_id 
             WHERE p.partition_ref = ? ORDER BY e.timestamp ASC"
        ).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let rows = stmt.query_map([partition_ref], |row| {
            let event_data: String = row.get(0)?;
            Ok(event_data)
        }).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let mut events = Vec::new();
        for row in rows {
            let event_data = row.map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
            let event: BuildEvent = serde_json::from_str(&event_data)
                .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
            events.push(event);
        }
        Ok(events)
    }
    /// Get the latest status for a partition
    pub async fn get_latest_partition_status(&self, partition_ref: &str) -> Result<Option<(PartitionStatus, i64)>> {
        let conn = self.connection.lock().unwrap();
        let mut stmt = conn.prepare(
            "SELECT p.status, e.timestamp FROM build_events e 
             JOIN partition_events p ON e.event_id = p.event_id 
             WHERE p.partition_ref = ? ORDER BY e.timestamp DESC LIMIT 1"
        ).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let result = stmt.query_row([partition_ref], |row| {
            let status_str: String = row.get(0)?;
            let timestamp: i64 = row.get(1)?;
            let status_code = status_str.parse::<i32>().unwrap_or(0);
            let status = PartitionStatus::try_from(status_code).unwrap_or(PartitionStatus::PartitionUnknown);
            Ok((status, timestamp))
        });
        match result {
            Ok(status_and_timestamp) => Ok(Some(status_and_timestamp)),
            Err(rusqlite::Error::QueryReturnedNoRows) => Ok(None),
            Err(e) => Err(BuildEventLogError::QueryError(e.to_string())),
        }
    }
    /// Get events in a timestamp range (used by BELStorage)
    pub async fn get_events_in_range(&self, start: i64, end: i64) -> Result<Vec<BuildEvent>> {
        let conn = self.connection.lock().unwrap();
        let mut stmt = conn.prepare(
            "SELECT event_data FROM build_events WHERE timestamp >= ? AND timestamp <= ? ORDER BY timestamp ASC"
        ).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let rows = stmt.query_map([start, end], |row| {
            let event_data: String = row.get(0)?;
            Ok(event_data)
        }).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let mut events = Vec::new();
        for row in rows {
            let event_data = row.map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
            let event: BuildEvent = serde_json::from_str(&event_data)
                .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
            events.push(event);
        }
        Ok(events)
    }
 }
 /// Utility functions for creating test events with sensible defaults
 pub mod test_events {
    use super::*;
    use crate::event_log::{generate_event_id, current_timestamp_nanos};
    use uuid::Uuid;
    /// Create a build request received event with random defaults
    pub fn build_request_received(
        build_request_id: Option<String>,
        partitions: Vec<PartitionRef>,
    ) -> BuildEvent {
        BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id,
            event_type: Some(build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(BuildRequestStatusCode::BuildRequestReceived.status()),
                requested_partitions: partitions,
                message: "Build request received".to_string(),
                comment: None,
                want_id: None,
            })),
        }
    }
    /// Create a build request event with specific status
    pub fn build_request_event(
        build_request_id: Option<String>,
        partitions: Vec<PartitionRef>,
        status: BuildRequestStatus,
    ) -> BuildEvent {
        BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id,
            event_type: Some(build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(status.clone()),
                requested_partitions: partitions,
                message: format!("Build request status: {:?}", status.name),
                comment: None,
                want_id: None,
            })),
        }
    }
    /// Create a partition status event with random defaults
    pub fn partition_status(
        build_request_id: Option<String>,
        partition_ref: PartitionRef,
        status: PartitionStatus,
        job_run_id: Option<String>,
    ) -> BuildEvent {
        BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id,
            event_type: Some(build_event::EventType::PartitionEvent(PartitionEvent {
                partition_ref: Some(partition_ref),
                status_code: status as i32,
                status_name: status.to_display_string(),
                message: format!("Partition status: {:?}", status),
                job_run_id: job_run_id.unwrap_or_default(),
            })),
        }
    }
    /// Create a job event with random defaults
    pub fn job_event(
        build_request_id: Option<String>,
        job_run_id: Option<String>,
        job_label: JobLabel,
        target_partitions: Vec<PartitionRef>,
        status: JobStatus,
    ) -> BuildEvent {
        BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id,
            event_type: Some(build_event::EventType::JobEvent(JobEvent {
                job_run_id: job_run_id.unwrap_or_else(|| Uuid::new_v4().to_string()),
                job_label: Some(job_label),
                target_partitions,
                status_code: status as i32,
                status_name: status.to_display_string(),
                message: format!("Job status: {:?}", status),
                config: None,
                manifests: vec![],
            })),
        }
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use super::test_events::*;
    #[tokio::test]
    async fn test_mock_build_event_log_basic() {
        let mock = MockBuildEventLog::new().await.unwrap();
        // Initially empty
        assert_eq!(mock.event_count().await.unwrap(), 0);
        // Add an event
        let build_id = "test-build-123".to_string();
        let partition = PartitionRef { str: "test/partition".to_string() };
        let event = build_request_received(Some(build_id.clone()), vec![partition]);
        mock.append_event(event).await.unwrap();
        // Check event count
        assert_eq!(mock.event_count().await.unwrap(), 1);
        // Query events by build request
        let events = mock.get_build_request_events(&build_id, None).await.unwrap();
        assert_eq!(events.len(), 1);
        // Clear events
        mock.clear().await.unwrap();
        assert_eq!(mock.event_count().await.unwrap(), 0);
    }
    #[tokio::test]
    async fn test_mock_build_event_log_with_predefined_events() {
        let build_id = "test-build-456".to_string();
        let partition = PartitionRef { str: "data/users".to_string() };
        let events = vec![
            build_request_received(Some(build_id.clone()), vec![partition.clone()]),
            partition_status(Some(build_id.clone()), partition.clone(), PartitionStatus::PartitionBuilding, None),
            partition_status(Some(build_id.clone()), partition.clone(), PartitionStatus::PartitionAvailable, None),
        ];
        let mock = MockBuildEventLog::with_events(events).await.unwrap();
        // Should have 3 events
        assert_eq!(mock.event_count().await.unwrap(), 3);
        // Query partition events
        let partition_events = mock.get_partition_events(&partition.str, None).await.unwrap();
        assert_eq!(partition_events.len(), 2); // Two partition events
        // Check latest partition status
        let latest_status = mock.get_latest_partition_status(&partition.str).await.unwrap();
        assert!(latest_status.is_some());
        let (status, _timestamp) = latest_status.unwrap();
        assert_eq!(status, PartitionStatus::PartitionAvailable);
    }
    #[tokio::test]
    async fn test_event_constructors() {
        let partition = PartitionRef { str: "test/data".to_string() };
        let job_label = JobLabel { label: "//:test_job".to_string() };
        // Test build request event constructor
        let br_event = build_request_received(None, vec![partition.clone()]);
        assert!(matches!(br_event.event_type, Some(build_event::EventType::BuildRequestEvent(_))));
        // Test partition event constructor
        let p_event = partition_status(None, partition.clone(), PartitionStatus::PartitionAvailable, None);
        assert!(matches!(p_event.event_type, Some(build_event::EventType::PartitionEvent(_))));
        // Test job event constructor
        let j_event = job_event(None, None, job_label, vec![partition], JobStatus::JobCompleted);
        assert!(matches!(j_event.event_type, Some(build_event::EventType::JobEvent(_))));
    }
 }
 /// MockBELStorage is a BELStorage implementation that wraps MockBuildEventLog
 /// This allows us to use the real BELQueryEngine in tests while having control over the data
 pub struct MockBELStorage {
    mock_log: Arc<MockBuildEventLog>,
 }
 impl MockBELStorage {
    pub async fn new() -> Result<Self> {
        let mock_log = Arc::new(MockBuildEventLog::new().await?);
        Ok(Self { mock_log })
    }
    pub async fn with_events(events: Vec<BuildEvent>) -> Result<Self> {
        let mock_log = Arc::new(MockBuildEventLog::with_events(events).await?);
        Ok(Self { mock_log })
    }
 }
 #[async_trait]
 impl BELStorage for MockBELStorage {
    async fn append_event(&self, event: BuildEvent) -> Result<i64> {
        self.mock_log.append_event(event).await?;
        Ok(0) // Return dummy index for mock storage
    }
    async fn list_events(&self, since_idx: i64, filter: EventFilter) -> Result<EventPage> {
        // Get all events first (MockBELEventLog uses timestamps, so we get all events)
        let mut events = self.mock_log.get_events_in_range(0, i64::MAX).await?;
        // Apply filtering based on EventFilter
        events.retain(|event| {
            // Filter by build request IDs if specified
            if !filter.build_request_ids.is_empty() {
                if !filter.build_request_ids.contains(&event.build_request_id.clone().unwrap()) {
                    return false;
                }
            }
            // Filter by partition refs if specified
            if !filter.partition_refs.is_empty() {
                let has_matching_partition = match &event.event_type {
                    Some(build_event::EventType::PartitionEvent(pe)) => {
                        pe.partition_ref.as_ref()
                            .map(|pr| filter.partition_refs.contains(&pr.str))
                            .unwrap_or(false)
                    }
                    Some(build_event::EventType::BuildRequestEvent(bre)) => {
                        bre.requested_partitions.iter()
                            .any(|pr| filter.partition_refs.contains(&pr.str))
                    }
                    Some(build_event::EventType::JobEvent(je)) => {
                        je.target_partitions.iter()
                            .any(|pr| filter.partition_refs.contains(&pr.str))
                    }
                    _ => false,
                };
                if !has_matching_partition {
                    return false;
                }
            }
            // Filter by job labels if specified
            if !filter.job_labels.is_empty() {
                let has_matching_job = match &event.event_type {
                    Some(build_event::EventType::JobEvent(je)) => {
                        je.job_label.as_ref()
                            .map(|jl| filter.job_labels.contains(&jl.label))
                            .unwrap_or(false)
                    }
                    _ => false,
                };
                if !has_matching_job {
                    return false;
                }
            }
            // Filter by job run IDs if specified  
            if !filter.job_run_ids.is_empty() {
                let has_matching_job_run = match &event.event_type {
                    Some(build_event::EventType::JobEvent(je)) => {
                        filter.job_run_ids.contains(&je.job_run_id)
                    }
                    Some(build_event::EventType::JobRunCancelEvent(jrce)) => {
                        filter.job_run_ids.contains(&jrce.job_run_id)
                    }
                    Some(build_event::EventType::PartitionEvent(pe)) => {
                        if pe.job_run_id.is_empty() {
                            false
                        } else {
                            filter.job_run_ids.contains(&pe.job_run_id)
                        }
                    }
                    // Add other job-run-related events here if they exist
                    _ => false,
                };
                if !has_matching_job_run {
                    return false;
                }
            }
            true
        });
        Ok(EventPage {
            events,
            next_idx: since_idx + 1, // Simple increment for testing
            has_more: false, // Simplify for testing
        })
    }
    async fn initialize(&self) -> Result<()> {
        self.mock_log.initialize().await
    }
 }
 /// Helper function to create a BELQueryEngine for testing with mock data
 pub async fn create_mock_bel_query_engine() -> Result<Arc<BELQueryEngine>> {
    let storage: Arc<dyn BELStorage> = Arc::new(MockBELStorage::new().await?);
    Ok(Arc::new(BELQueryEngine::new(storage)))
 }
 /// Helper function to create a BELQueryEngine for testing with predefined events
 pub async fn create_mock_bel_query_engine_with_events(events: Vec<BuildEvent>) -> Result<Arc<BELQueryEngine>> {
    let storage: Arc<dyn BELStorage> = Arc::new(MockBELStorage::with_events(events).await?);
    Ok(Arc::new(BELQueryEngine::new(storage)))
 }
--- a/databuild/event_log/mod.rs
+++ b/databuild/event_log/mod.rs
@ -1,113 +0,0 @@
 use crate::*;
 use std::error::Error as StdError;
 use uuid::Uuid;
 pub mod writer;
 pub mod mock;
 pub mod storage;
 pub mod sqlite_storage;
 pub mod query_engine;
 #[derive(Debug)]
 pub enum BuildEventLogError {
    DatabaseError(String),
    SerializationError(String),
    ConnectionError(String),
    QueryError(String),
 }
 impl std::fmt::Display for BuildEventLogError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            BuildEventLogError::DatabaseError(msg) => write!(f, "Database error: {}", msg),
            BuildEventLogError::SerializationError(msg) => write!(f, "Serialization error: {}", msg),
            BuildEventLogError::ConnectionError(msg) => write!(f, "Connection error: {}", msg),
            BuildEventLogError::QueryError(msg) => write!(f, "Query error: {}", msg),
        }
    }
 }
 impl StdError for BuildEventLogError {}
 pub type Result<T> = std::result::Result<T, BuildEventLogError>;
 #[derive(Debug, Clone)]
 pub struct QueryResult {
    pub columns: Vec<String>,
    pub rows: Vec<Vec<String>>,
 }
 // Summary types for list endpoints
 #[derive(Debug, Clone)]
 pub struct BuildRequestSummary {
    pub build_request_id: String,
    pub status: BuildRequestStatus,
    pub requested_partitions: Vec<String>,
    pub created_at: i64,
    pub updated_at: i64,
 }
 #[derive(Debug, Clone)]
 pub struct PartitionSummary {
    pub partition_ref: String,
    pub status: PartitionStatus,
    pub updated_at: i64,
    pub build_request_id: Option<String>,
 }
 #[derive(Debug, Clone)]
 pub struct ActivitySummary {
    pub active_builds_count: u32,
    pub recent_builds: Vec<BuildRequestSummary>,
    pub recent_partitions: Vec<PartitionSummary>,
    pub total_partitions_count: u32,
 }
 // Helper function to generate event ID
 pub fn generate_event_id() -> String {
    Uuid::new_v4().to_string()
 }
 // Helper function to get current timestamp in nanoseconds
 pub fn current_timestamp_nanos() -> i64 {
    std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .unwrap()
        .as_nanos() as i64
 }
 // Helper function to create build event with metadata
 pub fn create_build_event(
    build_request_id: String,
    event_type: crate::build_event::EventType,
 ) -> BuildEvent {
    BuildEvent {
        event_id: generate_event_id(),
        timestamp: current_timestamp_nanos(),
        build_request_id: Some(build_request_id.clone()),
        event_type: Some(event_type),
    }
 }
 // Parse build event log URI and create BEL query engine with appropriate storage backend
 pub async fn create_bel_query_engine(uri: &str) -> Result<std::sync::Arc<query_engine::BELQueryEngine>> {
    use std::sync::Arc;
    use storage::BELStorage;
    if uri == "stdout" {
        let storage: Arc<dyn BELStorage> = Arc::new(storage::StdoutBELStorage::new());
        storage.initialize().await?;
        Ok(Arc::new(query_engine::BELQueryEngine::new(storage)))
    } else if uri.starts_with("sqlite://") {
        let path = &uri[9..]; // Remove "sqlite://" prefix
        let storage: Arc<dyn BELStorage> = Arc::new(sqlite_storage::SqliteBELStorage::new(path)?);
        storage.initialize().await?;
        Ok(Arc::new(query_engine::BELQueryEngine::new(storage)))
    } else {
        Err(BuildEventLogError::ConnectionError(
            format!("Unsupported build event log URI for BEL query engine: {}", uri)
        ))
    }
 }
--- a/databuild/event_log/query_engine.rs
+++ b/databuild/event_log/query_engine.rs
@ -1,389 +0,0 @@
 use super::*;
 use super::storage::BELStorage;
 use std::sync::Arc;
 use std::collections::HashMap;
 /// App-layer aggregation that scans storage events
 pub struct BELQueryEngine {
    storage: Arc<dyn BELStorage>,
 }
 impl BELQueryEngine {
    pub fn new(storage: Arc<dyn BELStorage>) -> Self {
        Self { storage }
    }
    /// Get latest status for a partition by scanning recent events
    pub async fn get_latest_partition_status(&self, partition_ref: &str) -> Result<Option<(PartitionStatus, i64)>> {
        let filter = EventFilter {
            partition_refs: vec![partition_ref.to_string()],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        self.aggregate_partition_status(&events.events)
    }
    /// Get all build requests that are currently building a partition
    pub async fn get_active_builds_for_partition(&self, partition_ref: &str) -> Result<Vec<String>> {
        let filter = EventFilter {
            partition_refs: vec![partition_ref.to_string()],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        let mut active_builds: Vec<String> = Vec::new();
        let mut build_states: HashMap<String, BuildRequestStatusCode> = HashMap::new();
        // Process events chronologically to track build states
        for event in events.events {
            let build_request_id = event.build_request_id.clone().unwrap();
            match &event.event_type {
                Some(crate::build_event::EventType::BuildRequestEvent(br_event)) => {
                    if let Ok(code) = BuildRequestStatusCode::try_from(br_event.clone().status.unwrap().code) {
                        build_states.insert(build_request_id.clone(), code);
                    }
                }
                Some(crate::build_event::EventType::PartitionEvent(p_event)) => {
                    if let Some(partition_event_ref) = &p_event.partition_ref {
                        if partition_event_ref.str == partition_ref {
                            // Check if this partition is actively being built
                            if let Ok(status) = PartitionStatus::try_from(p_event.status_code) {
                                if matches!(status, PartitionStatus::PartitionBuilding | PartitionStatus::PartitionAnalyzed) {
                                    // Check if the build request is still active
                                    if let Some(build_status) = build_states.get(&build_request_id) {
                                        if matches!(build_status, 
                                            BuildRequestStatusCode::BuildRequestReceived |
                                            BuildRequestStatusCode::BuildRequestPlanning |
                                            BuildRequestStatusCode::BuildRequestExecuting |
                                            BuildRequestStatusCode::BuildRequestAnalysisCompleted
                                        ) {
                                            if !active_builds.contains(&build_request_id) {
                                                active_builds.push(build_request_id.clone());
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
                _ => {}
            }
        }
        Ok(active_builds)
    }
    /// Get summary of a build request by aggregating its events
    pub async fn get_build_request_summary(&self, build_id: &str) -> Result<BuildRequestSummary> {
        let filter = EventFilter {
            partition_refs: vec![],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![build_id.to_string()],
        };
        let events = self.storage.list_events(0, filter).await?;
        // If no events found, build doesn't exist
        if events.events.is_empty() {
            return Err(BuildEventLogError::QueryError(format!("Build request '{}' not found", build_id)));
        }
        let mut status = BuildRequestStatusCode::BuildRequestUnknown.status();
        let mut requested_partitions = Vec::new();
        let mut created_at = 0i64;
        let mut updated_at = 0i64;
        for event in events.events.iter().filter(|event| event.build_request_id.is_some()) {
            if event.timestamp > 0 {
                if created_at == 0 || event.timestamp < created_at {
                    created_at = event.timestamp;
                }
                if event.timestamp > updated_at {
                    updated_at = event.timestamp;
                }
            }
            if let Some(crate::build_event::EventType::BuildRequestEvent(br_event)) = &event.event_type {
                if let Ok(event_status) = BuildRequestStatus::try_from(br_event.status.clone().unwrap()) {
                    status = event_status;
                }
                if !br_event.requested_partitions.is_empty() {
                    requested_partitions = br_event.requested_partitions.iter()
                        .map(|p| p.str.clone())
                        .collect();
                }
            }
        }
        Ok(BuildRequestSummary {
            build_request_id: build_id.to_string(),
            status,
            requested_partitions,
            created_at,
            updated_at,
        })
    }
    /// List build requests with pagination and filtering
    pub async fn list_build_requests(&self, request: BuildsListRequest) -> Result<BuildsListResponse> {
        // For now, scan all events and aggregate
        let filter = EventFilter {
            partition_refs: vec![],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        let mut build_summaries: HashMap<String, BuildRequestSummary> = HashMap::new();
        // Aggregate by build request ID
        for event in events.events.iter().filter(|event| event.build_request_id.is_some()) {
            if let Some(crate::build_event::EventType::BuildRequestEvent(br_event)) = &event.event_type {
                let build_id = &event.build_request_id.clone().unwrap();
                let entry = build_summaries.entry(build_id.clone()).or_insert_with(|| {
                    BuildRequestSummary {
                        build_request_id: build_id.clone(),
                        status: BuildRequestStatusCode::BuildRequestUnknown.status(),
                        requested_partitions: Vec::new(),
                        created_at: event.timestamp,
                        updated_at: event.timestamp,
                    }
                });
                if let Ok(status) = BuildRequestStatus::try_from(br_event.status.clone().unwrap()) {
                    entry.status = status;
                }
                entry.updated_at = event.timestamp.max(entry.updated_at);
                if !br_event.requested_partitions.is_empty() {
                    entry.requested_partitions = br_event.requested_partitions.iter()
                        .map(|p| p.str.clone())
                        .collect();
                }
            }
        }
        let mut builds: Vec<_> = build_summaries.into_values().collect();
        builds.sort_by(|a, b| b.created_at.cmp(&a.created_at)); // Most recent first
        // Apply status filter if provided
        if let Some(status_filter) = &request.status_filter {
            if let Ok(filter_status) = status_filter.parse::<i32>() {
                if let Ok(status) = BuildRequestStatusCode::try_from(filter_status) {
                    builds.retain(|b| b.status.code == status as i32);
                }
            }
        }
        let total_count = builds.len() as u32;
        let offset = request.offset.unwrap_or(0) as usize;
        let limit = request.limit.unwrap_or(50) as usize;
        let paginated_builds = builds.into_iter()
            .skip(offset)
            .take(limit)
            .map(|summary| BuildSummary {
                build_request_id: summary.build_request_id,
                status: Some(summary.status),
                requested_partitions: summary.requested_partitions.into_iter()
                    .map(|s| PartitionRef { str: s })
                    .collect(),
                total_jobs: 0, // TODO: Implement
                completed_jobs: 0, // TODO: Implement
                failed_jobs: 0, // TODO: Implement
                cancelled_jobs: 0, // TODO: Implement
                requested_at: summary.created_at,
                started_at: None, // TODO: Implement
                completed_at: None, // TODO: Implement
                duration_ms: None, // TODO: Implement
                cancelled: false, // TODO: Implement
                comment: None,
            })
            .collect();
        Ok(BuildsListResponse {
            builds: paginated_builds,
            total_count,
            has_more: (offset + limit) < total_count as usize,
        })
    }
    /// Get activity summary for dashboard
    pub async fn get_activity_summary(&self) -> Result<ActivitySummary> {
        let builds_response = self.list_build_requests(BuildsListRequest {
            limit: Some(5),
            offset: Some(0),
            status_filter: None,
        }).await?;
        let active_builds_count = builds_response.builds.iter()
            .filter(|b| matches!(
                BuildRequestStatusCode::try_from(b.status.clone().unwrap().code).unwrap_or(BuildRequestStatusCode::BuildRequestUnknown),
                BuildRequestStatusCode::BuildRequestReceived |
                BuildRequestStatusCode::BuildRequestPlanning |
                BuildRequestStatusCode::BuildRequestExecuting |
                BuildRequestStatusCode::BuildRequestAnalysisCompleted
            ))
            .count() as u32;
        let recent_builds = builds_response.builds.into_iter()
            .map(|b| BuildRequestSummary {
                build_request_id: b.build_request_id,
                status: b.status.unwrap_or(BuildRequestStatusCode::BuildRequestUnknown.status()),
                requested_partitions: b.requested_partitions.into_iter().map(|p| p.str).collect(),
                created_at: b.requested_at,
                updated_at: b.completed_at.unwrap_or(b.requested_at),
            })
            .collect();
        // For partitions, we'd need a separate implementation
        let recent_partitions = Vec::new(); // TODO: Implement partition listing
        Ok(ActivitySummary {
            active_builds_count,
            recent_builds,
            recent_partitions,
            total_partitions_count: 0, // TODO: Implement
        })
    }
    /// Helper to aggregate partition status from events
    fn aggregate_partition_status(&self, events: &[BuildEvent]) -> Result<Option<(PartitionStatus, i64)>> {
        let mut latest_status = None;
        let mut latest_timestamp = 0i64;
        // Look for the most recent partition event for this partition
        for event in events {
            if let Some(crate::build_event::EventType::PartitionEvent(p_event)) = &event.event_type {
                if event.timestamp >= latest_timestamp {
                    if let Ok(status) = PartitionStatus::try_from(p_event.status_code) {
                        latest_status = Some(status);
                        latest_timestamp = event.timestamp;
                    }
                }
            }
        }
        Ok(latest_status.map(|status| (status, latest_timestamp)))
    }
    /// Get build request ID that created an available partition
    pub async fn get_build_request_for_available_partition(&self, partition_ref: &str) -> Result<Option<String>> {
        let filter = EventFilter {
            partition_refs: vec![partition_ref.to_string()],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        // Find the most recent PARTITION_AVAILABLE event
        let mut latest_available_build_id = None;
        let mut latest_timestamp = 0i64;
        for event in events.events {
            if let Some(crate::build_event::EventType::PartitionEvent(p_event)) = &event.event_type {
                if let Some(partition_event_ref) = &p_event.partition_ref {
                    if partition_event_ref.str == partition_ref {
                        if let Ok(status) = PartitionStatus::try_from(p_event.status_code) {
                            if status == PartitionStatus::PartitionAvailable && event.timestamp >= latest_timestamp {
                                latest_available_build_id = event.build_request_id.clone();
                                latest_timestamp = event.timestamp;
                            }
                        }
                    }
                }
            }
        }
        Ok(latest_available_build_id)
    }
    /// Append an event to storage
    pub async fn append_event(&self, event: BuildEvent) -> Result<i64> {
        self.storage.append_event(event).await
    }
    /// Get all events for a specific partition
    pub async fn get_partition_events(&self, partition_ref: &str, _limit: Option<u32>) -> Result<Vec<BuildEvent>> {
        let filter = EventFilter {
            partition_refs: vec![partition_ref.to_string()],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        Ok(events.events)
    }
    /// Execute a raw SQL query (for backwards compatibility)
    pub async fn execute_query(&self, _query: &str) -> Result<QueryResult> {
        // TODO: Implement SQL query execution if needed
        // For now, return empty result to avoid compilation errors
        Ok(QueryResult {
            columns: vec![],
            rows: vec![],
        })
    }
    /// Get all events in a timestamp range
    pub async fn get_events_in_range(&self, _start: i64, _end: i64) -> Result<Vec<BuildEvent>> {
        // TODO: Implement range filtering
        // For now, get all events
        let filter = EventFilter {
            partition_refs: vec![],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        Ok(events.events)
    }
    /// Get all events for a specific job run
    pub async fn get_job_run_events(&self, job_run_id: &str) -> Result<Vec<BuildEvent>> {
        let filter = EventFilter {
            partition_refs: vec![],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![job_run_id.to_string()],
            build_request_ids: vec![],
        };
        let events = self.storage.list_events(0, filter).await?;
        Ok(events.events)
    }
    /// Get all events for a specific build request
    pub async fn get_build_request_events(&self, build_request_id: &str, _limit: Option<u32>) -> Result<Vec<BuildEvent>> {
        let filter = EventFilter {
            partition_refs: vec![],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![build_request_id.to_string()],
        };
        let events = self.storage.list_events(0, filter).await?;
        Ok(events.events)
    }
 }
--- a/databuild/event_log/sqlite_storage.rs
+++ b/databuild/event_log/sqlite_storage.rs
@ -1,154 +0,0 @@
 use super::*;
 use super::storage::BELStorage;
 use async_trait::async_trait;
 use rusqlite::{params, Connection};
 use std::path::Path;
 use std::sync::{Arc, Mutex};
 pub struct SqliteBELStorage {
    connection: Arc<Mutex<Connection>>,
 }
 impl SqliteBELStorage {
    pub fn new(path: &str) -> Result<Self> {
        // Create parent directory if it doesn't exist
        if let Some(parent) = Path::new(path).parent() {
            std::fs::create_dir_all(parent)
                .map_err(|e| BuildEventLogError::ConnectionError(
                    format!("Failed to create directory {}: {}", parent.display(), e)
                ))?;
        }
        let conn = Connection::open(path)
            .map_err(|e| BuildEventLogError::ConnectionError(e.to_string()))?;
        Ok(Self {
            connection: Arc::new(Mutex::new(conn)),
        })
    }
 }
 #[async_trait]
 impl BELStorage for SqliteBELStorage {
    async fn append_event(&self, event: BuildEvent) -> Result<i64> {
        let serialized = serde_json::to_string(&event)
            .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
        let conn = self.connection.lock().unwrap();
        let _row_id = conn.execute(
            "INSERT INTO build_events (event_data) VALUES (?)",
            params![serialized],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        Ok(conn.last_insert_rowid())
    }
    async fn list_events(&self, since_idx: i64, filter: EventFilter) -> Result<EventPage> {
        let conn = self.connection.lock().unwrap();
        // For simplicity in the initial implementation, we'll do basic filtering
        // More sophisticated JSON path filtering can be added later if needed
        let mut query = "SELECT rowid, event_data FROM build_events WHERE rowid > ?".to_string();
        let mut params_vec = vec![since_idx.to_string()];
        // Add build request ID filter if provided
        if !filter.build_request_ids.is_empty() {
            query.push_str(" AND (");
            for (i, build_id) in filter.build_request_ids.iter().enumerate() {
                if i > 0 { query.push_str(" OR "); }
                query.push_str("JSON_EXTRACT(event_data, '$.build_request_id') = ?");
                params_vec.push(build_id.clone());
            }
            query.push_str(")");
        }
        // Add ordering and pagination
        query.push_str(" ORDER BY rowid ASC LIMIT 1000");
        let mut stmt = conn.prepare(&query)
            .map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        // Convert params to rusqlite params
        let param_refs: Vec<&dyn rusqlite::ToSql> = params_vec.iter()
            .map(|p| p as &dyn rusqlite::ToSql)
            .collect();
        let rows = stmt.query_map(&param_refs[..], |row| {
            let rowid: i64 = row.get(0)?;
            let event_data: String = row.get(1)?;
            Ok((rowid, event_data))
        }).map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
        let mut events = Vec::new();
        let mut max_idx = since_idx;
        for row in rows {
            let (rowid, event_data) = row.map_err(|e| BuildEventLogError::QueryError(e.to_string()))?;
            let event: BuildEvent = serde_json::from_str(&event_data)
                .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
            // Apply additional filtering in memory for now
            let mut include_event = true;
            if !filter.partition_refs.is_empty() {
                include_event = false;
                if let Some(event_type) = &event.event_type {
                    if let crate::build_event::EventType::PartitionEvent(pe) = event_type {
                        if let Some(partition_ref) = &pe.partition_ref {
                            if filter.partition_refs.contains(&partition_ref.str) {
                                include_event = true;
                            }
                        }
                    }
                }
            }
            if !filter.job_run_ids.is_empty() && include_event {
                include_event = false;
                if let Some(event_type) = &event.event_type {
                    if let crate::build_event::EventType::JobEvent(je) = event_type {
                        if filter.job_run_ids.contains(&je.job_run_id) {
                            include_event = true;
                        }
                    }
                }
            }
            if include_event {
                events.push(event);
                max_idx = rowid;
            }
        }
        let has_more = events.len() >= 1000; // If we got the max limit, there might be more
        Ok(EventPage {
            events,
            next_idx: max_idx,
            has_more,
        })
    }
    async fn initialize(&self) -> Result<()> {
        let conn = self.connection.lock().unwrap();
        conn.execute(
            "CREATE TABLE IF NOT EXISTS build_events (
                rowid INTEGER PRIMARY KEY AUTOINCREMENT,
                event_data TEXT NOT NULL
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        // Create index for efficient JSON queries
        conn.execute(
            "CREATE INDEX IF NOT EXISTS idx_build_request_id ON build_events(
                JSON_EXTRACT(event_data, '$.build_request_id')
            )",
            [],
        ).map_err(|e| BuildEventLogError::DatabaseError(e.to_string()))?;
        Ok(())
    }
 }
--- a/databuild/event_log/storage.rs
+++ b/databuild/event_log/storage.rs
@ -1,75 +0,0 @@
 use crate::*;
 use async_trait::async_trait;
 use super::Result;
 /// Simple stdout storage backend for debugging
 pub struct StdoutBELStorage;
 impl StdoutBELStorage {
    pub fn new() -> Self {
        Self
    }
 }
 #[async_trait]
 impl BELStorage for StdoutBELStorage {
    async fn append_event(&self, event: BuildEvent) -> Result<i64> {
        let json = serde_json::to_string(&event)
            .map_err(|e| BuildEventLogError::SerializationError(e.to_string()))?;
        println!("BUILD_EVENT: {}", json);
        Ok(0) // Return dummy index for stdout
    }
    async fn list_events(&self, _since_idx: i64, _filter: EventFilter) -> Result<EventPage> {
        // Stdout implementation doesn't support querying
        Err(BuildEventLogError::QueryError(
            "Stdout storage backend doesn't support querying".to_string()
        ))
    }
    async fn initialize(&self) -> Result<()> {
        Ok(()) // Nothing to initialize for stdout
    }
 }
 /// Minimal append-only interface optimized for sequential scanning
 #[async_trait]
 pub trait BELStorage: Send + Sync {
    /// Append a single event, returns the sequential index
    async fn append_event(&self, event: BuildEvent) -> Result<i64>;
    /// List events with filtering, starting from a given index
    async fn list_events(&self, since_idx: i64, filter: EventFilter) -> Result<EventPage>;
    /// Initialize storage backend (create tables, etc.)
    async fn initialize(&self) -> Result<()>;
 }
 /// Factory function to create storage backends from URI
 pub async fn create_bel_storage(uri: &str) -> Result<Box<dyn BELStorage>> {
    if uri == "stdout" {
        Ok(Box::new(StdoutBELStorage::new()))
    } else if uri.starts_with("sqlite://") {
        let path = &uri[9..]; // Remove "sqlite://" prefix
        let storage = crate::event_log::sqlite_storage::SqliteBELStorage::new(path)?;
        storage.initialize().await?;
        Ok(Box::new(storage))
    } else if uri.starts_with("postgres://") {
        // TODO: Implement PostgresBELStorage
        Err(BuildEventLogError::ConnectionError(
            "PostgreSQL storage backend not yet implemented".to_string()
        ))
    } else {
        Err(BuildEventLogError::ConnectionError(
            format!("Unsupported build event log URI: {}", uri)
        ))
    }
 }
 /// Factory function to create query engine from URI
 pub async fn create_bel_query_engine(uri: &str) -> Result<std::sync::Arc<crate::event_log::query_engine::BELQueryEngine>> {
    let storage = create_bel_storage(uri).await?;
    let storage_arc = std::sync::Arc::from(storage);
    Ok(std::sync::Arc::new(crate::event_log::query_engine::BELQueryEngine::new(storage_arc)))
 }
--- a/databuild/event_log/writer.rs
+++ b/databuild/event_log/writer.rs
@ -1,460 +0,0 @@
 use crate::*;
 use crate::event_log::{BuildEventLogError, Result, create_build_event, current_timestamp_nanos, generate_event_id, query_engine::BELQueryEngine};
 use std::sync::Arc;
 use log::debug;
 /// Common interface for writing events to the build event log with validation
 pub struct EventWriter {
    query_engine: Arc<BELQueryEngine>,
 }
 impl EventWriter {
    /// Create a new EventWriter with the specified query engine
    pub fn new(query_engine: Arc<BELQueryEngine>) -> Self {
        Self { query_engine }
    }
    /// Append an event directly to the event log
    pub async fn append_event(&self, event: BuildEvent) -> Result<()> {
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Get access to the underlying query engine for direct operations
    pub fn query_engine(&self) -> &BELQueryEngine {
        self.query_engine.as_ref()
    }
    /// Request a new build for the specified partitions
    pub async fn request_build(
        &self,
        build_request_id: String,
        requested_partitions: Vec<PartitionRef>,
    ) -> Result<()> {
        debug!("Writing build request event for build: {}", build_request_id);
        let event = create_build_event(
            build_request_id,
            build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(BuildRequestStatusCode::BuildRequestReceived.status()),
                requested_partitions,
                message: "Build request received".to_string(),
                comment: None,
                want_id: None,
            }),
        );
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Update build request status
    pub async fn update_build_status(
        &self,
        build_request_id: String,
        status: BuildRequestStatus,
        message: String,
    ) -> Result<()> {
        debug!("Updating build status for {}: {:?}", build_request_id, status);
        let event = create_build_event(
            build_request_id,
            build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(status),
                requested_partitions: vec![],
                message,
                comment: None,
                want_id: None,
            }),
        );
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Update build request status with partition list
    pub async fn update_build_status_with_partitions(
        &self,
        build_request_id: String,
        status: BuildRequestStatus,
        requested_partitions: Vec<PartitionRef>,
        message: String,
    ) -> Result<()> {
        debug!("Updating build status for {}: {:?}", build_request_id, status);
        let event = create_build_event(
            build_request_id,
            build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(status),
                requested_partitions,
                message,
                comment: None,
                want_id: None,
            }),
        );
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Update partition status
    pub async fn update_partition_status(
        &self,
        build_request_id: String,
        partition_ref: PartitionRef,
        status: PartitionStatus,
        message: String,
        job_run_id: Option<String>,
    ) -> Result<()> {
        debug!("Updating partition status for {}: {:?}", partition_ref.str, status);
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id),
            event_type: Some(build_event::EventType::PartitionEvent(PartitionEvent {
                partition_ref: Some(partition_ref),
                status_code: status as i32,
                status_name: status.to_display_string(),
                message,
                job_run_id: job_run_id.unwrap_or_default(),
            })),
        };
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Invalidate a partition with a reason
    pub async fn invalidate_partition(
        &self,
        build_request_id: String,
        partition_ref: PartitionRef,
        reason: String,
    ) -> Result<()> {
        // First validate that the partition exists by checking its current status
        let current_status = self.query_engine.get_latest_partition_status(&partition_ref.str).await?;
        if current_status.is_none() {
            return Err(BuildEventLogError::QueryError(
                format!("Cannot invalidate non-existent partition: {}", partition_ref.str)
            ));
        }
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id),
            event_type: Some(build_event::EventType::PartitionInvalidationEvent(
                PartitionInvalidationEvent {
                    partition_ref: Some(partition_ref),
                    reason,
                }
            )),
        };
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Schedule a job for execution
    pub async fn schedule_job(
        &self,
        build_request_id: String,
        job_run_id: String,
        job_label: JobLabel,
        target_partitions: Vec<PartitionRef>,
        config: JobConfig,
    ) -> Result<()> {
        debug!("Scheduling job {} for partitions: {:?}", job_label.label, target_partitions);
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id),
            event_type: Some(build_event::EventType::JobEvent(JobEvent {
                job_run_id,
                job_label: Some(job_label),
                target_partitions,
                status_code: JobStatus::JobScheduled as i32,
                status_name: JobStatus::JobScheduled.to_display_string(),
                message: "Job scheduled for execution".to_string(),
                config: Some(config),
                manifests: vec![],
            })),
        };
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Update job status
    pub async fn update_job_status(
        &self,
        build_request_id: String,
        job_run_id: String,
        job_label: JobLabel,
        target_partitions: Vec<PartitionRef>,
        status: JobStatus,
        message: String,
        manifests: Vec<PartitionManifest>,
    ) -> Result<()> {
        debug!("Updating job {} status to {:?}", job_run_id, status);
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id),
            event_type: Some(build_event::EventType::JobEvent(JobEvent {
                job_run_id,
                job_label: Some(job_label),
                target_partitions,
                status_code: status as i32,
                status_name: status.to_display_string(),
                message,
                config: None,
                manifests,
            })),
        };
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Cancel a task (job run) with a reason
    pub async fn cancel_task(
        &self,
        build_request_id: String,
        job_run_id: String,
        reason: String,
    ) -> Result<()> {
        // Validate that the job run exists and is in a cancellable state
        let job_events = self.query_engine.get_job_run_events(&job_run_id).await?;
        if job_events.is_empty() {
            return Err(BuildEventLogError::QueryError(
                format!("Cannot cancel non-existent job run: {}", job_run_id)
            ));
        }
        // Find the latest job status
        let latest_status = job_events.iter()
            .rev()
            .find_map(|e| match &e.event_type {
                Some(build_event::EventType::JobEvent(job)) => Some(job.status_code),
                _ => None,
            });
        match latest_status {
            Some(status) if status == JobStatus::JobCompleted as i32 => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel completed job run: {}", job_run_id)
                ));
            }
            Some(status) if status == JobStatus::JobFailed as i32 => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel failed job run: {}", job_run_id)
                ));
            }
            Some(status) if status == JobStatus::JobCancelled as i32 => {
                return Err(BuildEventLogError::QueryError(
                    format!("Job run already cancelled: {}", job_run_id)
                ));
            }
            _ => {}
        }
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id),
            event_type: Some(build_event::EventType::JobRunCancelEvent(JobRunCancelEvent {
                job_run_id,
                reason,
            })),
        };
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Cancel a build request with a reason
    pub async fn cancel_build(
        &self,
        build_request_id: String,
        reason: String,
    ) -> Result<()> {
        // Validate that the build exists and is in a cancellable state
        let build_events = self.query_engine.get_build_request_events(&build_request_id, None).await?;
        if build_events.is_empty() {
            return Err(BuildEventLogError::QueryError(
                format!("Cannot cancel non-existent build: {}", build_request_id)
            ));
        }
        // Find the latest build status
        let latest_status = build_events.iter()
            .rev()
            .find_map(|e| match &e.event_type {
                Some(build_event::EventType::BuildRequestEvent(br)) => Some(br.clone().status.unwrap().code),
                _ => None,
            });
        match latest_status {
            Some(status) if status == BuildRequestStatusCode::BuildRequestCompleted as i32 => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel completed build: {}", build_request_id)
                ));
            }
            Some(status) if status == BuildRequestStatusCode::BuildRequestFailed as i32 => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel failed build: {}", build_request_id)
                ));
            }
            Some(status) if status == BuildRequestStatusCode::BuildRequestCancelled as i32 => {
                return Err(BuildEventLogError::QueryError(
                    format!("Build already cancelled: {}", build_request_id)
                ));
            }
            _ => {}
        }
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id.clone()),
            event_type: Some(build_event::EventType::BuildCancelEvent(BuildCancelEvent {
                reason,
            })),
        };
        self.query_engine.append_event(event).await.map(|_| ())?;
        // Also emit a build request status update
        self.update_build_status(
            build_request_id,
            BuildRequestStatusCode::BuildRequestCancelled.status(),
            "Build cancelled by user".to_string(),
        ).await
    }
    /// Record a delegation event when a partition build is delegated to another build
    pub async fn record_delegation(
        &self,
        build_request_id: String,
        partition_ref: PartitionRef,
        delegated_to_build_request_id: String,
        message: String,
    ) -> Result<()> {
        debug!("Recording delegation of {} to build {}", partition_ref.str, delegated_to_build_request_id);
        let event = create_build_event(
            build_request_id,
            build_event::EventType::DelegationEvent(DelegationEvent {
                partition_ref: Some(partition_ref),
                delegated_to_build_request_id,
                message,
            }),
        );
        self.query_engine.append_event(event).await.map(|_| ())
    }
    /// Record the analyzed job graph
    pub async fn record_job_graph(
        &self,
        build_request_id: String,
        job_graph: JobGraph,
        message: String,
    ) -> Result<()> {
        debug!("Recording job graph for build: {}", build_request_id);
        let event = BuildEvent {
            event_id: generate_event_id(),
            timestamp: current_timestamp_nanos(),
            build_request_id: Some(build_request_id),
            event_type: Some(build_event::EventType::JobGraphEvent(JobGraphEvent {
                job_graph: Some(job_graph),
                message,
            })),
        };
        self.query_engine.append_event(event).await.map(|_| ())
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::event_log::mock::create_mock_bel_query_engine;
    #[tokio::test]
    async fn test_event_writer_build_lifecycle() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let writer = EventWriter::new(query_engine);
        let build_id = "test-build-123".to_string();
        let partitions = vec![PartitionRef { str: "test/partition".to_string() }];
        // Test build request
        writer.request_build(build_id.clone(), partitions.clone()).await.unwrap();
        // Test status updates
        writer.update_build_status(
            build_id.clone(),
            BuildRequestStatusCode::BuildRequestPlanning.status(),
            "Starting planning".to_string(),
        ).await.unwrap();
        writer.update_build_status(
            build_id.clone(),
            BuildRequestStatusCode::BuildRequestExecuting.status(),
            "Starting execution".to_string(),
        ).await.unwrap();
        writer.update_build_status(
            build_id.clone(),
            BuildRequestStatusCode::BuildRequestCompleted.status(),
            "Build completed successfully".to_string(),
        ).await.unwrap();
    }
    #[tokio::test]
    async fn test_event_writer_partition_and_job() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let writer = EventWriter::new(query_engine);
        let build_id = "test-build-456".to_string();
        let partition = PartitionRef { str: "data/users".to_string() };
        let job_run_id = "job-run-789".to_string();
        let job_label = JobLabel { label: "//:test_job".to_string() };
        // Test partition status update
        writer.update_partition_status(
            build_id.clone(),
            partition.clone(),
            PartitionStatus::PartitionBuilding,
            "Building partition".to_string(),
            Some(job_run_id.clone()),
        ).await.unwrap();
        // Test job scheduling
        let config = JobConfig {
            outputs: vec![partition.clone()],
            inputs: vec![],
            args: vec!["test".to_string()],
            env: std::collections::HashMap::new(),
        };
        writer.schedule_job(
            build_id.clone(),
            job_run_id.clone(),
            job_label.clone(),
            vec![partition.clone()],
            config,
        ).await.unwrap();
        // Test job status update
        writer.update_job_status(
            build_id.clone(),
            job_run_id,
            job_label,
            vec![partition],
            JobStatus::JobCompleted,
            "Job completed successfully".to_string(),
            vec![],
        ).await.unwrap();
    }
 }
--- a/databuild/format_consistency_test.rs
+++ b/databuild/format_consistency_test.rs
@ -1,143 +0,0 @@
 #[cfg(test)]
 mod format_consistency_tests {
    use super::*;
    use crate::*;
    use crate::repositories::partitions::PartitionsRepository;
    use crate::event_log::mock::{create_mock_bel_query_engine_with_events, test_events};
    use std::sync::Arc;
    #[tokio::test]
    async fn test_partitions_list_json_format_consistency() {
        // Create test data
        let build_id = "test-build-123".to_string();
        let partition1 = PartitionRef { str: "data/users".to_string() };
        let partition2 = PartitionRef { str: "data/orders".to_string() };
        let events = vec![
            test_events::build_request_received(Some(build_id.clone()), vec![partition1.clone(), partition2.clone()]),
            test_events::partition_status(Some(build_id.clone()), partition1.clone(), PartitionStatus::PartitionBuilding, None),
            test_events::partition_status(Some(build_id.clone()), partition1.clone(), PartitionStatus::PartitionAvailable, None),
            test_events::partition_status(Some(build_id.clone()), partition2.clone(), PartitionStatus::PartitionBuilding, None),
            test_events::partition_status(Some(build_id.clone()), partition2.clone(), PartitionStatus::PartitionFailed, None),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repository = PartitionsRepository::new(query_engine);
        // Test the new unified protobuf format
        let request = PartitionsListRequest {
            limit: Some(10),
            offset: None,
            status_filter: None,
        };
        let response = repository.list_protobuf(request).await.unwrap();
        // Serialize to JSON and verify structure
        let json_value = serde_json::to_value(&response).unwrap();
        // Verify top-level structure matches expected protobuf schema
        assert!(json_value.get("partitions").is_some());
        assert!(json_value.get("total_count").is_some());
        assert!(json_value.get("has_more").is_some());
        let partitions = json_value["partitions"].as_array().unwrap();
        assert_eq!(partitions.len(), 2);
        // Verify each partition has dual status fields
        for partition in partitions {
            assert!(partition.get("partition_ref").is_some());
            assert!(partition.get("status_code").is_some(), "Missing status_code field");
            assert!(partition.get("status_name").is_some(), "Missing status_name field");
            assert!(partition.get("last_updated").is_some());
            assert!(partition.get("builds_count").is_some());
            assert!(partition.get("invalidation_count").is_some());
            // Verify status fields are consistent
            let status_code = partition["status_code"].as_i64().unwrap();
            let status_name = partition["status_name"].as_str().unwrap();
            // Map status codes to expected names
            let expected_name = match status_code {
                1 => "requested",
                2 => "analyzed", 
                3 => "building",
                4 => "available",
                5 => "failed",
                6 => "delegated",
                _ => "unknown",
            };
            // Find the partition by status to verify correct mapping
            if status_name == "available" {
                assert_eq!(status_code, 4, "Available status should have code 4");
            } else if status_name == "failed" {
                assert_eq!(status_code, 5, "Failed status should have code 5");
            }
        }
        // Verify JSON serialization produces expected field names (snake_case for JSON)
        let json_str = serde_json::to_string_pretty(&response).unwrap();
        assert!(json_str.contains("\"partitions\""));
        assert!(json_str.contains("\"total_count\""));
        assert!(json_str.contains("\"has_more\""));
        assert!(json_str.contains("\"partition_ref\""));
        assert!(json_str.contains("\"status_code\""));
        assert!(json_str.contains("\"status_name\""));
        assert!(json_str.contains("\"last_updated\""));
        assert!(json_str.contains("\"builds_count\""));
        assert!(json_str.contains("\"invalidation_count\""));
        println!("✅ Partitions list JSON format test passed");
        println!("Sample JSON output:\n{}", json_str);
    }
    #[tokio::test]
    async fn test_status_conversion_utilities() {
        use crate::status_utils::*;
        // Test PartitionStatus conversions
        let status = PartitionStatus::PartitionAvailable;
        assert_eq!(status.to_display_string(), "available");
        assert_eq!(PartitionStatus::from_display_string("available"), Some(status));
        // Test JobStatus conversions  
        let job_status = JobStatus::JobCompleted;
        assert_eq!(job_status.to_display_string(), "completed");
        assert_eq!(JobStatus::from_display_string("completed"), Some(job_status));
        // Test BuildRequestStatus conversions
        let build_status = BuildRequestStatusCode::BuildRequestCompleted.status();
        assert_eq!(build_status.name, "completed");
        // Test invalid conversions
        assert_eq!(PartitionStatus::from_display_string("invalid"), None);
        println!("✅ Status conversion utilities test passed");
    }
    #[test]
    fn test_protobuf_response_helper_functions() {
        use crate::status_utils::list_response_helpers::*;
        // Test PartitionSummary creation
        let summary = create_partition_summary(
            PartitionRef { str: "test/partition".to_string() },
            PartitionStatus::PartitionAvailable,
            1234567890,
            5,
            2,
            Some("build-123".to_string()),
        );
        assert_eq!(summary.partition_ref, Some(PartitionRef { str: "test/partition".to_string() }));
        assert_eq!(summary.status_code, 4); // PartitionAvailable = 4
        assert_eq!(summary.status_name, "available");
        assert_eq!(summary.last_updated, 1234567890);
        assert_eq!(summary.builds_count, 5);
        assert_eq!(summary.invalidation_count, 2);
        assert_eq!(summary.last_successful_build, Some("build-123".to_string()));
        println!("✅ Protobuf response helper functions test passed");
    }
 }
--- a/databuild/graph/BUILD.bazel
+++ b/databuild/graph/BUILD.bazel
@ -1,43 +0,0 @@
 load("@rules_rust//rust:defs.bzl", "rust_binary", "rust_library")
 exports_files([
    "rust_analyze_wrapper.sh.tpl",
    "rust_execute_wrapper.sh.tpl",
 ])
 rust_binary(
    name = "execute",
    srcs = ["execute.rs"],
    edition = "2021",
    visibility = ["//visibility:public"],
    deps = [
        "//databuild",
        "@crates//:clap",
        "@crates//:crossbeam-channel",
        "@crates//:log",
        "@crates//:serde",
        "@crates//:serde_json",
        "@crates//:simple_logger",
        "@crates//:tokio",
        "@crates//:uuid",
    ],
 )
 rust_binary(
    name = "analyze",
    srcs = ["analyze.rs"],
    edition = "2021",
    visibility = ["//visibility:public"],
    deps = [
        "//databuild",
        "@crates//:clap",
        "@crates//:crossbeam-channel",
        "@crates//:log",
        "@crates//:num_cpus",
        "@crates//:serde",
        "@crates//:serde_json",
        "@crates//:simple_logger",
        "@crates//:tokio",
        "@crates//:uuid",
    ],
 )
--- a/databuild/graph/README.md
+++ b/databuild/graph/README.md
@ -1,10 +0,0 @@
 # DataBuild Graph
 ## Entrypoints
 - `graph.build` - Build the requested partitions.
 - `graph.analyze` - Calculate the `JobGraph` that would produce the requested partitions.
 - `graph.mermaid` - Calculate a [mermaid](https://mermaid.js.org/syntax/flowchart.html) diagram describing the `JobGraph`.
 - `graph.serve` - Run the databuild server for this graph.
 - `graph.image` / `graph.load` - Build a deployable graph artifact and wrap it in a container. `load` registers the container locally.
--- a/databuild/graph/analyze.rs
+++ b/databuild/graph/analyze.rs
@ -1,652 +0,0 @@
 use std::collections::{HashMap, HashSet};
 use std::env;
 use std::process::{Command, exit};
 use std::sync::{Arc, Mutex};
 use std::thread;
 use log::{info, error};
 use simple_logger::SimpleLogger;
 use clap::{Arg, Command as ClapCommand};
 use uuid::Uuid;
 use databuild::*;
 use databuild::event_log::{create_bel_query_engine, create_build_event};
 use databuild::mermaid_utils::generate_mermaid_diagram;
 // Configure a job to produce the desired outputs
 fn configure(job_label: &str, output_refs: &[String]) -> Result<Vec<Task>, String> {
    let candidate_jobs_str = env::var("DATABUILD_CANDIDATE_JOBS_CFG")
        .map_err(|e| format!("Failed to get DATABUILD_CANDIDATE_JOBS_CFG: {}", e))?;
    let job_path_map: HashMap<String, String> = serde_json::from_str(&candidate_jobs_str)
        .map_err(|e| format!("Failed to parse DATABUILD_CANDIDATE_JOBS_CFG: {}", e))?;
    // Look up the executable path for this job
    let exec_path = job_path_map.get(job_label)
        .ok_or_else(|| format!("Job {} is not a candidate job", job_label))?;
    // Check if executable exists
    if !std::path::Path::new(exec_path).exists() {
        return Err(format!("Executable not found at path: {}", exec_path));
    }
    info!("Executing job configuration: {} {:?}", exec_path, output_refs);
    // Execute the job configuration command
    let output = Command::new(exec_path)
        .args(output_refs)
        .output()
        .map_err(|e| format!("Failed to execute job config: {}", e))?;
    if !output.status.success() {
        let stderr = String::from_utf8_lossy(&output.stderr);
        error!("Job configuration failed: {}", stderr);
        return Err(format!("Failed to run job config: {}", stderr));
    }
    info!("Job configuration succeeded for {}", job_label);
    // Parse the job configurations
    let stdout = String::from_utf8_lossy(&output.stdout);
    let job_configure_response: JobConfigureResponse = serde_json::from_str(&stdout)
        .map_err(|e| {
            error!("Error parsing job configs for {}: {}. `{}`", job_label, e, stdout);
            format!("Failed to parse job configs: {}", e)
        })?;
    let job_configs = job_configure_response.configs;
    // Create tasks
    let tasks: Vec<Task> = job_configs.into_iter()
        .map(|cfg| Task {
            job: Some(JobLabel { label: job_label.to_string() }),
            config: Some(cfg),
        })
        .collect();
    info!("Created {} tasks for job {}", tasks.len(), job_label);
    Ok(tasks)
 }
 // Resolve produces a mapping of required job refs to the partitions it produces
 fn resolve(output_refs: &[String]) -> Result<HashMap<String, Vec<String>>, String> {
    let lookup_path = env::var("DATABUILD_JOB_LOOKUP_PATH")
        .map_err(|e| format!("Failed to get DATABUILD_JOB_LOOKUP_PATH: {}", e))?;
    // Run the job lookup
    info!("Executing job lookup: {} {:?}", lookup_path, output_refs);
    let output = Command::new(&lookup_path)
        .args(output_refs)
        .output()
        .map_err(|e| format!("Failed to execute job lookup: {}", e))?;
    if !output.status.success() {
        error!("Job lookup failed: {}", output.status);
        let stderr = String::from_utf8_lossy(&output.stderr);
        error!("stderr: {}", stderr);
        let stdout = String::from_utf8_lossy(&output.stdout);
        error!("stdout: {}", stdout);
        return Err(format!("Failed to run job lookup: {}", stderr));
    }
    info!("Job lookup succeeded for {} output refs", output_refs.len());
    // Parse the result
    let stdout = String::from_utf8_lossy(&output.stdout);
    let result: HashMap<String, Vec<String>> = serde_json::from_str(&stdout)
        .map_err(|e| {
            error!("Error parsing job lookup result: {}", e);
            format!("Failed to parse job lookup result: {}", e)
        })?;
    info!("Job lookup found {} job mappings", result.len());
    for (job, refs) in &result {
        info!("  Job {} produces {} refs", job, refs.len());
    }
    Ok(result)
 }
 // Configure multiple jobs in parallel
 fn configure_parallel(job_refs: HashMap<String, Vec<String>>, num_workers: usize) -> Result<Vec<Task>, String> {
    // Create a channel for jobs
    let (job_sender, job_receiver) = crossbeam_channel::unbounded();
    // Fill the jobs channel
    for (job_label, produced_refs) in job_refs {
        job_sender.send((job_label, produced_refs)).unwrap();
    }
    drop(job_sender); // Close the channel
    // Create a channel for results
    let (task_sender, task_receiver) = crossbeam_channel::unbounded();
    let error = Arc::new(Mutex::new(None));
    // Spawn worker threads
    let mut handles = vec![];
    for _ in 0..num_workers {
        let job_receiver = job_receiver.clone();
        let task_sender = task_sender.clone();
        let error = Arc::clone(&error);
        let handle = thread::spawn(move || {
            for (job_label, produced_refs) in job_receiver {
                // Check if an error has already occurred
                if error.lock().unwrap().is_some() {
                    return;
                }
                match configure(&job_label, &produced_refs) {
                    Ok(tasks) => {
                        task_sender.send(tasks).unwrap();
                    }
                    Err(e) => {
                        let mut error_guard = error.lock().unwrap();
                        if error_guard.is_none() {
                            *error_guard = Some(e);
                        }
                        return;
                    }
                }
            }
        });
        handles.push(handle);
    }
    // Close the task sender
    drop(task_sender);
    // Wait for all workers to finish
    for handle in handles {
        handle.join().unwrap();
    }
    // Check for errors
    let error_guard = error.lock().unwrap();
    if let Some(e) = &*error_guard {
        return Err(e.clone());
    }
    // Collect results
    let mut all_tasks = Vec::new();
    while let Ok(tasks) = task_receiver.try_recv() {
        all_tasks.extend(tasks);
    }
    Ok(all_tasks)
 }
 // Simple staleness check - all requested partitions need jobs created
 // Delegation optimization happens in execution phase
 async fn check_partition_staleness(
    partition_refs: &[String],
    _query_engine: &std::sync::Arc<databuild::event_log::query_engine::BELQueryEngine>,
    _build_request_id: &str
 ) -> Result<(Vec<String>, Vec<String>), String> {
    // Analysis phase creates jobs for all requested partitions
    // Execution phase will handle delegation optimization
    let stale_partitions = partition_refs.to_vec();
    let delegated_partitions = Vec::new();
    Ok((stale_partitions, delegated_partitions))
 }
 // Plan creates a job graph for given output references
 async fn plan(
    output_refs: &[String], 
    query_engine: Option<std::sync::Arc<databuild::event_log::query_engine::BELQueryEngine>>,
    build_request_id: &str
 ) -> Result<JobGraph, String> {
    info!("Starting planning for {} output refs: {:?}", output_refs.len(), output_refs);
    // Log build request received event
    if let Some(ref query_engine_ref) = query_engine {
        let event = create_build_event(
            build_request_id.to_string(),
            crate::build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(BuildRequestStatusCode::BuildRequestReceived.status()),
                requested_partitions: output_refs.iter().map(|s| PartitionRef { str: s.clone() }).collect(),
                message: "Analysis started".to_string(),
                comment: None,
                want_id: None,
            })
        );
        if let Err(e) = query_engine_ref.append_event(event).await {
            error!("Failed to log build request event: {}", e);
        }
    }
    // Check for partition staleness and delegation opportunities
    let (stale_refs, _delegated_refs) = if let Some(ref query_engine_ref) = query_engine {
        match check_partition_staleness(output_refs, query_engine_ref, build_request_id).await {
            Ok((stale, delegated)) => {
                info!("Staleness check: {} stale, {} delegated partitions", stale.len(), delegated.len());
                (stale, delegated)
            }
            Err(e) => {
                error!("Failed to check partition staleness: {}", e);
                // Fall back to building all partitions
                (output_refs.to_vec(), Vec::new())
            }
        }
    } else {
        // No event log, build all partitions
        (output_refs.to_vec(), Vec::new())
    };
    // Only plan for stale partitions that need to be built
    let mut unhandled_refs = HashSet::new();
    for ref_str in &stale_refs {
        unhandled_refs.insert(ref_str.clone());
    }
    // Note: Partition analysis events will be logged after successful job graph creation
    let mut epoch = 0;
    let mut nodes = Vec::new();
    // Determine the number of workers based on available CPU cores or environment variable
    let mut num_workers = num_cpus::get();
    if let Ok(worker_env) = env::var("DATABUILD_PARALLEL_WORKERS") {
        if let Ok(parsed_workers) = worker_env.parse::<usize>() {
            if parsed_workers < 1 {
                num_workers = 1;
                info!("Warning: DATABUILD_PARALLEL_WORKERS must be at least 1, using: {}", num_workers);
            } else {
                num_workers = parsed_workers;
            }
        } else {
            info!("Warning: Invalid DATABUILD_PARALLEL_WORKERS value '{}', using default: {}", worker_env, num_workers);
        }
    }
    info!("Using {} workers for parallel execution", num_workers);
    // Log planning phase start
    if let Some(ref query_engine_ref) = query_engine {
        let event = create_build_event(
            build_request_id.to_string(),
            crate::build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(BuildRequestStatusCode::BuildRequestPlanning.status()),
                requested_partitions: output_refs.iter().map(|s| PartitionRef { str: s.clone() }).collect(),
                message: "Graph analysis in progress".to_string(),
                comment: None,
                want_id: None,
            })
        );
        if let Err(e) = query_engine_ref.append_event(event).await {
            error!("Failed to log planning event: {}", e);
        }
    }
    while !unhandled_refs.is_empty() {
        if epoch >= 1000 {
            error!("Planning timeout: still planning after {} epochs, giving up", epoch);
            return Err(format!("Still planning after {} epochs, giving up", epoch));
        }
        info!("Planning epoch {} with {} unhandled refs", epoch, unhandled_refs.len());
        // Resolve jobs for all unhandled refs
        let unhandled_refs_list: Vec<String> = unhandled_refs.iter().cloned().collect();
        let job_refs = resolve(&unhandled_refs_list)?;
        // Configure jobs in parallel
        let new_nodes = configure_parallel(job_refs.clone(), num_workers)?;
        // Remove handled refs
        for (_, produced_refs) in job_refs {
            for ref_str in produced_refs {
                unhandled_refs.remove(&ref_str);
            }
        }
        if !unhandled_refs.is_empty() {
            error!("Error: Still have unhandled refs after configuration phase: {:?}", unhandled_refs);
            return Err(format!("Should have no unhandled refs after configuration phase, but had: {:?}", unhandled_refs));
        }
        epoch += 1;
        // Add new nodes to the graph
        nodes.extend(new_nodes.clone());
        info!("Planning epoch {} completed: added {} new nodes, total nodes: {}", epoch, new_nodes.len(), nodes.len());
        // Plan next epoch
        let mut new_unhandled_count = 0;
        for task in &new_nodes {
            for input in &task.config.as_ref().unwrap().inputs {
                if input.dep_type_code == 1 { // MATERIALIZE = 1
                    if !unhandled_refs.contains(&input.partition_ref.as_ref().unwrap().str) {
                        new_unhandled_count += 1;
                    }
                    unhandled_refs.insert(input.partition_ref.as_ref().unwrap().str.clone());
                }
            }
        }
        if new_unhandled_count > 0 {
            info!("Added {} new unhandled refs for next planning epoch", new_unhandled_count);
        }
    }
    if !nodes.is_empty() {
        info!("Planning complete: created graph with {} nodes for {} output refs", nodes.len(), output_refs.len());
        // Log analysis completion event
        if let Some(ref query_engine) = query_engine {
            let event = create_build_event(
                build_request_id.to_string(),
                crate::build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                    status: Some(BuildRequestStatusCode::BuildRequestAnalysisCompleted.status()),
                    requested_partitions: output_refs.iter().map(|s| PartitionRef { str: s.clone() }).collect(),
                    message: format!("Analysis completed successfully, {} tasks planned", nodes.len()),
                    comment: None,
                    want_id: None,
                })
            );
            if let Err(e) = query_engine.append_event(event).await {
                error!("Failed to log analysis completion event: {}", e);
            }
            // Store the job graph as an event in the build event log
            let job_graph = JobGraph {
                label: Some(GraphLabel { label: "analyzed_graph".to_string() }),
                outputs: output_refs.iter().map(|s| PartitionRef { str: s.clone() }).collect(),
                nodes: nodes.clone(),
            };
            let job_graph_event = create_build_event(
                build_request_id.to_string(),
                crate::build_event::EventType::JobGraphEvent(JobGraphEvent {
                    job_graph: Some(job_graph),
                    message: format!("Job graph analysis completed with {} tasks", nodes.len()),
                }),
            );
            if let Err(e) = query_engine.append_event(job_graph_event).await {
                error!("Failed to log job graph event: {}", e);
            }
        }
        Ok(JobGraph {
            label: Some(GraphLabel { label: "analyzed_graph".to_string() }),
            outputs: output_refs.iter().map(|s| PartitionRef { str: s.clone() }).collect(),
            nodes,
        })
    } else {
        error!("Planning failed: no nodes created for output refs {:?}", output_refs);
        // Log planning failure
        if let Some(ref query_engine) = query_engine {
            let event = create_build_event(
                build_request_id.to_string(),
                crate::build_event::EventType::BuildRequestEvent(BuildRequestEvent {
                    status: Some(BuildRequestStatusCode::BuildRequestFailed.status()),
                    requested_partitions: output_refs.iter().map(|s| PartitionRef { str: s.clone() }).collect(),
                    message: "No jobs found for requested partitions".to_string(),
                    comment: None,
                    want_id: None,
                })
            );
            if let Err(e) = query_engine.append_event(event).await {
                error!("Failed to log failure event: {}", e);
            }
        }
        Err("Unknown failure in graph planning".to_string())
    }
 }
 // Generate a Mermaid flowchart diagram from a job graph
 // fn generate_mermaid_diagram(graph: &JobGraph) -> String {
 //     // Start the mermaid flowchart
 //     let mut mermaid = String::from("flowchart TD\n");
 //
 //     // Track nodes we've already added to avoid duplicates
 //     let mut added_nodes = HashSet::new();
 //     let mut added_refs = HashSet::new();
 //
 //     // Map to track which refs are outputs (to highlight them)
 //     let mut is_output_ref = HashSet::new();
 //     for ref_str in &graph.outputs {
 //         is_output_ref.insert(ref_str.str.clone());
 //     }
 //
 //     // Process each task in the graph
 //     for task in &graph.nodes {
 //         // Create a unique ID for this job+outputs combination
 //         let outputs_strs: Vec<String> = task.config.as_ref().unwrap().outputs.iter().map(|o| o.str.clone()).collect();
 //         let outputs_key = outputs_strs.join("_");
 //         let mut job_node_id = format!("job_{}", task.job.as_ref().unwrap().label.replace("//", "_"));
 //         job_node_id = job_node_id.replace(":", "_").replace("=", "_").replace("?", "_").replace(" ", "_");
 //         job_node_id = format!("{}_{}", job_node_id, outputs_key.replace("/", "_").replace("=", "_"));
 //
 //         // Create a descriptive label that includes both job label and outputs
 //         let job_label = &task.job.as_ref().unwrap().label;
 //         let outputs_label = if !task.config.as_ref().unwrap().outputs.is_empty() {
 //             if task.config.as_ref().unwrap().outputs.len() == 1 {
 //                 format!(" [{}]", task.config.as_ref().unwrap().outputs[0].str)
 //             } else {
 //                 format!(" [{}, ...]", task.config.as_ref().unwrap().outputs[0].str)
 //             }
 //         } else {
 //             String::new()
 //         };
 //
 //         // Add the job node if not already added
 //         if !added_nodes.contains(&job_node_id) {
 //             // Represent job as a process shape with escaped label
 //             mermaid.push_str(&format!(
 //                 "    {}[\"`**{}** {}`\"]:::job\n",
 //                 job_node_id,
 //                 job_label,
 //                 outputs_label
 //             ));
 //             added_nodes.insert(job_node_id.clone());
 //         }
 //
 //         // Process inputs (dependencies)
 //         for input in &task.config.as_ref().unwrap().inputs {
 //             let ref_node_id = format!("ref_{}", input.partition_ref.as_ref().unwrap().str.replace("/", "_").replace("=", "_"));
 //
 //             // Add the partition ref node if not already added
 //             if !added_refs.contains(&ref_node_id) {
 //                 let node_class = if is_output_ref.contains(&input.partition_ref.as_ref().unwrap().str) {
 //                     "outputPartition"
 //                 } else {
 //                     "partition"
 //                 };
 //
 //                 // Represent partition as a cylinder
 //                 mermaid.push_str(&format!(
 //                     "    {}[(\"{}\")]:::{}\n",
 //                     ref_node_id,
 //                     input.partition_ref.as_ref().unwrap().str.replace("/", "_").replace("=", "_"),
 //                     node_class
 //                 ));
 //                 added_refs.insert(ref_node_id.clone());
 //             }
 //
 //             // Add the edge from input to job
 //             if input.dep_type == 1 { // MATERIALIZE = 1
 //                 // Solid line for materialize dependencies
 //                 mermaid.push_str(&format!("    {} --> {}\n", ref_node_id, job_node_id));
 //             } else {
 //                 // Dashed line for query dependencies
 //                 mermaid.push_str(&format!("    {} -.-> {}\n", ref_node_id, job_node_id));
 //             }
 //         }
 //
 //         // Process outputs
 //         for output in &task.config.as_ref().unwrap().outputs {
 //             let ref_node_id = format!("ref_{}", output.str.replace("/", "_").replace("=", "_"));
 //
 //             // Add the partition ref node if not already added
 //             if !added_refs.contains(&ref_node_id) {
 //                 let node_class = if is_output_ref.contains(&output.str) {
 //                     "outputPartition"
 //                 } else {
 //                     "partition"
 //                 };
 //
 //                 // Represent partition as a cylinder
 //                 mermaid.push_str(&format!(
 //                     "    {}[(\"Partition: {}\")]:::{}\n",
 //                     ref_node_id,
 //                     output.str,
 //                     node_class
 //                 ));
 //                 added_refs.insert(ref_node_id.clone());
 //             }
 //
 //             // Add the edge from job to output
 //             mermaid.push_str(&format!("    {} --> {}\n", job_node_id, ref_node_id));
 //         }
 //     }
 //
 //     // Add styling
 //     mermaid.push_str("\n    %% Styling\n");
 //     mermaid.push_str("    classDef job fill:#f9f,stroke:#333,stroke-width:1px;\n");
 //     mermaid.push_str("    classDef partition fill:#bbf,stroke:#333,stroke-width:1px;\n");
 //     mermaid.push_str("    classDef outputPartition fill:#bfb,stroke:#333,stroke-width:2px;\n");
 //
 //     mermaid
 // }
 #[tokio::main]
 async fn main() {
    // Initialize logger
    SimpleLogger::new().init().unwrap();
    let mode = env::var("DATABUILD_MODE").unwrap_or_else(|_| "unknown".to_string());
    info!("Starting analyze.rs in mode: {}", mode);
    // Parse command line arguments (only for partition references)
    let matches = ClapCommand::new("analyze")
        .version("1.0")
        .about("DataBuild graph analysis tool")
        .arg(
            Arg::new("partitions")
                .help("Partition references to analyze")
                .required(false)
                .num_args(0..)
                .value_name("PARTITIONS")
        )
        .get_matches();
    let args: Vec<String> = matches.get_many::<String>("partitions")
        .unwrap_or_default()
        .cloned()
        .collect();
    // Validate arguments based on mode
    match mode.as_str() {
        "plan" | "mermaid" => {
            if args.is_empty() {
                error!("Error: Partition references are required for {} mode", mode);
                eprintln!("Error: Partition references are required for {} mode", mode);
                exit(1);
            }
        }
        "import_test" => {
            // No partition arguments needed for test mode
        }
        _ => {
            // Unknown mode, will be handled later
        }
    }
    // Get build event log configuration from environment variables
    let build_event_log_uri = env::var("DATABUILD_BUILD_EVENT_LOG").ok();
    let build_request_id = env::var("DATABUILD_BUILD_REQUEST_ID")
        .unwrap_or_else(|_| Uuid::new_v4().to_string());
    // Initialize build event log if provided
    let query_engine = if let Some(uri) = build_event_log_uri {
        match create_bel_query_engine(&uri).await {
            Ok(engine) => {
                info!("Initialized build event log: {}", uri);
                Some(engine)
            }
            Err(e) => {
                error!("Failed to initialize build event log {}: {}", uri, e);
                exit(1);
            }
        }
    } else {
        None
    };
    match mode.as_str() {
        "plan" => {
            // Get output refs from command line arguments
            match plan(&args, query_engine, &build_request_id).await {
                Ok(graph) => {
                    // Output the job graph as JSON
                    match serde_json::to_string(&graph) {
                        Ok(json_data) => {
                            info!("Successfully generated job graph with {} nodes", graph.nodes.len());
                            println!("{}", json_data);
                        }
                        Err(e) => {
                            error!("Error marshaling job graph: {}", e);
                            eprintln!("Error marshaling job graph: {}", e);
                            exit(1);
                        }
                    }
                }
                Err(e) => {
                    eprintln!("Error: {}", e);
                    exit(1);
                }
            }
        }
        "lookup" => {
            // Get output refs from command line arguments
            match resolve(&args) {
                Ok(result) => {
                    // Output the result as JSON
                    match serde_json::to_string(&result) {
                        Ok(json_data) => {
                            info!("Successfully completed lookup for {} output refs with {} job mappings", args.len(), result.len());
                            println!("{}", json_data);
                        }
                        Err(e) => {
                            error!("Error marshaling lookup result: {}", e);
                            eprintln!("Error marshaling lookup result: {}", e);
                            exit(1);
                        }
                    }
                }
                Err(e) => {
                    eprintln!("Error: {}", e);
                    exit(1);
                }
            }
        }
        "mermaid" => {
            // Get output refs from command line arguments
            match plan(&args, None, &build_request_id).await {
                Ok(graph) => {
                    // Generate and output the mermaid diagram
                    let mermaid_diagram = generate_mermaid_diagram(&graph);
                    println!("{}", mermaid_diagram);
                    info!("Successfully generated mermaid diagram for {} nodes", graph.nodes.len());
                }
                Err(e) => {
                    eprintln!("Error: {}", e);
                    exit(1);
                }
            }
        }
        "import_test" => {
            info!("Running in import_test mode");
            println!("ok :)");
            info!("Import test completed successfully");
        }
        _ => {
            error!("Error: Unknown mode '{}'", mode);
            eprintln!("Unknown MODE `{}`", mode);
            exit(1);
        }
    }
 }
--- a/databuild/graph/execute.rs
+++ b/databuild/graph/execute.rs
@ -1,817 +0,0 @@
 use databuild::{JobGraph, Task, JobStatus, BuildRequestStatus, BuildRequestStatusCode, PartitionStatus, BuildRequestEvent, JobEvent, PartitionEvent, PartitionRef};
 use databuild::event_log::{create_bel_query_engine, create_build_event};
 use databuild::build_event::EventType;
 use databuild::log_collector::{LogCollector, LogCollectorError};
 use crossbeam_channel::{Receiver, Sender};
 use log::{debug, error, info, warn};
 use std::collections::{HashMap, HashSet};
 use std::io::{BufReader, Read, Write};
 use std::path::{Path, PathBuf};
 use std::process::{Command, Stdio};
 use std::sync::Arc;
 use std::{env, thread};
 use std::time::{Duration, Instant};
 // Command line parsing removed - using environment variables
 use uuid::Uuid;
 const NUM_WORKERS: usize = 4;
 const LOG_INTERVAL: Duration = Duration::from_secs(5);
 const FAIL_FAST: bool = true; // Same default as the Go version
 #[derive(Debug, Clone, PartialEq, Eq)]
 enum TaskState {
    Pending,
    Running,
    Succeeded,
    Failed,
 }
 #[derive(Debug, Clone)]
 struct TaskExecutionResult {
    task_key: String,
    job_label: String, // For logging
    success: bool,
    stdout: String,
    stderr: String,
    duration: Duration,
    error_message: Option<String>,
 }
 // Generates a unique key for a task based on its JobLabel, input and output references.
 // Mirrors the Go implementation's getTaskKey.
 fn get_task_key(task: &Task) -> String {
    let mut key_parts = Vec::new();
    key_parts.push(task.job.as_ref().unwrap().label.clone());
    for input_dep in &task.config.as_ref().unwrap().inputs {
        key_parts.push(format!("input:{}", input_dep.partition_ref.as_ref().unwrap().str));
    }
    for output_ref in &task.config.as_ref().unwrap().outputs {
        key_parts.push(format!("output:{}", output_ref.str));
    }
    key_parts.join("|")
 }
 fn worker(
    task_rx: Receiver<Arc<Task>>,
    result_tx: Sender<TaskExecutionResult>,
    worker_id: usize,
 ) {
    info!("[Worker {}] Starting", worker_id);
    while let Ok(task) = task_rx.recv() {
        let task_key = get_task_key(&task);
        info!("[Worker {}] Starting job: {} (Key: {})", worker_id, task.job.as_ref().unwrap().label, task_key);
        let start_time = Instant::now();
        let candidate_jobs_str = env::var("DATABUILD_CANDIDATE_JOBS_EXEC")
            .map_err(|e| format!("Failed to get DATABUILD_CANDIDATE_JOBS_EXEC: {}", e)).unwrap();
        let job_path_map: HashMap<String, String> = serde_json::from_str(&candidate_jobs_str)
            .map_err(|e| format!("Failed to parse DATABUILD_CANDIDATE_JOBS_EXEC: {}", e)).unwrap();
        // Look up the executable path for this job
        let job_label = &task.job.as_ref().unwrap().label;
        let exec_path = job_path_map.get(job_label)
            .ok_or_else(|| format!("Job {} is not a candidate job", job_label)).unwrap();
        let config_json = match serde_json::to_string(&task.config.as_ref().unwrap()) {
            Ok(json) => json,
            Err(e) => {
                let err_msg = format!("Failed to serialize task config for {}: {}", task.job.as_ref().unwrap().label, e);
                error!("[Worker {}] {}", worker_id, err_msg);
                result_tx
                    .send(TaskExecutionResult {
                        task_key,
                        job_label: task.job.as_ref().unwrap().label.clone(),
                        success: false,
                        stdout: String::new(),
                        stderr: err_msg.clone(),
                        duration: start_time.elapsed(),
                        error_message: Some(err_msg),
                    })
                    .unwrap_or_else(|e| error!("[Worker {}] Failed to send error result: {}", worker_id, e));
                continue;
            }
        };
        // Generate a job run ID for this execution  
        let job_run_id = Uuid::new_v4().to_string();
        info!("Running job {} (Path: {}) with config: {}", job_label, exec_path, config_json);
        let mut cmd = Command::new(&exec_path);
        cmd.stdin(Stdio::piped())
            .stdout(Stdio::piped())
            .stderr(Stdio::piped());
        // Set environment variables from the current process's environment
        // This mirrors the Go `cmd.Env = os.Environ()` behavior.
        // Task-specific env vars from task.config.env are passed via JSON through stdin.
        cmd.env_clear(); // Start with no environment variables
        for (key, value) in std::env::vars() {
            cmd.env(key, value); // Add current process's environment variables
        }
        // Add the job run ID so the job wrapper can use the same ID
        cmd.env("DATABUILD_JOB_RUN_ID", &job_run_id);
        match cmd.spawn() {
            Ok(mut child) => {
                if let Some(mut child_stdin) = child.stdin.take() {
                    if let Err(e) = child_stdin.write_all(config_json.as_bytes()) {
                        let err_msg = format!("[Worker {}] Failed to write to stdin for {}: {}", worker_id, task.job.as_ref().unwrap().label, e);
                        error!("{}", err_msg);
                        // Ensure child is killed if stdin write fails before wait
                        let _ = child.kill(); 
                        let _ = child.wait(); // Reap the child
                        result_tx.send(TaskExecutionResult {
                                task_key,
                                job_label: task.job.as_ref().unwrap().label.clone(),
                                success: false,
                                stdout: String::new(),
                                stderr: err_msg.clone(),
                                duration: start_time.elapsed(),
                                error_message: Some(err_msg),
                            })
                            .unwrap_or_else(|e| error!("[Worker {}] Failed to send error result: {}", worker_id, e));
                        continue;
                    }
                    drop(child_stdin); // Close stdin to signal EOF to the child
                } else {
                     let err_msg = format!("[Worker {}] Failed to get stdin for {}", worker_id, task.job.as_ref().unwrap().label);
                     error!("{}", err_msg);
                     result_tx.send(TaskExecutionResult {
                                task_key,
                                job_label: task.job.as_ref().unwrap().label.clone(),
                                success: false,
                                stdout: String::new(),
                                stderr: err_msg.clone(),
                                duration: start_time.elapsed(),
                                error_message: Some(err_msg),
                            })
                            .unwrap_or_else(|e| error!("[Worker {}] Failed to send error result: {}", worker_id, e));
                    continue;
                }
                // Initialize log collector
                let mut log_collector = match LogCollector::new(LogCollector::default_logs_dir()) {
                    Ok(mut collector) => {
                        // Set the job label mapping for this job run
                        collector.set_job_label(&job_run_id, &task.job.as_ref().unwrap().label);
                        collector
                    },
                    Err(e) => {
                        let err_msg = format!("[Worker {}] Failed to initialize log collector for {}: {}", 
                                             worker_id, task.job.as_ref().unwrap().label, e);
                        error!("{}", err_msg);
                        result_tx
                            .send(TaskExecutionResult {
                                task_key,
                                job_label: task.job.as_ref().unwrap().label.clone(),
                                success: false,
                                stdout: String::new(),
                                stderr: err_msg.clone(),
                                duration: start_time.elapsed(),
                                error_message: Some(err_msg),
                            })
                            .unwrap_or_else(|e| error!("[Worker {}] Failed to send error result: {}", worker_id, e));
                        continue;
                    }
                };
                // Collect stdout/stderr and process with LogCollector
                let stdout_handle = child.stdout.take();
                let stderr_handle = child.stderr.take();
                let mut stdout_content = String::new();
                let mut stderr_content = String::new();
                // Read stdout and process with LogCollector
                if let Some(stdout) = stdout_handle {
                    let stdout_reader = BufReader::new(stdout);
                    if let Err(e) = log_collector.consume_job_output(&job_run_id, stdout_reader) {
                        warn!("[Worker {}] Failed to process job logs for {}: {}", 
                              worker_id, task.job.as_ref().unwrap().label, e);
                    }
                }
                // Read stderr (raw, not structured)
                if let Some(mut stderr) = stderr_handle {
                    if let Err(e) = stderr.read_to_string(&mut stderr_content) {
                        warn!("[Worker {}] Failed to read stderr for {}: {}", 
                              worker_id, task.job.as_ref().unwrap().label, e);
                    }
                }
                // Wait for the process to finish
                match child.wait() {
                    Ok(status) => {
                        let duration = start_time.elapsed();
                        let success = status.success();
                        // Close the log collector for this job
                        if let Err(e) = log_collector.close_job(&job_run_id) {
                            warn!("[Worker {}] Failed to close log collector for {}: {}", 
                                  worker_id, task.job.as_ref().unwrap().label, e);
                        }
                        if success {
                            info!(
                                "[Worker {}] Job succeeded: {} (Duration: {:?}, Job Run ID: {})",
                                worker_id, task.job.as_ref().unwrap().label, duration, job_run_id
                            );
                        } else {
                            error!(
                                "[Worker {}] Job failed: {} (Duration: {:?}, Status: {:?}, Job Run ID: {})\nStderr: {}",
                                worker_id, task.job.as_ref().unwrap().label, duration, status, job_run_id, stderr_content
                            );
                        }
                        result_tx
                            .send(TaskExecutionResult {
                                task_key,
                                job_label: task.job.as_ref().unwrap().label.clone(),
                                success,
                                stdout: format!("Job logs written to JSONL (Job Run ID: {})", job_run_id),
                                stderr: stderr_content,
                                duration,
                                error_message: if success { None } else { Some(format!("Exited with status: {:?}", status)) },
                            })
                            .unwrap_or_else(|e| error!("[Worker {}] Failed to send result: {}", worker_id, e));
                    }
                    Err(e) => {
                        let err_msg = format!("[Worker {}] Failed to execute or wait for {}: {}", worker_id, task.job.as_ref().unwrap().label, e);
                        error!("{}", err_msg);
                        result_tx
                            .send(TaskExecutionResult {
                                task_key,
                                job_label: task.job.as_ref().unwrap().label.clone(),
                                success: false,
                                stdout: String::new(),
                                stderr: err_msg.clone(),
                                duration: start_time.elapsed(),
                                error_message: Some(err_msg),
                            })
                            .unwrap_or_else(|e| error!("[Worker {}] Failed to send execution error result: {}", worker_id, e));
                    }
                }
            }
            Err(e) => {
                let err_msg = format!("[Worker {}] Failed to spawn command for {}: {} (Path: {:?})", worker_id, task.job.as_ref().unwrap().label, e, exec_path);
                error!("{}", err_msg);
                 result_tx
                    .send(TaskExecutionResult {
                        task_key,
                        job_label: task.job.as_ref().unwrap().label.clone(),
                        success: false,
                        stdout: String::new(),
                        stderr: err_msg.clone(),
                        duration: start_time.elapsed(),
                        error_message: Some(err_msg),
                    })
                    .unwrap_or_else(|e| error!("[Worker {}] Failed to send spawn error result: {}", worker_id, e));
            }
        }
    }
    info!("[Worker {}] Exiting", worker_id);
 }
 fn is_task_ready(task: &Task, completed_outputs: &HashSet<String>) -> bool {
    let mut missing_deps = Vec::new();
    for dep in &task.config.as_ref().unwrap().inputs {
        if dep.dep_type_code == 1 { // MATERIALIZE = 1
            if !completed_outputs.contains(&dep.partition_ref.as_ref().unwrap().str) {
                missing_deps.push(&dep.partition_ref.as_ref().unwrap().str);
            }
        }
    }
    if !missing_deps.is_empty() {
        debug!("Task {} not ready - missing dependencies: {:?}", task.job.as_ref().unwrap().label, missing_deps);
        return false;
    }
    true
 }
 // Check if partitions are already available or being built by other build requests
 async fn check_build_coordination(
    task: &Task,
    query_engine: &Arc<databuild::event_log::query_engine::BELQueryEngine>,
    build_request_id: &str
 ) -> Result<(bool, bool, Vec<(PartitionRef, String)>), String> {
    let outputs = &task.config.as_ref().unwrap().outputs;
    let mut available_partitions = Vec::new();
    let mut needs_building = false;
    for output_ref in outputs {
        debug!("Checking build coordination for partition: {}", output_ref.str);
        // First check if this partition is already available
        match query_engine.get_latest_partition_status(&output_ref.str).await {
            Ok(Some((status, _timestamp))) => {
                debug!("Partition {} has status: {:?}", output_ref.str, status);
                if status == databuild::PartitionStatus::PartitionAvailable {
                    // Get which build request created this partition
                    match query_engine.get_build_request_for_available_partition(&output_ref.str).await {
                        Ok(Some(source_build_id)) => {
                            info!("Partition {} already available from build {}", output_ref.str, source_build_id);
                            available_partitions.push((output_ref.clone(), source_build_id));
                            continue;
                        }
                        Ok(None) => {
                            error!("Partition {} is available but no source build found - this indicates a bug in the event log implementation", output_ref.str);
                            return Err(format!("Available partition {} has no source build ID. This suggests the event log is missing required data.", output_ref.str));
                        }
                        Err(e) => {
                            error!("Failed to get source build for partition {}: {}", output_ref.str, e);
                            return Err(format!("Cannot determine source build for available partition {}: {}", output_ref.str, e));
                        }
                    }
                } else {
                    debug!("Partition {} has non-available status {:?}, needs building", output_ref.str, status);
                    needs_building = true;
                }
            }
            Ok(None) => {
                debug!("Partition {} has no status, needs building", output_ref.str);
                needs_building = true;
            }
            Err(e) => {
                error!("Failed to check partition status for {}: {}", output_ref.str, e);
                return Err(format!("Cannot check partition status: {}. Use a queryable event log (e.g., SQLite) for builds that need to check existing partitions.", e));
            }
        }
        // Check if this partition is being built by another request
        match query_engine.get_active_builds_for_partition(&output_ref.str).await {
            Ok(active_builds) => {
                let other_builds: Vec<String> = active_builds.into_iter()
                    .filter(|id| id != build_request_id)
                    .collect();
                if !other_builds.is_empty() {
                    info!("Partition {} is already being built by other requests: {:?}. Delegating.", 
                          output_ref.str, other_builds);
                    // Log delegation event for active builds
                    for delegated_to_build_id in &other_builds {
                        let event = create_build_event(
                            build_request_id.to_string(),
                            EventType::DelegationEvent(databuild::DelegationEvent {
                                partition_ref: Some(output_ref.clone()),
                                delegated_to_build_request_id: delegated_to_build_id.clone(),
                                message: "Delegated to active build during execution".to_string(),
                            })
                        );
                        if let Err(e) = query_engine.append_event(event).await {
                            error!("Failed to log delegation event: {}", e);
                        }
                    }
                    return Ok((false, false, available_partitions)); // Don't build, delegated to active build
                }
            }
            Err(e) => {
                error!("Failed to check active builds for partition {}: {}", output_ref.str, e);
                return Err(format!("Cannot check active builds: {}. Use a queryable event log (e.g., SQLite) for builds that need to check for concurrent execution.", e));
            }
        }
        // If we reach here, this partition needs to be built
        needs_building = true;
    }
    // Only skip the job if ALL partitions are already available
    if !needs_building && available_partitions.len() == outputs.len() {
        Ok((false, true, available_partitions)) // Don't build, skip due to all partitions available
    } else {
        Ok((true, false, available_partitions)) // Need to build (some partitions unavailable)
    }
 }
 fn log_status_summary(
    task_states: &HashMap<String, TaskState>,
    original_tasks_by_key: &HashMap<String, Arc<Task>>,
 ) {
    let mut pending_tasks = Vec::new();
    let mut running_tasks = Vec::new();
    let mut succeeded_tasks = Vec::new();
    let mut failed_tasks = Vec::new();
    for (key, state) in task_states {
        let label = original_tasks_by_key.get(key).map_or_else(|| key.as_str(), |t| t.job.as_ref().unwrap().label.as_str());
        match state {
            TaskState::Pending => pending_tasks.push(label),
            TaskState::Running => running_tasks.push(label),
            TaskState::Succeeded => succeeded_tasks.push(label),
            TaskState::Failed => failed_tasks.push(label),
        }
    }
    info!("Task Status Summary:");
    info!("  Pending ({}): {:?}", pending_tasks.len(), pending_tasks);
    info!("  Running ({}): {:?}", running_tasks.len(), running_tasks);
    info!("  Succeeded ({}): {:?}", succeeded_tasks.len(), succeeded_tasks);
    info!("  Failed ({}): {:?}", failed_tasks.len(), failed_tasks);
 }
 #[tokio::main]
 async fn main() -> Result<(), Box<dyn std::error::Error>> {
    simple_logger::SimpleLogger::new()
        .with_level(
            std::env::var("RUST_LOG")
                .unwrap_or_else(|_| "info".to_string())
                .parse()
                .unwrap_or(log::LevelFilter::Info)
        )
        .init()?;
    // Get build event log configuration from environment variables
    let build_event_log_uri = std::env::var("DATABUILD_BUILD_EVENT_LOG").ok();
    let build_request_id = std::env::var("DATABUILD_BUILD_REQUEST_ID")
        .unwrap_or_else(|_| Uuid::new_v4().to_string());
    // Initialize build event log if provided
    let build_event_log = if let Some(uri) = build_event_log_uri {
        match create_bel_query_engine(&uri).await {
            Ok(log) => {
                info!("Initialized build event log: {}", uri);
                Some(log)
            }
            Err(e) => {
                error!("Failed to initialize build event log {}: {}", uri, e);
                std::process::exit(1);
            }
        }
    } else {
        None
    };
    let mut buffer = String::new();
    std::io::stdin().read_to_string(&mut buffer)?;
    let graph: JobGraph = serde_json::from_str(&buffer)?;
    info!("Executing job graph with {} nodes", graph.nodes.len());
    // Log build request execution start (existing detailed event)
    if let Some(ref query_engine) = build_event_log {
        let event = create_build_event(
            build_request_id.clone(),
            EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(BuildRequestStatusCode::BuildRequestExecuting.status()),
                requested_partitions: graph.outputs.clone(),
                message: format!("Starting execution of {} jobs", graph.nodes.len()),
                comment: None,
                want_id: None,
            })
        );
        if let Err(e) = query_engine.append_event(event).await {
            error!("Failed to log execution start event: {}", e);
        }
    }
    let mut task_states: HashMap<String, TaskState> = HashMap::new();
    let mut original_tasks_by_key: HashMap<String, Arc<Task>> = HashMap::new();
    let graph_nodes_arc: Vec<Arc<Task>> = graph.nodes.into_iter().map(Arc::new).collect();
    for task_node in &graph_nodes_arc {
        let key = get_task_key(task_node);
        task_states.insert(key.clone(), TaskState::Pending);
        original_tasks_by_key.insert(key, task_node.clone());
    }
    let mut completed_outputs: HashSet<String> = HashSet::new();
    let mut job_results: Vec<TaskExecutionResult> = Vec::new();
    let (task_tx, task_rx): (Sender<Arc<Task>>, Receiver<Arc<Task>>) = crossbeam_channel::unbounded();
    let (result_tx, result_rx): (Sender<TaskExecutionResult>, Receiver<TaskExecutionResult>) = crossbeam_channel::unbounded();
    let mut worker_handles = Vec::new();
    for i in 0..NUM_WORKERS {
        let task_rx_clone = task_rx.clone();
        let result_tx_clone = result_tx.clone();
        worker_handles.push(thread::spawn(move || {
            worker(task_rx_clone, result_tx_clone, i + 1);
        }));
    }
    // Drop the original result_tx so the channel closes when all workers are done
    // if result_rx is the only remaining receiver.
    drop(result_tx); 
    let mut last_log_time = Instant::now();
    let mut active_tasks_count = 0;
    let mut fail_fast_triggered = false;
    loop {
            // 1. Process results
        while let Ok(result) = result_rx.try_recv() {
            active_tasks_count -= 1;
            info!(
                "Received result for task {}: Success: {}",
                result.job_label, result.success
            );
            let current_state = if result.success {
                TaskState::Succeeded
            } else {
                TaskState::Failed
            };
            task_states.insert(result.task_key.clone(), current_state);
            // Log job completion events
            if let Some(ref query_engine) = build_event_log {
                if let Some(original_task) = original_tasks_by_key.get(&result.task_key) {
                    let job_run_id = Uuid::new_v4().to_string();
                    // Log job completion
                    let job_event = create_build_event(
                        build_request_id.clone(),
                        EventType::JobEvent(JobEvent {
                            job_run_id: job_run_id.clone(),
                            job_label: original_task.job.clone(),
                            target_partitions: original_task.config.as_ref().unwrap().outputs.clone(),
                            status_code: if result.success { JobStatus::JobCompleted as i32 } else { JobStatus::JobFailed as i32 },
                            status_name: if result.success { JobStatus::JobCompleted.to_display_string() } else { JobStatus::JobFailed.to_display_string() },
                            message: if result.success { "Job completed successfully".to_string() } else { result.error_message.clone().unwrap_or_default() },
                            config: original_task.config.clone(),
                            manifests: vec![], // Would be populated from actual job output
                        })
                    );
                    if let Err(e) = query_engine.append_event(job_event).await {
                        error!("Failed to log job completion event: {}", e);
                    }
                    // Log partition status updates
                    for output_ref in &original_task.config.as_ref().unwrap().outputs {
                        let partition_event = create_build_event(
                            build_request_id.clone(),
                            EventType::PartitionEvent(PartitionEvent {
                                partition_ref: Some(output_ref.clone()),
                                status_code: if result.success { PartitionStatus::PartitionAvailable as i32 } else { PartitionStatus::PartitionFailed as i32 },
                                status_name: if result.success { PartitionStatus::PartitionAvailable.to_display_string() } else { PartitionStatus::PartitionFailed.to_display_string() },
                                message: if result.success { "Partition built successfully".to_string() } else { "Partition build failed".to_string() },
                                job_run_id: job_run_id.clone(),
                            })
                        );
                        if let Err(e) = query_engine.append_event(partition_event).await {
                            error!("Failed to log partition status event: {}", e);
                        }
                    }
                }
            }
            if result.success {
                if let Some(original_task) = original_tasks_by_key.get(&result.task_key) {
                    for output_ref in &original_task.config.as_ref().unwrap().outputs {
                        completed_outputs.insert(output_ref.str.clone());
                    }
                }
            } else {
                if FAIL_FAST {
                    warn!("Fail-fast enabled and task {} failed. Shutting down.", result.job_label);
                    fail_fast_triggered = true;
                }
            }
            job_results.push(result);
        }
        // 2. Check for fail-fast break
        if fail_fast_triggered && active_tasks_count == 0 { // Wait for running tasks to finish if fail fast
             info!("All active tasks completed after fail-fast trigger.");
            break;
        }
        if fail_fast_triggered && active_tasks_count > 0 {
            // Don't schedule new tasks, just wait for active ones or log
        } else if !fail_fast_triggered { // Only dispatch if not in fail-fast shutdown
            // 3. Dispatch ready tasks
            for task_node in &graph_nodes_arc {
                let task_key = get_task_key(task_node);
                if task_states.get(&task_key) == Some(&TaskState::Pending) {
                    if is_task_ready(task_node, &completed_outputs) {
                        // Check build coordination if event log is available
                        let (should_build, is_skipped, available_partitions) = if let Some(ref query_engine) = build_event_log {
                            match check_build_coordination(task_node, query_engine, &build_request_id).await {
                                Ok((should_build, is_skipped, available_partitions)) => (should_build, is_skipped, available_partitions),
                                Err(e) => {
                                    error!("Error checking build coordination for {}: {}", 
                                           task_node.job.as_ref().unwrap().label, e);
                                    (true, false, Vec::<(PartitionRef, String)>::new()) // Default to building on error
                                }
                            }
                        } else {
                            (true, false, Vec::<(PartitionRef, String)>::new()) // No event log, always build
                        };
                        if !should_build {
                            if is_skipped {
                                // Task skipped due to all partitions already available
                                info!("Task {} skipped - all target partitions already available", task_node.job.as_ref().unwrap().label);
                                // Log delegation events for each available partition
                                if let Some(ref query_engine) = build_event_log {
                                    for (partition_ref, source_build_id) in &available_partitions {
                                        let delegation_event = create_build_event(
                                            build_request_id.clone(),
                                            EventType::DelegationEvent(databuild::DelegationEvent {
                                                partition_ref: Some(partition_ref.clone()),
                                                delegated_to_build_request_id: source_build_id.clone(),
                                                message: "Delegated to historical build - partition already available".to_string(),
                                            })
                                        );
                                        if let Err(e) = query_engine.append_event(delegation_event).await {
                                            error!("Failed to log historical delegation event: {}", e);
                                        }
                                    }
                                    // Log JOB_SKIPPED event
                                    let job_run_id = Uuid::new_v4().to_string();
                                    let job_event = create_build_event(
                                        build_request_id.clone(),
                                        EventType::JobEvent(JobEvent {
                                            job_run_id: job_run_id.clone(),
                                            job_label: task_node.job.clone(),
                                            target_partitions: task_node.config.as_ref().unwrap().outputs.clone(),
                                            status_code: JobStatus::JobSkipped as i32,
                                            status_name: JobStatus::JobSkipped.to_display_string(),
                                            message: "Job skipped - all target partitions already available".to_string(),
                                            config: task_node.config.clone(),
                                            manifests: vec![],
                                        })
                                    );
                                    if let Err(e) = query_engine.append_event(job_event).await {
                                        error!("Failed to log job skipped event: {}", e);
                                    }
                                }
                            } else {
                                // Task delegated to active build
                                info!("Task {} delegated to active build request", task_node.job.as_ref().unwrap().label);
                            }
                            task_states.insert(task_key.clone(), TaskState::Succeeded);
                            // Mark outputs as completed
                            for output_ref in &task_node.config.as_ref().unwrap().outputs {
                                completed_outputs.insert(output_ref.str.clone());
                            }
                            continue;
                        }
                        info!("Dispatching task: {}", task_node.job.as_ref().unwrap().label);
                        // Log job scheduling events
                        if let Some(ref query_engine) = build_event_log {
                            let job_run_id = Uuid::new_v4().to_string();
                            // Log job scheduled
                            let job_event = create_build_event(
                                build_request_id.clone(),
                                EventType::JobEvent(JobEvent {
                                    job_run_id: job_run_id.clone(),
                                    job_label: task_node.job.clone(),
                                    target_partitions: task_node.config.as_ref().unwrap().outputs.clone(),
                                    status_code: JobStatus::JobScheduled as i32,
                                    status_name: JobStatus::JobScheduled.to_display_string(),
                                    message: "Job scheduled for execution".to_string(),
                                    config: task_node.config.clone(),
                                    manifests: vec![],
                                })
                            );
                            if let Err(e) = query_engine.append_event(job_event).await {
                                error!("Failed to log job scheduled event: {}", e);
                            }
                            // Log partition building status
                            for output_ref in &task_node.config.as_ref().unwrap().outputs {
                                let partition_event = create_build_event(
                                    build_request_id.clone(),
                                    EventType::PartitionEvent(PartitionEvent {
                                        partition_ref: Some(output_ref.clone()),
                                        status_code: PartitionStatus::PartitionBuilding as i32,
                                        status_name: PartitionStatus::PartitionBuilding.to_display_string(),
                                        message: "Partition build started".to_string(),
                                        job_run_id: job_run_id.clone(),
                                    })
                                );
                                if let Err(e) = query_engine.append_event(partition_event).await {
                                    error!("Failed to log partition building event: {}", e);
                                }
                            }
                        }
                        task_states.insert(task_key.clone(), TaskState::Running);
                        task_tx.send(task_node.clone())?;
                        active_tasks_count += 1;
                    }
                }
            }
        }
        // 4. Periodic logging
        if last_log_time.elapsed() >= LOG_INTERVAL {
            log_status_summary(&task_states, &original_tasks_by_key);
            // Debug: Check for deadlock (pending tasks with no running tasks)
            let has_pending = task_states.values().any(|s| *s == TaskState::Pending);
            if has_pending && active_tasks_count == 0 {
                warn!("Potential deadlock detected: {} pending tasks with no running tasks", 
                      task_states.values().filter(|s| **s == TaskState::Pending).count());
                // Log details of pending tasks and their preconditions
                for (key, state) in &task_states {
                    if *state == TaskState::Pending {
                        if let Some(task) = original_tasks_by_key.get(key) {
                            warn!("Pending task: {} ({})", task.job.as_ref().unwrap().label, key);
                            warn!("  Required inputs:");
                            for dep in &task.config.as_ref().unwrap().inputs {
                                if dep.dep_type_code == 1 { // MATERIALIZE = 1
                                    let available = completed_outputs.contains(&dep.partition_ref.as_ref().unwrap().str);
                                    warn!("    {} - {}", dep.partition_ref.as_ref().unwrap().str, if available { "AVAILABLE" } else { "MISSING" });
                                }
                            }
                            warn!("  Produces outputs:");
                            for output in &task.config.as_ref().unwrap().outputs {
                                warn!("    {}", output.str);
                            }
                        }
                    }
                }
            }
            last_log_time = Instant::now();
        }
        // 5. Check completion
        let all_done = task_states.values().all(|s| *s == TaskState::Succeeded || *s == TaskState::Failed);
        if active_tasks_count == 0 && all_done {
            info!("All tasks are in a terminal state and no tasks are active.");
            break;
        }
        // Avoid busy-waiting if no events, give channels time
        // Select would be better here, but for simplicity:
        thread::sleep(Duration::from_millis(50));
    }
    info!("Shutting down workers...");
    drop(task_tx); // Signal workers to stop by closing the task channel
    for handle in worker_handles {
        handle.join().expect("Failed to join worker thread");
    }
    info!("All workers finished.");
    // Final processing of any remaining results (should be minimal if loop logic is correct)
    while let Ok(result) = result_rx.try_recv() {
        active_tasks_count -= 1; // Should be 0
        info!(
            "Received late result for task {}: Success: {}",
            result.job_label, result.success
        );
        // Update state for completeness, though it might not affect overall outcome now
        let current_state = if result.success { TaskState::Succeeded } else { TaskState::Failed };
        task_states.insert(result.task_key.clone(), current_state);
        job_results.push(result);
    }
    let success_count = job_results.iter().filter(|r| r.success).count();
    let failure_count = job_results.len() - success_count;
    info!("Execution complete: {} succeeded, {} failed", success_count, failure_count);
    // Log final build request status (existing detailed event)
    if let Some(ref query_engine) = build_event_log {
        let final_status = if failure_count > 0 || fail_fast_triggered {
            BuildRequestStatusCode::BuildRequestFailed
        } else {
            BuildRequestStatusCode::BuildRequestCompleted
        };
        let event = create_build_event(
            build_request_id.clone(),
            EventType::BuildRequestEvent(BuildRequestEvent {
                status: Some(final_status.status()),
                requested_partitions: graph.outputs.clone(),
                message: format!("Execution completed: {} succeeded, {} failed", success_count, failure_count),
                comment: None,
                want_id: None,
            })
        );
        if let Err(e) = query_engine.append_event(event).await {
            error!("Failed to log final build request event: {}", e);
        }
    }
    if failure_count > 0 || fail_fast_triggered {
        error!("Execution finished with errors.");
        std::process::exit(1);
    }
    Ok(())
 }
--- a/databuild/graph/rust_analyze_wrapper.sh.tpl
+++ b/databuild/graph/rust_analyze_wrapper.sh.tpl
@ -1,13 +0,0 @@
 #!/bin/bash
 set -e
 %{RUNFILES_PREFIX}
 %{PREFIX}
 # Locate the Rust binary using its standard runfiles path
 # Assumes workspace name is 'databuild'
 EXECUTABLE_BINARY="$(rlocation "databuild/databuild/graph/analyze")"
 # Run the analysis
 exec "${EXECUTABLE_BINARY}" "$@"
--- a/databuild/graph/rust_execute_wrapper.sh.tpl
+++ b/databuild/graph/rust_execute_wrapper.sh.tpl
@ -1,11 +0,0 @@
 #!/bin/bash
 set -e
 %{RUNFILES_PREFIX}
 %{PREFIX}
 EXECUTABLE_BINARY="$(rlocation "databuild/databuild/graph/execute")"
 # Run the execution
 exec "${EXECUTABLE_BINARY}" "$@"
--- a/databuild/graph/test/BUILD.bazel
+++ b/databuild/graph/test/BUILD.bazel
@ -1,5 +0,0 @@
 sh_test(
    name = "analyze_test",
    srcs = ["analyze_test.sh"],
    data = ["//databuild/graph:analyze"],
 )
--- a/databuild/graph/test/analyze_test.sh
+++ b/databuild/graph/test/analyze_test.sh
@ -1,3 +0,0 @@
 #!/usr/bin/env bash
 DATABUILD_MODE=import_test DATABUILD_JOB_LOOKUP_PATH=foo DATABUILD_CANDIDATE_JOBS=bar databuild/graph/analyze
--- a/databuild/job/BUILD.bazel
+++ b/databuild/job/BUILD.bazel
@ -1,27 +0,0 @@
 load("@rules_rust//rust:defs.bzl", "rust_binary", "rust_test")
 rust_binary(
    name = "job_wrapper",
    srcs = ["main.rs"],
    visibility = ["//visibility:public"],
    deps = [
        "//databuild",
        "@crates//:serde",
        "@crates//:serde_json",
        "@crates//:uuid",
        "@crates//:sysinfo",
    ],
 )
 rust_test(
    name = "job_wrapper_test",
    srcs = ["main.rs"],
    deps = [
        "//databuild",
        "@crates//:serde",
        "@crates//:serde_json",
        "@crates//:uuid",
        "@crates//:sysinfo",
        "@crates//:tempfile",
    ],
 )
--- a/databuild/job/README.md
+++ b/databuild/job/README.md
@ -1,4 +0,0 @@
 # DataBuild Jobs
 Contains wrappers and tools for implementing DataBuild jobs.
--- a/databuild/job/main.rs
+++ b/databuild/job/main.rs
@ -1,985 +0,0 @@
 use std::env;
 use std::io::{self, Read, Write};
 use std::process::{Command, Stdio};
 use std::sync::{mpsc, Arc, Mutex};
 use std::thread;
 use std::time::{Duration, SystemTime, UNIX_EPOCH};
 // All serialization handled by protobuf serde derives
 use serde_json;
 use sysinfo::{Pid, ProcessRefreshKind, System};
 use uuid::Uuid;
 // Import protobuf types from databuild
 use databuild::{
    job_log_entry, log_message, JobConfig, JobLabel, JobLogEntry, LogMessage, PartitionManifest,
    PartitionRef, Task, WrapperJobEvent,
 };
 // All types now come from protobuf - no custom structs needed
 // Configuration constants
 const DEFAULT_HEARTBEAT_INTERVAL_MS: u64 = 30_000; // 30 seconds
 const DEFAULT_METRICS_INTERVAL_MS: u64 = 100; // 100 milliseconds
 const TEST_HEARTBEAT_INTERVAL_MS: u64 = 100; // Fast heartbeats for testing
 const TEST_METRICS_INTERVAL_MS: u64 = 50; // Fast metrics for testing
 #[derive(Debug)]
 struct HeartbeatMessage {
    entry: JobLogEntry,
 }
 fn get_timestamp() -> String {
    SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .unwrap()
        .as_secs()
        .to_string()
 }
 trait LogSink {
    fn emit(&mut self, entry: JobLogEntry);
 }
 struct StdoutSink;
 impl LogSink for StdoutSink {
    fn emit(&mut self, entry: JobLogEntry) {
        println!("{}", serde_json::to_string(&entry).unwrap());
    }
 }
 struct JobWrapper<S: LogSink> {
    job_id: String,
    sequence_number: u64,
    start_time: i64,
    sink: S,
 }
 impl JobWrapper<StdoutSink> {
    fn new() -> Self {
        Self::new_with_sink(StdoutSink)
    }
 }
 impl<S: LogSink> JobWrapper<S> {
    fn new_with_sink(sink: S) -> Self {
        // Use job ID from environment if provided by graph execution, otherwise generate one
        let job_id = env::var("DATABUILD_JOB_RUN_ID")
            .unwrap_or_else(|_| Uuid::new_v4().to_string());
        Self {
            job_id,
            sequence_number: 0,
            start_time: SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .unwrap()
                .as_secs() as i64,
            sink,
        }
    }
    fn next_sequence(&mut self) -> u64 {
        self.sequence_number += 1;
        self.sequence_number
    }
    fn emit_log(&mut self, outputs: &[PartitionRef], content: job_log_entry::Content) {
        let entry = JobLogEntry {
            timestamp: get_timestamp(),
            job_id: self.job_id.clone(),
            outputs: outputs.to_vec(),
            sequence_number: self.next_sequence(),
            content: Some(content),
        };
        self.sink.emit(entry);
    }
    fn config_mode(&mut self, outputs: Vec<String>) -> Result<(), Box<dyn std::error::Error>> {
        // Convert to PartitionRef objects
        let output_refs: Vec<PartitionRef> = outputs
            .iter()
            .map(|s| PartitionRef { r#str: s.clone() })
            .collect();
        // Following the state diagram: wrapper_validate_config -> emit_config_validate_success
        self.emit_log(
            &output_refs,
            job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "config_validate_success".to_string(),
                metadata: std::collections::HashMap::new(),
                job_status: None,
                exit_code: None,
                job_label: None, // Will be enriched by LogCollector
            }),
        );
        // For Phase 0, we still need to produce the expected JSON config format
        // so the current graph system can parse it. Later phases will change this.
        let config = JobConfig {
            outputs: output_refs.clone(),
            inputs: vec![],
            args: outputs.clone(),
            env: {
                let mut env_map = std::collections::HashMap::new();
                if let Some(partition_ref) = outputs.first() {
                    env_map.insert("PARTITION_REF".to_string(), partition_ref.clone());
                }
                env_map
            },
        };
        // For config mode, we need to output the standard config format to stdout
        // The structured logs will come later during exec mode
        let configs_wrapper = serde_json::json!({
            "configs": [config]
        });
        println!("{}", serde_json::to_string(&configs_wrapper)?);
        Ok(())
    }
    fn exec_mode(&mut self, job_binary: &str) -> Result<(), Box<dyn std::error::Error>> {
        // Read the job config from stdin
        let mut buffer = String::new();
        io::stdin().read_to_string(&mut buffer)?;
        let config: JobConfig = serde_json::from_str(&buffer)?;
        self.exec_mode_with_config(job_binary, config)
    }
    fn exec_mode_with_config(
        &mut self,
        job_binary: &str,
        config: JobConfig,
    ) -> Result<(), Box<dyn std::error::Error>> {
        let outputs = &config.outputs;
        // Following the state diagram:
        // 1. wrapper_validate_config -> emit_config_validate_success
        self.emit_log(
            outputs,
            job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "config_validate_success".to_string(),
                job_status: None,
                exit_code: None,
                metadata: std::collections::HashMap::new(),
                job_label: None, // Will be enriched by LogCollector
            }),
        );
        // 2. wrapper_launch_task -> emit_task_launch_success
        self.emit_log(
            outputs,
            job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "task_launch_success".to_string(),
                job_status: None,
                exit_code: None,
                metadata: std::collections::HashMap::new(),
                job_label: None, // Will be enriched by LogCollector
            }),
        );
        // Execute the original job binary with the exec subcommand
        let mut cmd = Command::new(job_binary);
        cmd.arg("exec");
        // Add the args from the config
        for arg in &config.args {
            cmd.arg(arg);
        }
        cmd.stdin(Stdio::piped())
            .stdout(Stdio::piped())
            .stderr(Stdio::piped());
        // Set environment variables from config
        for (key, value) in &config.env {
            cmd.env(key, value);
        }
        let mut child = cmd.spawn()?;
        let child_pid = child.id();
        // Send the config to the job
        if let Some(stdin) = child.stdin.as_mut() {
            stdin.write_all(serde_json::to_string(&config).unwrap().as_bytes())?;
        }
        // Start heartbeat thread with channel communication
        let heartbeat_job_id = self.job_id.clone();
        let heartbeat_outputs = outputs.clone();
        let heartbeat_sequence = Arc::new(Mutex::new(0u64));
        let heartbeat_sequence_clone = heartbeat_sequence.clone();
        let (heartbeat_tx, heartbeat_rx) = mpsc::channel::<HeartbeatMessage>();
        let heartbeat_handle = thread::spawn(move || {
            let mut system = System::new_all();
            let pid = Pid::from(child_pid as usize);
            let heartbeat_interval_ms = env::var("DATABUILD_HEARTBEAT_INTERVAL_MS")
                .unwrap_or_else(|_| DEFAULT_HEARTBEAT_INTERVAL_MS.to_string())
                .parse::<u64>()
                .unwrap_or(DEFAULT_HEARTBEAT_INTERVAL_MS);
            loop {
                thread::sleep(Duration::from_millis(heartbeat_interval_ms));
                // Refresh process info
                system.refresh_processes_specifics(ProcessRefreshKind::new());
                // Check if process still exists
                if let Some(process) = system.process(pid) {
                    let memory_mb = process.memory() as f64 / 1024.0 / 1024.0;
                    let cpu_percent = process.cpu_usage();
                    // Create heartbeat event with metrics
                    let mut metadata = std::collections::HashMap::new();
                    metadata.insert("memory_usage_mb".to_string(), format!("{:.3}", memory_mb));
                    metadata.insert(
                        "cpu_usage_percent".to_string(),
                        format!("{:.3}", cpu_percent),
                    );
                    // Get next sequence number for heartbeat
                    let seq = {
                        let mut seq_lock = heartbeat_sequence_clone.lock().unwrap();
                        *seq_lock += 1;
                        *seq_lock
                    };
                    let heartbeat_event = JobLogEntry {
                        timestamp: get_timestamp(),
                        job_id: heartbeat_job_id.clone(),
                        outputs: heartbeat_outputs.clone(),
                        sequence_number: seq,
                        content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                            event_type: "heartbeat".to_string(),
                            job_status: None,
                            exit_code: None,
                            metadata,
                            job_label: None, // Will be enriched by LogCollector
                        })),
                    };
                    // Send heartbeat through channel instead of printing directly
                    if heartbeat_tx.send(HeartbeatMessage { entry: heartbeat_event }).is_err() {
                        // Main thread dropped receiver, exit
                        break;
                    }
                } else {
                    // Process no longer exists, exit heartbeat thread
                    break;
                }
            }
        });
        // Track metrics while job is running
        let job_start_time = SystemTime::now();
        let mut system = System::new();
        let pid = Pid::from(child_pid as usize);
        // Initial refresh to establish baseline for CPU measurements
        system.refresh_cpu();
        system.refresh_processes_specifics(ProcessRefreshKind::new().with_cpu());
        let mut peak_memory_mb = 0.0f64;
        let mut cpu_samples = Vec::new();
        let mut stdout_buffer = Vec::new();
        let mut stderr_buffer = Vec::new();
        // Sleep briefly to allow the process to start up before measuring
        let sample_interval_ms = env::var("DATABUILD_METRICS_INTERVAL_MS")
            .unwrap_or_else(|_| DEFAULT_METRICS_INTERVAL_MS.to_string())
            .parse::<u64>()
            .unwrap_or(DEFAULT_METRICS_INTERVAL_MS);
        thread::sleep(Duration::from_millis(sample_interval_ms));
        // Poll process status and metrics
        let (output, peak_memory_mb, total_cpu_ms, job_duration) = loop {
            // Check if process has exited
            match child.try_wait()? {
                Some(status) => {
                    // Process has exited, collect any remaining output
                    if let Some(mut stdout) = child.stdout.take() {
                        stdout.read_to_end(&mut stdout_buffer)?;
                    }
                    if let Some(mut stderr) = child.stderr.take() {
                        stderr.read_to_end(&mut stderr_buffer)?;
                    }
                    // Calculate final metrics
                    let job_duration = job_start_time.elapsed().map_err(|e| {
                        io::Error::new(
                            io::ErrorKind::Other,
                            format!("Time calculation error: {}", e),
                        )
                    })?;
                    // Calculate CPU time: average CPU percentage * wall-clock time
                    let total_cpu_ms = if cpu_samples.is_empty() {
                        0.0
                    } else {
                        let avg_cpu_percent =
                            cpu_samples.iter().sum::<f32>() as f64 / cpu_samples.len() as f64;
                        (avg_cpu_percent / 100.0) * job_duration.as_millis() as f64
                    };
                    // Stop heartbeat thread
                    drop(heartbeat_handle);
                    // Process any remaining heartbeat messages
                    while let Ok(heartbeat_msg) = heartbeat_rx.try_recv() {
                        self.sink.emit(heartbeat_msg.entry);
                    }
                    // Update sequence number to account for heartbeats
                    let heartbeat_count = heartbeat_sequence.lock().unwrap();
                    self.sequence_number = self.sequence_number.max(*heartbeat_count);
                    drop(heartbeat_count);
                    // Create output struct to match original behavior
                    let output = std::process::Output {
                        status,
                        stdout: stdout_buffer,
                        stderr: stderr_buffer,
                    };
                    break (output, peak_memory_mb, total_cpu_ms, job_duration);
                }
                None => {
                    // Check for heartbeat messages and emit them
                    while let Ok(heartbeat_msg) = heartbeat_rx.try_recv() {
                        self.sink.emit(heartbeat_msg.entry);
                    }
                    // Process still running, collect metrics
                    // Refresh CPU info and processes
                    system.refresh_cpu();
                    system.refresh_processes_specifics(ProcessRefreshKind::new().with_cpu());
                    // Sleep to allow CPU measurement interval
                    thread::sleep(Duration::from_millis(sample_interval_ms));
                    // Refresh again to get updated CPU usage
                    system.refresh_cpu();
                    system.refresh_processes_specifics(ProcessRefreshKind::new().with_cpu());
                    if let Some(process) = system.process(pid) {
                        let memory_mb = process.memory() as f64 / 1024.0 / 1024.0;
                        peak_memory_mb = peak_memory_mb.max(memory_mb);
                        let cpu_usage = process.cpu_usage();
                        cpu_samples.push(cpu_usage);
                    }
                }
            }
        };
        let success = output.status.success();
        let exit_code = output.status.code().unwrap_or(-1);
        // Capture and forward job stdout/stderr as log messages
        if !output.stdout.is_empty() {
            let stdout_str = String::from_utf8_lossy(&output.stdout);
            self.emit_log(
                outputs,
                job_log_entry::Content::Log(LogMessage {
                    level: log_message::LogLevel::Info as i32,
                    message: stdout_str.to_string(),
                    fields: std::collections::HashMap::new(),
                }),
            );
        }
        if !output.stderr.is_empty() {
            let stderr_str = String::from_utf8_lossy(&output.stderr);
            self.emit_log(
                outputs,
                job_log_entry::Content::Log(LogMessage {
                    level: log_message::LogLevel::Error as i32,
                    message: stderr_str.to_string(),
                    fields: std::collections::HashMap::new(),
                }),
            );
        }
        // Emit job summary with resource metrics
        let mut summary_metadata = std::collections::HashMap::new();
        summary_metadata.insert(
            "runtime_ms".to_string(),
            format!("{:.3}", job_duration.as_millis() as f64),
        );
        summary_metadata.insert(
            "peak_memory_mb".to_string(),
            format!("{:.3}", peak_memory_mb),
        );
        summary_metadata.insert("total_cpu_ms".to_string(), format!("{:.3}", total_cpu_ms));
        summary_metadata.insert("exit_code".to_string(), exit_code.to_string());
        self.emit_log(
            outputs,
            job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "job_summary".to_string(),
                job_status: None,
                exit_code: Some(exit_code),
                metadata: summary_metadata,
                job_label: None, // Will be enriched by LogCollector
            }),
        );
        if success {
            // Following the state diagram: wrapper_monitor_task -> zero exit -> emit_task_success
            self.emit_log(
                outputs,
                job_log_entry::Content::JobEvent(WrapperJobEvent {
                    event_type: "task_success".to_string(),
                    job_status: Some("JOB_COMPLETED".to_string()),
                    exit_code: Some(exit_code),
                    metadata: std::collections::HashMap::new(),
                    job_label: None, // Will be enriched by LogCollector
                }),
            );
            // Then emit_partition_manifest -> success
            let end_time = SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .unwrap()
                .as_secs() as i64;
            self.emit_log(
                outputs,
                job_log_entry::Content::Manifest(PartitionManifest {
                    outputs: config.outputs.clone(),
                    inputs: vec![], // Phase 0: no input manifests yet
                    start_time: self.start_time,
                    end_time,
                    task: Some(Task {
                        job: Some(JobLabel {
                            label: env::var("DATABUILD_JOB_LABEL")
                                .unwrap_or_else(|_| "unknown".to_string()),
                        }),
                        config: Some(config.clone()),
                    }),
                    metadata: std::collections::HashMap::new(), // Phase 0: no metadata yet
                }),
            );
        } else {
            // Following the state diagram: wrapper_monitor_task -> non-zero exit -> emit_task_failed
            self.emit_log(
                outputs,
                job_log_entry::Content::JobEvent(WrapperJobEvent {
                    event_type: "task_failed".to_string(),
                    job_status: Some("JOB_FAILED".to_string()),
                    exit_code: Some(exit_code),
                    metadata: std::collections::HashMap::new(),
                    job_label: None, // Will be enriched by LogCollector
                }),
            );
            // Then emit_job_exec_fail -> fail (don't emit partition manifest on failure)
            self.emit_log(
                outputs,
                job_log_entry::Content::JobEvent(WrapperJobEvent {
                    event_type: "job_exec_fail".to_string(),
                    job_status: Some("JOB_FAILED".to_string()),
                    exit_code: Some(exit_code),
                    metadata: {
                        let mut meta = std::collections::HashMap::new();
                        meta.insert(
                            "error".to_string(),
                            format!("Job failed with exit code {}", exit_code),
                        );
                        meta
                    },
                    job_label: None, // Will be enriched by LogCollector
                }),
            );
        }
        // Forward the original job's output to stdout for compatibility
        io::stdout().write_all(&output.stdout)?;
        io::stderr().write_all(&output.stderr)?;
        if !success {
            std::process::exit(exit_code);
        }
        Ok(())
    }
 }
 fn main() -> Result<(), Box<dyn std::error::Error>> {
    let args: Vec<String> = env::args().collect();
    if args.len() < 2 {
        eprintln!("Usage: job_wrapper <config|exec> [args...]");
        std::process::exit(1);
    }
    let mode = &args[1];
    let mut wrapper = JobWrapper::new();
    match mode.as_str() {
        "config" => {
            let outputs = args[2..].to_vec();
            wrapper.config_mode(outputs)?;
        }
        "exec" => {
            // For exec mode, we need to know which original job binary to call
            // For Phase 0, we'll derive this from environment or make it configurable
            let job_binary =
                env::var("DATABUILD_JOB_BINARY").unwrap_or_else(|_| "python3".to_string()); // Default fallback
            wrapper.exec_mode(&job_binary)?;
        }
        _ => {
            eprintln!("Unknown mode: {}", mode);
            std::process::exit(1);
        }
    }
    Ok(())
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    // Test infrastructure
    struct TestSink {
        entries: Vec<JobLogEntry>,
    }
    impl TestSink {
        fn new() -> Self {
            Self {
                entries: Vec::new(),
            }
        }
        fn find_event(&self, event_type: &str) -> Option<&JobLogEntry> {
            self.entries.iter().find(|entry| {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    event.event_type == event_type
                } else {
                    false
                }
            })
        }
    }
    impl LogSink for TestSink {
        fn emit(&mut self, entry: JobLogEntry) {
            self.entries.push(entry);
        }
    }
    // Helper functions for testing
    fn generate_test_config(outputs: &[String]) -> JobConfig {
        JobConfig {
            outputs: outputs
                .iter()
                .map(|s| PartitionRef { r#str: s.clone() })
                .collect(),
            inputs: vec![],
            args: outputs.to_vec(),
            env: {
                let mut env_map = std::collections::HashMap::new();
                if let Some(partition_ref) = outputs.first() {
                    env_map.insert("PARTITION_REF".to_string(), partition_ref.clone());
                }
                env_map
            },
        }
    }
    #[test]
    fn test_job_log_entry_serialization() {
        let entry = JobLogEntry {
            timestamp: "1234567890".to_string(),
            job_id: "test-id".to_string(),
            outputs: vec![PartitionRef { r#str: "test/partition".to_string() }],
            sequence_number: 1,
            content: Some(job_log_entry::Content::Log(LogMessage {
                level: log_message::LogLevel::Info as i32,
                message: "test message".to_string(),
                fields: std::collections::HashMap::new(),
            })),
        };
        let json = serde_json::to_string(&entry).unwrap();
        assert!(json.contains("\"timestamp\":\"1234567890\""));
        assert!(json.contains("\"sequence_number\":1"));
        assert!(json.contains("\"Log\":{")); // Capitalized field name
        assert!(json.contains("\"message\":\"test message\""));
    }
    #[test]
    fn test_sequence_number_increment() {
        let mut wrapper = JobWrapper::new();
        assert_eq!(wrapper.next_sequence(), 1);
        assert_eq!(wrapper.next_sequence(), 2);
        assert_eq!(wrapper.next_sequence(), 3);
    }
    #[test]
    fn test_config_mode_output_format() {
        let outputs = vec!["test/partition".to_string()];
        let config = generate_test_config(&outputs);
        // Verify it produces expected structure
        assert_eq!(config.outputs.len(), 1);
        assert_eq!(config.outputs[0].r#str, "test/partition");
        assert_eq!(config.args, outputs);
        assert_eq!(
            config.env.get("PARTITION_REF"),
            Some(&"test/partition".to_string())
        );
    }
    #[test]
    fn test_multiple_outputs_config() {
        let outputs = vec![
            "reviews/date=2025-01-01".to_string(),
            "reviews/date=2025-01-02".to_string(),
        ];
        let config = generate_test_config(&outputs);
        assert_eq!(config.outputs.len(), 2);
        assert_eq!(config.outputs[0].r#str, "reviews/date=2025-01-01");
        assert_eq!(config.outputs[1].r#str, "reviews/date=2025-01-02");
        // First output is used as PARTITION_REF
        assert_eq!(
            config.env.get("PARTITION_REF"),
            Some(&"reviews/date=2025-01-01".to_string())
        );
    }
    #[test]
    fn test_wrapper_job_event_creation() {
        // Test success event
        let event = WrapperJobEvent {
            event_type: "task_success".to_string(),
            job_status: Some("JOB_COMPLETED".to_string()),
            exit_code: Some(0),
            metadata: std::collections::HashMap::new(),
            job_label: None,
        };
        assert_eq!(event.event_type, "task_success");
        assert_eq!(event.job_status, Some("JOB_COMPLETED".to_string()));
        assert_eq!(event.exit_code, Some(0));
        // Test failure event
        let event = WrapperJobEvent {
            event_type: "task_failed".to_string(),
            job_status: Some("JOB_FAILED".to_string()),
            exit_code: Some(1),
            metadata: std::collections::HashMap::new(),
            job_label: None,
        };
        assert_eq!(event.event_type, "task_failed");
        assert_eq!(event.job_status, Some("JOB_FAILED".to_string()));
        assert_eq!(event.exit_code, Some(1));
    }
    #[test]
    fn test_log_message_levels() {
        let info_log = LogMessage {
            level: log_message::LogLevel::Info as i32,
            message: "info message".to_string(),
            fields: std::collections::HashMap::new(),
        };
        assert_eq!(info_log.level, log_message::LogLevel::Info as i32);
        let error_log = LogMessage {
            level: log_message::LogLevel::Error as i32,
            message: "error message".to_string(),
            fields: std::collections::HashMap::new(),
        };
        assert_eq!(error_log.level, log_message::LogLevel::Error as i32);
    }
    #[test]
    fn test_partition_manifest_structure() {
        let config = generate_test_config(&vec!["test/partition".to_string()]);
        let manifest = PartitionManifest {
            outputs: config.outputs.clone(),
            inputs: vec![],
            start_time: 1234567890,
            end_time: 1234567900,
            task: Some(Task {
                job: Some(JobLabel {
                    label: "//test:job".to_string(),
                }),
                config: Some(config),
            }),
            metadata: std::collections::HashMap::new(),
        };
        assert_eq!(manifest.outputs.len(), 1);
        assert_eq!(manifest.outputs[0].r#str, "test/partition");
        assert_eq!(manifest.end_time - manifest.start_time, 10);
        assert!(manifest.task.is_some());
    }
    #[test]
    fn test_timestamp_generation() {
        let ts1 = get_timestamp();
        std::thread::sleep(std::time::Duration::from_millis(10));
        let ts2 = get_timestamp();
        // Timestamps should be parseable as integers
        let t1: u64 = ts1.parse().expect("Should be valid timestamp");
        let t2: u64 = ts2.parse().expect("Should be valid timestamp");
        // Second timestamp should be equal or greater
        assert!(t2 >= t1);
    }
    #[test]
    fn test_job_wrapper_initialization() {
        let wrapper = JobWrapper::new();
        assert_eq!(wrapper.sequence_number, 0);
        assert!(!wrapper.job_id.is_empty());
        assert!(wrapper.start_time > 0);
    }
    #[test]
    fn test_cpu_metrics_are_captured() {
        use std::io::Write;
        use tempfile::NamedTempFile;
        // Create a CPU-intensive test script
        let mut temp_file = NamedTempFile::new().expect("Failed to create temp file");
        let script_content = r#"#!/usr/bin/env python3
 import sys
 import json
 import time
 if len(sys.argv) > 1 and sys.argv[1] == "config":
    config = {
        "outputs": [{"str": "test/cpu"}],
        "inputs": [],
        "args": [],
        "env": {"PARTITION_REF": "test/cpu"}
    }
    print(json.dumps({"configs": [config]}))
 elif len(sys.argv) > 1 and sys.argv[1] == "exec":
    # CPU-intensive work that runs longer
    start_time = time.time()
    total = 0
    while time.time() - start_time < 0.5:  # Run for at least 500ms
        total += sum(range(1_000_000))
    print(f"Sum: {total}")
 "#;
        temp_file
            .write_all(script_content.as_bytes())
            .expect("Failed to write script");
        let script_path = temp_file.path().to_str().unwrap();
        // Make script executable
        std::fs::set_permissions(
            script_path,
            std::os::unix::fs::PermissionsExt::from_mode(0o755),
        )
        .expect("Failed to set permissions");
        // Set up environment for fast sampling and the test script
        env::set_var("DATABUILD_METRICS_INTERVAL_MS", "10"); // Even faster for CPU test
        env::set_var("DATABUILD_JOB_BINARY", script_path);
        // Create test sink and wrapper
        let sink = TestSink::new();
        let mut wrapper = JobWrapper::new_with_sink(sink);
        // Create a JobConfig for the test
        let config = JobConfig {
            outputs: vec![PartitionRef {
                r#str: "test/cpu".to_string(),
            }],
            inputs: vec![],
            args: vec![],
            env: {
                let mut env_map = std::collections::HashMap::new();
                env_map.insert("PARTITION_REF".to_string(), "test/cpu".to_string());
                env_map
            },
        };
        // We need to simulate stdin for exec_mode - let's create a test-specific exec method
        // that takes the config directly rather than reading from stdin
        let result = wrapper.exec_mode_with_config(script_path, config);
        // Clean up environment
        env::remove_var("DATABUILD_METRICS_INTERVAL_MS");
        env::remove_var("DATABUILD_JOB_BINARY");
        // Check that exec_mode succeeded
        if let Err(e) = &result {
            println!("exec_mode failed with error: {}", e);
        }
        assert!(result.is_ok(), "exec_mode should succeed: {:?}", result);
        // Find the job_summary event
        let summary_event = wrapper
            .sink
            .find_event("job_summary")
            .expect("Should have job_summary event");
        if let Some(job_log_entry::Content::JobEvent(event)) = &summary_event.content {
            // Verify we have CPU metrics
            let cpu_ms_str = event
                .metadata
                .get("total_cpu_ms")
                .expect("Should have total_cpu_ms metric");
            let cpu_ms: f64 = cpu_ms_str
                .parse()
                .expect("CPU metric should be valid float");
            // For CPU-intensive work, we should get non-zero CPU time
            assert!(
                cpu_ms > 0.0,
                "Expected non-zero CPU time for CPU-intensive workload, but got {:.3}ms",
                cpu_ms
            );
            // Also verify runtime is reasonable
            let runtime_ms_str = event
                .metadata
                .get("runtime_ms")
                .expect("Should have runtime_ms metric");
            let runtime_ms: f64 = runtime_ms_str
                .parse()
                .expect("Runtime metric should be valid float");
            assert!(runtime_ms > 0.0, "Should have non-zero runtime");
            println!(
                "CPU test results: {:.3}ms CPU time over {:.3}ms runtime",
                cpu_ms, runtime_ms
            );
        } else {
            panic!("job_summary event should contain JobEvent");
        }
    }
    #[test]
    fn test_heartbeat_functionality() {
        use std::io::Write;
        use tempfile::NamedTempFile;
        // Create a longer-running test script to trigger heartbeats
        let mut temp_file = NamedTempFile::new().expect("Failed to create temp file");
        let script_content = r#"#!/usr/bin/env python3
 import sys
 import json
 import time
 if len(sys.argv) > 1 and sys.argv[1] == "config":
    config = {
        "outputs": [{"str": "test/heartbeat"}],
        "inputs": [],
        "args": [],
        "env": {"PARTITION_REF": "test/heartbeat"}
    }
    print(json.dumps({"configs": [config]}))
 elif len(sys.argv) > 1 and sys.argv[1] == "exec":
    # Sleep long enough to trigger at least 2 heartbeats
    time.sleep(0.3)  # 300ms with 100ms heartbeat interval should give us 2-3 heartbeats
    print("Job completed")
 "#;
        temp_file
            .write_all(script_content.as_bytes())
            .expect("Failed to write script");
        let script_path = temp_file.path().to_str().unwrap();
        // Make script executable
        std::fs::set_permissions(
            script_path,
            std::os::unix::fs::PermissionsExt::from_mode(0o755),
        )
        .expect("Failed to set permissions");
        // Set up environment for fast heartbeats and the test script
        env::set_var("DATABUILD_HEARTBEAT_INTERVAL_MS", &TEST_HEARTBEAT_INTERVAL_MS.to_string());
        env::set_var("DATABUILD_METRICS_INTERVAL_MS", &TEST_METRICS_INTERVAL_MS.to_string());
        env::set_var("DATABUILD_JOB_BINARY", script_path);
        // Create test sink and wrapper
        let sink = TestSink::new();
        let mut wrapper = JobWrapper::new_with_sink(sink);
        // Create a JobConfig for the test
        let config = JobConfig {
            outputs: vec![PartitionRef {
                r#str: "test/heartbeat".to_string(),
            }],
            inputs: vec![],
            args: vec![],
            env: {
                let mut env_map = std::collections::HashMap::new();
                env_map.insert("PARTITION_REF".to_string(), "test/heartbeat".to_string());
                env_map
            },
        };
        // Run the job
        let result = wrapper.exec_mode_with_config(script_path, config);
        // Clean up environment
        env::remove_var("DATABUILD_HEARTBEAT_INTERVAL_MS");
        env::remove_var("DATABUILD_METRICS_INTERVAL_MS");
        env::remove_var("DATABUILD_JOB_BINARY");
        // Check that exec_mode succeeded
        assert!(result.is_ok(), "exec_mode should succeed: {:?}", result);
        // Count heartbeat events
        let heartbeat_count = wrapper
            .sink
            .entries
            .iter()
            .filter(|entry| {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    event.event_type == "heartbeat"
                } else {
                    false
                }
            })
            .count();
        // We should have at least 1 heartbeat event (possibly 2-3 depending on timing)
        assert!(
            heartbeat_count >= 1,
            "Expected at least 1 heartbeat event, but got {}",
            heartbeat_count
        );
        // Verify heartbeat event structure
        let heartbeat_event = wrapper
            .sink
            .entries
            .iter()
            .find(|entry| {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    event.event_type == "heartbeat"
                } else {
                    false
                }
            })
            .expect("Should have at least one heartbeat event");
        if let Some(job_log_entry::Content::JobEvent(event)) = &heartbeat_event.content {
            // Verify heartbeat contains memory and CPU metrics
            assert!(
                event.metadata.contains_key("memory_usage_mb"),
                "Heartbeat should contain memory_usage_mb"
            );
            assert!(
                event.metadata.contains_key("cpu_usage_percent"),
                "Heartbeat should contain cpu_usage_percent"
            );
        }
    }
 }
--- a/databuild/lib.rs
+++ b/databuild/lib.rs
@ -1,41 +1,4 @@
 mod build_event_log;
 // Include generated protobuf code
 include!("databuild.rs");
 // Event log module
 pub mod event_log;
 // Orchestration module
 pub mod orchestration;
 // Service module  
 pub mod service;
 // Repository pattern implementations
 pub mod repositories;
 pub mod mermaid_utils;
 // Status conversion utilities
 pub mod status_utils;
 // Log collection module
 pub mod log_collector;
 // Log access module
 pub mod log_access;
 // Metric templates module
 pub mod metric_templates;
 // Metrics aggregator module
 pub mod metrics_aggregator;
 // Format consistency tests
 #[cfg(test)]
 mod format_consistency_test;
 // Re-export commonly used types from event_log
 pub use event_log::{BuildEventLogError, create_bel_query_engine};
 // Re-export orchestration types
 pub use orchestration::{BuildOrchestrator, BuildResult, OrchestrationError};
--- a/databuild/log_access.rs
+++ b/databuild/log_access.rs
@ -1,440 +0,0 @@
 use crate::{JobLogEntry, JobLogsRequest, JobLogsResponse, log_message};
 use serde_json;
 use std::collections::HashMap;
 use std::fs::{self, File};
 use std::io::{BufRead, BufReader};
 use std::path::{Path, PathBuf};
 use std::time::{SystemTime, UNIX_EPOCH};
 use thiserror::Error;
 #[derive(Error, Debug)]
 pub enum LogAccessError {
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    #[error("JSON parsing error: {0}")]
    Json(#[from] serde_json::Error),
    #[error("Invalid request: {0}")]
    InvalidRequest(String),
    #[error("Job not found: {0}")]
    JobNotFound(String),
 }
 pub struct LogReader {
    logs_base_path: PathBuf,
 }
 impl LogReader {
    pub fn new<P: AsRef<Path>>(logs_base_path: P) -> Self {
        Self {
            logs_base_path: logs_base_path.as_ref().to_path_buf(),
        }
    }
    /// Create LogReader with the default logs directory
    pub fn default() -> Self {
        Self::new(crate::log_collector::LogCollector::default_logs_dir())
    }
    /// Get job logs according to the request criteria
    pub fn get_job_logs(&self, request: &JobLogsRequest) -> Result<JobLogsResponse, LogAccessError> {
        let job_file_path = self.find_job_file(&request.job_run_id)?;
        let file = File::open(&job_file_path)?;
        let reader = BufReader::new(file);
        let mut entries = Vec::new();
        let mut count = 0u32;
        let limit = if request.limit > 0 { request.limit } else { 1000 }; // Default limit
        for line in reader.lines() {
            let line = line?;
            // Skip empty lines
            if line.trim().is_empty() {
                continue;
            }
            // Parse the log entry
            let entry: JobLogEntry = serde_json::from_str(&line)?;
            // Apply filters
            if !self.matches_filters(&entry, request) {
                continue;
            }
            entries.push(entry);
            count += 1;
            // Stop if we've hit the limit
            if count >= limit {
                break;
            }
        }
        // Check if there are more entries by trying to read one more
        let has_more = count == limit;
        Ok(JobLogsResponse {
            entries,
            has_more,
        })
    }
    /// List available job run IDs for a given date range
    pub fn list_available_jobs(&self, date_range: Option<(String, String)>) -> Result<Vec<String>, LogAccessError> {
        let mut job_ids = Vec::new();
        // If no date range specified, look at all directories
        if let Some((start_date, end_date)) = date_range {
            // Parse date range and iterate through dates
            for date_str in self.date_range_iterator(&start_date, &end_date)? {
                let date_dir = self.logs_base_path.join(&date_str);
                if date_dir.exists() {
                    job_ids.extend(self.get_job_ids_from_directory(&date_dir)?);
                }
            }
        } else {
            // List all date directories and collect job IDs
            if self.logs_base_path.exists() {
                for entry in fs::read_dir(&self.logs_base_path)? {
                    let entry = entry?;
                    if entry.file_type()?.is_dir() {
                        job_ids.extend(self.get_job_ids_from_directory(&entry.path())?);
                    }
                }
            }
        }
        // Remove duplicates and sort
        job_ids.sort();
        job_ids.dedup();
        Ok(job_ids)
    }
    /// Get metrics points for a specific job
    pub fn get_job_metrics(&self, job_run_id: &str) -> Result<Vec<crate::MetricPoint>, LogAccessError> {
        let job_file_path = self.find_job_file(job_run_id)?;
        let file = File::open(&job_file_path)?;
        let reader = BufReader::new(file);
        let mut metrics = Vec::new();
        for line in reader.lines() {
            let line = line?;
            // Skip empty lines
            if line.trim().is_empty() {
                continue;
            }
            // Parse the log entry
            let entry: JobLogEntry = serde_json::from_str(&line)?;
            // Extract metrics from the entry
            if let Some(crate::job_log_entry::Content::Metric(metric)) = entry.content {
                metrics.push(metric);
            }
        }
        Ok(metrics)
    }
    /// Find the JSONL file for a specific job run ID
    fn find_job_file(&self, job_run_id: &str) -> Result<PathBuf, LogAccessError> {
        // Search through all date directories for the job file
        if !self.logs_base_path.exists() {
            return Err(LogAccessError::JobNotFound(job_run_id.to_string()));
        }
        for entry in fs::read_dir(&self.logs_base_path)? {
            let entry = entry?;
            if entry.file_type()?.is_dir() {
                let job_file = entry.path().join(format!("{}.jsonl", job_run_id));
                if job_file.exists() {
                    return Ok(job_file);
                }
            }
        }
        Err(LogAccessError::JobNotFound(job_run_id.to_string()))
    }
    /// Check if a log entry matches the request filters
    fn matches_filters(&self, entry: &JobLogEntry, request: &JobLogsRequest) -> bool {
        // Filter by timestamp (since_timestamp is in nanoseconds)
        if request.since_timestamp > 0 {
            if let Ok(entry_timestamp) = entry.timestamp.parse::<u64>() {
                let entry_timestamp_ns = entry_timestamp * 1_000_000_000; // Convert seconds to nanoseconds
                if entry_timestamp_ns <= request.since_timestamp as u64 {
                    return false;
                }
            }
        }
        // Filter by log level (only applies to log messages)
        if request.min_level > 0 {
            if let Some(crate::job_log_entry::Content::Log(log_msg)) = &entry.content {
                if log_msg.level < request.min_level {
                    return false;
                }
            }
            // For non-log entries (metrics, events), we include them regardless of min_level
        }
        true
    }
    /// Get job IDs from files in a specific directory
    fn get_job_ids_from_directory(&self, dir_path: &Path) -> Result<Vec<String>, LogAccessError> {
        let mut job_ids = Vec::new();
        for entry in fs::read_dir(dir_path)? {
            let entry = entry?;
            if entry.file_type()?.is_file() {
                if let Some(file_name) = entry.file_name().to_str() {
                    if file_name.ends_with(".jsonl") {
                        // Extract job ID by removing .jsonl extension
                        let job_id = file_name.trim_end_matches(".jsonl");
                        job_ids.push(job_id.to_string());
                    }
                }
            }
        }
        Ok(job_ids)
    }
    /// Generate an iterator over date strings in a range (YYYY-MM-DD format)
    fn date_range_iterator(&self, start_date: &str, end_date: &str) -> Result<Vec<String>, LogAccessError> {
        // Simple implementation - for production might want more robust date parsing
        let start_parts: Vec<&str> = start_date.split('-').collect();
        let end_parts: Vec<&str> = end_date.split('-').collect();
        if start_parts.len() != 3 || end_parts.len() != 3 {
            return Err(LogAccessError::InvalidRequest("Invalid date format, expected YYYY-MM-DD".to_string()));
        }
        // For now, just return the start and end dates
        // In a full implementation, you'd iterate through all dates in between
        let mut dates = vec![start_date.to_string()];
        if start_date != end_date {
            dates.push(end_date.to_string());
        }
        Ok(dates)
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::{job_log_entry, log_message, LogMessage, PartitionRef, MetricPoint};
    use std::io::Write;
    use tempfile::TempDir;
    fn create_test_log_entry(job_id: &str, sequence: u64, timestamp: &str) -> JobLogEntry {
        JobLogEntry {
            timestamp: timestamp.to_string(),
            job_id: job_id.to_string(),
            outputs: vec![PartitionRef { r#str: "test/partition".to_string() }],
            sequence_number: sequence,
            content: Some(job_log_entry::Content::Log(LogMessage {
                level: log_message::LogLevel::Info as i32,
                message: format!("Test log message {}", sequence),
                fields: HashMap::new(),
            })),
        }
    }
    fn create_test_metric_entry(job_id: &str, sequence: u64, timestamp: &str) -> JobLogEntry {
        JobLogEntry {
            timestamp: timestamp.to_string(),
            job_id: job_id.to_string(),
            outputs: vec![PartitionRef { r#str: "test/partition".to_string() }],
            sequence_number: sequence,
            content: Some(job_log_entry::Content::Metric(MetricPoint {
                name: "test_metric".to_string(),
                value: 42.0,
                labels: HashMap::new(),
                unit: "count".to_string(),
            })),
        }
    }
    fn setup_test_logs(temp_dir: &TempDir) -> Result<(), Box<dyn std::error::Error>> {
        // Create date directory
        let date_dir = temp_dir.path().join("2025-01-27");
        fs::create_dir_all(&date_dir)?;
        // Create a test job file
        let job_file = date_dir.join("job_123.jsonl");
        let mut file = File::create(&job_file)?;
        // Write test entries
        let entry1 = create_test_log_entry("job_123", 1, "1737993600"); // 2025-01-27 12:00:00
        let entry2 = create_test_log_entry("job_123", 2, "1737993660"); // 2025-01-27 12:01:00
        let entry3 = create_test_metric_entry("job_123", 3, "1737993720"); // 2025-01-27 12:02:00
        writeln!(file, "{}", serde_json::to_string(&entry1)?)?;
        writeln!(file, "{}", serde_json::to_string(&entry2)?)?;
        writeln!(file, "{}", serde_json::to_string(&entry3)?)?;
        Ok(())
    }
    #[test]
    fn test_log_reader_creation() {
        let temp_dir = TempDir::new().unwrap();
        let reader = LogReader::new(temp_dir.path());
        assert_eq!(reader.logs_base_path, temp_dir.path());
    }
    #[test]
    fn test_get_job_logs_basic() {
        let temp_dir = TempDir::new().unwrap();
        setup_test_logs(&temp_dir).unwrap();
        let reader = LogReader::new(temp_dir.path());
        let request = JobLogsRequest {
            job_run_id: "job_123".to_string(),
            since_timestamp: 0,
            min_level: 0,
            limit: 10,
        };
        let response = reader.get_job_logs(&request).unwrap();
        assert_eq!(response.entries.len(), 3);
        assert!(!response.has_more);
        // Verify the entries are in order
        assert_eq!(response.entries[0].sequence_number, 1);
        assert_eq!(response.entries[1].sequence_number, 2);
        assert_eq!(response.entries[2].sequence_number, 3);
    }
    #[test]
    fn test_get_job_logs_with_timestamp_filter() {
        let temp_dir = TempDir::new().unwrap();
        setup_test_logs(&temp_dir).unwrap();
        let reader = LogReader::new(temp_dir.path());
        let request = JobLogsRequest {
            job_run_id: "job_123".to_string(),
            since_timestamp: 1737993600_000_000_000, // 2025-01-27 12:00:00 in nanoseconds
            min_level: 0,
            limit: 10,
        };
        let response = reader.get_job_logs(&request).unwrap();
        // Should get entries 2 and 3 (after the timestamp)
        assert_eq!(response.entries.len(), 2);
        assert_eq!(response.entries[0].sequence_number, 2);
        assert_eq!(response.entries[1].sequence_number, 3);
    }
    #[test]
    fn test_get_job_logs_with_level_filter() {
        let temp_dir = TempDir::new().unwrap();
        setup_test_logs(&temp_dir).unwrap();
        let reader = LogReader::new(temp_dir.path());
        let request = JobLogsRequest {
            job_run_id: "job_123".to_string(),
            since_timestamp: 0,
            min_level: log_message::LogLevel::Warn as i32, // Only WARN and ERROR
            limit: 10,
        };
        let response = reader.get_job_logs(&request).unwrap();
        // Should get only the metric entry (sequence 3) since log entries are INFO level
        assert_eq!(response.entries.len(), 1);
        assert_eq!(response.entries[0].sequence_number, 3);
    }
    #[test]
    fn test_get_job_logs_with_limit() {
        let temp_dir = TempDir::new().unwrap();
        setup_test_logs(&temp_dir).unwrap();
        let reader = LogReader::new(temp_dir.path());
        let request = JobLogsRequest {
            job_run_id: "job_123".to_string(),
            since_timestamp: 0,
            min_level: 0,
            limit: 2,
        };
        let response = reader.get_job_logs(&request).unwrap();
        assert_eq!(response.entries.len(), 2);
        assert!(response.has_more);
        assert_eq!(response.entries[0].sequence_number, 1);
        assert_eq!(response.entries[1].sequence_number, 2);
    }
    #[test]
    fn test_list_available_jobs() {
        let temp_dir = TempDir::new().unwrap();
        setup_test_logs(&temp_dir).unwrap();
        // Create another job file
        let date_dir = temp_dir.path().join("2025-01-27");
        let job_file2 = date_dir.join("job_456.jsonl");
        let mut file2 = File::create(&job_file2).unwrap();
        let entry = create_test_log_entry("job_456", 1, "1737993600");
        writeln!(file2, "{}", serde_json::to_string(&entry).unwrap()).unwrap();
        let reader = LogReader::new(temp_dir.path());
        let job_ids = reader.list_available_jobs(None).unwrap();
        assert_eq!(job_ids.len(), 2);
        assert!(job_ids.contains(&"job_123".to_string()));
        assert!(job_ids.contains(&"job_456".to_string()));
    }
    #[test]
    fn test_get_job_metrics() {
        let temp_dir = TempDir::new().unwrap();
        setup_test_logs(&temp_dir).unwrap();
        let reader = LogReader::new(temp_dir.path());
        let metrics = reader.get_job_metrics("job_123").unwrap();
        assert_eq!(metrics.len(), 1);
        assert_eq!(metrics[0].name, "test_metric");
        assert_eq!(metrics[0].value, 42.0);
        assert_eq!(metrics[0].unit, "count");
    }
    #[test]
    fn test_job_not_found() {
        let temp_dir = TempDir::new().unwrap();
        let reader = LogReader::new(temp_dir.path());
        let request = JobLogsRequest {
            job_run_id: "nonexistent_job".to_string(),
            since_timestamp: 0,
            min_level: 0,
            limit: 10,
        };
        let result = reader.get_job_logs(&request);
        assert!(result.is_err());
        assert!(matches!(result.unwrap_err(), LogAccessError::JobNotFound(_)));
    }
    #[test]
    fn test_default_log_reader() {
        let reader = LogReader::default();
        // Should use the default logs directory
        let expected = crate::log_collector::LogCollector::default_logs_dir();
        assert_eq!(reader.logs_base_path, expected);
    }
 }
--- a/databuild/log_collector.rs
+++ b/databuild/log_collector.rs
@ -1,402 +0,0 @@
 use crate::{JobLogEntry, job_log_entry};
 use serde_json;
 use std::collections::HashMap;
 use std::fs::{self, File, OpenOptions};
 use std::io::{BufRead, Write};
 use std::path::{Path, PathBuf};
 use std::time::{SystemTime, UNIX_EPOCH};
 use thiserror::Error;
 /// Convert days since Unix epoch to (year, month, day)
 /// This is a simplified algorithm good enough for log file naming
 fn days_to_ymd(days: i32) -> (i32, u32, u32) {
    // Start from 1970-01-01
    let mut year = 1970;
    let mut remaining_days = days;
    // Handle years
    loop {
        let days_in_year = if is_leap_year(year) { 366 } else { 365 };
        if remaining_days < days_in_year {
            break;
        }
        remaining_days -= days_in_year;
        year += 1;
    }
    // Handle months
    let mut month = 1;
    for m in 1..=12 {
        let days_in_month = days_in_month(year, m);
        if remaining_days < days_in_month as i32 {
            month = m;
            break;
        }
        remaining_days -= days_in_month as i32;
    }
    let day = remaining_days + 1; // Days are 1-indexed
    (year, month, day as u32)
 }
 /// Check if a year is a leap year
 fn is_leap_year(year: i32) -> bool {
    (year % 4 == 0 && year % 100 != 0) || (year % 400 == 0)
 }
 /// Get number of days in a given month
 fn days_in_month(year: i32, month: u32) -> u32 {
    match month {
        1 | 3 | 5 | 7 | 8 | 10 | 12 => 31,
        4 | 6 | 9 | 11 => 30,
        2 => if is_leap_year(year) { 29 } else { 28 },
        _ => 30, // Should never happen
    }
 }
 #[derive(Error, Debug)]
 pub enum LogCollectorError {
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    #[error("JSON parsing error: {0}")]
    Json(#[from] serde_json::Error),
    #[error("Invalid log entry: {0}")]
    InvalidLogEntry(String),
 }
 pub struct LogCollector {
    logs_dir: PathBuf,
    active_files: HashMap<String, File>,
    job_label_mapping: HashMap<String, String>, // job_run_id -> job_label
 }
 impl LogCollector {
    pub fn new<P: AsRef<Path>>(logs_dir: P) -> Result<Self, LogCollectorError> {
        let logs_dir = logs_dir.as_ref().to_path_buf();
        // Ensure the base logs directory exists
        if !logs_dir.exists() {
            fs::create_dir_all(&logs_dir)?;
        }
        Ok(Self {
            logs_dir,
            active_files: HashMap::new(),
            job_label_mapping: HashMap::new(),
        })
    }
    /// Set the job label for a specific job run ID
    pub fn set_job_label(&mut self, job_run_id: &str, job_label: &str) {
        self.job_label_mapping.insert(job_run_id.to_string(), job_label.to_string());
    }
    /// Get the default logs directory based on environment variable or fallback
    pub fn default_logs_dir() -> PathBuf {
        std::env::var("DATABUILD_LOGS_DIR")
            .map(PathBuf::from)
            .unwrap_or_else(|_| {
                // Fallback to ./logs/databuild for safety - avoid system directories
                std::env::current_dir()
                    .unwrap_or_else(|_| PathBuf::from("."))
                    .join("logs")
                    .join("databuild")
            })
    }
    /// Create a date-organized directory path for today
    fn get_date_directory(&self) -> Result<PathBuf, LogCollectorError> {
        let now = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .map_err(|e| LogCollectorError::InvalidLogEntry(format!("System time error: {}", e)))?;
        let timestamp = now.as_secs();
        // Convert timestamp to YYYY-MM-DD format
        // Using a simple calculation instead of chrono
        let days_since_epoch = timestamp / 86400; // 86400 seconds in a day
        let days_since_1970 = days_since_epoch as i32;
        // Calculate year, month, day from days since epoch
        // This is a simplified calculation - good enough for log file naming
        let (year, month, day) = days_to_ymd(days_since_1970);
        let date_str = format!("{:04}-{:02}-{:02}", year, month, day);
        let date_dir = self.logs_dir.join(date_str);
        // Ensure the date directory exists
        if !date_dir.exists() {
            fs::create_dir_all(&date_dir)?;
        }
        Ok(date_dir)
    }
    /// Get or create a file handle for a specific job run
    fn get_job_file(&mut self, job_run_id: &str) -> Result<&mut File, LogCollectorError> {
        if !self.active_files.contains_key(job_run_id) {
            let date_dir = self.get_date_directory()?;
            let file_path = date_dir.join(format!("{}.jsonl", job_run_id));
            let file = OpenOptions::new()
                .create(true)
                .append(true)
                .open(&file_path)?;
            self.active_files.insert(job_run_id.to_string(), file);
        }
        Ok(self.active_files.get_mut(job_run_id).unwrap())
    }
    /// Write a single log entry to the appropriate JSONL file
    pub fn write_log_entry(&mut self, job_run_id: &str, entry: &JobLogEntry) -> Result<(), LogCollectorError> {
        let file = self.get_job_file(job_run_id)?;
        let json_line = serde_json::to_string(entry)?;
        writeln!(file, "{}", json_line)?;
        file.flush()?;
        Ok(())
    }
    /// Consume stdout from a job process and parse/store log entries
    pub fn consume_job_output<R: BufRead>(&mut self, job_run_id: &str, reader: R) -> Result<(), LogCollectorError> {
        for line in reader.lines() {
            let line = line?;
            // Skip empty lines
            if line.trim().is_empty() {
                continue;
            }
            // Try to parse as JobLogEntry
            match serde_json::from_str::<JobLogEntry>(&line) {
                Ok(mut entry) => {
                    // Validate that the job_id matches
                    if entry.job_id != job_run_id {
                        return Err(LogCollectorError::InvalidLogEntry(
                            format!("Job ID mismatch: expected {}, got {}", job_run_id, entry.job_id)
                        ));
                    }
                    // Enrich WrapperJobEvent and Manifest with job_label if available
                    if let Some(job_label) = self.job_label_mapping.get(job_run_id) {
                        match &mut entry.content {
                            Some(job_log_entry::Content::JobEvent(ref mut job_event)) => {
                                job_event.job_label = Some(job_label.clone());
                            }
                            Some(job_log_entry::Content::Manifest(ref mut manifest)) => {
                                if let Some(ref mut task) = manifest.task {
                                    if let Some(ref mut job) = task.job {
                                        job.label = job_label.clone();
                                    }
                                }
                            }
                            _ => {} // No enrichment needed for Log entries
                        }
                    }
                    self.write_log_entry(job_run_id, &entry)?;
                }
                Err(_) => {
                    // If it's not a JobLogEntry, treat it as raw output and create a log entry
                    let raw_entry = JobLogEntry {
                        timestamp: SystemTime::now()
                            .duration_since(UNIX_EPOCH)
                            .unwrap()
                            .as_secs()
                            .to_string(),
                        job_id: job_run_id.to_string(),
                        outputs: vec![], // Raw output doesn't have specific outputs
                        sequence_number: 0, // Raw output gets sequence 0
                        content: Some(crate::job_log_entry::Content::Log(crate::LogMessage {
                            level: crate::log_message::LogLevel::Info as i32,
                            message: line,
                            fields: HashMap::new(),
                        })),
                    };
                    self.write_log_entry(job_run_id, &raw_entry)?;
                }
            }
        }
        Ok(())
    }
    /// Close and flush all active files
    pub fn close_all(&mut self) -> Result<(), LogCollectorError> {
        for (_, mut file) in self.active_files.drain() {
            file.flush()?;
        }
        Ok(())
    }
    /// Close and flush a specific job's file
    pub fn close_job(&mut self, job_run_id: &str) -> Result<(), LogCollectorError> {
        if let Some(mut file) = self.active_files.remove(job_run_id) {
            file.flush()?;
        }
        Ok(())
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::{job_log_entry, log_message, LogMessage, PartitionRef};
    use std::io::Cursor;
    use tempfile::TempDir;
    fn create_test_log_entry(job_id: &str, sequence: u64) -> JobLogEntry {
        JobLogEntry {
            timestamp: "1234567890".to_string(),
            job_id: job_id.to_string(),
            outputs: vec![PartitionRef { r#str: "test/partition".to_string() }],
            sequence_number: sequence,
            content: Some(job_log_entry::Content::Log(LogMessage {
                level: log_message::LogLevel::Info as i32,
                message: "Test log message".to_string(),
                fields: HashMap::new(),
            })),
        }
    }
    #[test]
    fn test_log_collector_creation() {
        let temp_dir = TempDir::new().unwrap();
        let collector = LogCollector::new(temp_dir.path()).unwrap();
        assert_eq!(collector.logs_dir, temp_dir.path());
        assert!(collector.active_files.is_empty());
    }
    #[test]
    fn test_write_single_log_entry() {
        let temp_dir = TempDir::new().unwrap();
        let mut collector = LogCollector::new(temp_dir.path()).unwrap();
        let entry = create_test_log_entry("job_123", 1);
        collector.write_log_entry("job_123", &entry).unwrap();
        // Verify file was created and contains the entry
        collector.close_all().unwrap();
        // Check that a date directory was created
        let date_dirs: Vec<_> = fs::read_dir(temp_dir.path()).unwrap().collect();
        assert_eq!(date_dirs.len(), 1);
        // Check that the job file exists in the date directory
        let date_dir_path = date_dirs[0].as_ref().unwrap().path();
        let job_files: Vec<_> = fs::read_dir(&date_dir_path).unwrap().collect();
        assert_eq!(job_files.len(), 1);
        let job_file_path = job_files[0].as_ref().unwrap().path();
        assert!(job_file_path.file_name().unwrap().to_string_lossy().contains("job_123"));
        // Verify content
        let content = fs::read_to_string(&job_file_path).unwrap();
        assert!(content.contains("Test log message"));
        assert!(content.contains("\"sequence_number\":1"));
    }
    #[test]
    fn test_consume_structured_output() {
        let temp_dir = TempDir::new().unwrap();
        let mut collector = LogCollector::new(temp_dir.path()).unwrap();
        let entry1 = create_test_log_entry("job_456", 1);
        let entry2 = create_test_log_entry("job_456", 2);
        let input = format!("{}\n{}\n", 
            serde_json::to_string(&entry1).unwrap(),
            serde_json::to_string(&entry2).unwrap()
        );
        let reader = Cursor::new(input);
        collector.consume_job_output("job_456", reader).unwrap();
        collector.close_all().unwrap();
        // Verify both entries were written
        let date_dirs: Vec<_> = fs::read_dir(temp_dir.path()).unwrap().collect();
        let date_dir_path = date_dirs[0].as_ref().unwrap().path();
        let job_files: Vec<_> = fs::read_dir(&date_dir_path).unwrap().collect();
        let job_file_path = job_files[0].as_ref().unwrap().path();
        let content = fs::read_to_string(&job_file_path).unwrap();
        let lines: Vec<&str> = content.trim().split('\n').collect();
        assert_eq!(lines.len(), 2);
        // Verify both entries can be parsed back
        let parsed1: JobLogEntry = serde_json::from_str(lines[0]).unwrap();
        let parsed2: JobLogEntry = serde_json::from_str(lines[1]).unwrap();
        assert_eq!(parsed1.sequence_number, 1);
        assert_eq!(parsed2.sequence_number, 2);
    }
    #[test]
    fn test_consume_mixed_output() {
        let temp_dir = TempDir::new().unwrap();
        let mut collector = LogCollector::new(temp_dir.path()).unwrap();
        let entry = create_test_log_entry("job_789", 1);
        let structured_line = serde_json::to_string(&entry).unwrap();
        let input = format!("{}\nRaw output line\nAnother raw line\n", structured_line);
        let reader = Cursor::new(input);
        collector.consume_job_output("job_789", reader).unwrap();
        collector.close_all().unwrap();
        // Verify all lines were captured (1 structured + 2 raw)
        let date_dirs: Vec<_> = fs::read_dir(temp_dir.path()).unwrap().collect();
        let date_dir_path = date_dirs[0].as_ref().unwrap().path();
        let job_files: Vec<_> = fs::read_dir(&date_dir_path).unwrap().collect();
        let job_file_path = job_files[0].as_ref().unwrap().path();
        let content = fs::read_to_string(&job_file_path).unwrap();
        let lines: Vec<&str> = content.trim().split('\n').collect();
        assert_eq!(lines.len(), 3);
        // First line should be the structured entry
        let parsed1: JobLogEntry = serde_json::from_str(lines[0]).unwrap();
        assert_eq!(parsed1.sequence_number, 1);
        // Second and third lines should be raw output entries
        let parsed2: JobLogEntry = serde_json::from_str(lines[1]).unwrap();
        let parsed3: JobLogEntry = serde_json::from_str(lines[2]).unwrap();
        assert_eq!(parsed2.sequence_number, 0); // Raw output gets sequence 0
        assert_eq!(parsed3.sequence_number, 0);
        if let Some(job_log_entry::Content::Log(log_msg)) = &parsed2.content {
            assert_eq!(log_msg.message, "Raw output line");
        } else {
            panic!("Expected log content");
        }
    }
    #[test]
    fn test_default_logs_dir() {
        let default_dir = LogCollector::default_logs_dir();
        // Should be a valid path
        assert!(default_dir.is_absolute() || default_dir.starts_with("."));
        assert!(default_dir.to_string_lossy().contains("logs"));
        assert!(default_dir.to_string_lossy().contains("databuild"));
    }
    #[test]
    fn test_job_id_validation() {
        let temp_dir = TempDir::new().unwrap();
        let mut collector = LogCollector::new(temp_dir.path()).unwrap();
        let mut entry = create_test_log_entry("wrong_job_id", 1);
        entry.job_id = "wrong_job_id".to_string();
        let input = serde_json::to_string(&entry).unwrap();
        let reader = Cursor::new(input);
        let result = collector.consume_job_output("expected_job_id", reader);
        assert!(result.is_err());
        assert!(result.unwrap_err().to_string().contains("Job ID mismatch"));
    }
 }
--- a/databuild/mermaid_utils.rs
+++ b/databuild/mermaid_utils.rs
@ -1,915 +0,0 @@
 use crate::*;
 use std::collections::{HashMap, HashSet};
 /// Represents the status of a job or partition for visualization
 #[derive(Debug, Clone, PartialEq)]
 pub enum NodeStatus {
    Pending,
    Running,
    Completed,
    Failed,
    Cancelled,
    Skipped,
    Available,
    Delegated,
 }
 impl NodeStatus {
    /// Get the CSS class name for this status
    fn css_class(&self) -> &'static str {
        match self {
            NodeStatus::Pending => "pending",
            NodeStatus::Running => "running",
            NodeStatus::Completed => "completed",
            NodeStatus::Failed => "failed",
            NodeStatus::Cancelled => "cancelled",
            NodeStatus::Skipped => "skipped",
            NodeStatus::Available => "available",
            NodeStatus::Delegated => "delegated",
        }
    }
 }
 /// Extract current status information from build events
 pub fn extract_status_map(events: &[BuildEvent]) -> (HashMap<String, NodeStatus>, HashMap<String, NodeStatus>) {
    let mut job_statuses: HashMap<String, NodeStatus> = HashMap::new();
    let mut partition_statuses: HashMap<String, NodeStatus> = HashMap::new();
    // Process events in chronological order to get latest status
    let mut sorted_events = events.to_vec();
    sorted_events.sort_by_key(|e| e.timestamp);
    for event in sorted_events {
        match &event.event_type {
            Some(crate::build_event::EventType::JobEvent(job_event)) => {
                if let Some(job_label) = &job_event.job_label {
                    let status = match job_event.status_code {
                        1 => NodeStatus::Running,  // JOB_SCHEDULED
                        2 => NodeStatus::Running,  // JOB_RUNNING
                        3 => NodeStatus::Completed, // JOB_COMPLETED
                        4 => NodeStatus::Failed,    // JOB_FAILED
                        5 => NodeStatus::Cancelled, // JOB_CANCELLED
                        6 => NodeStatus::Skipped,   // JOB_SKIPPED
                        _ => NodeStatus::Pending,
                    };
                    // Create a unique key using job label + target partitions (same as node ID)
                    let outputs_label = job_event.target_partitions.iter()
                        .map(|p| p.str.clone())
                        .collect::<Vec<_>>()
                        .join("___");
                    let unique_key = encode_id(&(job_label.label.clone() + "___" + &outputs_label));
                    job_statuses.insert(unique_key, status);
                }
            }
            Some(crate::build_event::EventType::PartitionEvent(partition_event)) => {
                if let Some(partition_ref) = &partition_event.partition_ref {
                    let status = match partition_event.status_code {
                        1 => NodeStatus::Pending,    // PARTITION_REQUESTED
                        2 => NodeStatus::Pending,    // PARTITION_ANALYZED
                        3 => NodeStatus::Running,    // PARTITION_BUILDING
                        4 => NodeStatus::Available,  // PARTITION_AVAILABLE
                        5 => NodeStatus::Failed,     // PARTITION_FAILED
                        6 => NodeStatus::Delegated,  // PARTITION_DELEGATED
                        _ => NodeStatus::Pending,
                    };
                    partition_statuses.insert(partition_ref.str.clone(), status);
                }
            }
            _ => {}
        }
    }
    (job_statuses, partition_statuses)
 }
 /// Convert NodeStatus to EdgeStatus for edge coloring
 fn map_node_status_to_edge_status(node_status: &NodeStatus) -> EdgeStatus {
    match node_status {
        NodeStatus::Failed => EdgeStatus::Failed,
        NodeStatus::Running => EdgeStatus::Running,
        NodeStatus::Completed => EdgeStatus::Completed,
        NodeStatus::Available => EdgeStatus::Available,
        NodeStatus::Pending => EdgeStatus::Pending,
        NodeStatus::Cancelled => EdgeStatus::Failed,  // Treat cancelled as failed
        NodeStatus::Skipped => EdgeStatus::Pending,   // Treat skipped as pending
        NodeStatus::Delegated => EdgeStatus::Available, // Treat delegated as available
    }
 }
 /// Encodes ID for safe usage in mermaid graph
 fn encode_id(id: &str) -> String {
    id.replace("/", "_").replace("=", "_").replace(":", "_")
 }
 /// Trait for all Mermaid node types
 trait MermaidNode {
    fn id(&self) -> &str;
    #[allow(dead_code)]
    fn label(&self) -> &str;
    fn render(&self, status: &NodeStatus) -> String;
 }
 /// Represents a job node in the Mermaid diagram
 struct MermaidJobNode {
    task: Task,
    id: String,
    label: String,
 }
 impl MermaidJobNode {
    fn from(task: &Task) -> Option<MermaidJobNode> {
        let job_label: String = match &task.job {
            Some(job) => job.label.clone(),
            None => return None,
        };
        let outputs_label: String = match &task.config {
            Some(config) => config.outputs.iter()
                .map(|o| o.str.clone())
                .collect::<Vec<_>>()
                .join("___"),
            None => String::new(),
        };
        let id = encode_id(&(job_label.clone() + "___" + &outputs_label));
        let label = format!("**{}** {}", job_label, outputs_label);
        Some(MermaidJobNode {
            task: task.clone(),
            id,
            label,
        })
    }
    fn to_mermaid(&self, job_statuses: &HashMap<String, NodeStatus>) -> String {
        // Use the same unique ID logic for status lookup as we use for the node ID
        let status = job_statuses.get(&self.id).unwrap_or(&NodeStatus::Pending);
        self.render(status)
    }
 }
 impl MermaidNode for MermaidJobNode {
    fn id(&self) -> &str {
        &self.id
    }
    fn label(&self) -> &str {
        &self.label
    }
    fn render(&self, status: &NodeStatus) -> String {
        format!("    {}[\"{}\"]:::job_{}\n", self.id, self.label, status.css_class())
    }
 }
 /// Represents a partition node in the Mermaid diagram
 struct MermaidPartitionNode {
    id: String,
    label: String,
    is_output: bool,
 }
 impl MermaidPartitionNode {
    fn new(partition_ref: &str, is_output: bool) -> Self {
        let id = format!("ref_{}", encode_id(partition_ref));
        let label = partition_ref.to_string();
        Self {
            id,
            label,
            is_output,
        }
    }
 }
 impl MermaidNode for MermaidPartitionNode {
    fn id(&self) -> &str {
        &self.id
    }
    fn label(&self) -> &str {
        &self.label
    }
    fn render(&self, status: &NodeStatus) -> String {
        let node_class = if self.is_output {
            format!("outputPartition_{}", status.css_class())
        } else {
            format!("partition_{}", status.css_class())
        };
        format!("    {}[(\"{}\")]:::{}\n", self.id, encode_id(&self.label), node_class)
    }
 }
 /// Types of edges in the diagram
 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
 enum EdgeType {
    Solid,   // Regular dependency
    Dotted,  // Weak dependency
 }
 /// Status of an edge for coloring purposes
 #[derive(Debug, Clone, PartialEq)]
 enum EdgeStatus {
    Failed,      // Red - critical path issues
    Running,     // Yellow - actively processing
    Completed,   // Green - successfully processed
    Available,   // Light green - data ready
    Pending,     // Gray - waiting/not started
 }
 /// Represents an edge between two nodes
 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
 struct MermaidEdge {
    from_id: String,
    to_id: String,
    edge_type: EdgeType,
 }
 impl MermaidEdge {
    fn new(from_id: String, to_id: String, edge_type: EdgeType) -> Self {
        Self { from_id, to_id, edge_type }
    }
    fn render(&self) -> String {
        match self.edge_type {
            EdgeType::Solid => format!("    {} --> {}\n", self.from_id, self.to_id),
            EdgeType::Dotted => format!("    {} -.-> {}\n", self.from_id, self.to_id),
        }
    }
 }
 /// Collection of edges with deduplication
 struct EdgeCollection {
    edges: HashSet<MermaidEdge>,
 }
 impl EdgeCollection {
    fn new() -> Self {
        Self {
            edges: HashSet::new(),
        }
    }
    fn add(&mut self, edge: MermaidEdge) {
        self.edges.insert(edge);
    }
    fn render_all(&self) -> String {
        self.edges.iter()
            .map(|edge| edge.render())
            .collect::<Vec<_>>()
            .join("")
    }
 }
 /// Style rule for a specific node type and status combination
 struct StyleRule {
    class_name: String,
    fill: &'static str,
    stroke: &'static str,
    stroke_width: &'static str,
 }
 impl StyleRule {
    fn render(&self) -> String {
        format!(
            "    classDef {} fill:{},stroke:{},stroke-width:{};\n",
            self.class_name, self.fill, self.stroke, self.stroke_width
        )
    }
 }
 /// Manages all styling for the Mermaid diagram
 struct MermaidStyleSheet {
    rules: Vec<StyleRule>,
 }
 impl MermaidStyleSheet {
    fn default() -> Self {
        let mut rules = Vec::new();
        // Job status styles
        rules.push(StyleRule {
            class_name: "job_pending".to_string(),
            fill: "#e0e0e0",
            stroke: "#333",
            stroke_width: "1px",
        });
        rules.push(StyleRule {
            class_name: "job_running".to_string(),
            fill: "#ffeb3b",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "job_completed".to_string(),
            fill: "#4caf50",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "job_failed".to_string(),
            fill: "#f44336",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "job_cancelled".to_string(),
            fill: "#ff9800",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "job_skipped".to_string(),
            fill: "#9e9e9e",
            stroke: "#333",
            stroke_width: "1px",
        });
        // Partition status styles
        rules.push(StyleRule {
            class_name: "partition_pending".to_string(),
            fill: "#e3f2fd",
            stroke: "#333",
            stroke_width: "1px",
        });
        rules.push(StyleRule {
            class_name: "partition_running".to_string(),
            fill: "#fff9c4",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "partition_available".to_string(),
            fill: "#c8e6c9",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "partition_failed".to_string(),
            fill: "#ffcdd2",
            stroke: "#333",
            stroke_width: "2px",
        });
        rules.push(StyleRule {
            class_name: "partition_delegated".to_string(),
            fill: "#d1c4e9",
            stroke: "#333",
            stroke_width: "2px",
        });
        // Output partition status styles (highlighted versions)
        rules.push(StyleRule {
            class_name: "outputPartition_pending".to_string(),
            fill: "#bbdefb",
            stroke: "#333",
            stroke_width: "3px",
        });
        rules.push(StyleRule {
            class_name: "outputPartition_running".to_string(),
            fill: "#fff59d",
            stroke: "#333",
            stroke_width: "3px",
        });
        rules.push(StyleRule {
            class_name: "outputPartition_available".to_string(),
            fill: "#a5d6a7",
            stroke: "#333",
            stroke_width: "3px",
        });
        rules.push(StyleRule {
            class_name: "outputPartition_failed".to_string(),
            fill: "#ef9a9a",
            stroke: "#333",
            stroke_width: "3px",
        });
        rules.push(StyleRule {
            class_name: "outputPartition_delegated".to_string(),
            fill: "#b39ddb",
            stroke: "#333",
            stroke_width: "3px",
        });
        Self { rules }
    }
    fn render(&self) -> String {
        let mut result = String::from("\n    %% Styling\n");
        for rule in &self.rules {
            result.push_str(&rule.render());
        }
        result
    }
    fn get_edge_color(&self, status: &EdgeStatus) -> &'static str {
        match status {
            EdgeStatus::Failed => "#ff4444",      // Red
            EdgeStatus::Running => "#ffaa00",     // Orange  
            EdgeStatus::Completed => "#44aa44",   // Green
            EdgeStatus::Available => "#88cc88",   // Light green
            EdgeStatus::Pending => "#888888",     // Gray
        }
    }
 }
 /// Builder for constructing Mermaid diagrams
 struct MermaidDiagramBuilder {
    job_nodes: HashMap<String, MermaidJobNode>,
    partition_nodes: HashMap<String, MermaidPartitionNode>,
    edges: EdgeCollection,
    output_refs: HashSet<String>,
    edge_count: usize,
 }
 impl MermaidDiagramBuilder {
    fn new() -> Self {
        Self {
            job_nodes: HashMap::new(),
            partition_nodes: HashMap::new(),
            edges: EdgeCollection::new(),
            output_refs: HashSet::new(),
            edge_count: 0,
        }
    }
    fn set_output_refs(&mut self, refs: &[PartitionRef]) {
        for ref_str in refs {
            self.output_refs.insert(ref_str.str.clone());
        }
    }
    fn add_job_node(&mut self, node: MermaidJobNode) {
        self.job_nodes.insert(node.id().to_string(), node);
    }
    fn add_partition_node(&mut self, partition_ref: &str) -> String {
        let is_output = self.output_refs.contains(partition_ref);
        let node = MermaidPartitionNode::new(partition_ref, is_output);
        let id = node.id().to_string();
        self.partition_nodes.entry(partition_ref.to_string())
            .or_insert(node);
        id
    }
    fn add_edge(&mut self, from_id: String, to_id: String, edge_type: EdgeType) {
        self.edges.add(MermaidEdge::new(from_id, to_id, edge_type));
    }
    fn add_edge_with_status(&mut self, from_id: String, to_id: String, edge_type: EdgeType, 
                           edge_status: EdgeStatus, result: &mut String, stylesheet: &MermaidStyleSheet) {
        // Create the edge
        let edge = MermaidEdge::new(from_id, to_id, edge_type);
        // Check if this edge already exists (for deduplication)
        if self.edges.edges.contains(&edge) {
            return; // Skip duplicate edge
        }
        // Render the edge
        result.push_str(&edge.render());
        // Add edge to collection for deduplication tracking
        self.edges.add(edge);
        // Immediately render the linkStyle if status is not pending
        if edge_status != EdgeStatus::Pending {
            let color = stylesheet.get_edge_color(&edge_status);
            result.push_str(&format!("    linkStyle {} stroke:{},stroke-width:2px\n",
                                    self.edge_count, color));
        }
        self.edge_count += 1;
    }
    fn build_with_edges(self, statuses: &(HashMap<String, NodeStatus>, HashMap<String, NodeStatus>), 
                        stylesheet: MermaidStyleSheet, edges_content: String) -> String {
        let (job_statuses, partition_statuses) = statuses;
        let mut result = String::from("flowchart TD\n");
        // Render all job nodes
        for (_, job_node) in self.job_nodes {
            result.push_str(&job_node.to_mermaid(job_statuses));
        }
        // Render all partition nodes
        for (partition_ref, node) in self.partition_nodes {
            let status = partition_statuses.get(&partition_ref).unwrap_or(&NodeStatus::Pending);
            result.push_str(&node.render(status));
        }
        // Add the edges content (which includes linkStyle statements)
        result.push_str(&edges_content);
        // Apply styles
        result.push_str(&stylesheet.render());
        result
    }
 }
 pub fn generate_mermaid_diagram(graph: &JobGraph) -> String {
    generate_mermaid_with_status(graph, &[])
 }
 /// Generate a mermaid diagram for a job graph with current status annotations
 pub fn generate_mermaid_with_status(
    graph: &JobGraph,
    events: &[BuildEvent],
 ) -> String {
    let statuses = extract_status_map(events);
    let (job_statuses, partition_statuses) = &statuses;
    let mut builder = MermaidDiagramBuilder::new();
    let stylesheet = MermaidStyleSheet::default();
    // Set output refs for highlighting
    builder.set_output_refs(&graph.outputs);
    // String to accumulate edges with their styles
    let mut edges_content = String::new();
    // Process all task nodes
    for task in &graph.nodes {
        if let Some(job_node) = MermaidJobNode::from(task) {
            let job_id = job_node.id().to_string();
            builder.add_job_node(job_node);
            if let Some(config) = &task.config {
                // Process inputs (dependencies)
                for input in &config.inputs {
                    if let Some(partition_ref) = &input.partition_ref {
                        let ref_id = builder.add_partition_node(&partition_ref.str);
                        let edge_type = if input.dep_type_code == 1 {
                            EdgeType::Solid
                        } else {
                            EdgeType::Dotted
                        };
                        // Get partition status for edge coloring
                        let partition_status = partition_statuses.get(&partition_ref.str)
                            .unwrap_or(&NodeStatus::Pending);
                        let edge_status = map_node_status_to_edge_status(partition_status);
                        builder.add_edge_with_status(ref_id, job_id.clone(), edge_type, 
                                                   edge_status, &mut edges_content, &stylesheet);
                    }
                }
                // Process outputs
                for output in &config.outputs {
                    let ref_id = builder.add_partition_node(&output.str);
                    // Get job status for edge coloring
                    let job_status = job_statuses.get(&job_id)
                        .unwrap_or(&NodeStatus::Pending);
                    let edge_status = map_node_status_to_edge_status(job_status);
                    builder.add_edge_with_status(job_id.clone(), ref_id, EdgeType::Solid, 
                                               edge_status, &mut edges_content, &stylesheet);
                }
            }
        }
    }
    // Build the diagram with edges content
    builder.build_with_edges(&statuses, stylesheet, edges_content)
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn test_encode_id() {
        assert_eq!(encode_id("path/to/file"), "path_to_file");
        assert_eq!(encode_id("key=value"), "key_value");
        assert_eq!(encode_id("scope:item"), "scope_item");
        assert_eq!(encode_id("a/b=c:d"), "a_b_c_d");
    }
    #[test]
    fn test_mermaid_job_node() {
        let mut task = Task::default();
        task.job = Some(JobLabel { label: "test_job".to_string() });
        task.config = Some(JobConfig {
            outputs: vec![
                PartitionRef { str: "output1".to_string() },
                PartitionRef { str: "output2".to_string() },
            ],
            inputs: vec![],
            args: vec![],
            env: HashMap::new(),
        });
        let node = MermaidJobNode::from(&task).expect("Failed to create job node");
        assert_eq!(node.id(), "test_job___output1___output2");
        assert_eq!(node.label(), "**test_job** output1___output2");
        let rendered = node.render(&NodeStatus::Running);
        assert!(rendered.contains("test_job___output1___output2"));
        assert!(rendered.contains("**test_job** output1___output2"));
        assert!(rendered.contains("job_running"));
    }
    #[test]
    fn test_mermaid_partition_node() {
        let node = MermaidPartitionNode::new("data/partition=1", false);
        assert_eq!(node.id(), "ref_data_partition_1");
        assert_eq!(node.label(), "data/partition=1");
        let rendered = node.render(&NodeStatus::Available);
        assert!(rendered.contains("ref_data_partition_1"));
        assert!(rendered.contains("data_partition_1"));
        assert!(rendered.contains("partition_available"));
        // Test output partition
        let output_node = MermaidPartitionNode::new("output/data", true);
        let output_rendered = output_node.render(&NodeStatus::Available);
        assert!(output_rendered.contains("outputPartition_available"));
    }
    #[test]
    fn test_edge_collection() {
        let mut edges = EdgeCollection::new();
        // Add edges
        edges.add(MermaidEdge::new("node1".to_string(), "node2".to_string(), EdgeType::Solid));
        edges.add(MermaidEdge::new("node2".to_string(), "node3".to_string(), EdgeType::Dotted));
        // Test deduplication
        edges.add(MermaidEdge::new("node1".to_string(), "node2".to_string(), EdgeType::Solid));
        let rendered = edges.render_all();
        assert!(rendered.contains("node1 --> node2"));
        assert!(rendered.contains("node2 -.-> node3"));
        // Should only have 2 unique edges
        assert_eq!(rendered.matches("-->").count(), 1);
        assert_eq!(rendered.matches("-.->").count(), 1);
    }
    #[test]
    fn test_simple_graph_generation() {
        // Create task 1
        let mut task1 = Task::default();
        task1.job = Some(JobLabel { label: "job1".to_string() });
        task1.config = Some(JobConfig {
            inputs: vec![{
                let mut input = DataDep::default();
                input.partition_ref = Some(PartitionRef { str: "input/data".to_string() });
                input.dep_type_code = 1; // Solid dependency
                input.dep_type_name = "materialize".to_string();
                input
            }],
            outputs: vec![
                PartitionRef { str: "intermediate/data".to_string() },
            ],
            args: vec![],
            env: HashMap::new(),
        });
        // Create task 2
        let mut task2 = Task::default();
        task2.job = Some(JobLabel { label: "job2".to_string() });
        task2.config = Some(JobConfig {
            inputs: vec![{
                let mut input = DataDep::default();
                input.partition_ref = Some(PartitionRef { str: "intermediate/data".to_string() });
                input.dep_type_code = 0; // Dotted dependency  
                input.dep_type_name = "query".to_string();
                input
            }],
            outputs: vec![
                PartitionRef { str: "output/data".to_string() },
            ],
            args: vec![],
            env: HashMap::new(),
        });
        // Create a simple graph
        let mut graph = JobGraph::default();
        graph.nodes = vec![task1, task2];
        graph.outputs = vec![
            PartitionRef { str: "output/data".to_string() },
        ];
        let mermaid = generate_mermaid_diagram(&graph);
        // Check basic structure
        assert!(mermaid.starts_with("flowchart TD\n"));
        // Check nodes - verify both ID and label are present
        assert!(mermaid.contains("job1___intermediate_data"), "Missing job1 node ID");
        assert!(mermaid.contains("**job1** intermediate/data"), "Missing job1 label");
        assert!(mermaid.contains("job2___output_data"), "Missing job2 node ID");
        assert!(mermaid.contains("**job2** output/data"), "Missing job2 label");
        assert!(mermaid.contains("ref_input_data"));
        assert!(mermaid.contains("ref_intermediate_data"));
        assert!(mermaid.contains("ref_output_data"));
        // Check edges
        assert!(mermaid.contains("ref_input_data --> job1"));
        assert!(mermaid.contains("job1___intermediate_data --> ref_intermediate_data"));
        assert!(mermaid.contains("ref_intermediate_data -.-> job2"));
        assert!(mermaid.contains("job2___output_data --> ref_output_data"));
        // Check styling
        assert!(mermaid.contains("classDef job_pending"));
        assert!(mermaid.contains("classDef partition_pending"));
        assert!(mermaid.contains("classDef outputPartition_pending"));
    }
    #[test]
    fn test_status_extraction() {
        let mut event1 = BuildEvent::default();
        event1.timestamp = 1;
        event1.event_type = Some(crate::build_event::EventType::JobEvent({
            let mut job_event = JobEvent::default();
            job_event.job_label = Some(JobLabel { label: "test_job".to_string() });
            job_event.status_code = 2; // JOB_RUNNING
            job_event
        }));
        let mut event2 = BuildEvent::default();
        event2.timestamp = 2;
        event2.event_type = Some(crate::build_event::EventType::PartitionEvent({
            let mut partition_event = PartitionEvent::default();
            partition_event.partition_ref = Some(PartitionRef { str: "test/partition".to_string() });
            partition_event.status_code = 4; // PARTITION_AVAILABLE
            partition_event
        }));
        let events = vec![event1, event2];
        let (job_statuses, partition_statuses) = extract_status_map(&events);
        // Should use the unique key (job_label + target_partitions) instead of just job_label
        assert_eq!(job_statuses.get("test_job"), None, "Should not find job by label alone");
        assert_eq!(partition_statuses.get("test/partition"), Some(&NodeStatus::Available));
    }
    #[test]
    fn test_job_status_per_task_instance() {
        // Test that different task instances with same job label get different status
        let mut event1 = BuildEvent::default();
        event1.event_type = Some(crate::build_event::EventType::JobEvent({
            let mut job_event = JobEvent::default();
            job_event.job_label = Some(JobLabel { label: "same_job".to_string() });
            job_event.target_partitions = vec![PartitionRef { str: "output1".to_string() }];
            job_event.status_code = 2; // JOB_RUNNING
            job_event
        }));
        let mut event2 = BuildEvent::default();
        event2.event_type = Some(crate::build_event::EventType::JobEvent({
            let mut job_event = JobEvent::default();
            job_event.job_label = Some(JobLabel { label: "same_job".to_string() });
            job_event.target_partitions = vec![PartitionRef { str: "output2".to_string() }];
            job_event.status_code = 3; // JOB_COMPLETED
            job_event
        }));
        let events = vec![event1, event2];
        let (job_statuses, _) = extract_status_map(&events);
        // Each task should have its own status based on unique key
        assert_eq!(job_statuses.get("same_job___output1"), Some(&NodeStatus::Running));
        assert_eq!(job_statuses.get("same_job___output2"), Some(&NodeStatus::Completed));
        assert_eq!(job_statuses.get("same_job"), None, "Should not find job by label alone");
    }
    #[test]
    fn test_edge_coloring_with_status() {
        // Create a simple graph with status
        let mut task1 = Task::default();
        task1.job = Some(JobLabel { label: "job1".to_string() });
        task1.config = Some(JobConfig {
            inputs: vec![{
                let mut input = DataDep::default();
                input.partition_ref = Some(PartitionRef { str: "input/data".to_string() });
                input.dep_type_code = 1; // Solid dependency
                input.dep_type_name = "materialize".to_string();
                input
            }],
            outputs: vec![
                PartitionRef { str: "intermediate/data".to_string() },
            ],
            args: vec![],
            env: HashMap::new(),
        });
        let mut graph = JobGraph::default();
        graph.nodes = vec![task1];
        graph.outputs = vec![
            PartitionRef { str: "intermediate/data".to_string() },
        ];
        // Create events to set status
        let mut partition_event = BuildEvent::default();
        partition_event.event_type = Some(crate::build_event::EventType::PartitionEvent({
            let mut pe = PartitionEvent::default();
            pe.partition_ref = Some(PartitionRef { str: "input/data".to_string() });
            pe.status_code = 4; // PARTITION_AVAILABLE
            pe
        }));
        let mut job_event = BuildEvent::default();
        job_event.event_type = Some(crate::build_event::EventType::JobEvent({
            let mut je = JobEvent::default();
            je.job_label = Some(JobLabel { label: "job1".to_string() });
            je.target_partitions = vec![PartitionRef { str: "intermediate/data".to_string() }];
            je.status_code = 2; // JOB_RUNNING
            je
        }));
        let events = vec![partition_event, job_event];
        let mermaid = generate_mermaid_with_status(&graph, &events);
        // Check that linkStyle statements are present
        assert!(mermaid.contains("linkStyle"), "Should contain linkStyle statements");
        assert!(mermaid.contains("#88cc88"), "Should contain available edge color (light green)");
        assert!(mermaid.contains("#ffaa00"), "Should contain running edge color (orange)");
        // Check basic structure is still intact
        assert!(mermaid.contains("flowchart TD"));
        assert!(mermaid.contains("job1___intermediate_data"));
        assert!(mermaid.contains("ref_input_data"));
        assert!(mermaid.contains("ref_intermediate_data"));
    }
    #[test]
    fn test_edge_status_mapping() {
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Failed), EdgeStatus::Failed);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Running), EdgeStatus::Running);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Completed), EdgeStatus::Completed);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Available), EdgeStatus::Available);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Pending), EdgeStatus::Pending);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Cancelled), EdgeStatus::Failed);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Skipped), EdgeStatus::Pending);
        assert_eq!(map_node_status_to_edge_status(&NodeStatus::Delegated), EdgeStatus::Available);
    }
    #[test]
    fn test_edge_deduplication() {
        // Create a graph that could potentially have duplicate edges
        let mut task1 = Task::default();
        task1.job = Some(JobLabel { label: "job1".to_string() });
        task1.config = Some(JobConfig {
            inputs: vec![{
                let mut input = DataDep::default();
                input.partition_ref = Some(PartitionRef { str: "shared_input".to_string() });
                input.dep_type_code = 1;
                input.dep_type_name = "materialize".to_string();
                input
            }],
            outputs: vec![
                PartitionRef { str: "output1".to_string() },
            ],
            args: vec![],
            env: HashMap::new(),
        });
        let mut task2 = Task::default();
        task2.job = Some(JobLabel { label: "job2".to_string() });
        task2.config = Some(JobConfig {
            inputs: vec![{
                let mut input = DataDep::default();
                input.partition_ref = Some(PartitionRef { str: "shared_input".to_string() });
                input.dep_type_code = 1;
                input.dep_type_name = "materialize".to_string();
                input
            }],
            outputs: vec![
                PartitionRef { str: "output2".to_string() },
            ],
            args: vec![],
            env: HashMap::new(),
        });
        let mut graph = JobGraph::default();
        graph.nodes = vec![task1, task2];
        graph.outputs = vec![
            PartitionRef { str: "output1".to_string() },
            PartitionRef { str: "output2".to_string() },
        ];
        let mermaid = generate_mermaid_diagram(&graph);
        // Count how many times the shared edge appears
        let shared_edge_count = mermaid.matches("ref_shared_input --> job").count();
        // Should only appear once per job (2 total), not duplicated
        assert_eq!(shared_edge_count, 2, "Should have exactly 2 edges from shared_input (one to each job)");
        // Verify no duplicate edges in the output
        let lines: Vec<&str> = mermaid.lines().collect();
        let edge_lines: Vec<&str> = lines.iter().filter(|line| line.contains("-->") || line.contains("-.->")).cloned().collect();
        let unique_edges: std::collections::HashSet<&str> = edge_lines.iter().cloned().collect();
        assert_eq!(edge_lines.len(), unique_edges.len(), "Should have no duplicate edges in output");
    }
 }
--- a/databuild/metric_templates.rs
+++ b/databuild/metric_templates.rs
@ -1,523 +0,0 @@
 use crate::{JobLogEntry, job_log_entry, WrapperJobEvent};
 use std::collections::HashMap;
 /// Template for metric extraction from job events
 #[derive(Debug, Clone)]
 pub struct MetricTemplate {
    pub name: String,
    pub help: String,
    pub metric_type: MetricType,
    pub extractor: MetricExtractor,
    pub labels: Vec<String>, // Static label names for this metric
 }
 /// Prometheus metric types
 #[derive(Debug, Clone)]
 pub enum MetricType {
    Counter,
    Gauge,
    Histogram,
    Summary,
 }
 /// Strategy for extracting metric values from job events
 #[derive(Debug, Clone)]
 pub enum MetricExtractor {
    /// Extract from job event metadata by key
    EventMetadata { 
        event_type: String, 
        metadata_key: String,
        /// Optional conversion function name for non-numeric values
        converter: Option<MetricConverter>,
    },
    /// Count occurrences of specific event types
    EventCount { 
        event_type: String,
    },
    /// Extract job duration from start/end events
    JobDuration,
    /// Extract peak memory from job summary
    PeakMemory,
    /// Extract total CPU time from job summary
    TotalCpuTime,
    /// Extract exit code from job events
    ExitCode,
 }
 /// Converters for non-numeric metadata values
 #[derive(Debug, Clone)]
 pub enum MetricConverter {
    /// Convert boolean strings to 0/1
    BoolToFloat,
    /// Convert status strings to numeric codes
    StatusToCode(HashMap<String, f64>),
    /// Parse duration strings like "123ms" to seconds
    DurationToSeconds,
 }
 /// Result of metric extraction
 #[derive(Debug)]
 pub struct ExtractedMetric {
    pub name: String,
    pub value: f64,
    pub labels: HashMap<String, String>,
    pub help: String,
    pub metric_type: MetricType,
 }
 impl MetricTemplate {
    /// Extract a metric from a job log entry if applicable
    pub fn extract(&self, entry: &JobLogEntry) -> Option<ExtractedMetric> {
        let value = match &self.extractor {
            MetricExtractor::EventMetadata { event_type, metadata_key, converter } => {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    if event.event_type == *event_type {
                        if let Some(raw_value) = event.metadata.get(metadata_key) {
                            self.convert_value(raw_value, converter)?
                        } else {
                            return None;
                        }
                    } else {
                        return None;
                    }
                } else {
                    return None;
                }
            },
            MetricExtractor::EventCount { event_type } => {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    if event.event_type == *event_type {
                        1.0
                    } else {
                        return None;
                    }
                } else {
                    return None;
                }
            },
            MetricExtractor::JobDuration => {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    if event.event_type == "job_summary" {
                        if let Some(runtime_str) = event.metadata.get("runtime_ms") {
                            runtime_str.parse::<f64>().ok()? / 1000.0 // Convert to seconds
                        } else {
                            return None;
                        }
                    } else {
                        return None;
                    }
                } else {
                    return None;
                }
            },
            MetricExtractor::PeakMemory => {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    if event.event_type == "job_summary" {
                        if let Some(memory_str) = event.metadata.get("peak_memory_mb") {
                            memory_str.parse::<f64>().ok()?
                        } else {
                            return None;
                        }
                    } else {
                        return None;
                    }
                } else {
                    return None;
                }
            },
            MetricExtractor::TotalCpuTime => {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    if event.event_type == "job_summary" {
                        if let Some(cpu_str) = event.metadata.get("total_cpu_ms") {
                            cpu_str.parse::<f64>().ok()? / 1000.0 // Convert to seconds
                        } else {
                            return None;
                        }
                    } else {
                        return None;
                    }
                } else {
                    return None;
                }
            },
            MetricExtractor::ExitCode => {
                if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
                    if let Some(exit_code) = event.exit_code {
                        exit_code as f64
                    } else {
                        return None;
                    }
                } else {
                    return None;
                }
            },
        };
        // Generate labels for this metric
        let mut labels = HashMap::new();
        // Always include job_id as a label (but this is excluded by default for cardinality safety)
        labels.insert("job_id".to_string(), entry.job_id.clone());
        // Extract job label from manifest if available - this is the low-cardinality identifier
        if let Some(job_log_entry::Content::Manifest(manifest)) = &entry.content {
            if let Some(task) = &manifest.task {
                if let Some(job) = &task.job {
                    labels.insert("job_label".to_string(), job.label.clone());
                }
            }
        }
        // Add job status and job label if available from job events
        if let Some(job_log_entry::Content::JobEvent(event)) = &entry.content {
            if let Some(job_status) = &event.job_status {
                labels.insert("job_status".to_string(), job_status.clone());
            }
            if let Some(job_label) = &event.job_label {
                labels.insert("job_label".to_string(), job_label.clone());
            }
        }
        Some(ExtractedMetric {
            name: self.name.clone(),
            value,
            labels,
            help: self.help.clone(),
            metric_type: self.metric_type.clone(),
        })
    }
    fn convert_value(&self, raw_value: &str, converter: &Option<MetricConverter>) -> Option<f64> {
        match converter {
            None => raw_value.parse().ok(),
            Some(MetricConverter::BoolToFloat) => {
                match raw_value.to_lowercase().as_str() {
                    "true" | "1" | "yes" => Some(1.0),
                    "false" | "0" | "no" => Some(0.0),
                    _ => None,
                }
            },
            Some(MetricConverter::StatusToCode(mapping)) => {
                mapping.get(raw_value).copied()
            },
            Some(MetricConverter::DurationToSeconds) => {
                // Parse formats like "123ms", "45s", "2.5m"
                if raw_value.ends_with("ms") {
                    raw_value.trim_end_matches("ms").parse::<f64>().ok().map(|v| v / 1000.0)
                } else if raw_value.ends_with("s") {
                    raw_value.trim_end_matches("s").parse::<f64>().ok()
                } else if raw_value.ends_with("m") {
                    raw_value.trim_end_matches("m").parse::<f64>().ok().map(|v| v * 60.0)
                } else {
                    raw_value.parse::<f64>().ok()
                }
            },
        }
    }
 }
 /// Get standard DataBuild metric templates
 pub fn get_standard_metrics() -> Vec<MetricTemplate> {
    vec![
        // Job execution metrics
        MetricTemplate {
            name: "databuild_job_duration_seconds".to_string(),
            help: "Duration of job execution in seconds".to_string(),
            metric_type: MetricType::Histogram,
            extractor: MetricExtractor::JobDuration,
            labels: vec!["job_label".to_string()],
        },
        MetricTemplate {
            name: "databuild_job_peak_memory_mb".to_string(),
            help: "Peak memory usage of job in megabytes".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::PeakMemory,
            labels: vec!["job_label".to_string()],
        },
        MetricTemplate {
            name: "databuild_job_cpu_time_seconds".to_string(),
            help: "Total CPU time consumed by job in seconds".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::TotalCpuTime,
            labels: vec!["job_label".to_string()],
        },
        MetricTemplate {
            name: "databuild_job_exit_code".to_string(),
            help: "Exit code of job execution".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::ExitCode,
            labels: vec!["job_label".to_string(), "job_status".to_string()],
        },
        // Job event counters
        MetricTemplate {
            name: "databuild_job_events_total".to_string(),
            help: "Total number of job events".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::EventCount { event_type: "task_success".to_string() },
            labels: vec!["job_label".to_string()],
        },
        MetricTemplate {
            name: "databuild_job_failures_total".to_string(),
            help: "Total number of job failures".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::EventCount { event_type: "task_failed".to_string() },
            labels: vec!["job_label".to_string()],
        },
        MetricTemplate {
            name: "databuild_heartbeats_total".to_string(),
            help: "Total number of heartbeat events".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::EventCount { event_type: "heartbeat".to_string() },
            labels: vec!["job_label".to_string()],
        },
    ]
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::{PartitionRef, log_message, LogMessage};
    fn create_test_job_summary_entry(job_id: &str, runtime_ms: &str, memory_mb: &str, cpu_ms: &str, exit_code: i32) -> JobLogEntry {
        let mut metadata = HashMap::new();
        metadata.insert("runtime_ms".to_string(), runtime_ms.to_string());
        metadata.insert("peak_memory_mb".to_string(), memory_mb.to_string());
        metadata.insert("total_cpu_ms".to_string(), cpu_ms.to_string());
        metadata.insert("exit_code".to_string(), exit_code.to_string());
        JobLogEntry {
            timestamp: "1234567890".to_string(),
            job_id: job_id.to_string(),
            outputs: vec![PartitionRef { r#str: "reviews/date=2025-01-27".to_string() }],
            sequence_number: 1,
            content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "job_summary".to_string(),
                job_status: Some("JOB_COMPLETED".to_string()),
                exit_code: Some(exit_code),
                metadata,
                job_label: None,
            })),
        }
    }
    fn create_test_task_success_entry(job_id: &str) -> JobLogEntry {
        JobLogEntry {
            timestamp: "1234567890".to_string(),
            job_id: job_id.to_string(),
            outputs: vec![PartitionRef { r#str: "podcasts/date=2025-01-27".to_string() }],
            sequence_number: 2,
            content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "task_success".to_string(),
                job_status: Some("JOB_COMPLETED".to_string()),
                exit_code: Some(0),
                metadata: HashMap::new(),
                job_label: None,
            })),
        }
    }
    #[test]
    fn test_job_duration_extraction() {
        let template = MetricTemplate {
            name: "test_duration".to_string(),
            help: "Test duration".to_string(),
            metric_type: MetricType::Histogram,
            extractor: MetricExtractor::JobDuration,
            labels: vec![],
        };
        let entry = create_test_job_summary_entry("test-job", "2500", "64.5", "1200", 0);
        let metric = template.extract(&entry).unwrap();
        assert_eq!(metric.name, "test_duration");
        assert_eq!(metric.value, 2.5); // 2500ms -> 2.5s
        assert_eq!(metric.labels.get("job_id").unwrap(), "test-job");
        // Note: job_label would only be available from manifest entries, not job_summary
    }
    #[test]
    fn test_memory_extraction() {
        let template = MetricTemplate {
            name: "test_memory".to_string(),
            help: "Test memory".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::PeakMemory,
            labels: vec![],
        };
        let entry = create_test_job_summary_entry("test-job", "2500", "128.75", "1200", 0);
        let metric = template.extract(&entry).unwrap();
        assert_eq!(metric.value, 128.75);
    }
    #[test]
    fn test_cpu_time_extraction() {
        let template = MetricTemplate {
            name: "test_cpu".to_string(),
            help: "Test CPU".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::TotalCpuTime,
            labels: vec![],
        };
        let entry = create_test_job_summary_entry("test-job", "2500", "64.5", "1500", 0);
        let metric = template.extract(&entry).unwrap();
        assert_eq!(metric.value, 1.5); // 1500ms -> 1.5s
    }
    #[test]
    fn test_exit_code_extraction() {
        let template = MetricTemplate {
            name: "test_exit_code".to_string(),
            help: "Test exit code".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::ExitCode,
            labels: vec![],
        };
        let entry = create_test_job_summary_entry("test-job", "2500", "64.5", "1200", 42);
        let metric = template.extract(&entry).unwrap();
        assert_eq!(metric.value, 42.0);
        assert_eq!(metric.labels.get("job_status").unwrap(), "JOB_COMPLETED");
    }
    #[test]
    fn test_event_count_extraction() {
        let template = MetricTemplate {
            name: "test_success_count".to_string(),
            help: "Test success count".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::EventCount { event_type: "task_success".to_string() },
            labels: vec![],
        };
        let entry = create_test_task_success_entry("test-job");
        let metric = template.extract(&entry).unwrap();
        assert_eq!(metric.value, 1.0);
        // Note: job_label would only be available from manifest entries, not job events
    }
    #[test]
    fn test_event_metadata_extraction() {
        let template = MetricTemplate {
            name: "test_runtime".to_string(),
            help: "Test runtime from metadata".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::EventMetadata { 
                event_type: "job_summary".to_string(),
                metadata_key: "runtime_ms".to_string(),
                converter: None,
            },
            labels: vec![],
        };
        let entry = create_test_job_summary_entry("test-job", "3000", "64.5", "1200", 0);
        let metric = template.extract(&entry).unwrap();
        assert_eq!(metric.value, 3000.0);
    }
    #[test]
    fn test_bool_converter() {
        let template = MetricTemplate {
            name: "test_bool".to_string(),
            help: "Test bool".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::EventMetadata {
                event_type: "test_event".to_string(),
                metadata_key: "success".to_string(),
                converter: Some(MetricConverter::BoolToFloat),
            },
            labels: vec![],
        };
        assert_eq!(template.convert_value("true", &Some(MetricConverter::BoolToFloat)), Some(1.0));
        assert_eq!(template.convert_value("false", &Some(MetricConverter::BoolToFloat)), Some(0.0));
        assert_eq!(template.convert_value("yes", &Some(MetricConverter::BoolToFloat)), Some(1.0));
        assert_eq!(template.convert_value("no", &Some(MetricConverter::BoolToFloat)), Some(0.0));
        assert_eq!(template.convert_value("invalid", &Some(MetricConverter::BoolToFloat)), None);
    }
    #[test]
    fn test_duration_converter() {
        let template = MetricTemplate {
            name: "test_duration".to_string(),
            help: "Test duration".to_string(),
            metric_type: MetricType::Gauge,
            extractor: MetricExtractor::EventMetadata {
                event_type: "test_event".to_string(),
                metadata_key: "duration".to_string(),
                converter: Some(MetricConverter::DurationToSeconds),
            },
            labels: vec![],
        };
        assert_eq!(template.convert_value("1000ms", &Some(MetricConverter::DurationToSeconds)), Some(1.0));
        assert_eq!(template.convert_value("5s", &Some(MetricConverter::DurationToSeconds)), Some(5.0));
        assert_eq!(template.convert_value("2.5m", &Some(MetricConverter::DurationToSeconds)), Some(150.0));
        assert_eq!(template.convert_value("42", &Some(MetricConverter::DurationToSeconds)), Some(42.0));
    }
    #[test]
    fn test_standard_metrics() {
        let metrics = get_standard_metrics();
        assert!(!metrics.is_empty());
        // Verify we have the key metrics
        let metric_names: Vec<&String> = metrics.iter().map(|m| &m.name).collect();
        assert!(metric_names.contains(&&"databuild_job_duration_seconds".to_string()));
        assert!(metric_names.contains(&&"databuild_job_peak_memory_mb".to_string()));
        assert!(metric_names.contains(&&"databuild_job_cpu_time_seconds".to_string()));
        assert!(metric_names.contains(&&"databuild_job_failures_total".to_string()));
    }
    #[test]
    fn test_no_extraction_for_wrong_event_type() {
        let template = MetricTemplate {
            name: "test_metric".to_string(),
            help: "Test".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::EventCount { event_type: "task_failed".to_string() },
            labels: vec![],
        };
        let entry = create_test_task_success_entry("test-job"); // This is task_success, not task_failed
        let result = template.extract(&entry);
        assert!(result.is_none());
    }
    #[test]
    fn test_no_extraction_for_log_entries() {
        let template = MetricTemplate {
            name: "test_metric".to_string(),
            help: "Test".to_string(),
            metric_type: MetricType::Counter,
            extractor: MetricExtractor::JobDuration,
            labels: vec![],
        };
        // Create a log entry instead of job event
        let entry = JobLogEntry {
            timestamp: "1234567890".to_string(),
            job_id: "test-job".to_string(),
            outputs: vec![PartitionRef { r#str: "test/partition".to_string() }],
            sequence_number: 1,
            content: Some(job_log_entry::Content::Log(LogMessage {
                level: log_message::LogLevel::Info as i32,
                message: "Test log message".to_string(),
                fields: HashMap::new(),
            })),
        };
        let result = template.extract(&entry);
        assert!(result.is_none());
    }
 }
--- a/databuild/metrics_aggregator.rs
+++ b/databuild/metrics_aggregator.rs
@ -1,507 +0,0 @@
 use crate::{JobLogEntry, log_access::LogReader, metric_templates::{MetricTemplate, ExtractedMetric, MetricType, get_standard_metrics}};
 use std::collections::{HashMap, HashSet};
 use std::path::Path;
 use thiserror::Error;
 #[derive(Error, Debug)]
 pub enum MetricsError {
    #[error("Log access error: {0}")]
    LogAccess(#[from] crate::log_access::LogAccessError),
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    #[error("Too many label combinations for metric {metric}: {count} > {limit}")]
    CardinalityLimit { metric: String, count: usize, limit: usize },
 }
 /// Aggregated metric value with labels
 #[derive(Debug, Clone)]
 pub struct AggregatedMetric {
    pub name: String,
    pub help: String,
    pub metric_type: MetricType,
    pub samples: Vec<MetricSample>,
 }
 /// Individual metric sample
 #[derive(Debug, Clone)]
 pub struct MetricSample {
    pub labels: HashMap<String, String>,
    pub value: f64,
    pub timestamp_ms: Option<u64>,
 }
 /// Configuration for metrics aggregation
 #[derive(Debug, Clone)]
 pub struct MetricsConfig {
    /// Maximum number of unique label combinations per metric (cardinality safety)
    pub max_cardinality_per_metric: usize,
    /// Time range for metrics collection (in hours from now)
    pub time_range_hours: u64,
    /// Whether to include job_id in labels (can create high cardinality)
    pub include_job_id_labels: bool,
    /// Maximum number of jobs to process per metric
    pub max_jobs_per_metric: usize,
 }
 impl Default for MetricsConfig {
    fn default() -> Self {
        Self {
            max_cardinality_per_metric: 1000, // Prometheus recommended limit
            time_range_hours: 24,              // Last 24 hours
            include_job_id_labels: false,     // Disabled by default for cardinality safety
            max_jobs_per_metric: 100,         // Limit recent jobs
        }
    }
 }
 /// Aggregates metrics from job logs with cardinality safety
 pub struct MetricsAggregator {
    log_reader: LogReader,
    config: MetricsConfig,
    templates: Vec<MetricTemplate>,
 }
 impl MetricsAggregator {
    /// Create a new metrics aggregator
    pub fn new<P: AsRef<Path>>(logs_path: P, config: MetricsConfig) -> Self {
        Self {
            log_reader: LogReader::new(logs_path),
            config,
            templates: get_standard_metrics(),
        }
    }
    /// Create with default configuration
    pub fn with_defaults<P: AsRef<Path>>(logs_path: P) -> Self {
        Self::new(logs_path, MetricsConfig::default())
    }
    /// Add custom metric template
    pub fn add_template(&mut self, template: MetricTemplate) {
        self.templates.push(template);
    }
    /// Aggregate all metrics from recent job logs
    pub fn aggregate_metrics(&self) -> Result<Vec<AggregatedMetric>, MetricsError> {
        // Get recent job IDs
        let job_ids = self.get_recent_job_ids()?;
        let mut aggregated: HashMap<String, AggregatedMetric> = HashMap::new();
        let mut cardinality_counters: HashMap<String, HashSet<String>> = HashMap::new();
        // Process each job's logs
        for job_id in job_ids.iter().take(self.config.max_jobs_per_metric) {
            if let Ok(entries) = self.get_job_entries(job_id) {
                for entry in entries {
                    self.process_entry(&entry, &mut aggregated, &mut cardinality_counters)?;
                }
            }
        }
        Ok(aggregated.into_values().collect())
    }
    /// Generate Prometheus format output
    pub fn to_prometheus_format(&self) -> Result<String, MetricsError> {
        let metrics = self.aggregate_metrics()?;
        let mut output = String::new();
        for metric in metrics {
            // Add help comment
            output.push_str(&format!("# HELP {} {}\n", metric.name, metric.help));
            // Add type comment
            let type_str = match metric.metric_type {
                MetricType::Counter => "counter",
                MetricType::Gauge => "gauge", 
                MetricType::Histogram => "histogram",
                MetricType::Summary => "summary",
            };
            output.push_str(&format!("# TYPE {} {}\n", metric.name, type_str));
            // Add samples
            for sample in metric.samples {
                output.push_str(&format!("{}{} {}\n", 
                    metric.name,
                    self.format_labels(&sample.labels),
                    sample.value
                ));
            }
            output.push('\n');
        }
        Ok(output)
    }
    /// Get recent job IDs within the configured time range
    fn get_recent_job_ids(&self) -> Result<Vec<String>, MetricsError> {
        // For now, get all available jobs. In production, this would filter by date
        let job_ids = self.log_reader.list_available_jobs(None)?;
        Ok(job_ids)
    }
    /// Get log entries for a specific job
    fn get_job_entries(&self, job_id: &str) -> Result<Vec<JobLogEntry>, MetricsError> {
        use crate::JobLogsRequest;
        let request = JobLogsRequest {
            job_run_id: job_id.to_string(),
            since_timestamp: 0,
            min_level: 0,
            limit: 1000, // Get all entries for the job
        };
        let response = self.log_reader.get_job_logs(&request)?;
        Ok(response.entries)
    }
    /// Process a single log entry through all metric templates
    fn process_entry(
        &self,
        entry: &JobLogEntry,
        aggregated: &mut HashMap<String, AggregatedMetric>,
        cardinality_counters: &mut HashMap<String, HashSet<String>>,
    ) -> Result<(), MetricsError> {
        for template in &self.templates {
            if let Some(mut extracted) = template.extract(entry) {
                // Apply cardinality safety filters
                if !self.config.include_job_id_labels {
                    extracted.labels.remove("job_id");
                }
                // Check cardinality limit
                let label_signature = self.get_label_signature(&extracted.labels);
                let cardinality_set = cardinality_counters
                    .entry(extracted.name.clone())
                    .or_insert_with(HashSet::new);
                if cardinality_set.len() >= self.config.max_cardinality_per_metric 
                    && !cardinality_set.contains(&label_signature) {
                    // Skip this metric to avoid cardinality explosion
                    continue;
                }
                cardinality_set.insert(label_signature);
                // Add to aggregated metrics
                let agg_metric = aggregated
                    .entry(extracted.name.clone())
                    .or_insert_with(|| AggregatedMetric {
                        name: extracted.name.clone(),
                        help: extracted.help.clone(),
                        metric_type: extracted.metric_type.clone(),
                        samples: Vec::new(),
                    });
                // For counters, sum values with same labels; for gauges, keep latest
                let existing_sample = agg_metric.samples.iter_mut()
                    .find(|s| s.labels == extracted.labels);
                if let Some(sample) = existing_sample {
                    match extracted.metric_type {
                        MetricType::Counter => {
                            sample.value += extracted.value; // Sum counters
                        },
                        MetricType::Gauge | MetricType::Histogram | MetricType::Summary => {
                            sample.value = extracted.value; // Replace with latest
                        },
                    }
                } else {
                    agg_metric.samples.push(MetricSample {
                        labels: extracted.labels,
                        value: extracted.value,
                        timestamp_ms: None, // Could add timestamp parsing if needed
                    });
                }
            }
        }
        Ok(())
    }
    /// Generate a signature string for label combinations
    fn get_label_signature(&self, labels: &HashMap<String, String>) -> String {
        let mut pairs: Vec<_> = labels.iter().collect();
        pairs.sort_by_key(|&(k, _)| k);
        pairs.iter()
            .map(|(k, v)| format!("{}={}", k, v))
            .collect::<Vec<_>>()
            .join(",")
    }
    /// Format labels for Prometheus output
    fn format_labels(&self, labels: &HashMap<String, String>) -> String {
        if labels.is_empty() {
            return String::new();
        }
        let mut pairs: Vec<_> = labels.iter().collect();
        pairs.sort_by_key(|&(k, _)| k);
        let formatted_pairs: Vec<String> = pairs.iter()
            .map(|(k, v)| format!("{}=\"{}\"", k, self.escape_label_value(v)))
            .collect();
        format!("{{{}}}", formatted_pairs.join(","))
    }
    /// Escape label values for Prometheus format
    fn escape_label_value(&self, value: &str) -> String {
        value
            .replace('\\', "\\\\")
            .replace('"', "\\\"")
            .replace('\n', "\\n")
            .replace('\t', "\\t")
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::{job_log_entry, PartitionRef, WrapperJobEvent};
    use std::io::Write;
    use tempfile::TempDir;
    fn create_test_logs(temp_dir: &TempDir) -> Result<(), Box<dyn std::error::Error>> {
        // Create date directory
        let date_dir = temp_dir.path().join("2025-01-27");
        std::fs::create_dir_all(&date_dir)?;
        // Create test job file with job summary
        let job_file = date_dir.join("test_job_123.jsonl");
        let mut file = std::fs::File::create(&job_file)?;
        let entry = JobLogEntry {
            timestamp: "1753763856".to_string(),
            job_id: "test_job_123".to_string(),
            outputs: vec![PartitionRef { r#str: "reviews/date=2025-01-27".to_string() }],
            sequence_number: 4,
            content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "job_summary".to_string(),
                job_status: Some("JOB_COMPLETED".to_string()),
                exit_code: Some(0),
                metadata: {
                    let mut meta = HashMap::new();
                    meta.insert("runtime_ms".to_string(), "2500.000".to_string());
                    meta.insert("peak_memory_mb".to_string(), "128.5".to_string());
                    meta.insert("total_cpu_ms".to_string(), "1200.000".to_string());
                    meta.insert("exit_code".to_string(), "0".to_string());
                    meta
                },
                job_label: None,
            })),
        };
        writeln!(file, "{}", serde_json::to_string(&entry)?)?;
        // Create task_success entry
        let success_entry = JobLogEntry {
            timestamp: "1753763857".to_string(),
            job_id: "test_job_123".to_string(),
            outputs: vec![PartitionRef { r#str: "reviews/date=2025-01-27".to_string() }],
            sequence_number: 5,
            content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                event_type: "task_success".to_string(),
                job_status: Some("JOB_COMPLETED".to_string()),
                exit_code: Some(0),
                metadata: HashMap::new(),
                job_label: None,
            })),
        };
        writeln!(file, "{}", serde_json::to_string(&success_entry)?)?;
        Ok(())
    }
    #[test]
    fn test_metrics_aggregation() {
        let temp_dir = TempDir::new().unwrap();
        create_test_logs(&temp_dir).unwrap();
        let aggregator = MetricsAggregator::with_defaults(temp_dir.path());
        let metrics = aggregator.aggregate_metrics().unwrap();
        assert!(!metrics.is_empty());
        // Find duration metric
        let duration_metric = metrics.iter()
            .find(|m| m.name == "databuild_job_duration_seconds")
            .expect("Should have duration metric");
        assert_eq!(duration_metric.samples.len(), 1);
        assert_eq!(duration_metric.samples[0].value, 2.5); // 2500ms -> 2.5s
        // Verify labels - should only have job_id (which gets excluded) and job_status
        let labels = &duration_metric.samples[0].labels;
        assert_eq!(labels.get("job_status").unwrap(), "JOB_COMPLETED");
        assert!(!labels.contains_key("job_id")); // Should be excluded by default
        // Note: job_label would only be available from manifest entries, not job_summary events
    }
    #[test]
    fn test_prometheus_format() {
        let temp_dir = TempDir::new().unwrap();
        create_test_logs(&temp_dir).unwrap();
        let aggregator = MetricsAggregator::with_defaults(temp_dir.path());
        let prometheus_output = aggregator.to_prometheus_format().unwrap();
        assert!(prometheus_output.contains("# HELP databuild_job_duration_seconds"));
        assert!(prometheus_output.contains("# TYPE databuild_job_duration_seconds histogram"));
        assert!(prometheus_output.contains("databuild_job_duration_seconds{"));
        assert!(prometheus_output.contains("job_status=\"JOB_COMPLETED\""));
        assert!(prometheus_output.contains("} 2.5"));
    }
    #[test]
    fn test_cardinality_safety() {
        let config = MetricsConfig {
            max_cardinality_per_metric: 2, // Very low limit for testing
            time_range_hours: 24,
            include_job_id_labels: true, // Enable to test cardinality
            max_jobs_per_metric: 100,
        };
        let temp_dir = TempDir::new().unwrap();
        // Create multiple jobs to test cardinality limit
        let date_dir = temp_dir.path().join("2025-01-27");
        std::fs::create_dir_all(&date_dir).unwrap();
        for i in 1..=5 {
            let job_file = date_dir.join(format!("job_{}.jsonl", i));
            let mut file = std::fs::File::create(&job_file).unwrap();
            let entry = JobLogEntry {
                timestamp: "1753763856".to_string(),
                job_id: format!("job_{}", i),
                outputs: vec![PartitionRef { r#str: format!("table_{}/date=2025-01-27", i) }],
                sequence_number: 1,
                content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                    event_type: "task_success".to_string(),
                    job_status: Some("JOB_COMPLETED".to_string()),
                    exit_code: Some(0),
                    metadata: HashMap::new(),
                job_label: None,
                })),
            };
            writeln!(file, "{}", serde_json::to_string(&entry).unwrap()).unwrap();
        }
        let aggregator = MetricsAggregator::new(temp_dir.path(), config);
        let metrics = aggregator.aggregate_metrics().unwrap();
        // Find the success count metric
        let success_metric = metrics.iter()
            .find(|m| m.name == "databuild_job_events_total")
            .expect("Should have success count metric");
        // Should be limited by cardinality (max 2 unique label combinations)
        assert!(success_metric.samples.len() <= 2, 
               "Expected <= 2 samples due to cardinality limit, got {}", 
               success_metric.samples.len());
    }
    #[test]
    fn test_label_escaping() {
        let aggregator = MetricsAggregator::with_defaults("/tmp");
        assert_eq!(aggregator.escape_label_value("normal"), "normal");
        assert_eq!(aggregator.escape_label_value("with\"quotes"), "with\\\"quotes");
        assert_eq!(aggregator.escape_label_value("with\\backslash"), "with\\\\backslash");
        assert_eq!(aggregator.escape_label_value("with\nnewline"), "with\\nnewline");
        assert_eq!(aggregator.escape_label_value("with\ttab"), "with\\ttab");
    }
    #[test]
    fn test_label_signature_generation() {
        let aggregator = MetricsAggregator::with_defaults("/tmp");
        let mut labels1 = HashMap::new();
        labels1.insert("job_label".to_string(), "test_job".to_string());
        labels1.insert("job_status".to_string(), "JOB_COMPLETED".to_string());
        let mut labels2 = HashMap::new();
        labels2.insert("job_status".to_string(), "JOB_COMPLETED".to_string());
        labels2.insert("job_label".to_string(), "test_job".to_string());
        // Order shouldn't matter
        assert_eq!(
            aggregator.get_label_signature(&labels1),
            aggregator.get_label_signature(&labels2)
        );
        let signature = aggregator.get_label_signature(&labels1);
        assert!(signature.contains("job_label=test_job"));
        assert!(signature.contains("job_status=JOB_COMPLETED"));
    }
    #[test]
    fn test_counter_vs_gauge_aggregation() {
        let temp_dir = TempDir::new().unwrap();
        let date_dir = temp_dir.path().join("2025-01-27");
        std::fs::create_dir_all(&date_dir).unwrap();
        let job_file = date_dir.join("test_job.jsonl");
        let mut file = std::fs::File::create(&job_file).unwrap();
        // Create multiple task_success events (should be summed as counter)
        for i in 1..=3 {
            let entry = JobLogEntry {
                timestamp: format!("175376385{}", i),
                job_id: "test_job".to_string(),
                outputs: vec![PartitionRef { r#str: "reviews/date=2025-01-27".to_string() }],
                sequence_number: i,
                content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                    event_type: "task_success".to_string(),
                    job_status: Some("JOB_COMPLETED".to_string()),
                    exit_code: Some(0),
                    metadata: HashMap::new(),
                job_label: None,
                })),
            };
            writeln!(file, "{}", serde_json::to_string(&entry).unwrap()).unwrap();
        }
        // Create job summaries with different memory values (should use latest as gauge)
        for (i, memory) in ["100.0", "150.0", "120.0"].iter().enumerate() {
            let entry = JobLogEntry {
                timestamp: format!("175376386{}", i),
                job_id: "test_job".to_string(),
                outputs: vec![PartitionRef { r#str: "reviews/date=2025-01-27".to_string() }],
                sequence_number: (i + 10) as u64,
                content: Some(job_log_entry::Content::JobEvent(WrapperJobEvent {
                    event_type: "job_summary".to_string(),
                    job_status: Some("JOB_COMPLETED".to_string()),
                    exit_code: Some(0),
                    metadata: {
                        let mut meta = HashMap::new();
                        meta.insert("peak_memory_mb".to_string(), memory.to_string());
                        meta.insert("runtime_ms".to_string(), "1000".to_string());
                        meta.insert("total_cpu_ms".to_string(), "500".to_string());
                        meta
                    },
                    job_label: None,
                })),
            };
            writeln!(file, "{}", serde_json::to_string(&entry).unwrap()).unwrap();
        }
        let aggregator = MetricsAggregator::with_defaults(temp_dir.path());
        let metrics = aggregator.aggregate_metrics().unwrap();
        // Check counter behavior (task_success events should be summed)
        let success_metric = metrics.iter()
            .find(|m| m.name == "databuild_job_events_total")
            .expect("Should have success count metric");
        assert_eq!(success_metric.samples[0].value, 3.0); // 3 events summed
        // Check gauge behavior (memory should be latest value)
        let memory_metric = metrics.iter()
            .find(|m| m.name == "databuild_job_peak_memory_mb")
            .expect("Should have memory metric");
        assert_eq!(memory_metric.samples[0].value, 120.0); // Latest value
    }
 }
--- a/databuild/orchestration/error.rs
+++ b/databuild/orchestration/error.rs
@ -1,15 +0,0 @@
 use crate::event_log::BuildEventLogError;
 #[derive(Debug, thiserror::Error)]
 pub enum OrchestrationError {
    #[error("Event log error: {0}")]
    EventLog(#[from] BuildEventLogError),
    #[error("Build coordination error: {0}")]
    Coordination(String),
    #[error("Invalid build state transition: {current} -> {requested}")]
    InvalidStateTransition { current: String, requested: String },
 }
 pub type Result<T> = std::result::Result<T, OrchestrationError>;
--- a/databuild/orchestration/events.rs
+++ b/databuild/orchestration/events.rs
@ -1,156 +0,0 @@
 use crate::*;
 use crate::event_log::{create_build_event, current_timestamp_nanos, generate_event_id};
 /// Helper functions for creating standardized build events
 pub fn create_build_request_received_event(
    build_request_id: String,
    requested_partitions: Vec<PartitionRef>,
 ) -> BuildEvent {
    create_build_event(
        build_request_id,
        build_event::EventType::BuildRequestEvent(BuildRequestEvent {
            status: Some(BuildRequestStatusCode::BuildRequestReceived.status()),
            requested_partitions,
            message: "Build request received".to_string(),
            comment: None,
            want_id: None,
        }),
    )
 }
 pub fn create_build_planning_started_event(
    build_request_id: String,
 ) -> BuildEvent {
    create_build_event(
        build_request_id,
        build_event::EventType::BuildRequestEvent(BuildRequestEvent {
            status: Some(BuildRequestStatusCode::BuildRequestPlanning.status()),
            requested_partitions: vec![],
            message: "Starting build planning".to_string(),
            comment: None,
            want_id: None,
        }),
    )
 }
 pub fn create_build_execution_started_event(
    build_request_id: String,
 ) -> BuildEvent {
    create_build_event(
        build_request_id,
        build_event::EventType::BuildRequestEvent(BuildRequestEvent {
            status: Some(BuildRequestStatusCode::BuildRequestExecuting.status()),
            requested_partitions: vec![],
            message: "Starting build execution".to_string(),
            comment: None,
            want_id: None,
        }),
    )
 }
 pub fn create_build_completed_event(
    build_request_id: String,
    result: &super::BuildResult,
 ) -> BuildEvent {
    let message = match result {
        super::BuildResult::Success { jobs_completed } => {
            format!("Build completed successfully with {} jobs", jobs_completed)
        }
        super::BuildResult::Failed { jobs_completed, jobs_failed } => {
            format!("Build failed: {} jobs completed, {} jobs failed", jobs_completed, jobs_failed)
        }
        super::BuildResult::FailFast { trigger_job } => {
            format!("Build failed fast due to job: {}", trigger_job)
        }
    };
    let status = match result {
        super::BuildResult::Success { .. } => BuildRequestStatusCode::BuildRequestCompleted.status(),
        super::BuildResult::Failed { .. } | super::BuildResult::FailFast { .. } => BuildRequestStatusCode::BuildRequestFailed.status(),
    };
    create_build_event(
        build_request_id,
        build_event::EventType::BuildRequestEvent(BuildRequestEvent {
            status: Some(status),
            requested_partitions: vec![],
            message,
            comment: None,
            want_id: None,
        }),
    )
 }
 pub fn create_analysis_completed_event(
    build_request_id: String,
    requested_partitions: Vec<PartitionRef>,
    task_count: usize,
 ) -> BuildEvent {
    create_build_event(
        build_request_id,
        build_event::EventType::BuildRequestEvent(BuildRequestEvent {
            status: Some(BuildRequestStatusCode::BuildRequestAnalysisCompleted.status()),
            requested_partitions,
            message: format!("Analysis completed successfully, {} tasks planned", task_count),
            comment: None,
            want_id: None,
        }),
    )
 }
 pub fn create_job_scheduled_event(
    build_request_id: String,
    job_event: &JobEvent,
 ) -> BuildEvent {
    BuildEvent {
        event_id: generate_event_id(),
        timestamp: current_timestamp_nanos(),
        build_request_id: Some(build_request_id),
        event_type: Some(build_event::EventType::JobEvent(job_event.clone())),
    }
 }
 pub fn create_job_completed_event(
    build_request_id: String,
    job_event: &JobEvent,
 ) -> BuildEvent {
    BuildEvent {
        event_id: generate_event_id(),
        timestamp: current_timestamp_nanos(),
        build_request_id: Some(build_request_id),
        event_type: Some(build_event::EventType::JobEvent(job_event.clone())),
    }
 }
 pub fn create_partition_available_event(
    build_request_id: String,
    partition_event: &PartitionEvent,
 ) -> BuildEvent {
    BuildEvent {
        event_id: generate_event_id(),
        timestamp: current_timestamp_nanos(),
        build_request_id: Some(build_request_id),
        event_type: Some(build_event::EventType::PartitionEvent(partition_event.clone())),
    }
 }
 pub fn create_delegation_event(
    build_request_id: String,
    partition_ref: &str,
    target_build: &str,
    message: &str,
 ) -> BuildEvent {
    let partition = PartitionRef {
        str: partition_ref.to_string(),
    };
    create_build_event(
        build_request_id,
        build_event::EventType::DelegationEvent(DelegationEvent {
            partition_ref: Some(partition),
            delegated_to_build_request_id: target_build.to_string(),
            message: message.to_string(),
        }),
    )
 }
--- a/databuild/orchestration/mod.rs
+++ b/databuild/orchestration/mod.rs
@ -1,261 +0,0 @@
 use crate::*;
 use crate::event_log::{writer::EventWriter, query_engine::BELQueryEngine};
 use log::info;
 use std::sync::Arc;
 pub mod error;
 pub mod events;
 pub use error::{OrchestrationError, Result};
 /// Result of a build execution
 #[derive(Debug, Clone)]
 pub enum BuildResult {
    Success { jobs_completed: usize },
    Failed { jobs_completed: usize, jobs_failed: usize },
    FailFast { trigger_job: String },
 }
 /// Core orchestrator for managing build lifecycle and event emission
 pub struct BuildOrchestrator {
    event_writer: EventWriter,
    build_request_id: String,
    requested_partitions: Vec<PartitionRef>,
 }
 impl BuildOrchestrator {
    /// Create a new build orchestrator
    pub fn new(
        query_engine: Arc<BELQueryEngine>,
        build_request_id: String,
        requested_partitions: Vec<PartitionRef>,
    ) -> Self {
        Self {
            event_writer: EventWriter::new(query_engine),
            build_request_id,
            requested_partitions,
        }
    }
    /// Get the build request ID
    pub fn build_request_id(&self) -> &str {
        &self.build_request_id
    }
    /// Get the requested partitions
    pub fn requested_partitions(&self) -> &[PartitionRef] {
        &self.requested_partitions
    }
    /// Emit build request received event and start the build lifecycle
    pub async fn start_build(&self) -> Result<()> {
        info!("Starting build for request: {}", self.build_request_id);
        self.event_writer.request_build(
            self.build_request_id.clone(),
            self.requested_partitions.clone(),
        ).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit build planning started event
    pub async fn start_planning(&self) -> Result<()> {
        info!("Starting build planning for request: {}", self.build_request_id);
        self.event_writer.update_build_status(
            self.build_request_id.clone(),
            BuildRequestStatusCode::BuildRequestPlanning.status(),
            "Starting build planning".to_string(),
        ).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit build execution started event
    pub async fn start_execution(&self) -> Result<()> {
        info!("Starting build execution for request: {}", self.build_request_id);
        self.event_writer.update_build_status(
            self.build_request_id.clone(),
            BuildRequestStatusCode::BuildRequestExecuting.status(),
            "Starting build execution".to_string(),
        ).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit build completion event
    pub async fn complete_build(&self, result: BuildResult) -> Result<()> {
        info!("Completing build for request: {} with result: {:?}", 
               self.build_request_id, result);
        let (status, message) = match &result {
            BuildResult::Success { jobs_completed } => {
                (BuildRequestStatusCode::BuildRequestCompleted,
                 format!("Build completed successfully with {} jobs", jobs_completed))
            }
            BuildResult::Failed { jobs_completed, jobs_failed } => {
                (BuildRequestStatusCode::BuildRequestFailed,
                 format!("Build failed: {} jobs completed, {} jobs failed", jobs_completed, jobs_failed))
            }
            BuildResult::FailFast { trigger_job } => {
                (BuildRequestStatusCode::BuildRequestFailed,
                 format!("Build failed fast due to job: {}", trigger_job))
            }
        };
        self.event_writer.update_build_status(
            self.build_request_id.clone(),
            status.status(),
            message,
        ).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit analysis completed event
    pub async fn emit_analysis_completed(&self, task_count: usize) -> Result<()> {
        self.event_writer.update_build_status_with_partitions(
            self.build_request_id.clone(),
            BuildRequestStatusCode::BuildRequestAnalysisCompleted.status(),
            self.requested_partitions.clone(),
            format!("Analysis completed successfully, {} tasks planned", task_count),
        ).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit job scheduled event
    pub async fn emit_job_scheduled(&self, job: &JobEvent) -> Result<()> {
        let event = events::create_job_scheduled_event(
            self.build_request_id.clone(),
            job,
        );
        self.event_writer.append_event(event).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit job completed event
    pub async fn emit_job_completed(&self, job: &JobEvent) -> Result<()> {
        let event = events::create_job_completed_event(
            self.build_request_id.clone(),
            job,
        );
        self.event_writer.append_event(event).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit partition available event
    pub async fn emit_partition_available(&self, partition: &PartitionEvent) -> Result<()> {
        let event = events::create_partition_available_event(
            self.build_request_id.clone(),
            partition,
        );
        self.event_writer.append_event(event).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
    /// Emit delegation event
    pub async fn emit_delegation(
        &self,
        partition_ref: &str,
        target_build: &str,
        message: &str,
    ) -> Result<()> {
        let partition = PartitionRef { str: partition_ref.to_string() };
        self.event_writer.record_delegation(
            self.build_request_id.clone(),
            partition,
            target_build.to_string(),
            message.to_string(),
        ).await
            .map_err(OrchestrationError::EventLog)?;
        Ok(())
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    #[tokio::test]
    async fn test_build_lifecycle_events() {
        // Use mock BEL query engine for testing
        let query_engine = crate::event_log::mock::create_mock_bel_query_engine().await.unwrap();
        let partitions = vec![PartitionRef { str: "test/partition".to_string() }];
        let orchestrator = BuildOrchestrator::new(
            query_engine,
            "test-build-123".to_string(),
            partitions.clone(),
        );
        // Test full build lifecycle
        orchestrator.start_build().await.unwrap();
        orchestrator.start_planning().await.unwrap();
        orchestrator.start_execution().await.unwrap();
        orchestrator.complete_build(BuildResult::Success { jobs_completed: 5 }).await.unwrap();
        // Note: Since we're using the real BELQueryEngine with mock storage,
        // we can't easily inspect emitted events in this test without significant refactoring.
        // The test verifies that the orchestration methods complete without errors,
        // which exercises the event emission code paths.
        // TODO: If we need to verify specific events, we could:
        // 1. Query the mock storage through the query engine
        // 2. Create a specialized test storage that captures events
        // 3. Use the existing MockBuildEventLog test pattern with dependency injection
    }
    #[tokio::test]
    async fn test_partition_and_job_events() {
        // Use mock BEL query engine for testing
        let query_engine = crate::event_log::mock::create_mock_bel_query_engine().await.unwrap();
        let orchestrator = BuildOrchestrator::new(
            query_engine,
            "test-build-456".to_string(),
            vec![],
        );
        // Test analysis completed event
        orchestrator.emit_analysis_completed(3).await.unwrap();
        // Test job event
        let partition = PartitionRef { str: "data/users".to_string() };
        let job_event = JobEvent {
            job_run_id: "job-run-123".to_string(),
            job_label: Some(JobLabel { label: "//:test_job".to_string() }),
            target_partitions: vec![partition.clone()],
            status_code: JobStatus::JobScheduled as i32,
            status_name: JobStatus::JobScheduled.to_display_string(),
            message: "Job scheduled".to_string(),
            config: None,
            manifests: vec![],
        };
        orchestrator.emit_job_scheduled(&job_event).await.unwrap();
        // Note: Same testing limitation as above.
        // We verify that the methods complete successfully without panicking.
    }
 }
--- a/databuild/repositories/builds/mod.rs
+++ b/databuild/repositories/builds/mod.rs
@ -1,408 +0,0 @@
 use crate::*;
 use crate::event_log::{BuildEventLogError, Result};
 use crate::event_log::query_engine::BELQueryEngine;
 use crate::{BuildDetailResponse, BuildTimelineEvent as ServiceBuildTimelineEvent};
 use std::sync::Arc;
 // use std::collections::HashMap; // Commented out since not used with new query engine
 use serde::Serialize;
 /// Repository for querying build data from the build event log
 pub struct BuildsRepository {
    query_engine: Arc<BELQueryEngine>,
 }
 /// Summary of a build request and its current status
 #[derive(Debug, Clone, Serialize)]
 pub struct BuildInfo {
    pub build_request_id: String,
    pub status: BuildRequestStatus,
    pub requested_partitions: Vec<PartitionRef>,
    pub requested_at: i64,
    pub started_at: Option<i64>,
    pub completed_at: Option<i64>,
    pub duration_ms: Option<i64>,
    pub total_jobs: usize,
    pub completed_jobs: usize,
    pub failed_jobs: usize,
    pub cancelled_jobs: usize,
    pub cancelled: bool,
    pub cancel_reason: Option<String>,
 }
 /// Detailed timeline of a build's execution events
 #[derive(Debug, Clone, Serialize)]
 pub struct BuildEvent {
    pub timestamp: i64,
    pub event_type: String,
    pub status: Option<BuildRequestStatus>,
    pub message: String,
    pub cancel_reason: Option<String>,
 }
 impl BuildsRepository {
    /// Create a new BuildsRepository
    pub fn new(query_engine: Arc<BELQueryEngine>) -> Self {
        Self { query_engine }
    }
    /// List all builds with their current status
    /// 
    /// Returns a list of all build requests that have been made,
    /// including their current status and execution details.
    pub async fn list(&self, limit: Option<usize>) -> Result<Vec<BuildInfo>> {
        // Use query engine to list builds with the protobuf request format
        let request = BuildsListRequest {
            limit: limit.map(|l| l as u32),
            offset: Some(0),
            status_filter: None,
        };
        let response = self.query_engine.list_build_requests(request).await?;
        // Convert from protobuf BuildSummary to repository BuildInfo
        let builds = response.builds.into_iter().map(|build| {
            BuildInfo {
                build_request_id: build.build_request_id,
                status: build.status.clone().unwrap_or(BuildRequestStatusCode::BuildRequestUnknown.status()),
                requested_partitions: build.requested_partitions,
                requested_at: build.requested_at,
                started_at: build.started_at,
                completed_at: build.completed_at,
                duration_ms: build.duration_ms,
                total_jobs: build.total_jobs as usize,
                completed_jobs: build.completed_jobs as usize,
                failed_jobs: build.failed_jobs as usize,
                cancelled_jobs: build.cancelled_jobs as usize,
                cancelled: build.cancelled,
                cancel_reason: None, // TODO: Add cancel reason to BuildSummary if needed
            }
        }).collect();
        Ok(builds)
    }
    /// Show detailed information about a specific build
    /// 
    /// Returns the complete timeline of events for the specified build,
    /// including all status changes and any cancellation events.
    pub async fn show(&self, build_request_id: &str) -> Result<Option<(BuildInfo, Vec<BuildEvent>)>> {
        // Use query engine to get build summary
        let summary_result = self.query_engine.get_build_request_summary(build_request_id).await;
        match summary_result {
            Ok(summary) => {
                // Convert BuildRequestSummary to BuildInfo
                let build_info = BuildInfo {
                    build_request_id: summary.build_request_id,
                    status: summary.status,
                    requested_partitions: summary.requested_partitions.into_iter()
                        .map(|s| PartitionRef { str: s })
                        .collect(),
                    requested_at: summary.created_at,
                    started_at: None, // TODO: Track started_at in query engine
                    completed_at: Some(summary.updated_at),
                    duration_ms: None, // TODO: Calculate duration in query engine
                    total_jobs: 0, // TODO: Implement job counting in query engine
                    completed_jobs: 0,
                    failed_jobs: 0,
                    cancelled_jobs: 0,
                    cancelled: false, // TODO: Track cancellation in query engine
                    cancel_reason: None,
                };
                // Get all events for this build to create a proper timeline
                let all_events = self.query_engine.get_build_request_events(build_request_id, None).await?;
                // Create timeline from build request events
                let mut timeline = Vec::new();
                for event in all_events {
                    if let Some(crate::build_event::EventType::BuildRequestEvent(br_event)) = &event.event_type {
                        if let Some(status) = br_event.clone().status {
                            timeline.push(BuildEvent {
                                timestamp: event.timestamp,
                                event_type: "build_status".to_string(),
                                status: Some(status),
                                message: br_event.message.clone(),
                                cancel_reason: None,
                            });
                        }
                    }
                }
                // Sort timeline by timestamp
                timeline.sort_by_key(|e| e.timestamp);
                Ok(Some((build_info, timeline)))
            }
            Err(_) => {
                // Build not found
                Ok(None)
            }
        }
    }
    /// Show detailed information about a specific build using protobuf response format
    /// 
    /// Returns the complete build details with dual status fields and timeline events.
    pub async fn show_protobuf(&self, build_request_id: &str) -> Result<Option<BuildDetailResponse>> {
        // Get build info and timeline using existing show method
        if let Some((build_info, timeline)) = self.show(build_request_id).await? {
            // Convert timeline events to protobuf format
            let protobuf_timeline: Vec<ServiceBuildTimelineEvent> = timeline
                .into_iter()
                .map(|event| ServiceBuildTimelineEvent {
                    timestamp: event.timestamp,
                    status: event.status,
                    message: event.message,
                    event_type: event.event_type,
                    cancel_reason: event.cancel_reason,
                })
                .collect();
            let response = BuildDetailResponse {
                build_request_id: build_info.build_request_id,
                status: Some(build_info.status),
                requested_partitions: build_info.requested_partitions,
                total_jobs: build_info.total_jobs as u32,
                completed_jobs: build_info.completed_jobs as u32,
                failed_jobs: build_info.failed_jobs as u32,
                cancelled_jobs: build_info.cancelled_jobs as u32,
                requested_at: build_info.requested_at,
                started_at: build_info.started_at,
                completed_at: build_info.completed_at,
                duration_ms: build_info.duration_ms,
                cancelled: build_info.cancelled,
                cancel_reason: build_info.cancel_reason,
                timeline: protobuf_timeline,
            };
            Ok(Some(response))
        } else {
            Ok(None)
        }
    }
    /// Cancel a build with a reason
    /// 
    /// This method uses the EventWriter to write a build cancellation event.
    /// It validates that the build exists and is in a cancellable state.
    pub async fn cancel(&self, build_request_id: &str, _reason: String) -> Result<()> {
        // First check if the build exists and get its current status
        let build_info = self.show(build_request_id).await?;
        if build_info.is_none() {
            return Err(BuildEventLogError::QueryError(
                format!("Cannot cancel non-existent build: {}", build_request_id)
            ));
        }
        let (build, _timeline) = build_info.unwrap();
        // Check if build is in a cancellable state
        match BuildRequestStatusCode::try_from(build.status.code) {
            Ok(BuildRequestStatusCode::BuildRequestCompleted) => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel completed build: {}", build_request_id)
                ));
            }
            Ok(BuildRequestStatusCode::BuildRequestFailed) => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel failed build: {}", build_request_id)
                ));
            }
            Ok(BuildRequestStatusCode::BuildRequestCancelled) => {
                return Err(BuildEventLogError::QueryError(
                    format!("Build already cancelled: {}", build_request_id)
                ));
            }
            _ => {}
        }
        // Create a build cancellation event
        use crate::event_log::{create_build_event, current_timestamp_nanos, generate_event_id};
        let cancel_event = create_build_event(
            build_request_id.to_string(),
            crate::build_event::EventType::BuildRequestEvent(crate::BuildRequestEvent {
                status: Some(BuildRequestStatusCode::BuildRequestCancelled.status()),
                requested_partitions: build.requested_partitions,
                message: format!("Build cancelled"),
                comment: None,
                want_id: None,
            })
        );
        // Append the cancellation event
        self.query_engine.append_event(cancel_event).await?;
        Ok(())
    }
    /// List builds using protobuf response format with dual status fields
    /// 
    /// Returns BuildSummary protobuf messages with status_code and status_name.
    pub async fn list_protobuf(&self, limit: Option<usize>) -> Result<Vec<crate::BuildSummary>> {
        // Get build info using existing list method
        let builds = self.list(limit).await?;
        // Convert to protobuf format
        let protobuf_builds: Vec<crate::BuildSummary> = builds
            .into_iter()
            .map(|build| crate::BuildSummary {
                build_request_id: build.build_request_id,
                status: Some(build.status),
                requested_partitions: build.requested_partitions.into_iter().map(|p| crate::PartitionRef { str: p.str }).collect(),
                total_jobs: build.total_jobs as u32,
                completed_jobs: build.completed_jobs as u32,
                failed_jobs: build.failed_jobs as u32,
                cancelled_jobs: build.cancelled_jobs as u32,
                requested_at: build.requested_at,
                started_at: build.started_at,
                completed_at: build.completed_at,
                duration_ms: build.duration_ms,
                cancelled: build.cancelled,
                comment: None,
            })
            .collect();
        Ok(protobuf_builds)
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::event_log::mock::{create_mock_bel_query_engine, create_mock_bel_query_engine_with_events, test_events};
    #[tokio::test]
    async fn test_builds_repository_list_empty() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = BuildsRepository::new(query_engine);
        let builds = repo.list(None).await.unwrap();
        assert!(builds.is_empty());
    }
    #[tokio::test]
    async fn test_builds_repository_list_with_data() {
        let build_id1 = "build-123".to_string();
        let build_id2 = "build-456".to_string();
        let partition1 = PartitionRef { str: "data/users".to_string() };
        let partition2 = PartitionRef { str: "data/orders".to_string() };
        // Create events for multiple builds
        let events = vec![
            test_events::build_request_event(Some(build_id1.clone()), vec![partition1.clone()], BuildRequestStatusCode::BuildRequestReceived.status()),
            test_events::build_request_event(Some(build_id1.clone()), vec![partition1.clone()], BuildRequestStatusCode::BuildRequestCompleted.status()),
            test_events::build_request_event(Some(build_id2.clone()), vec![partition2.clone()], BuildRequestStatusCode::BuildRequestReceived.status()),
            test_events::build_request_event(Some(build_id2.clone()), vec![partition2.clone()], BuildRequestStatusCode::BuildRequestFailed.status()),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = BuildsRepository::new(query_engine);
        let builds = repo.list(None).await.unwrap();
        assert_eq!(builds.len(), 2);
        // Find builds by id
        let build1 = builds.iter().find(|b| b.build_request_id == build_id1).unwrap();
        let build2 = builds.iter().find(|b| b.build_request_id == build_id2).unwrap();
        assert_eq!(build1.status, BuildRequestStatusCode::BuildRequestCompleted.status());
        assert_eq!(build1.requested_partitions.len(), 1);
        assert!(!build1.cancelled);
        assert_eq!(build2.status, BuildRequestStatusCode::BuildRequestFailed.status());
        assert_eq!(build2.requested_partitions.len(), 1);
        assert!(!build2.cancelled);
    }
    #[tokio::test]
    async fn test_builds_repository_show() {
        let build_id = "build-789".to_string();
        let partition = PartitionRef { str: "analytics/daily".to_string() };
        let events = vec![
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestReceived.status()),
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestPlanning.status()),
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestExecuting.status()),
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestCompleted.status()),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = BuildsRepository::new(query_engine);
        let result = repo.show(&build_id).await.unwrap();
        assert!(result.is_some());
        let (info, timeline) = result.unwrap();
        assert_eq!(info.build_request_id, build_id);
        assert_eq!(info.status, BuildRequestStatusCode::BuildRequestCompleted.status());
        assert!(!info.cancelled);
        assert_eq!(timeline.len(), 4);
        assert_eq!(timeline[0].status, Some(BuildRequestStatusCode::BuildRequestReceived.status()));
        assert_eq!(timeline[1].status, Some(BuildRequestStatusCode::BuildRequestPlanning.status()));
        assert_eq!(timeline[2].status, Some(BuildRequestStatusCode::BuildRequestExecuting.status()));
        assert_eq!(timeline[3].status, Some(BuildRequestStatusCode::BuildRequestCompleted.status()));
    }
    #[tokio::test]
    async fn test_builds_repository_show_nonexistent() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = BuildsRepository::new(query_engine);
        let result = repo.show("nonexistent-build").await.unwrap();
        assert!(result.is_none());
    }
    #[tokio::test]
    async fn test_builds_repository_cancel() {
        let build_id = "build-cancel-test".to_string();
        let partition = PartitionRef { str: "test/data".to_string() };
        // Start with a running build
        let events = vec![
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestReceived.status()),
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestExecuting.status()),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = BuildsRepository::new(query_engine.clone());
        // Cancel the build
        repo.cancel(&build_id, "User requested cancellation".to_string()).await.unwrap();
        // Verify the cancellation was recorded
        // Note: This test demonstrates the pattern, but the MockBELStorage would need
        // to be enhanced to properly store build cancel events for full verification
        // Try to cancel a non-existent build
        let result = repo.cancel("nonexistent-build", "Should fail".to_string()).await;
        assert!(result.is_err());
    }
    #[tokio::test]
    async fn test_builds_repository_cancel_completed_build() {
        let build_id = "completed-build".to_string();
        let partition = PartitionRef { str: "test/data".to_string() };
        // Create a completed build
        let events = vec![
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestReceived.status()),
            test_events::build_request_event(Some(build_id.clone()), vec![partition.clone()], BuildRequestStatusCode::BuildRequestCompleted.status()),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = BuildsRepository::new(query_engine);
        // Try to cancel the completed build - should fail
        let result = repo.cancel(&build_id, "Should fail".to_string()).await;
        assert!(result.is_err());
        if let Err(BuildEventLogError::QueryError(msg)) = result {
            assert!(msg.contains("Cannot cancel completed build"));
        } else {
            panic!("Expected QueryError for completed build cancellation");
        }
    }
 }
--- a/databuild/repositories/jobs/mod.rs
+++ b/databuild/repositories/jobs/mod.rs
@ -1,499 +0,0 @@
 use crate::*;
 use crate::event_log::{BuildEventLogError, Result};
 use crate::event_log::query_engine::BELQueryEngine;
 use crate::{JobDetailResponse, JobRunDetail as ServiceJobRunDetail};
 use std::sync::Arc;
 use std::collections::HashMap;
 use serde::Serialize;
 /// Repository for querying job data from the build event log
 pub struct JobsRepository {
    query_engine: Arc<BELQueryEngine>,
 }
 /// Summary of a job's execution history and statistics
 #[derive(Debug, Clone, Serialize)]
 pub struct JobInfo {
    pub job_label: String,
    pub total_runs: usize,
    pub successful_runs: usize,
    pub failed_runs: usize,
    pub cancelled_runs: usize,
    pub last_run_timestamp: i64,
    pub last_run_status: JobStatus,
    pub average_partitions_per_run: f64,
    pub recent_builds: Vec<String>, // Build request IDs that used this job
 }
 /// Detailed information about a specific job execution
 #[derive(Debug, Clone, Serialize)]
 pub struct JobRunDetail {
    pub job_run_id: String,
    pub job_label: String,
    pub build_request_id: String,
    pub target_partitions: Vec<PartitionRef>,
    pub status: JobStatus,
    pub scheduled_at: i64,
    pub started_at: Option<i64>,
    pub completed_at: Option<i64>,
    pub duration_ms: Option<i64>,
    pub message: String,
    pub config: Option<JobConfig>,
    pub manifests: Vec<PartitionManifest>,
 }
 impl JobsRepository {
    /// Create a new JobsRepository
    pub fn new(query_engine: Arc<BELQueryEngine>) -> Self {
        Self { query_engine }
    }
    /// List all jobs with their execution statistics
    /// 
    /// Returns a summary of all jobs that have been executed, including
    /// success/failure statistics and recent activity.
    pub async fn list(&self, limit: Option<usize>) -> Result<Vec<JobInfo>> {
        // Get all job events from the event log
        let events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        let mut job_data: HashMap<String, Vec<JobRunDetail>> = HashMap::new();
        // Collect all job events and group by job label
        for event in events {
            if let Some(build_event::EventType::JobEvent(j_event)) = &event.event_type {
                let job_label = j_event.job_label.as_ref()
                    .map(|l| l.label.clone())
                    .unwrap_or_else(|| "unknown".to_string());
                let status = match j_event.status_code {
                    1 => JobStatus::JobScheduled,
                    2 => JobStatus::JobRunning,
                    3 => JobStatus::JobCompleted,
                    4 => JobStatus::JobFailed,
                    5 => JobStatus::JobCancelled,
                    6 => JobStatus::JobSkipped,
                    _ => JobStatus::JobUnknown,
                };
                // Create or update job run detail
                let job_runs = job_data.entry(job_label.clone()).or_insert_with(Vec::new);
                // Find existing run or create new one
                if let Some(existing_run) = job_runs.iter_mut().find(|r| r.job_run_id == j_event.job_run_id) {
                    // Update existing run with new status
                    existing_run.status = status;
                    existing_run.message = j_event.message.clone();
                    match status {
                        JobStatus::JobRunning => {
                            existing_run.started_at = Some(event.timestamp);
                        }
                        JobStatus::JobCompleted | JobStatus::JobFailed | JobStatus::JobCancelled => {
                            existing_run.completed_at = Some(event.timestamp);
                            if let Some(started) = existing_run.started_at {
                                existing_run.duration_ms = Some((event.timestamp - started) / 1_000_000); // Convert to ms
                            }
                            existing_run.manifests = j_event.manifests.clone();
                        }
                        _ => {}
                    }
                } else {
                    // Create new job run
                    let job_run = JobRunDetail {
                        job_run_id: j_event.job_run_id.clone(),
                        job_label: job_label.clone(),
                        build_request_id: event.build_request_id.clone().unwrap(),
                        target_partitions: j_event.target_partitions.clone(),
                        status,
                        scheduled_at: event.timestamp,
                        started_at: if status == JobStatus::JobRunning { Some(event.timestamp) } else { None },
                        completed_at: None,
                        duration_ms: None,
                        message: j_event.message.clone(),
                        config: j_event.config.clone(),
                        manifests: j_event.manifests.clone(),
                    };
                    job_runs.push(job_run);
                }
            }
        }
        // Convert to JobInfo structs with statistics
        let mut job_infos: Vec<JobInfo> = job_data.into_iter()
            .map(|(job_label, job_runs)| {
                let total_runs = job_runs.len();
                let successful_runs = job_runs.iter().filter(|r| r.status == JobStatus::JobCompleted).count();
                let failed_runs = job_runs.iter().filter(|r| r.status == JobStatus::JobFailed).count();
                let cancelled_runs = job_runs.iter().filter(|r| r.status == JobStatus::JobCancelled).count();
                let (last_run_timestamp, last_run_status) = job_runs.iter()
                    .max_by_key(|r| r.scheduled_at)
                    .map(|r| (r.scheduled_at, r.status.clone()))
                    .unwrap_or((0, JobStatus::JobUnknown));
                let total_partitions: usize = job_runs.iter()
                    .map(|r| r.target_partitions.len())
                    .sum();
                let average_partitions_per_run = if total_runs > 0 {
                    total_partitions as f64 / total_runs as f64
                } else {
                    0.0
                };
                // Get recent unique build request IDs
                let mut recent_builds: Vec<String> = job_runs.iter()
                    .map(|r| r.build_request_id.clone())
                    .collect::<std::collections::HashSet<_>>()
                    .into_iter()
                    .collect();
                recent_builds.sort();
                recent_builds.truncate(10); // Keep last 10 builds
                JobInfo {
                    job_label,
                    total_runs,
                    successful_runs,
                    failed_runs,
                    cancelled_runs,
                    last_run_timestamp,
                    last_run_status,
                    average_partitions_per_run,
                    recent_builds,
                }
            })
            .collect();
        // Sort by last run timestamp (most recent first)
        job_infos.sort_by(|a, b| b.last_run_timestamp.cmp(&a.last_run_timestamp));
        // Apply limit if specified
        if let Some(limit) = limit {
            job_infos.truncate(limit);
        }
        Ok(job_infos)
    }
    /// Show detailed information about a specific job
    /// 
    /// Returns all execution runs for the specified job label, including
    /// detailed timing, status, and output information.
    pub async fn show(&self, job_label: &str) -> Result<Option<(JobInfo, Vec<JobRunDetail>)>> {
        // Get all job events for this specific job
        let events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        let mut job_runs: Vec<JobRunDetail> = Vec::new();
        // Collect all job events for this job label
        for event in events {
            if let Some(build_event::EventType::JobEvent(j_event)) = &event.event_type {
                let event_job_label = j_event.job_label.as_ref()
                    .map(|l| l.label.clone())
                    .unwrap_or_else(|| "unknown".to_string());
                if event_job_label != job_label {
                    continue;
                }
                let status = match j_event.status_code {
                    1 => JobStatus::JobScheduled,
                    2 => JobStatus::JobRunning,
                    3 => JobStatus::JobCompleted,
                    4 => JobStatus::JobFailed,
                    5 => JobStatus::JobCancelled,
                    6 => JobStatus::JobSkipped,
                    _ => JobStatus::JobUnknown,
                };
                // Find existing run or create new one
                if let Some(existing_run) = job_runs.iter_mut().find(|r| r.job_run_id == j_event.job_run_id) {
                    // Update existing run with new status
                    existing_run.status = status;
                    existing_run.message = j_event.message.clone();
                    match status {
                        JobStatus::JobRunning => {
                            existing_run.started_at = Some(event.timestamp);
                        }
                        JobStatus::JobCompleted | JobStatus::JobFailed | JobStatus::JobCancelled => {
                            existing_run.completed_at = Some(event.timestamp);
                            if let Some(started) = existing_run.started_at {
                                existing_run.duration_ms = Some((event.timestamp - started) / 1_000_000); // Convert to ms
                            }
                            existing_run.manifests = j_event.manifests.clone();
                        }
                        _ => {}
                    }
                } else {
                    // Create new job run
                    let job_run = JobRunDetail {
                        job_run_id: j_event.job_run_id.clone(),
                        job_label: job_label.to_string(),
                        build_request_id: event.build_request_id.clone().unwrap(),
                        target_partitions: j_event.target_partitions.clone(),
                        status,
                        scheduled_at: event.timestamp,
                        started_at: if status == JobStatus::JobRunning { Some(event.timestamp) } else { None },
                        completed_at: None,
                        duration_ms: None,
                        message: j_event.message.clone(),
                        config: j_event.config.clone(),
                        manifests: j_event.manifests.clone(),
                    };
                    job_runs.push(job_run);
                }
            }
        }
        if job_runs.is_empty() {
            return Ok(None);
        }
        // Sort runs by scheduled time (most recent first)
        job_runs.sort_by(|a, b| b.scheduled_at.cmp(&a.scheduled_at));
        // Calculate job statistics
        let total_runs = job_runs.len();
        let successful_runs = job_runs.iter().filter(|r| r.status == JobStatus::JobCompleted).count();
        let failed_runs = job_runs.iter().filter(|r| r.status == JobStatus::JobFailed).count();
        let cancelled_runs = job_runs.iter().filter(|r| r.status == JobStatus::JobCancelled).count();
        let (last_run_timestamp, last_run_status) = job_runs.iter()
            .max_by_key(|r| r.scheduled_at)
            .map(|r| (r.scheduled_at, r.status.clone()))
            .unwrap_or((0, JobStatus::JobUnknown));
        let total_partitions: usize = job_runs.iter()
            .map(|r| r.target_partitions.len())
            .sum();
        let average_partitions_per_run = if total_runs > 0 {
            total_partitions as f64 / total_runs as f64
        } else {
            0.0
        };
        // Get recent unique build request IDs
        let mut recent_builds: Vec<String> = job_runs.iter()
            .map(|r| r.build_request_id.clone())
            .collect::<std::collections::HashSet<_>>()
            .into_iter()
            .collect();
        recent_builds.sort();
        recent_builds.truncate(10); // Keep last 10 builds
        let job_info = JobInfo {
            job_label: job_label.to_string(),
            total_runs,
            successful_runs,
            failed_runs,
            cancelled_runs,
            last_run_timestamp,
            last_run_status,
            average_partitions_per_run,
            recent_builds,
        };
        Ok(Some((job_info, job_runs)))
    }
    /// Show detailed information about a specific job using protobuf response format
    /// 
    /// Returns the complete job details with dual status fields and run details.
    pub async fn show_protobuf(&self, job_label: &str) -> Result<Option<JobDetailResponse>> {
        // Get job info and runs using existing show method
        if let Some((job_info, job_runs)) = self.show(job_label).await? {
            // Convert job runs to protobuf format
            let protobuf_runs: Vec<ServiceJobRunDetail> = job_runs
                .into_iter()
                .map(|run| ServiceJobRunDetail {
                    job_run_id: run.job_run_id,
                    build_request_id: run.build_request_id,
                    target_partitions: run.target_partitions,
                    status_code: run.status as i32,
                    status_name: run.status.to_display_string(),
                    started_at: run.started_at,
                    completed_at: run.completed_at,
                    duration_ms: run.duration_ms,
                    message: run.message,
                })
                .collect();
            let response = JobDetailResponse {
                job_label: job_info.job_label,
                total_runs: job_info.total_runs as u32,
                successful_runs: job_info.successful_runs as u32,
                failed_runs: job_info.failed_runs as u32,
                cancelled_runs: job_info.cancelled_runs as u32,
                average_partitions_per_run: job_info.average_partitions_per_run,
                last_run_timestamp: job_info.last_run_timestamp,
                last_run_status_code: job_info.last_run_status as i32,
                last_run_status_name: job_info.last_run_status.to_display_string(),
                recent_builds: job_info.recent_builds,
                runs: protobuf_runs,
            };
            Ok(Some(response))
        } else {
            Ok(None)
        }
    }
    /// List jobs using protobuf response format with dual status fields
    /// 
    /// Returns JobsListResponse protobuf message with JobSummary objects containing
    /// last_run_status_code and last_run_status_name fields.
    pub async fn list_protobuf(&self, request: JobsListRequest) -> Result<JobsListResponse> {
        // Get job info using existing list method
        let jobs = self.list(request.limit.map(|l| l as usize)).await?;
        // Convert to protobuf format
        let protobuf_jobs: Vec<crate::JobSummary> = jobs
            .into_iter()
            .map(|job| crate::JobSummary {
                job_label: job.job_label,
                total_runs: job.total_runs as u32,
                successful_runs: job.successful_runs as u32,
                failed_runs: job.failed_runs as u32,
                cancelled_runs: job.cancelled_runs as u32,
                average_partitions_per_run: job.average_partitions_per_run,
                last_run_timestamp: job.last_run_timestamp,
                last_run_status_code: job.last_run_status as i32,
                last_run_status_name: job.last_run_status.to_display_string(),
                recent_builds: job.recent_builds,
            })
            .collect();
        let total_count = protobuf_jobs.len() as u32;
        Ok(JobsListResponse {
            jobs: protobuf_jobs,
            total_count,
        })
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::event_log::mock::{create_mock_bel_query_engine, create_mock_bel_query_engine_with_events, test_events};
    #[tokio::test]
    async fn test_jobs_repository_list_empty() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = JobsRepository::new(query_engine);
        let jobs = repo.list(None).await.unwrap();
        assert!(jobs.is_empty());
    }
    #[tokio::test]
    async fn test_jobs_repository_list_with_data() {
        let build_id = "test-build-123".to_string();
        let job_label1 = JobLabel { label: "//:process_data".to_string() };
        let job_label2 = JobLabel { label: "//:generate_reports".to_string() };
        let partition1 = PartitionRef { str: "data/users".to_string() };
        let partition2 = PartitionRef { str: "reports/summary".to_string() };
        // Create events for multiple jobs
        let events = vec![
            test_events::job_event(Some(build_id.clone()), Some("job-run-1".to_string()), job_label1.clone(), vec![partition1.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("job-run-1".to_string()), job_label1.clone(), vec![partition1.clone()], JobStatus::JobCompleted),
            test_events::job_event(Some(build_id.clone()), Some("job-run-2".to_string()), job_label2.clone(), vec![partition2.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("job-run-2".to_string()), job_label2.clone(), vec![partition2.clone()], JobStatus::JobFailed),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = JobsRepository::new(query_engine);
        let jobs = repo.list(None).await.unwrap();
        assert_eq!(jobs.len(), 2);
        // Find jobs by label
        let process_job = jobs.iter().find(|j| j.job_label == "//:process_data").unwrap();
        let reports_job = jobs.iter().find(|j| j.job_label == "//:generate_reports").unwrap();
        assert_eq!(process_job.total_runs, 1);
        assert_eq!(process_job.successful_runs, 1);
        assert_eq!(process_job.failed_runs, 0);
        assert_eq!(process_job.last_run_status, JobStatus::JobCompleted);
        assert_eq!(reports_job.total_runs, 1);
        assert_eq!(reports_job.successful_runs, 0);
        assert_eq!(reports_job.failed_runs, 1);
        assert_eq!(reports_job.last_run_status, JobStatus::JobFailed);
    }
    #[tokio::test]
    async fn test_jobs_repository_show() {
        let build_id = "test-build-456".to_string();
        let job_label = JobLabel { label: "//:analytics_job".to_string() };
        let partition = PartitionRef { str: "analytics/daily".to_string() };
        let events = vec![
            test_events::job_event(Some(build_id.clone()), Some("job-run-123".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("job-run-123".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobRunning),
            test_events::job_event(Some(build_id.clone()), Some("job-run-123".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobCompleted),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = JobsRepository::new(query_engine);
        let result = repo.show(&job_label.label).await.unwrap();
        assert!(result.is_some());
        let (info, runs) = result.unwrap();
        assert_eq!(info.job_label, "//:analytics_job");
        assert_eq!(info.total_runs, 1);
        assert_eq!(info.successful_runs, 1);
        assert_eq!(info.last_run_status, JobStatus::JobCompleted);
        assert_eq!(runs.len(), 1);
        let run = &runs[0];
        assert_eq!(run.job_run_id, "job-run-123");
        assert_eq!(run.status, JobStatus::JobCompleted);
        assert_eq!(run.target_partitions.len(), 1);
        assert_eq!(run.target_partitions[0].str, "analytics/daily");
    }
    #[tokio::test]
    async fn test_jobs_repository_show_nonexistent() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = JobsRepository::new(query_engine);
        let result = repo.show("//:nonexistent_job").await.unwrap();
        assert!(result.is_none());
    }
    #[tokio::test]
    async fn test_jobs_repository_statistics() {
        let build_id = "test-build-789".to_string();
        let job_label = JobLabel { label: "//:batch_processor".to_string() };
        let partition = PartitionRef { str: "batch/data".to_string() };
        // Create multiple runs with different outcomes
        let events = vec![
            // First run - successful
            test_events::job_event(Some(build_id.clone()), Some("run-1".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("run-1".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobCompleted),
            // Second run - failed
            test_events::job_event(Some(build_id.clone()), Some("run-2".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("run-2".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobFailed),
            // Third run - cancelled
            test_events::job_event(Some(build_id.clone()), Some("run-3".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("run-3".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobCancelled),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = JobsRepository::new(query_engine);
        let result = repo.show(&job_label.label).await.unwrap();
        assert!(result.is_some());
        let (info, _runs) = result.unwrap();
        assert_eq!(info.total_runs, 3);
        assert_eq!(info.successful_runs, 1);
        assert_eq!(info.failed_runs, 1);
        assert_eq!(info.cancelled_runs, 1);
        assert_eq!(info.average_partitions_per_run, 1.0);
    }
 }
--- a/databuild/repositories/mod.rs
+++ b/databuild/repositories/mod.rs
@ -1,17 +0,0 @@
 /// Repository pattern implementations for reading from the build event log
 /// 
 /// This module provides read-only repository interfaces that query the build event log
 /// for different types of data. Each repository focuses on a specific domain:
 /// 
 /// - PartitionsRepository: Query partition status and history
 /// - JobsRepository: Query job execution data
 /// - TasksRepository: Query task (job run) information  
 /// - BuildsRepository: Query build request data
 /// 
 /// All repositories work with any BuildEventLog implementation and provide
 /// a clean separation between read and write operations.
 pub mod partitions;
 pub mod jobs;
 pub mod tasks;
 pub mod builds;
--- a/databuild/repositories/partitions/mod.rs
+++ b/databuild/repositories/partitions/mod.rs
@ -1,373 +0,0 @@
 use crate::*;
 use crate::event_log::{BuildEventLogError, Result};
 use crate::event_log::query_engine::BELQueryEngine;
 use crate::status_utils::list_response_helpers;
 use std::sync::Arc;
 use std::collections::HashMap;
 use serde::Serialize;
 /// Repository for querying partition data from the build event log
 pub struct PartitionsRepository {
    query_engine: Arc<BELQueryEngine>,
 }
 /// Summary of a partition's current state and history
 #[derive(Debug, Clone, Serialize)]
 pub struct PartitionInfo {
    pub partition_ref: PartitionRef,
    pub current_status: PartitionStatus,
    pub last_updated: i64,
    pub builds_count: usize,
    pub last_successful_build: Option<String>,
    pub invalidation_count: usize,
 }
 /// Detailed partition status with timeline
 #[derive(Debug, Clone, Serialize)]
 pub struct PartitionStatusEvent {
    pub timestamp: i64,
    pub status: PartitionStatus,
    pub message: String,
    pub build_request_id: String,
    pub job_run_id: Option<String>,
 }
 impl PartitionsRepository {
    /// Create a new PartitionsRepository
    pub fn new(query_engine: Arc<BELQueryEngine>) -> Self {
        Self { query_engine }
    }
    /// List all partitions with their current status
    /// 
    /// Returns a list of all partitions that have been referenced in the build event log,
    /// along with their current status and summary information.
    pub async fn list(&self, _limit: Option<usize>) -> Result<Vec<PartitionInfo>> {
        // Get all events to find unique partitions
        let filter = EventFilter {
            partition_refs: vec![],
            partition_patterns: vec![],
            job_labels: vec![],
            job_run_ids: vec![],
            build_request_ids: vec![],
        };
        let events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        // Collect unique partition references
        let mut unique_partitions = std::collections::HashSet::new();
        for event in &events {
            match &event.event_type {
                Some(crate::build_event::EventType::PartitionEvent(p_event)) => {
                    if let Some(partition_ref) = &p_event.partition_ref {
                        unique_partitions.insert(partition_ref.str.clone());
                    }
                }
                Some(crate::build_event::EventType::BuildRequestEvent(br_event)) => {
                    for partition_ref in &br_event.requested_partitions {
                        unique_partitions.insert(partition_ref.str.clone());
                    }
                }
                Some(crate::build_event::EventType::JobEvent(j_event)) => {
                    for partition_ref in &j_event.target_partitions {
                        unique_partitions.insert(partition_ref.str.clone());
                    }
                }
                _ => {}
            }
        }
        // Get status for each partition and count builds
        let mut partition_infos = Vec::new();
        for partition_ref in unique_partitions {
            if let Ok(Some((status, last_updated))) = self.query_engine.get_latest_partition_status(&partition_ref).await {
                // Count builds that reference this partition by looking at BuildRequestEvents
                let mut builds_count = 0;
                for event in &events {
                    if let Some(crate::build_event::EventType::BuildRequestEvent(br_event)) = &event.event_type {
                        if br_event.requested_partitions.iter().any(|p| p.str == partition_ref) {
                            builds_count += 1;
                        }
                    }
                }
                partition_infos.push(PartitionInfo {
                    partition_ref: PartitionRef { str: partition_ref },
                    current_status: status,
                    last_updated,
                    builds_count,
                    last_successful_build: None, // TODO: Find last successful build
                    invalidation_count: 0, // TODO: Count invalidation events
                });
            }
        }
        // Sort by partition reference for consistent ordering
        partition_infos.sort_by(|a, b| a.partition_ref.str.cmp(&b.partition_ref.str));
        Ok(partition_infos)
    }
    // TODO: Implement remaining methods for BELQueryEngine
    /*
    Legacy methods that need to be updated to use query_engine:
    pub async fn show(&self, partition_ref: &str) -> Result<Option<(PartitionInfo, Vec<PartitionStatusEvent>)>> { ... }
    pub async fn invalidate(&self, partition_ref: &str, reason: String, build_request_id: String) -> Result<()> { ... }
    pub async fn show_protobuf(&self, partition_ref: &str) -> Result<Option<PartitionDetailResponse>> { ... }
    pub async fn list_protobuf(&self, request: PartitionsListRequest) -> Result<PartitionsListResponse> { ... }
    */
    /// Show detailed information about a specific partition
    /// 
    /// Returns the complete timeline of status changes for the specified partition,
    /// including all builds that have referenced it.
    pub async fn show(&self, partition_ref: &str) -> Result<Option<(PartitionInfo, Vec<PartitionStatusEvent>)>> {
        // Get partition events from query engine
        let events = self.query_engine.get_partition_events(partition_ref, None).await?;
        if events.is_empty() {
            return Ok(None);
        }
        // Get the latest partition status
        let latest_status_result = self.query_engine.get_latest_partition_status(partition_ref).await?;
        let (status, last_updated) = latest_status_result.unwrap_or((PartitionStatus::PartitionUnknown, 0));
        // Count builds that reference this partition
        let all_events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        let mut builds_count = 0;
        for event in &all_events {
            if let Some(crate::build_event::EventType::BuildRequestEvent(br_event)) = &event.event_type {
                if br_event.requested_partitions.iter().any(|p| p.str == partition_ref) {
                    builds_count += 1;
                }
            }
        }
        // Create partition info
        let partition_info = PartitionInfo {
            partition_ref: PartitionRef { str: partition_ref.to_string() },
            current_status: status,
            last_updated,
            builds_count,
            last_successful_build: None, // TODO: Find last successful build
            invalidation_count: 0, // TODO: Count invalidation events
        };
        // Convert events to PartitionStatusEvent
        let mut status_events = Vec::new();
        for event in events {
            if let Some(crate::build_event::EventType::PartitionEvent(p_event)) = &event.event_type {
                if let Ok(event_status) = PartitionStatus::try_from(p_event.status_code) {
                    status_events.push(PartitionStatusEvent {
                        timestamp: event.timestamp,
                        status: event_status,
                        message: p_event.message.clone(),
                        build_request_id: event.build_request_id.unwrap(),
                        job_run_id: if p_event.job_run_id.is_empty() { None } else { Some(p_event.job_run_id.clone()) },
                    });
                }
            }
        }
        // Sort events by timestamp
        status_events.sort_by_key(|e| e.timestamp);
        Ok(Some((partition_info, status_events)))
    }
    /// Invalidate a partition with a reason
    /// 
    /// This method uses the EventWriter to write a partition invalidation event.
    /// It validates that the partition exists before invalidating it.
    pub async fn invalidate(&self, partition_ref: &str, reason: String, build_request_id: String) -> Result<()> {
        // Check if the partition exists by looking for any events that reference it
        let partition_events = self.query_engine.get_partition_events(partition_ref, None).await?;
        let all_events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        // Check if partition is referenced in any build request events
        let mut partition_exists = !partition_events.is_empty();
        if !partition_exists {
            for event in &all_events {
                if let Some(crate::build_event::EventType::BuildRequestEvent(br_event)) = &event.event_type {
                    if br_event.requested_partitions.iter().any(|p| p.str == partition_ref) {
                        partition_exists = true;
                        break;
                    }
                }
            }
        }
        if !partition_exists {
            return Err(crate::event_log::BuildEventLogError::QueryError(
                format!("Cannot invalidate non-existent partition: {}", partition_ref)
            ));
        }
        // Create a partition invalidation event
        use crate::event_log::create_build_event;
        let invalidation_event = create_build_event(
            build_request_id,
            crate::build_event::EventType::PartitionInvalidationEvent(crate::PartitionInvalidationEvent {
                partition_ref: Some(crate::PartitionRef { str: partition_ref.to_string() }),
                reason,
            })
        );
        // Append the invalidation event
        self.query_engine.append_event(invalidation_event).await?;
        Ok(())
    }
    /// Show detailed information about a specific partition using protobuf response format
    /// 
    /// Returns the complete partition details with dual status fields and timeline events.
    pub async fn show_protobuf(&self, partition_ref: &str) -> Result<Option<PartitionDetailResponse>> {
        // TODO: Implement with query engine - for now return None
        Ok(None)
    }
    /// List partitions returning protobuf response format with dual status fields
    /// 
    /// This method provides the unified CLI/Service response format with both
    /// status codes (enum values) and status names (human-readable strings).
    pub async fn list_protobuf(&self, request: PartitionsListRequest) -> Result<PartitionsListResponse> {
        // Get partition info using existing list method
        let partition_infos = self.list(request.limit.map(|l| l as usize)).await?;
        // Convert to protobuf format
        let protobuf_partitions: Vec<crate::PartitionSummary> = partition_infos
            .into_iter()
            .map(|info| crate::PartitionSummary {
                partition_ref: Some(info.partition_ref),
                status_code: info.current_status as i32,
                status_name: info.current_status.to_display_string(),
                last_updated: info.last_updated,
                builds_count: info.builds_count as u32,
                last_successful_build: info.last_successful_build,
                invalidation_count: info.invalidation_count as u32,
            })
            .collect();
        let total_count = protobuf_partitions.len() as u32;
        Ok(PartitionsListResponse {
            partitions: protobuf_partitions,
            total_count,
            has_more: false, // TODO: Implement pagination
        })
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::event_log::mock::{create_mock_bel_query_engine, create_mock_bel_query_engine_with_events, test_events};
    #[tokio::test]
    async fn test_partitions_repository_list_empty() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = PartitionsRepository::new(query_engine);
        let partitions = repo.list(None).await.unwrap();
        assert!(partitions.is_empty());
    }
    #[tokio::test]
    async fn test_partitions_repository_list_with_data() {
        let build_id = "test-build-123".to_string();
        let partition1 = PartitionRef { str: "data/users".to_string() };
        let partition2 = PartitionRef { str: "data/orders".to_string() };
        // Create events for multiple partitions
        let events = vec![
            test_events::build_request_received(Some(build_id.clone()), vec![partition1.clone(), partition2.clone()]),
            test_events::partition_status(Some(build_id.clone()), partition1.clone(), PartitionStatus::PartitionBuilding, None),
            test_events::partition_status(Some(build_id.clone()), partition1.clone(), PartitionStatus::PartitionAvailable, None),
            test_events::partition_status(Some(build_id.clone()), partition2.clone(), PartitionStatus::PartitionBuilding, None),
            test_events::partition_status(Some(build_id.clone()), partition2.clone(), PartitionStatus::PartitionFailed, None),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = PartitionsRepository::new(query_engine.clone());
        let partitions = repo.list(None).await.unwrap();
        assert_eq!(partitions.len(), 2);
        // Find partitions by name
        let users_partition = partitions.iter().find(|p| p.partition_ref.str == "data/users").unwrap();
        let orders_partition = partitions.iter().find(|p| p.partition_ref.str == "data/orders").unwrap();
        assert_eq!(users_partition.current_status, PartitionStatus::PartitionAvailable);
        assert_eq!(orders_partition.current_status, PartitionStatus::PartitionFailed);
        assert_eq!(users_partition.builds_count, 1);
        assert_eq!(orders_partition.builds_count, 1);
    }
    #[tokio::test]
    async fn test_partitions_repository_show() {
        let build_id = "test-build-456".to_string();
        let partition = PartitionRef { str: "analytics/metrics".to_string() };
        let events = vec![
            test_events::build_request_received(Some(build_id.clone()), vec![partition.clone()]),
            test_events::partition_status(Some(build_id.clone()), partition.clone(), PartitionStatus::PartitionRequested, None),
            test_events::partition_status(Some(build_id.clone()), partition.clone(), PartitionStatus::PartitionBuilding, None),
            test_events::partition_status(Some(build_id.clone()), partition.clone(), PartitionStatus::PartitionAvailable, None),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = PartitionsRepository::new(query_engine);
        let result = repo.show(&partition.str).await.unwrap();
        assert!(result.is_some());
        let (info, timeline) = result.unwrap();
        assert_eq!(info.partition_ref.str, "analytics/metrics");
        assert_eq!(info.current_status, PartitionStatus::PartitionAvailable);
        assert_eq!(info.builds_count, 1);
        assert_eq!(timeline.len(), 3);
        // Verify timeline order
        assert_eq!(timeline[0].status, PartitionStatus::PartitionRequested);
        assert_eq!(timeline[1].status, PartitionStatus::PartitionBuilding);
        assert_eq!(timeline[2].status, PartitionStatus::PartitionAvailable);
    }
    #[tokio::test]
    async fn test_partitions_repository_show_nonexistent() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = PartitionsRepository::new(query_engine);
        let result = repo.show("nonexistent/partition").await.unwrap();
        assert!(result.is_none());
    }
    #[tokio::test]
    async fn test_partitions_repository_invalidate() {
        let build_id = "test-build-789".to_string();
        let partition = PartitionRef { str: "temp/data".to_string() };
        // Start with an existing partition
        let events = vec![
            test_events::partition_status(Some(build_id.clone()), partition.clone(), PartitionStatus::PartitionAvailable, None),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = PartitionsRepository::new(query_engine.clone());
        // Invalidate the partition
        repo.invalidate(&partition.str, "Test invalidation".to_string(), build_id.clone()).await.unwrap();
        // Verify the invalidation was recorded
        // Note: This test demonstrates the pattern, but the MockBuildEventLog would need
        // to be enhanced to properly store invalidation events for full verification
        // Try to invalidate a non-existent partition
        let result = repo.invalidate("nonexistent/partition", "Should fail".to_string(), build_id).await;
        assert!(result.is_err());
    }
 }
--- a/databuild/repositories/tasks/mod.rs
+++ b/databuild/repositories/tasks/mod.rs
@ -1,519 +0,0 @@
 use crate::*;
 use crate::event_log::{BuildEventLogError, Result};
 use crate::event_log::query_engine::BELQueryEngine;
 use crate::{JobRunDetailResponse, JobRunTimelineEvent as ServiceTaskTimelineEvent};
 use std::sync::Arc;
 use std::collections::HashMap;
 use serde::Serialize;
 /// Repository for querying task (job run) data from the build event log
 pub struct TasksRepository {
    query_engine: Arc<BELQueryEngine>,
 }
 /// Summary of a task's execution
 #[derive(Debug, Clone, Serialize)]
 pub struct TaskInfo {
    pub job_run_id: String,
    pub job_label: String,
    pub build_request_id: String,
    pub status: JobStatus,
    pub target_partitions: Vec<PartitionRef>,
    pub scheduled_at: i64,
    pub started_at: Option<i64>,
    pub completed_at: Option<i64>,
    pub duration_ms: Option<i64>,
    pub message: String,
    pub config: Option<JobConfig>,
    pub manifests: Vec<PartitionManifest>,
    pub cancelled: bool,
    pub cancel_reason: Option<String>,
 }
 /// Detailed timeline of a task's execution events
 #[derive(Debug, Clone, Serialize)]
 pub struct TaskEvent {
    pub timestamp: i64,
    pub event_type: String,
    pub status: Option<JobStatus>,
    pub message: String,
    pub cancel_reason: Option<String>,
 }
 impl TasksRepository {
    /// Create a new TasksRepository
    pub fn new(query_engine: Arc<BELQueryEngine>) -> Self {
        Self { query_engine }
    }
    /// List all tasks with their current status
    /// 
    /// Returns a list of all job runs (tasks) that have been executed,
    /// including their current status and execution details.
    pub async fn list(&self, limit: Option<usize>) -> Result<Vec<TaskInfo>> {
        // Get all events from the event log
        let events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        let mut task_data: HashMap<String, TaskInfo> = HashMap::new();
        let mut task_cancellations: HashMap<String, String> = HashMap::new();
        // First pass: collect all task cancel events
        for event in &events {
            if let Some(build_event::EventType::JobRunCancelEvent(tc_event)) = &event.event_type {
                task_cancellations.insert(tc_event.job_run_id.clone(), tc_event.reason.clone());
            }
        }
        // Second pass: collect all job events and build task information
        for event in events {
            if let Some(build_event::EventType::JobEvent(j_event)) = &event.event_type {
                let job_label = j_event.job_label.as_ref()
                    .map(|l| l.label.clone())
                    .unwrap_or_else(|| "unknown".to_string());
                let status = match j_event.status_code {
                    1 => JobStatus::JobScheduled,
                    2 => JobStatus::JobRunning,
                    3 => JobStatus::JobCompleted,
                    4 => JobStatus::JobFailed,
                    5 => JobStatus::JobCancelled,
                    6 => JobStatus::JobSkipped,
                    _ => JobStatus::JobUnknown,
                };
                // Create or update task info
                let task = task_data.entry(j_event.job_run_id.clone()).or_insert_with(|| {
                    TaskInfo {
                        job_run_id: j_event.job_run_id.clone(),
                        job_label: job_label.clone(),
                        build_request_id: event.build_request_id.clone().unwrap(),
                        status: JobStatus::JobUnknown,
                        target_partitions: j_event.target_partitions.clone(),
                        scheduled_at: event.timestamp,
                        started_at: None,
                        completed_at: None,
                        duration_ms: None,
                        message: String::new(),
                        config: None,
                        manifests: vec![],
                        cancelled: false,
                        cancel_reason: None,
                    }
                });
                // Update task with new information
                task.status = status;
                task.message = j_event.message.clone();
                match status {
                    JobStatus::JobScheduled => {
                        task.scheduled_at = event.timestamp;
                        if let Some(config) = &j_event.config {
                            task.config = Some(config.clone());
                        }
                    }
                    JobStatus::JobRunning => {
                        task.started_at = Some(event.timestamp);
                    }
                    JobStatus::JobCompleted | JobStatus::JobFailed | JobStatus::JobCancelled => {
                        task.completed_at = Some(event.timestamp);
                        if let Some(started) = task.started_at {
                            task.duration_ms = Some((event.timestamp - started) / 1_000_000); // Convert to ms
                        }
                        task.manifests = j_event.manifests.clone();
                    }
                    _ => {}
                }
                // Check if this task was cancelled
                if let Some(cancel_reason) = task_cancellations.get(&j_event.job_run_id) {
                    task.cancelled = true;
                    task.cancel_reason = Some(cancel_reason.clone());
                }
            }
        }
        // Convert to vector and sort by scheduled time (most recent first)
        let mut tasks: Vec<TaskInfo> = task_data.into_values().collect();
        tasks.sort_by(|a, b| b.scheduled_at.cmp(&a.scheduled_at));
        // Apply limit if specified
        if let Some(limit) = limit {
            tasks.truncate(limit);
        }
        Ok(tasks)
    }
    /// Show detailed information about a specific task
    /// 
    /// Returns the complete timeline of events for the specified task,
    /// including all status changes and any cancellation events.
    pub async fn show(&self, job_run_id: &str) -> Result<Option<(TaskInfo, Vec<TaskEvent>)>> {
        // Get all events for this specific job run
        let job_events = self.query_engine.get_job_run_events(job_run_id).await?;
        if job_events.is_empty() {
            return Ok(None);
        }
        let mut task_info: Option<TaskInfo> = None;
        let mut timeline: Vec<TaskEvent> = Vec::new();
        // Process job events to build task information
        for event in &job_events {
            if let Some(build_event::EventType::JobEvent(j_event)) = &event.event_type {
                let job_label = j_event.job_label.as_ref()
                    .map(|l| l.label.clone())
                    .unwrap_or_else(|| "unknown".to_string());
                let status = match j_event.status_code {
                    1 => JobStatus::JobScheduled,
                    2 => JobStatus::JobRunning,
                    3 => JobStatus::JobCompleted,
                    4 => JobStatus::JobFailed,
                    5 => JobStatus::JobCancelled,
                    6 => JobStatus::JobSkipped,
                    _ => JobStatus::JobUnknown,
                };
                // Create or update task info
                if task_info.is_none() {
                    task_info = Some(TaskInfo {
                        job_run_id: j_event.job_run_id.clone(),
                        job_label: job_label.clone(),
                        build_request_id: event.build_request_id.clone().unwrap(),
                        status: JobStatus::JobUnknown,
                        target_partitions: j_event.target_partitions.clone(),
                        scheduled_at: event.timestamp,
                        started_at: None,
                        completed_at: None,
                        duration_ms: None,
                        message: String::new(),
                        config: None,
                        manifests: vec![],
                        cancelled: false,
                        cancel_reason: None,
                    });
                }
                let task = task_info.as_mut().unwrap();
                task.status = status;
                task.message = j_event.message.clone();
                match status {
                    JobStatus::JobScheduled => {
                        task.scheduled_at = event.timestamp;
                        if let Some(config) = &j_event.config {
                            task.config = Some(config.clone());
                        }
                    }
                    JobStatus::JobRunning => {
                        task.started_at = Some(event.timestamp);
                    }
                    JobStatus::JobCompleted | JobStatus::JobFailed | JobStatus::JobCancelled => {
                        task.completed_at = Some(event.timestamp);
                        if let Some(started) = task.started_at {
                            task.duration_ms = Some((event.timestamp - started) / 1_000_000); // Convert to ms
                        }
                        task.manifests = j_event.manifests.clone();
                    }
                    _ => {}
                }
                // Add to timeline
                timeline.push(TaskEvent {
                    timestamp: event.timestamp,
                    event_type: "job_status_change".to_string(),
                    status: Some(status),
                    message: j_event.message.clone(),
                    cancel_reason: None,
                });
            }
        }
        // Also check for task cancel events in all events
        let all_events = self.query_engine.get_events_in_range(0, i64::MAX).await?;
        for event in all_events {
            if let Some(build_event::EventType::JobRunCancelEvent(tc_event)) = &event.event_type {
                if tc_event.job_run_id == job_run_id {
                    if let Some(task) = task_info.as_mut() {
                        task.cancelled = true;
                        task.cancel_reason = Some(tc_event.reason.clone());
                    }
                    timeline.push(TaskEvent {
                        timestamp: event.timestamp,
                        event_type: "task_cancel".to_string(),
                        status: None,
                        message: "Task cancelled".to_string(),
                        cancel_reason: Some(tc_event.reason.clone()),
                    });
                }
            }
        }
        // Sort timeline by timestamp
        timeline.sort_by_key(|e| e.timestamp);
        Ok(task_info.map(|info| (info, timeline)))
    }
    /// Cancel a task with a reason
    /// 
    /// This method uses the EventWriter to write a task cancellation event.
    /// It validates that the task exists and is in a cancellable state.
    pub async fn cancel(&self, job_run_id: &str, reason: String, build_request_id: String) -> Result<()> {
        // First check if the task exists and get its current status
        let task_info = self.show(job_run_id).await?;
        if task_info.is_none() {
            return Err(BuildEventLogError::QueryError(
                format!("Cannot cancel non-existent task: {}", job_run_id)
            ));
        }
        let (task, _timeline) = task_info.unwrap();
        // Check if task is in a cancellable state
        match task.status {
            JobStatus::JobCompleted => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel completed task: {}", job_run_id)
                ));
            }
            JobStatus::JobFailed => {
                return Err(BuildEventLogError::QueryError(
                    format!("Cannot cancel failed task: {}", job_run_id)
                ));
            }
            JobStatus::JobCancelled => {
                return Err(BuildEventLogError::QueryError(
                    format!("Task already cancelled: {}", job_run_id)
                ));
            }
            _ => {}
        }
        // Use EventWriter to write the cancellation event
        let event_writer = crate::event_log::writer::EventWriter::new(self.query_engine.clone());
        event_writer.cancel_task(build_request_id, job_run_id.to_string(), reason).await
    }
    /// Show detailed information about a specific task using protobuf response format
    /// 
    /// Returns the complete task details with dual status fields and timeline events.
    pub async fn show_protobuf(&self, job_run_id: &str) -> Result<Option<JobRunDetailResponse>> {
        // Get task info and timeline using existing show method
        if let Some((task_info, timeline)) = self.show(job_run_id).await? {
            // Convert timeline events to protobuf format
            let protobuf_timeline: Vec<ServiceTaskTimelineEvent> = timeline
                .into_iter()
                .map(|event| ServiceTaskTimelineEvent {
                    timestamp: event.timestamp,
                    status_code: event.status.map(|s| s as i32),
                    status_name: event.status.map(|s| s.to_display_string()),
                    message: event.message,
                    event_type: event.event_type,
                    cancel_reason: event.cancel_reason,
                })
                .collect();
            let response = JobRunDetailResponse {
                job_run_id: task_info.job_run_id,
                job_label: task_info.job_label,
                build_request_id: task_info.build_request_id,
                status_code: task_info.status as i32,
                status_name: task_info.status.to_display_string(),
                target_partitions: task_info.target_partitions,
                scheduled_at: task_info.scheduled_at,
                started_at: task_info.started_at,
                completed_at: task_info.completed_at,
                duration_ms: task_info.duration_ms,
                cancelled: task_info.cancelled,
                cancel_reason: task_info.cancel_reason,
                message: task_info.message,
                timeline: protobuf_timeline,
            };
            Ok(Some(response))
        } else {
            Ok(None)
        }
    }
    /// List tasks using protobuf response format with dual status fields
    /// 
    /// Returns JobRunsListResponse protobuf message with JobRunSummary objects containing
    /// status_code and status_name fields.
    pub async fn list_protobuf(&self, request: JobRunsListRequest) -> Result<JobRunsListResponse> {
        // Get task info using existing list method
        let tasks = self.list(request.limit.map(|l| l as usize)).await?;
        // Convert to protobuf format
        let protobuf_tasks: Vec<crate::JobRunSummary> = tasks
            .into_iter()
            .map(|task| crate::JobRunSummary {
                job_run_id: task.job_run_id,
                job_label: task.job_label,
                build_request_id: task.build_request_id,
                status_code: task.status as i32,
                status_name: task.status.to_display_string(),
                target_partitions: task.target_partitions.into_iter().map(|p| crate::PartitionRef { str: p.str }).collect(),
                scheduled_at: task.scheduled_at,
                started_at: task.started_at,
                completed_at: task.completed_at,
                duration_ms: task.duration_ms,
                cancelled: task.cancelled,
                message: task.message,
            })
            .collect();
        let total_count = protobuf_tasks.len() as u32;
        Ok(JobRunsListResponse {
            tasks: protobuf_tasks,
            total_count,
        })
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::event_log::mock::{create_mock_bel_query_engine, create_mock_bel_query_engine_with_events, test_events};
    #[tokio::test]
    async fn test_tasks_repository_list_empty() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = TasksRepository::new(query_engine);
        let tasks = repo.list(None).await.unwrap();
        assert!(tasks.is_empty());
    }
    #[tokio::test]
    async fn test_tasks_repository_list_with_data() {
        let build_id = "test-build-123".to_string();
        let job_label = JobLabel { label: "//:process_data".to_string() };
        let partition = PartitionRef { str: "data/users".to_string() };
        // Create events for multiple tasks
        let events = vec![
            test_events::job_event(Some(build_id.clone()), Some("task-1".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("task-1".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobCompleted),
            test_events::job_event(Some(build_id.clone()), Some("task-2".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("task-2".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobFailed),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = TasksRepository::new(query_engine);
        let tasks = repo.list(None).await.unwrap();
        assert_eq!(tasks.len(), 2);
        // Find tasks by job run id
        let task1 = tasks.iter().find(|t| t.job_run_id == "task-1").unwrap();
        let task2 = tasks.iter().find(|t| t.job_run_id == "task-2").unwrap();
        assert_eq!(task1.status, JobStatus::JobCompleted);
        assert_eq!(task1.job_label, "//:process_data");
        assert!(!task1.cancelled);
        assert_eq!(task2.status, JobStatus::JobFailed);
        assert_eq!(task2.job_label, "//:process_data");
        assert!(!task2.cancelled);
    }
    #[tokio::test]
    async fn test_tasks_repository_show() {
        let build_id = "test-build-456".to_string();
        let job_label = JobLabel { label: "//:analytics_task".to_string() };
        let partition = PartitionRef { str: "analytics/daily".to_string() };
        let events = vec![
            test_events::job_event(Some(build_id.clone()), Some("task-123".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("task-123".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobRunning),
            test_events::job_event(Some(build_id.clone()), Some("task-123".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobCompleted),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = TasksRepository::new(query_engine);
        let result = repo.show("task-123").await.unwrap();
        assert!(result.is_some());
        let (info, timeline) = result.unwrap();
        assert_eq!(info.job_run_id, "task-123");
        assert_eq!(info.job_label, "//:analytics_task");
        assert_eq!(info.status, JobStatus::JobCompleted);
        assert!(!info.cancelled);
        assert_eq!(timeline.len(), 3);
        assert_eq!(timeline[0].status, Some(JobStatus::JobScheduled));
        assert_eq!(timeline[1].status, Some(JobStatus::JobRunning));
        assert_eq!(timeline[2].status, Some(JobStatus::JobCompleted));
    }
    #[tokio::test]
    async fn test_tasks_repository_show_nonexistent() {
        let query_engine = create_mock_bel_query_engine().await.unwrap();
        let repo = TasksRepository::new(query_engine);
        let result = repo.show("nonexistent-task").await.unwrap();
        assert!(result.is_none());
    }
    #[tokio::test]
    async fn test_tasks_repository_cancel() {
        let build_id = "test-build-789".to_string();
        let job_label = JobLabel { label: "//:batch_task".to_string() };
        let partition = PartitionRef { str: "batch/data".to_string() };
        // Start with a running task
        let events = vec![
            test_events::job_event(Some(build_id.clone()), Some("task-456".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("task-456".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobRunning),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = TasksRepository::new(query_engine.clone());
        // Cancel the task
        repo.cancel("task-456", "User requested cancellation".to_string(), build_id.clone()).await.unwrap();
        // Verify the cancellation was recorded
        // Note: This test demonstrates the pattern, but the MockBELStorage would need
        // to be enhanced to properly store task cancel events for full verification
        // Try to cancel a non-existent task
        let result = repo.cancel("nonexistent-task", "Should fail".to_string(), build_id).await;
        assert!(result.is_err());
    }
    #[tokio::test]
    async fn test_tasks_repository_cancel_completed_task() {
        let build_id = "test-build-999".to_string();
        let job_label = JobLabel { label: "//:completed_task".to_string() };
        let partition = PartitionRef { str: "test/data".to_string() };
        // Create a completed task
        let events = vec![
            test_events::job_event(Some(build_id.clone()), Some("completed-task".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobScheduled),
            test_events::job_event(Some(build_id.clone()), Some("completed-task".to_string()), job_label.clone(), vec![partition.clone()], JobStatus::JobCompleted),
        ];
        let query_engine = create_mock_bel_query_engine_with_events(events).await.unwrap();
        let repo = TasksRepository::new(query_engine);
        // Try to cancel the completed task - should fail
        let result = repo.cancel("completed-task", "Should fail".to_string(), build_id).await;
        assert!(result.is_err());
        if let Err(BuildEventLogError::QueryError(msg)) = result {
            assert!(msg.contains("Cannot cancel completed task"));
        } else {
            panic!("Expected QueryError for completed task cancellation");
        }
    }
 }
--- a/databuild/runtime/BUILD.bazel
+++ b/databuild/runtime/BUILD.bazel
@ -1,77 +0,0 @@
 # In modules/jq/BUILD.bazel
 load("@bazel_skylib//lib:selects.bzl", "selects")
 exports_files([
    "simple_executable_wrapper.sh.tpl",
 ])
 # Platform detection
 config_setting(
    name = "darwin",
    constraint_values = ["@platforms//os:osx"],
 )
 config_setting(
    name = "linux",
    constraint_values = ["@platforms//os:linux"],
 )
 # Download jq binaries
 genrule(
    name = "download_jq_linux",
    outs = ["jq-linux64"],
    cmd = """
        curl -L "https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64" -o "$@"
        chmod +x "$@"
    """,
 )
 genrule(
    name = "download_jq_macos",
    outs = ["jq-osx-amd64"],
    cmd = """
        curl -L "https://github.com/stedolan/jq/releases/download/jq-1.6/jq-osx-amd64" -o "$@"
        chmod +x "$@"
    """,
 )
 # Create jq binary target for each platform
 genrule(
    name = "jq_linux_bin",
    srcs = [":jq-linux64"],
    outs = ["jq_linux"],
    cmd = "cp $< $@ && chmod +x $@",
    executable = True,
 )
 genrule(
    name = "jq_macos_bin",
    srcs = [":jq-osx-amd64"],
    outs = ["jq_macos"],
    cmd = "cp $< $@ && chmod +x $@",
    executable = True,
 )
 # Create platform-specific filegroups
 filegroup(
    name = "jq_bin_linux",
    srcs = [":jq_linux"],
    visibility = ["//visibility:public"],
 )
 filegroup(
    name = "jq_bin_macos",
    srcs = [":jq_macos"],
    visibility = ["//visibility:public"],
 )
 # Create a binary target for jq
 sh_binary(
    name = "jq",
    srcs = select({
        ":darwin": [":jq_macos"],
        ":linux": [":jq_linux"],
        "//conditions:default": [":jq_linux"],
    }),
    visibility = ["//visibility:public"],
 )
--- a/databuild/runtime/simple_executable_wrapper.sh.tpl
+++ b/databuild/runtime/simple_executable_wrapper.sh.tpl
@ -1,40 +0,0 @@
 #!/bin/bash
 set -e
 %{RUNFILES_PREFIX}
 %{PREFIX}
 # Check if rlocation function is available
 if ! type rlocation >/dev/null 2>&1; then
    echo "Error: rlocation function not available. Runfiles may not be properly initialized." >&2
    exit 1
 fi
 # Resolve the executable using rlocation
 EXECUTABLE_BINARY="$(rlocation "_main/%{EXECUTABLE_SHORT_PATH}")"
 # Check if rlocation returned something
 if [[ -z "${EXECUTABLE_BINARY}" ]]; then
    echo "Error: rlocation returned empty result for '_main/%{EXECUTABLE_SHORT_PATH}'" >&2
    exit 1
 fi
 # Check if the resolved binary exists
 if [[ ! -f "${EXECUTABLE_BINARY}" ]]; then
    echo "Error: Resolved executable '${EXECUTABLE_BINARY}' does not exist" >&2
    exit 1
 fi
 # Check if the resolved binary is executable
 if [[ ! -x "${EXECUTABLE_BINARY}" ]]; then
    echo "Error: Resolved executable '${EXECUTABLE_BINARY}' is not executable" >&2
    exit 1
 fi
 # Run the configuration
 if [[ -n "${EXECUTABLE_SUBCOMMAND:-}" ]]; then
    exec "${EXECUTABLE_BINARY}" "${EXECUTABLE_SUBCOMMAND}" "$@"
 else
    exec "${EXECUTABLE_BINARY}" "$@"
 fi
--- a/databuild/service/handlers.rs
+++ b/databuild/service/handlers.rs
--- a/databuild/service/main.rs
+++ b/databuild/service/main.rs
@ -1,140 +0,0 @@
 use databuild::service::BuildGraphService;
 use std::collections::HashMap;
 use std::env;
 use std::net::SocketAddr;
 use clap::{Arg, Command};
 use log::info;
 use simple_logger::SimpleLogger;
 #[tokio::main]
 async fn main() {
    SimpleLogger::new().init().unwrap();
    let matches = Command::new("build-graph-service")
        .version("1.0")
        .about("DataBuild Build Graph Service")
        .arg(
            Arg::new("port")
                .short('p')
                .long("port")
                .value_name("PORT")
                .help("Port to listen on")
                .default_value("8080")
        )
        .arg(
            Arg::new("host")
                .long("host")
                .value_name("HOST")
                .help("Host to bind to")
                .default_value("0.0.0.0")
        )
        .arg(
            Arg::new("event-log")
                .long("event-log")
                .value_name("URI")
                .help("Build event log URI")
                .default_value("sqlite:///tmp/databuild.db")
        )
        .arg(
            Arg::new("graph-label")
                .long("graph-label")
                .value_name("LABEL")
                .help("Graph label")
                .default_value("//example:graph")
        )
        .arg(
            Arg::new("job-lookup-path")
                .long("job-lookup-path")
                .value_name("PATH")
                .help("Job lookup binary path")
                .default_value("job_lookup")
        )
        .arg(
            Arg::new("print-openapi-spec")
                .long("print-openapi-spec")
                .help("Print OpenAPI spec to stdout and exit")
                .action(clap::ArgAction::SetTrue)
        )
        .get_matches();
    let port: u16 = matches.get_one::<String>("port").unwrap()
        .parse().expect("Invalid port number");
    let host = matches.get_one::<String>("host").unwrap();
    // Check environment variable first, fall back to command line argument
    let event_log_uri = env::var("DATABUILD_BUILD_EVENT_LOG")
        .unwrap_or_else(|_| matches.get_one::<String>("event-log").unwrap().to_string());
    let graph_label = matches.get_one::<String>("graph-label").unwrap().to_string();
    let job_lookup_path = matches.get_one::<String>("job-lookup-path").unwrap().to_string();
    // Get candidate jobs from environment
    let candidate_jobs: HashMap<String, String> = env::var("DATABUILD_CANDIDATE_JOBS")
        .map(|s| serde_json::from_str(&s).unwrap_or_else(|_| HashMap::new()))
        .unwrap_or_else(|_| HashMap::new());
    // Handle OpenAPI spec generation
    if matches.get_flag("print-openapi-spec") {
        // Disable logging for OpenAPI generation to keep output clean
        log::set_max_level(log::LevelFilter::Off);
        // Create a minimal service instance for OpenAPI generation
        let service = match BuildGraphService::new(
            "sqlite://:memory:",  // Use in-memory database for spec generation
            graph_label,
            job_lookup_path,
            candidate_jobs,
        ).await {
            Ok(service) => service,
            Err(e) => {
                eprintln!("Failed to create service for OpenAPI generation: {}", e);
                std::process::exit(1);
            }
        };
        // Generate and print OpenAPI spec
        let spec = service.generate_openapi_spec();
        match serde_json::to_string_pretty(&spec) {
            Ok(json) => {
                println!("{}", json);
                std::process::exit(0);
            }
            Err(e) => {
                eprintln!("Failed to serialize OpenAPI spec: {}", e);
                std::process::exit(1);
            }
        }
    }
    info!("Starting Build Graph Service on {}:{}", host, port);
    info!("Event log URI: {}", event_log_uri);
    info!("Graph label: {}", graph_label);
    info!("Job lookup path: {}", job_lookup_path);
    info!("Candidate jobs: {} configured", candidate_jobs.len());
    // Create service
    let service = match BuildGraphService::new(
        &event_log_uri,
        graph_label,
        job_lookup_path,
        candidate_jobs,
    ).await {
        Ok(service) => service,
        Err(e) => {
            eprintln!("Failed to create service: {}", e);
            std::process::exit(1);
        }
    };
    // Create router
    let app = service.create_router();
    // Start server
    let addr: SocketAddr = format!("{}:{}", host, port).parse().unwrap();
    info!("Build Graph Service listening on {}", addr);
    let listener = tokio::net::TcpListener::bind(&addr).await.unwrap();
    axum::serve(listener, app.into_make_service())
        .await
        .unwrap();
 }
--- a/databuild/service/mod.rs
+++ b/databuild/service/mod.rs
@ -1,479 +0,0 @@
 use crate::*;
 use crate::event_log::BuildEventLogError;
 use aide::{
    axum::{
        routing::{get, post, delete},
        ApiRouter,
    },
    openapi::OpenApi,
 };
 use axum::{Extension, response::Response, http::StatusCode};
 use axum_jsonschema::Json;
 use serde::{Deserialize, Serialize};
 use schemars::JsonSchema;
 use std::collections::HashMap;
 use std::sync::Arc;
 use rusqlite::ToSql;
 use tokio::sync::RwLock;
 use uuid::Uuid;
 pub mod handlers;
 #[derive(Clone)]
 pub struct BuildGraphService {
    pub query_engine: Arc<crate::event_log::query_engine::BELQueryEngine>,
    pub event_log_uri: String,
    pub active_builds: Arc<RwLock<HashMap<String, BuildRequestState>>>,
    pub graph_label: String,
    pub job_lookup_path: String,
    pub candidate_jobs: HashMap<String, String>,
 }
 #[derive(Debug, Clone)]
 pub struct BuildRequestState {
    pub build_request_id: String,
    pub status: BuildRequestStatus,
    pub requested_partitions: Vec<String>,
    pub created_at: i64,
    pub updated_at: i64,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildRequest {
    pub partitions: Vec<String>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildRequestResponse {
    pub build_request_id: String,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildEventSummary {
    pub event_id: String,
    pub timestamp: i64,
    pub event_type: String,
    pub message: String,
    pub build_request_id: String,            // Build request ID for navigation
    // Navigation-relevant fields (populated based on event type)
    pub job_label: Option<String>,           // For job events
    pub partition_ref: Option<String>,       // For partition events
    pub delegated_build_id: Option<String>,  // For delegation events
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct PartitionStatusResponse {
    pub partition_ref: String,
    pub status_code: i32,
    pub status_name: String,
    pub last_updated: Option<i64>,
    pub build_requests: Vec<String>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct PartitionEventsResponse {
    pub partition_ref: String,
    pub events: Vec<BuildEventSummary>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct AnalyzeRequest {
    pub partitions: Vec<String>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct AnalyzeResponse {
    #[schemars(schema_with = "job_graph_schema")]
    pub job_graph: serde_json::Value,
 }
 fn job_graph_schema(_gen: &mut schemars::gen::SchemaGenerator) -> schemars::schema::Schema {
    schemars::schema::Schema::Object(schemars::schema::SchemaObject {
        instance_type: Some(schemars::schema::SingleOrVec::Single(Box::new(schemars::schema::InstanceType::Object))),
        ..Default::default()
    })
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct ErrorResponse {
    pub error: String,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildCancelResponse {
    pub cancelled: bool,
    pub build_request_id: String,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildCancelRepositoryResponse {
    pub cancelled: bool,
    pub build_request_id: String,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct PartitionInvalidateResponse {
    pub invalidated: bool,
    pub partition_ref: String,
    pub reason: String,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct TaskCancelResponse {
    pub cancelled: bool,
    pub job_run_id: String,
    pub reason: String,
 }
 // List endpoints request/response types
 // Removed: duplicate of crate::BuildsListResponse from proto
 // Wrapper structs for API responses that contain protobuf data + service metadata
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildsListApiResponse {
    pub data: crate::BuildsListResponse,
    pub request_id: Option<String>,
    pub pagination: Option<PaginationInfo>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct PartitionsListApiResponse {
    pub data: crate::PartitionsListResponse,
    pub request_id: Option<String>,
    pub pagination: Option<PaginationInfo>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobsListApiResponse {
    pub data: crate::JobsListResponse,
    pub request_id: Option<String>,
    pub pagination: Option<PaginationInfo>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobRunsListApiResponse {
    pub data: crate::JobRunsListResponse,
    pub request_id: Option<String>,
    pub pagination: Option<PaginationInfo>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct ActivityApiResponse {
    pub data: crate::ActivityResponse,
    pub request_id: Option<String>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct PaginationInfo {
    pub total_count: u32,
    pub has_more: bool,
    pub limit: Option<u32>,
    pub offset: Option<u32>,
 }
 // Removed: Legacy types that duplicate proto definitions
 // - BuildSummary (use crate::BuildSummary from proto)
 // - PartitionsListResponse (use crate::PartitionsListResponse from proto)  
 // - PartitionSummary (use crate::PartitionSummary from proto)
 // Job-related request/response types
 // Removed: JobsListResponse and JobSummary (use crate:: proto versions)
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobMetricsResponse {
    pub job_label: String,
    pub success_rate: f64,
    pub avg_duration_ms: Option<i64>,
    pub total_runs: u32,
    pub recent_runs: Vec<JobRunSummary>,
    pub daily_stats: Vec<JobDailyStats>,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobRunSummary {
    pub build_request_id: String,
    pub partitions: Vec<String>,
    pub status_code: i32,
    pub status_name: String,
    pub duration_ms: Option<i64>,
    pub started_at: i64,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobDailyStats {
    pub date: String,
    pub success_rate: f64,
    pub avg_duration_ms: Option<i64>,
    pub total_runs: u32,
 }
 impl BuildGraphService {
    pub async fn new(
        event_log_uri: &str,
        graph_label: String,
        job_lookup_path: String,
        candidate_jobs: HashMap<String, String>,
    ) -> Result<Self, BuildEventLogError> {
        let query_engine = crate::event_log::storage::create_bel_query_engine(event_log_uri).await?;
        Ok(Self {
            query_engine,
            event_log_uri: event_log_uri.to_string(),
            active_builds: Arc::new(RwLock::new(HashMap::new())),
            graph_label,
            job_lookup_path,
            candidate_jobs,
        })
    }
    pub fn generate_openapi_spec(&self) -> OpenApi {
        let mut api = OpenApi::default();
        // Create API router with all routes to generate OpenAPI spec
        let _ = ApiRouter::new()
            .api_route("/api/v1/builds", post(handlers::submit_build_request))
            .api_route("/api/v1/builds", get(handlers::list_builds_repository))
            .api_route("/api/v1/builds/:build_request_id", get(handlers::get_build_detail))
            .api_route("/api/v1/builds/:build_request_id", delete(handlers::cancel_build_repository))
            .api_route("/api/v1/partitions", get(handlers::list_partitions_repository))
            .api_route("/api/v1/partitions/:partition_ref", get(handlers::get_partition_detail))
            .api_route("/api/v1/partitions/:partition_ref/status", get(handlers::get_partition_status))
            .api_route("/api/v1/partitions/:partition_ref/events", get(handlers::get_partition_events))
            .api_route("/api/v1/partitions/:partition_ref/invalidate", post(handlers::invalidate_partition))
            .api_route("/api/v1/jobs", get(handlers::list_jobs_repository))
            .api_route("/api/v1/jobs/:label", get(handlers::get_job_detail))
            .api_route("/api/v1/jobs/:label/metrics", get(handlers::get_job_metrics))
            .api_route("/api/v1/tasks", get(handlers::list_tasks_repository))
            .api_route("/api/v1/tasks/:job_run_id", get(handlers::get_task_detail))
            .api_route("/api/v1/tasks/:job_run_id/cancel", post(handlers::cancel_task))
            .api_route("/api/v1/activity", get(handlers::get_activity_summary))
            .api_route("/api/v1/analyze", post(handlers::analyze_build_graph))
            .finish_api(&mut api);
        api
    }
    pub fn create_router(self) -> axum::Router {
        let mut api = OpenApi::default();
        let api_router = ApiRouter::new()
            .api_route("/api/v1/builds", post(handlers::submit_build_request))
            .api_route("/api/v1/builds", get(handlers::list_builds_repository))
            .api_route("/api/v1/builds/:build_request_id", get(handlers::get_build_detail))
            .api_route("/api/v1/builds/:build_request_id/mermaid", get(handlers::get_build_mermaid_diagram))
            .api_route("/api/v1/builds/:build_request_id", delete(handlers::cancel_build_repository))
            .api_route("/api/v1/partitions", get(handlers::list_partitions_repository))
            .api_route("/api/v1/partitions/:partition_ref", get(handlers::get_partition_detail))
            .api_route("/api/v1/partitions/:partition_ref/status", get(handlers::get_partition_status))
            .api_route("/api/v1/partitions/:partition_ref/events", get(handlers::get_partition_events))
            .api_route("/api/v1/partitions/:partition_ref/invalidate", post(handlers::invalidate_partition))
            .api_route("/api/v1/jobs", get(handlers::list_jobs_repository))
            .api_route("/api/v1/jobs/:label", get(handlers::get_job_detail))
            .api_route("/api/v1/jobs/:label/metrics", get(handlers::get_job_metrics))
            .api_route("/api/v1/tasks", get(handlers::list_tasks_repository))
            .api_route("/api/v1/tasks/:job_run_id", get(handlers::get_task_detail))
            .api_route("/api/v1/tasks/:job_run_id/cancel", post(handlers::cancel_task))
            .api_route("/api/v1/activity", get(handlers::get_activity_summary))
            .api_route("/api/v1/analyze", post(handlers::analyze_build_graph))
            // Job logs and metrics endpoints
            .api_route("/api/v1/logs/jobs", get(handlers::list_available_jobs))
            .api_route("/api/v1/logs/jobs/:job_run_id", get(handlers::get_job_logs))
            .api_route("/api/v1/logs/jobs/:job_run_id/metrics", get(handlers::get_job_run_metrics))
            .route("/api/v1/metrics", axum::routing::get(handlers::get_prometheus_metrics))
            .route("/api/v1/openapi.json", get(Self::openapi_spec))
            .with_state(Arc::new(self))
            .finish_api(&mut api);
        let static_router = axum::Router::new()
            .route("/", axum::routing::get(Self::serve_index))
            .route("/static/*file", axum::routing::get(Self::serve_static));
        axum::Router::new()
            .merge(api_router)
            .merge(static_router)
            .layer(Extension(api))
            .layer(axum::middleware::from_fn(Self::cors_middleware))
    }
    pub async fn openapi_spec(Extension(api): Extension<OpenApi>) -> Json<OpenApi> {
        Json(api)
    }
    fn resolve_fpath(fpath: &str) -> String {
        let standard_prefix = "databuild+";
        let test_prefix = "_main";
        match (
            std::fs::read_dir(Self::get_runfile_path(&format!("{}/databuild/dashboard", standard_prefix))),
            std::fs::read_dir(Self::get_runfile_path(&format!("{}/databuild/dashboard", test_prefix))),
        ) {
            (Ok(_), _) => Self::get_runfile_path(&format!("{}/databuild/dashboard/{}", standard_prefix, fpath)),
            (Err(_), Ok(_)) => Self::get_runfile_path(&format!("{}/databuild/dashboard/{}", test_prefix, fpath)),
            (_, Err(_)) => panic!("Failed to find dashboard files"),
        }
    }
    pub async fn serve_index() -> Response {
        match std::fs::read_to_string(&Self::resolve_fpath("index.html")) {
            Ok(content) => Response::builder()
                .header("content-type", "text/html")
                .body(content.into())
                .unwrap(),
            Err(_) => Response::builder()
                .status(StatusCode::INTERNAL_SERVER_ERROR)
                .body("Failed to load dashboard".into())
                .unwrap(),
        }
    }
    pub async fn serve_static(axum::extract::Path(file): axum::extract::Path<String>) -> Response {
        match std::fs::read(&Self::resolve_fpath(&file)) {
            Ok(content) => {
                let content_type = match file.split('.').last() {
                    Some("html") => "text/html",
                    Some("css") => "text/css",
                    Some("js") => "application/javascript",
                    Some("png") => "image/png",
                    Some("jpg") | Some("jpeg") => "image/jpeg",
                    Some("svg") => "image/svg+xml",
                    Some("ico") => "image/x-icon",
                    _ => "application/octet-stream",
                };
                Response::builder()
                    .header("content-type", content_type)
                    .body(content.into())
                    .unwrap()
            }
            Err(_) => Response::builder()
                .status(StatusCode::NOT_FOUND)
                .body("404 Not Found".into())
                .unwrap(),
        }
    }
    fn get_dashboard_file_path(relative_path: &str) -> String {
        let runfiles_dir = std::env::var("DASHBOARD_FILES_DIR").unwrap();
        format!("{}/{}", runfiles_dir, relative_path)
    }
    fn get_runfile_path(relative_path: &str) -> String {
        if let Ok(runfiles_dir) = std::env::var("RUNFILES_DIR") {
            format!("{}/{}", runfiles_dir, relative_path)
        } else if let Ok(_manifest_file) = std::env::var("RUNFILES_MANIFEST_FILE") {
            // Parse manifest file to find the actual path
            // For now, just use the relative path
            relative_path.to_string()
        } else {
            // Development mode - files might be in the workspace
            relative_path.to_string()
        }
    }
    pub async fn cors_middleware(
        request: axum::http::Request<axum::body::Body>,
        next: axum::middleware::Next,
    ) -> axum::response::Response {
        let response = next.run(request).await;
        let (mut parts, body) = response.into_parts();
        parts.headers.insert(
            axum::http::header::ACCESS_CONTROL_ALLOW_ORIGIN,
            axum::http::HeaderValue::from_static("*"),
        );
        parts.headers.insert(
            axum::http::header::ACCESS_CONTROL_ALLOW_METHODS,
            axum::http::HeaderValue::from_static("GET, POST, DELETE, OPTIONS"),
        );
        parts.headers.insert(
            axum::http::header::ACCESS_CONTROL_ALLOW_HEADERS,
            axum::http::HeaderValue::from_static("Content-Type, Authorization"),
        );
        axum::response::Response::from_parts(parts, body)
    }
    pub fn generate_build_request_id() -> String {
        Uuid::new_v4().to_string()
    }
    pub fn status_to_string(status: BuildRequestStatus) -> String {
        match BuildRequestStatusCode::try_from(status.code) {
            Ok(BuildRequestStatusCode::BuildRequestUnknown) => "unknown".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestReceived) => "received".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestPlanning) => "planning".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestAnalysisCompleted) => "analysis_completed".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestExecuting) => "executing".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestCompleted) => "completed".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestFailed) => "failed".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestCancelled) => "cancelled".to_string(),
            Ok(BuildRequestStatusCode::BuildRequestPreconditionFailed) => "precondition_failed".to_string(),
            Err(_) => "error".to_string(),
        }
    }
    pub fn partition_status_to_string(status: PartitionStatus) -> String {
        match status {
            PartitionStatus::PartitionUnknown => "unknown".to_string(),
            PartitionStatus::PartitionRequested => "requested".to_string(),
            PartitionStatus::PartitionAnalyzed => "analyzed".to_string(),
            PartitionStatus::PartitionBuilding => "building".to_string(),
            PartitionStatus::PartitionAvailable => "available".to_string(),
            PartitionStatus::PartitionFailed => "failed".to_string(),
            PartitionStatus::PartitionDelegated => "delegated".to_string(),
        }
    }
 }
 pub type ServiceState = Arc<BuildGraphService>;
 // Repository-based response types
 // Removed: PartitionDetailResponse and PartitionTimelineEvent (use crate:: proto versions)
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobsRepositoryListResponse {
    pub jobs: Vec<JobRepositorySummary>,
    pub total_count: u32,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct JobRepositorySummary {
    pub job_label: String,
    pub total_runs: usize,
    pub successful_runs: usize,
    pub failed_runs: usize,
    pub cancelled_runs: usize,
    pub average_partitions_per_run: f64,
    pub last_run_timestamp: i64,
    pub last_run_status: String,
    pub recent_builds: Vec<String>,
 }
 // Removed: JobDetailResponse, JobRunDetail, JobRunsListResponse, JobRunSummary (use crate:: proto versions)
 // Removed: TaskDetailResponse and TaskTimelineEvent (use crate:: proto versions)
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildsRepositoryListResponse {
    pub builds: Vec<BuildRepositorySummary>,
    pub total_count: u32,
 }
 #[derive(Debug, Serialize, Deserialize, JsonSchema)]
 pub struct BuildRepositorySummary {
    pub build_request_id: String,
    pub status: String,
    pub requested_partitions: Vec<String>,
    pub total_jobs: usize,
    pub completed_jobs: usize,
    pub failed_jobs: usize,
    pub cancelled_jobs: usize,
    pub requested_at: i64,
    pub started_at: Option<i64>,
    pub completed_at: Option<i64>,
    pub duration_ms: Option<i64>,
    pub cancelled: bool,
 }
 // Removed: BuildDetailResponse and BuildTimelineEvent (use crate:: proto versions)
--- a/databuild/service/openapi_spec_generator.rs
+++ b/databuild/service/openapi_spec_generator.rs
@ -1,35 +0,0 @@
 use databuild::service::BuildGraphService;
 use std::collections::HashMap;
 #[tokio::main]
 async fn main() {
    // Disable logging to keep output clean
    log::set_max_level(log::LevelFilter::Off);
    // Create a minimal service instance for OpenAPI generation
    let service = match BuildGraphService::new(
        "sqlite://:memory:",  // Use in-memory database for spec generation
        "//example:graph".to_string(),
        "job_lookup".to_string(),
        HashMap::new(),
    ).await {
        Ok(service) => service,
        Err(e) => {
            eprintln!("Failed to create service for OpenAPI generation: {}", e);
            std::process::exit(1);
        }
    };
    // Generate and print OpenAPI spec
    let spec = service.generate_openapi_spec();
    match serde_json::to_string_pretty(&spec) {
        Ok(json) => {
            println!("{}", json);
            std::process::exit(0);
        }
        Err(e) => {
            eprintln!("Failed to serialize OpenAPI spec: {}", e);
            std::process::exit(1);
        }
    }
 }
--- a/databuild/status_utils.rs
+++ b/databuild/status_utils.rs
@ -1,291 +0,0 @@
 use crate::*;
 /// Utilities for converting status enums to human-readable strings
 /// This provides consistent status naming across CLI and Service interfaces
 impl PartitionStatus {
    /// Convert partition status to human-readable string matching current CLI/service format
    pub fn to_display_string(&self) -> String {
        match self {
            PartitionStatus::PartitionUnknown => "unknown".to_string(),
            PartitionStatus::PartitionRequested => "requested".to_string(),
            PartitionStatus::PartitionAnalyzed => "analyzed".to_string(),
            PartitionStatus::PartitionBuilding => "building".to_string(),
            PartitionStatus::PartitionAvailable => "available".to_string(),
            PartitionStatus::PartitionFailed => "failed".to_string(),
            PartitionStatus::PartitionDelegated => "delegated".to_string(),
        }
    }
    /// Parse a display string back to enum (for filtering, etc.)
    pub fn from_display_string(s: &str) -> Option<Self> {
        match s {
            "unknown" => Some(PartitionStatus::PartitionUnknown),
            "requested" => Some(PartitionStatus::PartitionRequested),
            "analyzed" => Some(PartitionStatus::PartitionAnalyzed),
            "building" => Some(PartitionStatus::PartitionBuilding),
            "available" => Some(PartitionStatus::PartitionAvailable),
            "failed" => Some(PartitionStatus::PartitionFailed),
            "delegated" => Some(PartitionStatus::PartitionDelegated),
            _ => None,
        }
    }
 }
 impl JobStatus {
    /// Convert job status to human-readable string matching current CLI/service format
    pub fn to_display_string(&self) -> String {
        match self {
            JobStatus::JobUnknown => "unknown".to_string(),
            JobStatus::JobScheduled => "scheduled".to_string(),
            JobStatus::JobRunning => "running".to_string(),
            JobStatus::JobCompleted => "completed".to_string(),
            JobStatus::JobFailed => "failed".to_string(),
            JobStatus::JobCancelled => "cancelled".to_string(),
            JobStatus::JobSkipped => "skipped".to_string(),
        }
    }
    /// Parse a display string back to enum
    pub fn from_display_string(s: &str) -> Option<Self> {
        match s {
            "unknown" => Some(JobStatus::JobUnknown),
            "scheduled" => Some(JobStatus::JobScheduled),
            "running" => Some(JobStatus::JobRunning),
            "completed" => Some(JobStatus::JobCompleted),
            "failed" => Some(JobStatus::JobFailed),
            "cancelled" => Some(JobStatus::JobCancelled),
            "skipped" => Some(JobStatus::JobSkipped),
            _ => None,
        }
    }
 }
 impl BuildRequestStatusCode {
    /// Convert build request status to human-readable string matching current CLI/service format
    pub fn to_display_string(&self) -> String {
        match self {
            BuildRequestStatusCode::BuildRequestUnknown => "unknown".to_string(),
            BuildRequestStatusCode::BuildRequestReceived => "received".to_string(),
            BuildRequestStatusCode::BuildRequestPlanning => "planning".to_string(),
            BuildRequestStatusCode::BuildRequestAnalysisCompleted => "analysis_completed".to_string(),
            BuildRequestStatusCode::BuildRequestExecuting => "executing".to_string(),
            BuildRequestStatusCode::BuildRequestCompleted => "completed".to_string(),
            BuildRequestStatusCode::BuildRequestFailed => "failed".to_string(),
            BuildRequestStatusCode::BuildRequestCancelled => "cancelled".to_string(),
            &BuildRequestStatusCode::BuildRequestPreconditionFailed => "precondition failed".to_string(),
        }
    }
    /// Parse a display string back to enum
    pub fn from_display_string(s: &str) -> Option<Self> {
        match s {
            "unknown" => Some(BuildRequestStatusCode::BuildRequestUnknown),
            "received" => Some(BuildRequestStatusCode::BuildRequestReceived),
            "planning" => Some(BuildRequestStatusCode::BuildRequestPlanning),
            "analysis_completed" => Some(BuildRequestStatusCode::BuildRequestAnalysisCompleted),
            "executing" => Some(BuildRequestStatusCode::BuildRequestExecuting),
            "completed" => Some(BuildRequestStatusCode::BuildRequestCompleted),
            "failed" => Some(BuildRequestStatusCode::BuildRequestFailed),
            "cancelled" => Some(BuildRequestStatusCode::BuildRequestCancelled),
            "precondition failed" => Some(BuildRequestStatusCode::BuildRequestPreconditionFailed),
            _ => None,
        }
    }
    pub fn status(&self) -> BuildRequestStatus {
        BuildRequestStatus {
            code: self.clone().into(),
            name: self.to_display_string(),
        }
    }
 }
 impl DepType {
    /// Convert dependency type to human-readable string
    pub fn to_display_string(&self) -> String {
        match self {
            DepType::Query => "query".to_string(),
            DepType::Materialize => "materialize".to_string(),
        }
    }
    /// Parse a display string back to enum
    pub fn from_display_string(s: &str) -> Option<Self> {
        match s {
            "query" => Some(DepType::Query),
            "materialize" => Some(DepType::Materialize),
            _ => None,
        }
    }
 }
 /// Helper functions for creating protobuf list responses with dual status fields
 pub mod list_response_helpers {
    use super::*;
    /// Create a PartitionSummary from repository data
    pub fn create_partition_summary(
        partition_ref: PartitionRef,
        status: PartitionStatus,
        last_updated: i64,
        builds_count: usize,
        invalidation_count: usize,
        last_successful_build: Option<String>,
    ) -> PartitionSummary {
        PartitionSummary {
            partition_ref: Some(partition_ref),
            status_code: status as i32,
            status_name: status.to_display_string(),
            last_updated,
            builds_count: builds_count as u32,
            invalidation_count: invalidation_count as u32,
            last_successful_build,
        }
    }
    /// Create a JobSummary from repository data
    pub fn create_job_summary(
        job_label: String,
        total_runs: usize,
        successful_runs: usize,
        failed_runs: usize,
        cancelled_runs: usize,
        average_partitions_per_run: f64,
        last_run_timestamp: i64,
        last_run_status: JobStatus,
        recent_builds: Vec<String>,
    ) -> JobSummary {
        JobSummary {
            job_label,
            total_runs: total_runs as u32,
            successful_runs: successful_runs as u32,
            failed_runs: failed_runs as u32,
            cancelled_runs: cancelled_runs as u32,
            average_partitions_per_run,
            last_run_timestamp,
            last_run_status_code: last_run_status as i32,
            last_run_status_name: last_run_status.to_display_string(),
            recent_builds,
        }
    }
    /// Create a TaskSummary from repository data
    pub fn create_task_summary(
        job_run_id: String,
        job_label: String,
        build_request_id: String,
        status: JobStatus,
        target_partitions: Vec<PartitionRef>,
        scheduled_at: i64,
        started_at: Option<i64>,
        completed_at: Option<i64>,
        duration_ms: Option<i64>,
        cancelled: bool,
        message: String,
    ) -> JobRunSummary {
        JobRunSummary {
            job_run_id,
            job_label,
            build_request_id,
            status_code: status as i32,
            status_name: status.to_display_string(),
            target_partitions,
            scheduled_at,
            started_at,
            completed_at,
            duration_ms,
            cancelled,
            message,
        }
    }
    /// Create a BuildSummary from repository data
    pub fn create_build_summary(
        build_request_id: String,
        status: BuildRequestStatus,
        requested_partitions: Vec<PartitionRef>,
        total_jobs: usize,
        completed_jobs: usize,
        failed_jobs: usize,
        cancelled_jobs: usize,
        requested_at: i64,
        started_at: Option<i64>,
        completed_at: Option<i64>,
        duration_ms: Option<i64>,
        cancelled: bool,
        comment: Option<String>,
    ) -> BuildSummary {
        BuildSummary {
            build_request_id,
            status: Some(status),
            requested_partitions,
            total_jobs: total_jobs as u32,
            completed_jobs: completed_jobs as u32,
            failed_jobs: failed_jobs as u32,
            cancelled_jobs: cancelled_jobs as u32,
            requested_at,
            started_at,
            completed_at,
            duration_ms,
            cancelled,
            comment,
        }
    }
    /// Create a DataDep with dual fields from repository data
    pub fn create_data_dep(
        dep_type: DepType,
        partition_ref: PartitionRef,
    ) -> DataDep {
        DataDep {
            dep_type_code: dep_type as i32,
            dep_type_name: dep_type.to_display_string(),
            partition_ref: Some(partition_ref),
        }
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn test_partition_status_conversions() {
        let status = PartitionStatus::PartitionAvailable;
        assert_eq!(status.to_display_string(), "available");
        assert_eq!(PartitionStatus::from_display_string("available"), Some(status));
    }
    #[test]
    fn test_job_status_conversions() {
        let status = JobStatus::JobCompleted;
        assert_eq!(status.to_display_string(), "completed");
        assert_eq!(JobStatus::from_display_string("completed"), Some(status));
    }
    #[test]
    fn test_build_request_status_conversions() {
        let status = BuildRequestStatusCode::BuildRequestCompleted.status();
        assert_eq!(status.name, "completed");
    }
    #[test]
    fn test_dep_type_conversions() {
        let dep_type = DepType::Materialize;
        assert_eq!(dep_type.to_display_string(), "materialize");
        assert_eq!(DepType::from_display_string("materialize"), Some(dep_type));
        let dep_type = DepType::Query;
        assert_eq!(dep_type.to_display_string(), "query");
        assert_eq!(DepType::from_display_string("query"), Some(dep_type));
    }
    #[test]
    fn test_invalid_display_string() {
        assert_eq!(PartitionStatus::from_display_string("invalid"), None);
        assert_eq!(JobStatus::from_display_string("invalid"), None);
        assert_eq!(BuildRequestStatusCode::from_display_string("invalid"), None);
        assert_eq!(DepType::from_display_string("invalid"), None);
    }
 }
--- a/databuild/test/BUILD.bazel
+++ b/databuild/test/BUILD.bazel
@ -1,61 +0,0 @@
 load("@rules_proto//proto:defs.bzl", "proto_library")
 load("@rules_rust//rust:defs.bzl", "rust_test")
 # Test the databuild generation
 rust_test(
    name = "databuild_test",
    srcs = [
        "databuild_test.rs",
        "//databuild:generate_databuild_rust",
    ],
    edition = "2021",
    deps = [
        "@crates//:prost",
        "@crates//:schemars",
        "@crates//:serde",
        "@crates//:serde_json",
    ],
 )
 # Generate Rust code for simple proto using prost generator
 genrule(
    name = "generate_simple_rust",
    srcs = ["simple.proto"],
    outs = ["simple.rs"],
    cmd = "PROTOC=$(location @com_google_protobuf//:protoc) $(location //databuild:prost_generator) $(location simple.proto) /dev/null $@",
    tools = [
        "//databuild:prost_generator",
        "@com_google_protobuf//:protoc",
    ],
 )
 # Simple proto for testing
 proto_library(
    name = "simple_proto",
    srcs = ["simple.proto"],
    visibility = ["//visibility:public"],
 )
 # Test the simple generation
 rust_test(
    name = "simple_test",
    srcs = [
        "simple_test.rs",
        ":generate_simple_rust",
    ],
    edition = "2021",
    deps = [
        "@crates//:prost",
        "@crates//:schemars",
        "@crates//:serde",
        "@crates//:serde_json",
    ],
 )
 py_test(
    name = "py_proto_test",
    srcs = ["py_proto_test.py"],
    deps = [
        "//databuild:py_proto",
    ],
 )
--- a/databuild/test/app/BUILD.bazel
+++ b/databuild/test/app/BUILD.bazel
@ -1,15 +0,0 @@
 py_library(
    name = "job_src",
    srcs = glob(["**/*.py"], exclude=["e2e_test_common.py"]),
    visibility = ["//visibility:public"],
    deps = [
        "//databuild:py_proto",
        "//databuild/dsl/python:dsl",
    ],
 )
 py_library(
    name = "e2e_test_common",
    srcs = ["e2e_test_common.py"],
    visibility = ["//visibility:public"],
 )
--- a/databuild/test/app/README.md
+++ b/databuild/test/app/README.md
@ -1,34 +0,0 @@
 # Test DataBuild App
 This directory contains common job components for testing databuild apps described via different methods, e.g. the core bazel targets, the python DSL, etc.
 ## Structure
 The fictitious use case is "daily color votes". The underlying input data is votes per color per day, which we combine and aggregate in ways that help us test different aspects of databuild. Job exec contents should be trivial, as the purpose is to test composition. Types of partition relationships:
 - Time-range: 1 day depending on N prior days
 - Multi-partition-output jobs
  - Always output multiple, e.g. producing per type
  - Consume different inputs based on desired output
  - Produce multiple of the same type depending on input
 ```mermaid
 flowchart TD
    daily_color_votes[(daily_color_votes/$date/$color)]
    color_votes_1w[(color_votes_1w/$date/$color)]
    color_votes_1m[(color_votes_1m/$date/$color)]
    daily_votes[(daily_votes/$date)]
    votes_1w[(votes_1w/$date)]
    votes_1m[(votes_1m/$date)]
    color_vote_report[(color_vote_report/$date/$color)]
    ingest_color_votes --> daily_color_votes
    daily_color_votes --> trailing_color_votes --> color_votes_1w & color_votes_1m
    daily_color_votes --> aggregate_color_votes --> daily_votes
    color_votes_1w --> aggregate_color_votes --> votes_1w
    color_votes_1m --> aggregate_color_votes --> votes_1m
    daily_votes & votes_1w & votes_1m & color_votes_1w & color_votes_1m --> color_vote_report_calc --> color_vote_report
 ```
 ## Data Access
 Data access is implemented in [`dal.py`](./dal.py), with data written as lists of dicts in JSON. Partition fields are stored as values in those dicts.
--- a/databuild/test/app/bazel/BUILD.bazel
+++ b/databuild/test/app/bazel/BUILD.bazel
@ -1,157 +0,0 @@
 load("//databuild:rules.bzl", "databuild_graph", "databuild_job")
 py_library(
    name = "job_src",
    srcs = glob(["**/*.py"]),
    visibility = ["//visibility:public"],
    deps = [
        "//databuild:py_proto",
        "//databuild/dsl/python:dsl",
    ],
 )
 # Tests
 py_test(
    name = "test_trailing_color_votes",
    srcs = ["jobs/trailing_color_votes/test.py"],
    main = "jobs/trailing_color_votes/test.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 py_test(
    name = "test_ingest_color_votes",
    srcs = ["jobs/ingest_color_votes/test.py"],
    main = "jobs/ingest_color_votes/test.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 py_test(
    name = "test_aggregate_color_votes",
    srcs = ["jobs/aggregate_color_votes/test.py"],
    main = "jobs/aggregate_color_votes/test.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 py_test(
    name = "test_color_vote_report_calc",
    srcs = ["jobs/color_vote_report_calc/test.py"],
    main = "jobs/color_vote_report_calc/test.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 py_test(
    name = "test_graph_analysis",
    srcs = ["graph/graph_test.py"],
    data = [
        ":bazel_graph.analyze",
        ":bazel_graph_lookup",
    ],
    main = "graph/graph_test.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 py_test(
    name = "test_e2e",
    srcs = ["test_e2e.py"],
    data = [":bazel_graph.build"],
    main = "test_e2e.py",
    deps = ["//databuild/test/app:e2e_test_common"],
 )
 # Bazel-defined
 ## Graph
 databuild_graph(
    name = "bazel_graph",
    jobs = [
        ":ingest_color_votes",
        ":trailing_color_votes",
        ":aggregate_color_votes",
        ":color_vote_report_calc",
    ],
    lookup = ":bazel_graph_lookup",
 )
 py_binary(
    name = "bazel_graph_lookup",
    srcs = ["graph/lookup.py"],
    main = "graph/lookup.py",
 )
 ## Ingest Color Votes
 databuild_job(
    name = "ingest_color_votes",
    binary = ":ingest_color_votes_binary",
 )
 py_binary(
    name = "ingest_color_votes_binary",
    srcs = ["jobs/ingest_color_votes/main.py"],
    main = "jobs/ingest_color_votes/main.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 ## Trailing Color Votes
 databuild_job(
    name = "trailing_color_votes",
    binary = ":trailing_color_votes_binary",
 )
 py_binary(
    name = "trailing_color_votes_binary",
    srcs = ["jobs/trailing_color_votes/main.py"],
    main = "jobs/trailing_color_votes/main.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 ## Aggregate Color Votes
 databuild_job(
    name = "aggregate_color_votes",
    binary = ":aggregate_color_votes_binary",
 )
 py_binary(
    name = "aggregate_color_votes_binary",
    srcs = ["jobs/aggregate_color_votes/main.py"],
    main = "jobs/aggregate_color_votes/main.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
 ## Color Vote Report Calc
 databuild_job(
    name = "color_vote_report_calc",
    binary = ":color_vote_report_calc_binary",
 )
 py_binary(
    name = "color_vote_report_calc_binary",
    srcs = ["jobs/color_vote_report_calc/main.py"],
    main = "jobs/color_vote_report_calc/main.py",
    deps = [
        ":job_src",
        "//databuild/test/app:job_src",
    ],
 )
--- a/databuild/test/app/bazel/README.md
+++ b/databuild/test/app/bazel/README.md
@ -1,4 +0,0 @@
 # Bazel-Based Graph Definition
 The bazel-based graph definition relies on declaring `databuild_job` and `databuild_graph` targets which reference binaries.
--- a/databuild/test/app/bazel/graph/graph_test.py
+++ b/databuild/test/app/bazel/graph/graph_test.py
@ -1,91 +0,0 @@
 #!/usr/bin/env python3
 """
 Integration test for the databuild graph analysis.
 This test verifies that when we request color vote reports, the graph analyzer
 correctly identifies all upstream dependencies and jobs required.
 """
 import subprocess
 import json
 import unittest
 import os
 from pathlib import Path
 class GraphAnalysisTest(unittest.TestCase):
    def setUp(self):
        # Determine the path to bazel_graph.analyze
        # In bazel test, we need to find the executable in the runfiles
        runfiles_dir = os.environ.get('RUNFILES_DIR')
        test_srcdir = os.environ.get('TEST_SRCDIR')
        possible_paths = []
        if runfiles_dir:
            possible_paths.append(os.path.join(runfiles_dir, '_main', 'databuild', 'test', 'app', 'bazel_graph.analyze'))
            possible_paths.append(os.path.join(runfiles_dir, 'databuild', 'test', 'app', 'bazel_graph.analyze'))
        if test_srcdir:
            possible_paths.append(os.path.join(test_srcdir, '_main', 'databuild', 'test', 'app', 'bazel_graph.analyze'))
            possible_paths.append(os.path.join(test_srcdir, 'databuild', 'test', 'app', 'bazel_graph.analyze'))
        # Fallback for local testing
        possible_paths.extend([
            'bazel-bin/databuild/test/app/bazel_graph.analyze',
            './bazel_graph.analyze'
        ])
        self.graph_analyze = None
        for path in possible_paths:
            if os.path.exists(path):
                self.graph_analyze = path
                break
        # Ensure the executable exists
        if not self.graph_analyze:
            self.skipTest(f"Graph analyze executable not found in any of these paths: {possible_paths}")
    def run_graph_analyze(self, partition_refs):
        """Run graph.analyze with the given partition references."""
        cmd = [self.graph_analyze] + partition_refs
        result = subprocess.run(cmd, capture_output=True, text=True, cwd=os.getcwd())
        if result.returncode != 0:
            self.fail(f"Graph analyze failed with return code {result.returncode}.\nStdout: {result.stdout}\nStderr: {result.stderr}")
        # Parse the JSON output
        try:
            return json.loads(result.stdout)
        except json.JSONDecodeError as e:
            self.fail(f"Failed to parse JSON output: {e}\nOutput: {result.stdout}")
    def test_single_color_report_dependencies(self):
        """Test dependencies for a single color vote report."""
        partition_refs = ["color_vote_report/2024-01-15/red"]
        result = self.run_graph_analyze(partition_refs)
        self.assertIn('nodes', result)
        # TODO expand
    def test_multiple_color_reports_same_date(self):
        """Test dependencies when requesting multiple colors for the same date."""
        partition_refs = [
            "color_vote_report/2024-01-15/red",
            "color_vote_report/2024-01-15/blue"
        ]
        result = self.run_graph_analyze(partition_refs)
        self.assertIn('nodes', result)
        # TODO expand
    def test_multiple_dates_dependencies(self):
        """Test dependencies when requesting reports for different dates."""
        partition_refs = [
            "color_vote_report/2024-01-15/red",
            "color_vote_report/2024-01-16/red"
        ]
        result = self.run_graph_analyze(partition_refs)
        self.assertIn('nodes', result)
        # TODO expand
 if __name__ == '__main__':
    unittest.main()
--- a/databuild/test/app/bazel/graph/lookup.py
+++ b/databuild/test/app/bazel/graph/lookup.py
@ -1,29 +0,0 @@
 #!/usr/bin/env python3
 from collections import defaultdict
 import sys
 import json
 LABEL_BASE = "//databuild/test/app/bazel"
 def lookup(raw_ref: str):
    if raw_ref.startswith("daily_color_votes"):
        return LABEL_BASE + ":ingest_color_votes"
    elif raw_ref.startswith("color_votes_1"):
        return LABEL_BASE + ":trailing_color_votes"
    elif raw_ref.startswith("daily_votes") or raw_ref.startswith("votes_1w") or raw_ref.startswith("votes_1m"):
        return LABEL_BASE + ":aggregate_color_votes"
    elif raw_ref.startswith("color_vote_report"):
        return LABEL_BASE + ":color_vote_report_calc"
    else:
        raise ValueError(f"Unable to resolve job for partition: `{raw_ref}`")
 if __name__ == "__main__":
    results = defaultdict(list)
    for raw_ref in sys.argv[1:]:
        results[lookup(raw_ref)].append(raw_ref)
    # Output the results as JSON
    print(json.dumps(dict(results)))
--- a/databuild/test/app/bazel/graph/test.py
+++ b/databuild/test/app/bazel/graph/test.py
--- a/databuild/test/app/bazel/jobs/aggregate_color_votes/README.md
+++ b/databuild/test/app/bazel/jobs/aggregate_color_votes/README.md
@ -1 +0,0 @@
 jobs/aggregate_color_votes/README.md
--- a/databuild/test/app/bazel/jobs/aggregate_color_votes/config.py
+++ b/databuild/test/app/bazel/jobs/aggregate_color_votes/config.py
@ -1,42 +0,0 @@
 from databuild.proto import PartitionRef, JobConfigureResponse, JobConfig, DepType, DataDep
 from databuild.test.app.colors import COLORS
 from datetime import date
 def configure(outputs: list[PartitionRef]) -> JobConfigureResponse:
    configs = []
    for output in outputs:
        parts = output.str.split("/")
        if len(parts) == 2:
            output_type, data_date = parts
            date.fromisoformat(data_date)  # Validate date format
            # Determine input type based on output type
            if output_type == "daily_votes":
                input_prefix = "daily_color_votes"
            elif output_type == "votes_1w":
                input_prefix = "color_votes_1w"
            elif output_type == "votes_1m":
                input_prefix = "color_votes_1m"
            else:
                raise ValueError(f"Unknown output type: {output_type}")
            # Create inputs for all colors
            inputs = []
            for color in COLORS:
                input_ref = PartitionRef(str=f"{input_prefix}/{data_date}/{color}")
                inputs.append(input_ref)
            configs.append(JobConfig(
                outputs=[output],
                inputs=[DataDep(dep_type_code=DepType.MATERIALIZE, dep_type_name="materialize", partition_ref=ref) for ref in inputs],
                args=[],
                env={
                    "DATA_DATE": data_date,
                    "AGGREGATE_TYPE": output_type
                }
            ))
        else:
            raise ValueError(f"Invalid output partition format: {output.str}")
    return JobConfigureResponse(configs=configs)
--- a/databuild/test/app/bazel/jobs/aggregate_color_votes/main.py
+++ b/databuild/test/app/bazel/jobs/aggregate_color_votes/main.py
@ -1,20 +0,0 @@
 """Main entrypoint for the aggregate_color_votes job for use with bazel-defined graph."""
 import sys
 import os
 import json
 from databuild.proto import PartitionRef, to_dict
 from databuild.test.app.bazel.jobs.aggregate_color_votes.config import configure
 from databuild.test.app.jobs.aggregate_color_votes.execute import execute
 if __name__ == "__main__":
    if sys.argv[1] == "config":
        response = configure([
            PartitionRef(str=raw_ref)
            for raw_ref in sys.argv[2:]
        ])
        print(json.dumps(to_dict(response)))
    elif sys.argv[1] == "exec":
        execute(os.environ["DATA_DATE"], os.environ["AGGREGATE_TYPE"])
    else:
        raise Exception(f"Invalid command `{sys.argv[1]}`")
--- a/databuild/test/app/bazel/jobs/aggregate_color_votes/test.py
+++ b/databuild/test/app/bazel/jobs/aggregate_color_votes/test.py
@ -1,59 +0,0 @@
 import unittest
 from databuild.proto import PartitionRef
 from databuild.test.app.bazel.jobs.aggregate_color_votes.config import configure
 from databuild.test.app.colors import COLORS
 class TestAggregateColorVotesConfig(unittest.TestCase):
    def test_configure_daily_votes(self):
        outputs = [PartitionRef(str="daily_votes/2024-01-15")]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 1)
        config = response.configs[0]
        self.assertEqual(len(config.outputs), 1)
        self.assertEqual(len(config.inputs), len(COLORS))  # One input per color
        self.assertEqual(config.env["AGGREGATE_TYPE"], "daily_votes")
        self.assertEqual(config.env["DATA_DATE"], "2024-01-15")
        # Check that inputs are from daily_color_votes
        for i, color in enumerate(COLORS):
            expected_input = f"daily_color_votes/2024-01-15/{color}"
            self.assertEqual(config.inputs[i].partition_ref.str, expected_input)
    def test_configure_weekly_votes(self):
        outputs = [PartitionRef(str="votes_1w/2024-01-21")]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 1)
        config = response.configs[0]
        self.assertEqual(config.env["AGGREGATE_TYPE"], "votes_1w")
        # Check that inputs are from color_votes_1w
        for i, color in enumerate(COLORS):
            expected_input = f"color_votes_1w/2024-01-21/{color}"
            self.assertEqual(config.inputs[i].partition_ref.str, expected_input)
    def test_configure_monthly_votes(self):
        outputs = [PartitionRef(str="votes_1m/2024-01-31")]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 1)  
        config = response.configs[0]
        self.assertEqual(config.env["AGGREGATE_TYPE"], "votes_1m")
        # Check that inputs are from color_votes_1m
        for i, color in enumerate(COLORS):
            expected_input = f"color_votes_1m/2024-01-31/{color}"
            self.assertEqual(config.inputs[i].partition_ref.str, expected_input)
    def test_configure_multiple_outputs(self):
        outputs = [
            PartitionRef(str="daily_votes/2024-01-15"),
            PartitionRef(str="votes_1w/2024-01-21")
        ]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 2)  # One config per output
 if __name__ == "__main__":
    unittest.main()
--- a/databuild/test/app/bazel/jobs/color_vote_report_calc/README.md
+++ b/databuild/test/app/bazel/jobs/color_vote_report_calc/README.md
@ -1 +0,0 @@
 jobs/color_vote_report_calc/README.md
--- a/databuild/test/app/bazel/jobs/color_vote_report_calc/config.py
+++ b/databuild/test/app/bazel/jobs/color_vote_report_calc/config.py
@ -1,48 +0,0 @@
 from databuild.proto import PartitionRef, JobConfigureResponse, JobConfig, DataDep, DepType
 from datetime import date
 from collections import defaultdict
 def configure(outputs: list[PartitionRef]) -> JobConfigureResponse:
    # This job produces a single job config that handles all requested outputs
    all_dates = set()
    all_colors = set()
    for output in outputs:
        parts = output.str.split("/")
        if len(parts) == 3 and parts[0] == "color_vote_report":
            prefix, data_date, color = parts
            date.fromisoformat(data_date)  # Validate date format
            all_dates.add(data_date)
            all_colors.add(color)
        else:
            raise ValueError(f"Invalid output partition format: {output.str}")
    # Build inputs for all dates and colors that are actually requested
    inputs = []
    # Add total vote aggregates for all dates
    for data_date in all_dates:
        inputs.extend([
            PartitionRef(str=f"daily_votes/{data_date}"),
            PartitionRef(str=f"votes_1w/{data_date}"),
            PartitionRef(str=f"votes_1m/{data_date}")
        ])
    # Add color-specific inputs for all date/color combinations that are requested
    for output in outputs:
        data_date, color = output.str.split("/")[1], output.str.split("/")[2]
        inputs.extend([
            PartitionRef(str=f"daily_color_votes/{data_date}/{color}"),
            PartitionRef(str=f"color_votes_1w/{data_date}/{color}"),
            PartitionRef(str=f"color_votes_1m/{data_date}/{color}")
        ])
    # Single job config for all outputs - pass output partition refs as args
    config = JobConfig(
        outputs=outputs,
        inputs=[DataDep(dep_type_code=DepType.MATERIALIZE, dep_type_name="materialize", partition_ref=ref) for ref in inputs],
        args=[output.str for output in outputs],
        env={}
    )
    return JobConfigureResponse(configs=[config])
--- a/databuild/test/app/bazel/jobs/color_vote_report_calc/main.py
+++ b/databuild/test/app/bazel/jobs/color_vote_report_calc/main.py
@ -1,20 +0,0 @@
 """Main entrypoint for the color_vote_report_calc job for use with bazel-defined graph."""
 import sys
 import os
 import json
 from databuild.proto import PartitionRef, to_dict
 from databuild.test.app.bazel.jobs.color_vote_report_calc.config import configure
 from databuild.test.app.jobs.color_vote_report_calc.execute import execute
 if __name__ == "__main__":
    if sys.argv[1] == "config":
        response = configure([
            PartitionRef(str=raw_ref)
            for raw_ref in sys.argv[2:]
        ])
        print(json.dumps(to_dict(response)))
    elif sys.argv[1] == "exec":
        execute(sys.argv[2:])
    else:
        raise Exception(f"Invalid command `{sys.argv[1]}`")
--- a/databuild/test/app/bazel/jobs/color_vote_report_calc/test.py
+++ b/databuild/test/app/bazel/jobs/color_vote_report_calc/test.py
@ -1,60 +0,0 @@
 import unittest
 from databuild.proto import PartitionRef
 from databuild.test.app.bazel.jobs.color_vote_report_calc.config import configure
 class TestColorVoteReportCalcConfig(unittest.TestCase):
    def test_configure_single_output(self):
        outputs = [PartitionRef(str="color_vote_report/2024-01-15/red")]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 1)  # Always single config
        config = response.configs[0]
        self.assertEqual(len(config.outputs), 1)
        self.assertEqual(config.args, ["color_vote_report/2024-01-15/red"])
        # Should have inputs for total votes and color-specific votes
        expected_inputs = [
            "daily_votes/2024-01-15",
            "votes_1w/2024-01-15", 
            "votes_1m/2024-01-15",
            "daily_color_votes/2024-01-15/red",
            "color_votes_1w/2024-01-15/red",
            "color_votes_1m/2024-01-15/red"
        ]
        actual_inputs = [inp.partition_ref.str for inp in config.inputs]
        for expected in expected_inputs:
            self.assertIn(expected, actual_inputs)
    def test_configure_multiple_outputs_same_date(self):
        outputs = [
            PartitionRef(str="color_vote_report/2024-01-15/red"),
            PartitionRef(str="color_vote_report/2024-01-15/blue")
        ]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 1)  # Single config for all outputs
        config = response.configs[0]
        self.assertEqual(len(config.outputs), 2)
        self.assertEqual(set(config.args), {
            "color_vote_report/2024-01-15/red",
            "color_vote_report/2024-01-15/blue"
        })
    def test_configure_multiple_dates(self):
        outputs = [
            PartitionRef(str="color_vote_report/2024-01-15/red"),
            PartitionRef(str="color_vote_report/2024-01-16/red")
        ]
        response = configure(outputs)
        self.assertEqual(len(response.configs), 1)  # Single config for all outputs
        config = response.configs[0]
        self.assertEqual(len(config.outputs), 2)
        # Should have total vote inputs for both dates
        actual_inputs = [inp.partition_ref.str for inp in config.inputs]
        self.assertIn("daily_votes/2024-01-15", actual_inputs)
        self.assertIn("daily_votes/2024-01-16", actual_inputs)
 if __name__ == "__main__":
    unittest.main()
--- a/databuild/test/app/bazel/jobs/ingest_color_votes/README.md
+++ b/databuild/test/app/bazel/jobs/ingest_color_votes/README.md
@ -1 +0,0 @@
 jobs/ingest_color_votes/README.md
--- a/databuild/test/app/bazel/jobs/ingest_color_votes/config.py
+++ b/databuild/test/app/bazel/jobs/ingest_color_votes/config.py
@ -1,13 +0,0 @@
 from databuild.proto import PartitionRef, JobConfigureResponse, JobConfig
 from datetime import date
 def configure(outputs: list[PartitionRef]) -> JobConfigureResponse:
    configs = []
    for output in outputs:
        prefix, data_date, color = output.str.split("/")
        date.fromisoformat(data_date) # Should be able to parse date
        assert prefix == "daily_color_votes"
        configs.append(JobConfig(outputs = [output], inputs=[], args=[], env={"DATA_DATE": data_date, "COLOR": color}))
    return JobConfigureResponse(configs=configs)
--- a/Show more
+++ b/Show more
Author	SHA1	Message	Date
Stuart Axelbrooke	c2bd4f230c	Add explicit requirements.md Some checks failed / setup (push) Has been cancelled Details	2025-09-07 18:03:55 -07:00
Stuart Axelbrooke	cf449529a3	lets go	2025-09-03 21:32:17 -07:00
		`@ -1,4 +0,0 @@`

			`# Dashboard`

			`A dashboard for viewing past build status, current running builds, etc. Extremely prototyped right now.`
		`@ -1,3 +0,0 @@`
			`#!/usr/bin/env bash`

			`DATABUILD_MODE=import_test DATABUILD_JOB_LOOKUP_PATH=foo DATABUILD_CANDIDATE_JOBS=bar databuild/graph/analyze`
		`@ -1,4 +0,0 @@`

			`# DataBuild Jobs`

			`Contains wrappers and tools for implementing DataBuild jobs.`
		`@ -1,4 +0,0 @@`

			`# Bazel-Based Graph Definition`

			The bazel-based graph definition relies on declaring `databuild_job` and `databuild_graph` targets which reference binaries.