# Detail & Lineage Views ## Vision Provide rich, navigable views into databuild's execution history that answer operational questions: - **"What work was done to fulfill this want?"** - The full DAG of partitions built and jobs run - **"Where did this data come from?"** - Trace a partition's lineage back through its inputs - **"What downstream data uses this?"** - Understand impact before tainting or debugging staleness ## Three Distinct Views ### 1. Want Fulfillment View Shows the work tree rooted at a want: all partitions built, jobs run, and derivative wants spawned to fulfill it. ``` W-001 "data/gamma" [Successful] │ ├── data/gamma [Live, uuid:abc] │ └── JR-789 [Succeeded] │ ├── read: data/beta [Live, uuid:def] │ └── read: data/alpha [Live, uuid:ghi] │ └── derivative: W-002 "data/beta" [Successful] │ └── triggered by: JR-456 dep-miss │ └── data/beta [Live, uuid:def] └── JR-456 [DepMiss → retry → Succeeded] └── read: data/alpha [Live, uuid:ghi] ``` Key insight: This shows **specific partition instances** (by UUID), not just refs. A want's fulfillment is a concrete snapshot of what was built. ### 2. Partition Lineage View The data flow graph: partition ↔ job_run alternating. Navigable upstream (inputs) and downstream (consumers). ``` UPSTREAM │ ┌────────────┼────────────┐ ▼ ▼ ▼ [data/a] [data/b] [data/c] │ │ │ └────────────┼────────────┘ ▼ JR-xyz [Succeeded] │ ▼ ══════════════════ ║ data/beta ║ ← FOCUS ║ [Live] ║ ══════════════════ │ ▼ JR-abc [Running] │ ┌────────────┼────────────┐ ▼ ▼ ▼ [data/x] [data/y] [data/z] │ DOWNSTREAM ``` This view answers: "What data flows into/out of this partition?" Click to navigate. ### 3. JobRun Detail View Not a graph - just the immediate context of a single job execution: - **Scheduled for**: Which want(s) triggered this job - **Read**: Input partitions (with UUIDs - the specific versions read) - **Wrote**: Output partitions (with UUIDs) - **Status history**: Queued → Running → Succeeded/Failed/DepMiss - **If DepMiss**: Which derivative wants were spawned ## Data Requirements ### Track read_deps on success Currently only captured on dep-miss. Need to extend `JobRunSuccessEventV1`: ```protobuf message JobRunSuccessEventV1 { string job_run_id = 1; repeated ReadDeps read_deps = 2; // NEW } ``` ### Inverted consumer index To answer "what reads this partition", need: ```rust partition_consumers: BTreeMap> // partition_ref → consumer partition_refs ``` Built from read_deps on job success. ## Open Questions 1. How do we visualize retries? (Same want, multiple job attempts) 2. Should partition lineage show historical versions or just current? 3. Performance strategy for high fan-out (partition read by 1000s of consumers)?