add prd
This commit is contained in:
parent
d812bb51e2
commit
368558d9d8
1 changed files with 105 additions and 0 deletions
105
docs/plans/detail-lineage.md
Normal file
105
docs/plans/detail-lineage.md
Normal file
|
|
@ -0,0 +1,105 @@
|
|||
# Detail & Lineage Views
|
||||
|
||||
## Vision
|
||||
|
||||
Provide rich, navigable views into databuild's execution history that answer operational questions:
|
||||
|
||||
- **"What work was done to fulfill this want?"** - The full DAG of partitions built and jobs run
|
||||
- **"Where did this data come from?"** - Trace a partition's lineage back through its inputs
|
||||
- **"What downstream data uses this?"** - Understand impact before tainting or debugging staleness
|
||||
|
||||
## Three Distinct Views
|
||||
|
||||
### 1. Want Fulfillment View
|
||||
|
||||
Shows the work tree rooted at a want: all partitions built, jobs run, and derivative wants spawned to fulfill it.
|
||||
|
||||
```
|
||||
W-001 "data/gamma" [Successful]
|
||||
│
|
||||
├── data/gamma [Live, uuid:abc]
|
||||
│ └── JR-789 [Succeeded]
|
||||
│ ├── read: data/beta [Live, uuid:def]
|
||||
│ └── read: data/alpha [Live, uuid:ghi]
|
||||
│
|
||||
└── derivative: W-002 "data/beta" [Successful]
|
||||
│ └── triggered by: JR-456 dep-miss
|
||||
│
|
||||
└── data/beta [Live, uuid:def]
|
||||
└── JR-456 [DepMiss → retry → Succeeded]
|
||||
└── read: data/alpha [Live, uuid:ghi]
|
||||
```
|
||||
|
||||
Key insight: This shows **specific partition instances** (by UUID), not just refs. A want's fulfillment is a concrete snapshot of what was built.
|
||||
|
||||
### 2. Partition Lineage View
|
||||
|
||||
The data flow graph: partition ↔ job_run alternating. Navigable upstream (inputs) and downstream (consumers).
|
||||
|
||||
```
|
||||
UPSTREAM
|
||||
│
|
||||
┌────────────┼────────────┐
|
||||
▼ ▼ ▼
|
||||
[data/a] [data/b] [data/c]
|
||||
│ │ │
|
||||
└────────────┼────────────┘
|
||||
▼
|
||||
JR-xyz [Succeeded]
|
||||
│
|
||||
▼
|
||||
══════════════════
|
||||
║ data/beta ║ ← FOCUS
|
||||
║ [Live] ║
|
||||
══════════════════
|
||||
│
|
||||
▼
|
||||
JR-abc [Running]
|
||||
│
|
||||
┌────────────┼────────────┐
|
||||
▼ ▼ ▼
|
||||
[data/x] [data/y] [data/z]
|
||||
│
|
||||
DOWNSTREAM
|
||||
```
|
||||
|
||||
This view answers: "What data flows into/out of this partition?" Click to navigate.
|
||||
|
||||
### 3. JobRun Detail View
|
||||
|
||||
Not a graph - just the immediate context of a single job execution:
|
||||
|
||||
- **Scheduled for**: Which want(s) triggered this job
|
||||
- **Read**: Input partitions (with UUIDs - the specific versions read)
|
||||
- **Wrote**: Output partitions (with UUIDs)
|
||||
- **Status history**: Queued → Running → Succeeded/Failed/DepMiss
|
||||
- **If DepMiss**: Which derivative wants were spawned
|
||||
|
||||
## Data Requirements
|
||||
|
||||
### Track read_deps on success
|
||||
|
||||
Currently only captured on dep-miss. Need to extend `JobRunSuccessEventV1`:
|
||||
|
||||
```protobuf
|
||||
message JobRunSuccessEventV1 {
|
||||
string job_run_id = 1;
|
||||
repeated ReadDeps read_deps = 2; // NEW
|
||||
}
|
||||
```
|
||||
|
||||
### Inverted consumer index
|
||||
|
||||
To answer "what reads this partition", need:
|
||||
|
||||
```rust
|
||||
partition_consumers: BTreeMap<String, Vec<String>> // partition_ref → consumer partition_refs
|
||||
```
|
||||
|
||||
Built from read_deps on job success.
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. How do we visualize retries? (Same want, multiple job attempts)
|
||||
2. Should partition lineage show historical versions or just current?
|
||||
3. Performance strategy for high fan-out (partition read by 1000s of consumers)?
|
||||
Loading…
Reference in a new issue