update docs
This commit is contained in:
parent
6cb11af642
commit
f353660f97
2 changed files with 86 additions and 44 deletions
7
docs/narrative/what-llms-dont-do.md
Normal file
7
docs/narrative/what-llms-dont-do.md
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
|
||||
# What LLMs Don't Do
|
||||
|
||||
- Create and cultivate technical strategy
|
||||
- Don't have a specific vision of the organizing formalization of the problem + the technical solution
|
||||
- Adhere to technical strategy
|
||||
- Please please please just read the relevant docs!
|
||||
|
|
@ -93,27 +93,59 @@ message JobRunSuccessEventV1 {
|
|||
To answer "what reads this partition", need:
|
||||
|
||||
```rust
|
||||
partition_consumers: BTreeMap<String, Vec<String>> // partition_ref → consumer partition_refs
|
||||
partition_consumers: BTreeMap<Uuid, Vec<(Uuid, String)>> // input_uuid → (output_uuid, job_run_id)
|
||||
```
|
||||
|
||||
Built from read_deps on job success.
|
||||
Indexed by UUID (not ref) because partition refs get reused across rebuilds, but UUIDs are immutable per instance. This preserves historical lineage correctly.
|
||||
|
||||
Built from read_deps when processing JobRunSuccessEventV1.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
1. **Retries**: List all job runs triggered by a want, collapsing retries in the UI (expandable)
|
||||
2. **Lineage UUIDs**: Resolve partition refs to canonical UUIDs at job success time (jobs don't need to know about UUIDs)
|
||||
3. **High fan-out**: Truncate to N items with "+X more" expansion
|
||||
4. **Consumer index by UUID**: Index consumers by partition UUID (not ref) since refs get reused across rebuilds but UUIDs are immutable per instance
|
||||
5. **Job run as lineage source of truth**: Partition details don't duplicate upstream info - they reference their builder job run, which holds the read_deps
|
||||
|
||||
## API Response Pattern
|
||||
|
||||
Detail and list endpoints return a wrapper with the primary data plus a shared index of related entities:
|
||||
|
||||
```protobuf
|
||||
message GetJobRunResponse {
|
||||
JobRunDetail data = 1;
|
||||
RelatedEntities index = 2;
|
||||
}
|
||||
|
||||
message ListJobRunsResponse {
|
||||
repeated JobRunDetail data = 1;
|
||||
RelatedEntities index = 2; // shared across all items
|
||||
}
|
||||
|
||||
message RelatedEntities {
|
||||
map<string, PartitionDetail> partitions = 1;
|
||||
map<string, JobRunDetail> job_runs = 2;
|
||||
map<string, WantDetail> wants = 3;
|
||||
}
|
||||
```
|
||||
|
||||
**Why this pattern:**
|
||||
- **No recursion** - Detail types stay flat, don't embed each other
|
||||
- **Deduplication** - Each entity appears once in the index, even if referenced by multiple items in `data`
|
||||
- **O(1) lookup** - Templates access `index.partitions["data/beta"]` directly
|
||||
- **Composable** - Same pattern works for single-item and list endpoints
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Data Model
|
||||
### ✅ Phase 1: Data Model (Complete)
|
||||
|
||||
**1.1 Extend JobRunSuccessEventV1**
|
||||
|
||||
```protobuf
|
||||
message JobRunSuccessEventV1 {
|
||||
string job_run_id = 1;
|
||||
repeated ReadDeps read_deps = 2; // NEW: preserves impacted→read relationships
|
||||
repeated ReadDeps read_deps = 2; // preserves impacted→read relationships
|
||||
}
|
||||
```
|
||||
|
||||
|
|
@ -133,63 +165,66 @@ UUIDs resolved by looking up canonical partitions when processing success event.
|
|||
**1.3 Add consumer index to BuildState**
|
||||
|
||||
```rust
|
||||
// input_partition_ref → list of (output_partition_ref, job_run_id)
|
||||
partition_consumers: BTreeMap<String, Vec<(String, String)>>
|
||||
// input_uuid → list of (output_uuid, job_run_id)
|
||||
partition_consumers: BTreeMap<Uuid, Vec<(Uuid, String)>>
|
||||
```
|
||||
|
||||
Populated from `read_deps` when processing JobRunSuccessEventV1.
|
||||
Populated from `read_deps` when processing JobRunSuccessEventV1. Uses UUIDs (not refs) to preserve historical lineage across partition rebuilds.
|
||||
|
||||
### Phase 2: Extend Existing API Endpoints
|
||||
### ✅ Phase 2: API Response Pattern (Complete)
|
||||
|
||||
**2.1 GET /api/wants/:id**
|
||||
**2.1 RelatedEntities wrapper**
|
||||
|
||||
Add to response:
|
||||
- `job_runs`: All job runs servicing this want (with status, partitions built)
|
||||
- `derivative_wants`: Wants spawned by dep-miss from this want's jobs
|
||||
Added `RelatedEntities` message and `index` field to all Get*/List* responses.
|
||||
|
||||
**2.2 GET /api/partitions/:ref**
|
||||
**2.2 HasRelatedIds trait**
|
||||
|
||||
Add to response:
|
||||
- `built_by`: Job run that built this partition (with read_deps + resolved UUIDs)
|
||||
- `upstream`: Input partitions (refs + UUIDs) from builder's read_deps
|
||||
- `downstream`: Consumer partitions (refs + UUIDs) from consumer index
|
||||
Implemented trait for Want, JobRun, Partition that returns the IDs of related entities. Query layer uses this to build the index.
|
||||
|
||||
**2.3 GET /api/job_runs/:id**
|
||||
**2.3 Query methods**
|
||||
|
||||
Add to response:
|
||||
- `read_deps`: With resolved UUIDs for each partition
|
||||
- `wrote_partitions`: With UUIDs
|
||||
- `derivative_wants`: If DepMiss, the wants that were spawned
|
||||
Added `*_with_index()` methods that collect related IDs via the trait and resolve them to full entity details.
|
||||
|
||||
### Phase 3: Frontend
|
||||
### ✅ Phase 3: Job Integration (Complete)
|
||||
|
||||
**3.1 Want detail page**
|
||||
Jobs already emit `DATABUILD_DEP_READ_JSON` and the full pipeline is wired up:
|
||||
|
||||
Add "Fulfillment" section:
|
||||
- List of job runs (retries collapsed, expandable)
|
||||
- Derivative wants as nested items
|
||||
- Partition UUIDs linked to partition detail
|
||||
1. **Job execution** (`job_run.rs`): `SubProcessBackend::poll` parses `DATABUILD_DEP_READ_JSON` lines from stdout and stores in `SubProcessCompleted.read_deps`
|
||||
2. **Event creation** (`job_run.rs`): `to_event()` creates `JobRunSuccessEventV1` with `read_deps`
|
||||
3. **Event handling** (`event_handlers.rs`): `handle_job_run_success()` resolves `read_partition_uuids` and `wrote_partition_uuids`, populates `partition_consumers` index
|
||||
4. **API serialization** (`job_run_state.rs`): `to_detail()` includes `read_deps`, `read_partition_uuids`, `wrote_partition_uuids` in `JobRunDetail`
|
||||
|
||||
**3.2 Partition detail page**
|
||||
### ✅ Phase 4: Frontend (Complete)
|
||||
|
||||
Add "Lineage" section:
|
||||
- Upstream: builder job → input partitions (navigable)
|
||||
- Downstream: consumer jobs → output partitions (truncated at N)
|
||||
**4.1 JobRun detail page**
|
||||
|
||||
**3.3 JobRun detail page**
|
||||
Added to `job_runs/detail.html`:
|
||||
- "Read Partitions" section showing partition refs with UUIDs (linked to partition detail)
|
||||
- "Wrote Partitions" section showing partition refs with UUIDs (linked to partition detail)
|
||||
- "Derivative Wants" section showing wants spawned by dep-miss (linked to want detail)
|
||||
|
||||
Add:
|
||||
- "Read" section with partition refs + UUIDs
|
||||
- "Wrote" section with partition refs + UUIDs
|
||||
- "Derivative Wants" section (if DepMiss)
|
||||
Extended `JobRunDetailView` with:
|
||||
- `read_deps: Vec<ReadDepsView>` - impacted→read dependency relationships
|
||||
- `read_partitions: Vec<PartitionRefWithUuidView>` - input partitions with UUIDs
|
||||
- `wrote_partitions: Vec<PartitionRefWithUuidView>` - output partitions with UUIDs
|
||||
- `derivative_want_ids: Vec<String>` - derivative wants from dep-miss
|
||||
|
||||
### Phase 4: Job Integration
|
||||
**4.2 Partition detail page**
|
||||
|
||||
Extend `DATABUILD_DEP_READ_JSON` parsing to run on job success (not just dep-miss). Jobs already emit this; we just need to capture it.
|
||||
Added to `partitions/detail.html`:
|
||||
- "Lineage - Built By" section showing the builder job run (linked to job run detail for upstream lineage)
|
||||
- "Lineage - Downstream Consumers" section showing UUIDs of downstream partitions
|
||||
|
||||
## Sequencing
|
||||
Extended `PartitionDetailView` with:
|
||||
- `built_by_job_run_id: Option<String>` - job run that built this partition
|
||||
- `downstream_partition_uuids: Vec<String>` - downstream consumers from index
|
||||
|
||||
1. Proto + state changes
|
||||
2. Event handler updates
|
||||
3. API response extensions
|
||||
4. Frontend enhancements
|
||||
**4.3 Want detail page**
|
||||
|
||||
Added to `wants/detail.html`:
|
||||
- "Fulfillment - Job Runs" section listing all job runs that serviced this want
|
||||
- "Fulfillment - Derivative Wants" section listing derivative wants spawned by dep-misses
|
||||
|
||||
Extended `WantDetailView` with:
|
||||
- `job_run_ids: Vec<String>` - all job runs that serviced this want
|
||||
- `derivative_want_ids: Vec<String>` - derivative wants spawned
|
||||
|
|
|
|||
Loading…
Reference in a new issue