update partitions refactor plan
This commit is contained in:
parent
dfc1d19237
commit
7ccec59364
1 changed files with 139 additions and 10 deletions
|
|
@ -208,6 +208,7 @@ Can answer:
|
|||
- **UpForRetry**: Upstream dependencies satisfied, partition ready to retry building
|
||||
- **Live**: Successfully built
|
||||
- **Failed**: Hard failure (shouldn't retry)
|
||||
- **UpstreamFailed**: Partition failed because upstream dependencies failed (terminal state)
|
||||
- **Tainted**: Marked invalid by taint event
|
||||
|
||||
**Removed:** Missing state - partitions only exist when jobs start building them or are completed.
|
||||
|
|
@ -215,6 +216,7 @@ Can answer:
|
|||
Key transitions:
|
||||
- Building → UpstreamBuilding (job reports dep miss)
|
||||
- UpstreamBuilding → UpForRetry (all upstream deps satisfied)
|
||||
- UpstreamBuilding → UpstreamFailed (upstream dependency hard failure)
|
||||
- Building → Live (job succeeds)
|
||||
- Building → Failed (job hard failure)
|
||||
- UpForRetry → Building (new job queued for retry, creates fresh UUID)
|
||||
|
|
@ -309,6 +311,7 @@ Can answer:
|
|||
- Check canonical partition states for all partition refs
|
||||
- Transition based on observation (in priority order):
|
||||
- If ANY canonical partition is Failed → New → Failed (job can't be safely retried)
|
||||
- If ANY canonical partition is UpstreamFailed → New → UpstreamFailed (upstream deps failed)
|
||||
- If ALL canonical partitions exist AND are Live → New → Successful (already built!)
|
||||
- If ANY canonical partition is Building → New → Building (being built now)
|
||||
- If ANY canonical partition is UpstreamBuilding → New → UpstreamBuilding (waiting for deps)
|
||||
|
|
@ -316,7 +319,7 @@ Can answer:
|
|||
- Otherwise (partitions don't exist or other states) → New → Idle (need to schedule)
|
||||
- For derivative wants, additional logic may transition to UpstreamBuilding
|
||||
|
||||
Key insight: Most wants will go New → Idle because partitions won't exist yet (only created when jobs start). Subsequent wants for already-building partitions go New → Building. Wants arriving during dep miss go New → UpstreamBuilding. Wants for partitions ready to retry go New → Idle. Wants for already-Live partitions go New → Successful. Wants for Failed partitions go New → Failed.
|
||||
Key insight: Most wants will go New → Idle because partitions won't exist yet (only created when jobs start). Subsequent wants for already-building partitions go New → Building. Wants arriving during dep miss go New → UpstreamBuilding. Wants for partitions ready to retry go New → Idle. Wants for already-Live partitions go New → Successful. Wants for Failed or UpstreamFailed partitions go New → Failed/UpstreamFailed.
|
||||
|
||||
3. **Keep WantSchedulability building check**
|
||||
|
||||
|
|
@ -350,6 +353,7 @@ Can answer:
|
|||
New transitions needed:
|
||||
|
||||
- **New → Failed:** Any partition failed
|
||||
- **New → UpstreamFailed:** Any partition upstream failed
|
||||
- **New → Successful:** All partitions live
|
||||
- **New → Idle:** Normal case, partitions don't exist
|
||||
- **New → Building:** Partitions already building when want created
|
||||
|
|
@ -468,14 +472,22 @@ Complete flow when a job has dependency miss:
|
|||
- Wants for partition: Building → UpstreamBuilding
|
||||
- Wants track the derivative want IDs in their UpstreamBuildingState
|
||||
|
||||
4. **Upstream builds complete:**
|
||||
- Derivative wants build upstream partitions → upstream partition becomes Live
|
||||
- **Lookup downstream_waiting:** Get `downstream_waiting[upstream_partition_ref]` → list of UUIDs waiting for this upstream
|
||||
- For each waiting partition UUID:
|
||||
- Get partition from `partitions_by_uuid[uuid]`
|
||||
- Check if ALL its MissingDeps are now satisfied (canonical partitions for all refs are Live)
|
||||
- If satisfied: transition partition UpstreamBuilding → UpForRetry
|
||||
- Remove uuid from `downstream_waiting` entries (cleanup)
|
||||
4. **Upstream builds complete or fail:**
|
||||
- **Success case:** Derivative wants build upstream partitions → upstream partition becomes Live
|
||||
- **Lookup downstream_waiting:** Get `downstream_waiting[upstream_partition_ref]` → list of UUIDs waiting for this upstream
|
||||
- For each waiting partition UUID:
|
||||
- Get partition from `partitions_by_uuid[uuid]`
|
||||
- Check if ALL its MissingDeps are now satisfied (canonical partitions for all refs are Live)
|
||||
- If satisfied: transition partition UpstreamBuilding → UpForRetry
|
||||
- Remove uuid from `downstream_waiting` entries (cleanup)
|
||||
|
||||
- **Failure case:** Upstream partition transitions to Failed (hard failure)
|
||||
- **Lookup downstream_waiting:** Get `downstream_waiting[failed_partition_ref]` → list of UUIDs waiting for this upstream
|
||||
- For each waiting partition UUID in UpstreamBuilding state:
|
||||
- Transition partition: UpstreamBuilding → UpstreamFailed
|
||||
- Transition associated wants: UpstreamBuilding → UpstreamFailed
|
||||
- Remove uuid from `downstream_waiting` entries (cleanup)
|
||||
- This propagates failure information down the dependency chain
|
||||
|
||||
5. **Want becomes schedulable:**
|
||||
- When partition transitions to UpForRetry, wants transition: UpstreamBuilding → Idle
|
||||
|
|
@ -493,7 +505,8 @@ Complete flow when a job has dependency miss:
|
|||
- UpstreamBuilding also acts as lease (upstreams not ready, can't retry yet)
|
||||
- UpForRetry releases lease (upstreams ready, safe to schedule)
|
||||
- Failed releases lease but blocks new wants (hard failure, shouldn't retry)
|
||||
- `downstream_waiting` index enables O(1) lookup of affected partitions when upstreams complete
|
||||
- UpstreamFailed releases lease and blocks new wants (upstream deps failed, can't succeed)
|
||||
- `downstream_waiting` index enables O(1) lookup of affected partitions when upstreams complete or fail
|
||||
|
||||
**Taint Handling:**
|
||||
|
||||
|
|
@ -657,6 +670,122 @@ JobRunBufferEventV1 {
|
|||
- Want completes → partition remains in history
|
||||
- Partition expires → new UUID for rebuild, canonical updated
|
||||
|
||||
## Implementation FAQs
|
||||
|
||||
### Q: Do we need to maintain backwards compatibility with existing events?
|
||||
|
||||
**A:** No. We can assume no need to maintain backwards compatibility or retain data produced before this change. This simplifies the implementation significantly - no need to handle old event formats or generate UUIDs for replayed pre-UUID events.
|
||||
|
||||
### Q: How should we handle reference errors and index inconsistencies?
|
||||
|
||||
**A:** Panic on any reference issues with contextual information. This includes:
|
||||
- Missing partition UUIDs in `partitions_by_uuid`
|
||||
- Missing canonical pointers in `canonical_partitions`
|
||||
- Inverted index inconsistencies (wants_for_partition, downstream_waiting)
|
||||
- Invalid state transitions
|
||||
|
||||
Add assertions and validation throughout to catch these issues immediately rather than failing silently.
|
||||
|
||||
### Q: What about cleanup of the `wants_for_partition` inverted index?
|
||||
|
||||
**A:** Don't remove wants from the index when they complete. This is acceptable for the initial implementation. Building of years of partitions for a mature data platform would still represent less than a million entries, which is manageable. We can add cleanup later if needed.
|
||||
|
||||
### Q: What happens when an upstream partition is Tainted instead of becoming Live?
|
||||
|
||||
**A:** Tainting of an upstream means it is no longer live, and the downstream job should dep miss. The system will operate correctly:
|
||||
1. Downstream job discovers upstream is Tainted (not Live) → dep miss
|
||||
2. Derivative want created for tainted upstream
|
||||
3. Tainted upstream triggers rebuild (new UUID, replaces canonical)
|
||||
4. Derivative want succeeds → downstream can resume
|
||||
|
||||
### Q: How should UUIDs be generated? Should the Orchestrator calculate them?
|
||||
|
||||
**A:** Use deterministic derivation instead of orchestrator generation:
|
||||
|
||||
```rust
|
||||
fn derive_partition_uuid(job_run_id: &str, partition_ref: &str) -> Uuid {
|
||||
// Hash job_run_id + partition_ref bytes
|
||||
let mut hasher = Sha256::new();
|
||||
hasher.update(job_run_id.as_bytes());
|
||||
hasher.update(partition_ref.as_bytes());
|
||||
let hash = hasher.finalize();
|
||||
// Convert first 16 bytes to UUID
|
||||
Uuid::from_slice(&hash[0..16]).unwrap()
|
||||
}
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- No orchestrator UUID state/generation needed
|
||||
- Deterministic replay (same job + ref = same UUID)
|
||||
- Event schema stays simple (job_run_id + partition refs)
|
||||
- Build state derives UUIDs in `handle_job_run_buffer()`
|
||||
- No need for `PartitionInstanceRef` message in protobuf
|
||||
|
||||
### Q: How do we enforce safe canonical partition access?
|
||||
|
||||
**A:** Add and use helper methods in BuildState to enforce correct access patterns:
|
||||
- `get_canonical_partition(ref)` - lookup canonical partition for a ref
|
||||
- `get_canonical_partition_uuid(ref)` - get UUID of canonical partition
|
||||
- `get_partition_by_uuid(uuid)` - direct UUID lookup
|
||||
- `get_wants_for_partition(ref)` - query inverted index
|
||||
|
||||
Existing `get_partition()` function should be updated to use canonical lookup. Code should always access "current state" via canonical_partitions, not by ref lookup in the deprecated partitions map.
|
||||
|
||||
### Q: What is the want schedulability check logic?
|
||||
|
||||
**A:** A want is schedulable if:
|
||||
- The canonical partition doesn't exist for any of its partition refs, OR
|
||||
- The canonical partition exists and is in Tainted or UpForRetry state
|
||||
|
||||
In other words: `!exists || Tainted || UpForRetry`
|
||||
|
||||
Building and UpstreamBuilding partitions act as leases (not schedulable).
|
||||
|
||||
### Q: Should we implement phases strictly sequentially?
|
||||
|
||||
**A:** No. Proceed in the most efficient and productive manner possible. Phases can be combined or reordered as makes sense. For example, Phase 1 + Phase 2 can be done together since want state sensing depends on the new partition states.
|
||||
|
||||
### Q: Should we write tests incrementally or implement everything first?
|
||||
|
||||
**A:** Implement tests as we go. Write unit tests for each component as it's implemented, then integration tests for full scenarios.
|
||||
|
||||
### Q: Should wants reference partition UUIDs or partition refs?
|
||||
|
||||
**A:** Wants should NEVER reference partition instances (via UUID). Wants should ONLY reference canonical partitions via partition ref strings. This is already the case - wants include partition refs, which allows the orchestrator to resolve partition info for want state updates. The separation is:
|
||||
- Wants → Partition Refs (canonical, user-facing)
|
||||
- Jobs → Partition UUIDs (specific instances, historical)
|
||||
|
||||
### Q: Should we add UpstreamFailed state for partitions?
|
||||
|
||||
**A:** Yes. This provides symmetry with want semantics and clear terminal state propagation:
|
||||
|
||||
**Scenario:**
|
||||
1. Partition A: Building → Failed (hard failure)
|
||||
2. Partition B needs A, dep misses → UpstreamBuilding
|
||||
3. Derivative want created for A, immediately fails (A is Failed)
|
||||
4. Partition B: UpstreamBuilding → UpstreamFailed
|
||||
|
||||
**Benefits:**
|
||||
- Clear signal that partition can never succeed (upstreams failed)
|
||||
- Mirrors Want UpstreamFailed semantics (consistency)
|
||||
- Useful for UIs and debugging
|
||||
- Prevents indefinite waiting in UpstreamBuilding state
|
||||
|
||||
**Transition logic:**
|
||||
- When partition transitions to Failed, lookup `downstream_waiting[failed_partition_ref]`
|
||||
- For each downstream partition UUID in UpstreamBuilding state, transition to UpstreamFailed
|
||||
- This propagates failure information down the dependency chain
|
||||
|
||||
**Add to Phase 1 partition states:**
|
||||
- **UpstreamFailed**: Partition failed because upstream dependencies failed (terminal state)
|
||||
|
||||
**Add transition:**
|
||||
- UpstreamBuilding → UpstreamFailed (upstream dependency hard failure)
|
||||
|
||||
### Q: Can a job build the same partition ref multiple times?
|
||||
|
||||
**A:** No, this is invalid. A job run cannot build the same partition multiple times. Each partition ref should appear at most once in a job's building_partitions list.
|
||||
|
||||
## Summary
|
||||
|
||||
Adding partition UUIDs solves fundamental architectural problems:
|
||||
|
|
|
|||
Loading…
Reference in a new issue