2.5 KiB
Wants System
Purpose: Enable declarative partition requirements with continuous reconciliation, SLA tracking, and efficient event-driven execution.
Overview
Wants declare intent to have partitions exist. The graph continuously reconciles these wants by attempting execution when dependencies are satisfied. Jobs either succeed or fail with missing dependencies, which become new wants. This creates a self-discovering dependency chain without upfront planning.
Want Identity
Wants are idempotent through deterministic ID generation:
message PartitionWant {
string want_id = 1; // Hash(partition_ref + data_timestamp + source)
string root_want_id = 2; // Original user want
string parent_want_id = 3; // Want that triggered this
PartitionRef partition_ref = 4;
uint64 data_timestamp = 5; // Business time (e.g., "2024-01-01" → midnight UTC)
uint64 ttl_seconds = 6; // From data_timestamp
uint64 sla_seconds = 7; // From data_timestamp
WantSource source = 8;
}
Multiple identical want requests produce the same want_id, preventing duplication.
Execution Flow
- Want Registration: User/trigger creates wants with deterministic IDs
- Immediate Dispatch: Graph attempts execution without checking dependencies
- Runtime Discovery: Jobs fail with
MissingDependenciesError(partitions) - Want Propagation: Graph creates upstream wants from missing dependencies
- Event-Driven Retry: When partitions become available (via BEL events), graph retries dependent wants
No polling required - partition availability events directly trigger reconciliation.
Reconciliation Loop
The graph monitors two event streams:
- New wants: Trigger immediate execution attempts
- Partition completions: Trigger retry of wants previously failed on missing dependencies
This creates an event-driven cascade where upstream completion immediately unlocks downstream work.
SLA Management
SLAs and TTLs anchor to data_timestamp, not creation time:
- TTL: "Build January 1st data within 30 days of January 1st"
- SLA: "January 1st data should be ready by 9am January 2nd"
This makes wants truly idempotent - the same logical want always has identical constraints regardless of when it's created.
Benefits
- No planning overhead: Jobs discover dependencies at runtime
- Natural batching: Graph can group wants per job preferences
- Continuous progress: Partial availability enables marginal execution
- Simple deployment: No in-flight state beyond wants themselves