2.7 KiB
Wants System
Purpose: Enable declarative partition requirements with continuous reconciliation, SLA tracking, and efficient event-driven execution.
Overview
Wants declare intent to have partitions exist. The graph continuously reconciles these wants by attempting execution when dependencies are satisfied. Jobs either succeed or fail with missing dependencies, which become new wants. This creates a self-discovering dependency chain without upfront planning.
Want Identity
Wants use UUID-based identity:
message WantCreateEventV1 {
string want_id = 1; // UUID generated at creation time
repeated PartitionRef partitions = 2; // Partitions this want requests
uint64 data_timestamp = 3; // Business time (e.g., "2024-01-01" → midnight UTC)
uint64 ttl_seconds = 4; // From data_timestamp
uint64 sla_seconds = 5; // From data_timestamp
EventSource source = 6; // Origin: job, API, CLI, web app...
optional string comment = 7;
}
Want IDs are UUIDs generated at creation time. Duplicate prevention is handled at the scheduling layer: the orchestrator checks canonical partition state before scheduling jobs, so multiple wants for the same partition simply observe the same build progress rather than triggering redundant work.
Execution Flow
- Want Registration: User/trigger creates wants (UUIDs assigned at creation)
- Immediate Dispatch: Graph attempts execution without checking dependencies
- Runtime Discovery: Jobs fail with
MissingDependenciesError(partitions) - Want Propagation: Graph creates upstream wants from missing dependencies
- Event-Driven Retry: When partitions become available (via BEL events), graph retries dependent wants
No polling required - partition availability events directly trigger reconciliation.
Reconciliation Loop
The graph monitors two event streams:
- New wants: Trigger immediate execution attempts
- Partition completions: Trigger retry of wants previously failed on missing dependencies
This creates an event-driven cascade where upstream completion immediately unlocks downstream work.
SLA Management
SLAs and TTLs anchor to data_timestamp, not creation time:
- TTL: "Build January 1st data within 30 days of January 1st"
- SLA: "January 1st data should be ready by 9am January 2nd"
This makes wants truly idempotent - the same logical want always has identical constraints regardless of when it's created.
Benefits
- No planning overhead: Jobs discover dependencies at runtime
- Natural batching: Graph can group wants per job preferences
- Continuous progress: Partial availability enables marginal execution
- Simple deployment: No in-flight state beyond wants themselves