Wants System

Purpose: Enable declarative partition requirements with continuous reconciliation, SLA tracking, and efficient event-driven execution.

Overview

Wants declare intent to have partitions exist. The graph continuously reconciles these wants by attempting execution when dependencies are satisfied. Jobs either succeed or fail with missing dependencies, which become new wants. This creates a self-discovering dependency chain without upfront planning.

Want Identity

Wants use UUID-based identity:

message WantCreateEventV1 {
  string want_id = 1;              // UUID generated at creation time
  repeated PartitionRef partitions = 2;  // Partitions this want requests
  uint64 data_timestamp = 3;       // Business time (e.g., "2024-01-01" → midnight UTC)
  uint64 ttl_seconds = 4;          // From data_timestamp
  uint64 sla_seconds = 5;          // From data_timestamp
  EventSource source = 6;          // Origin: job, API, CLI, web app...
  optional string comment = 7;
}

Want IDs are UUIDs generated at creation time. Duplicate prevention is handled at the scheduling layer: the orchestrator checks canonical partition state before scheduling jobs, so multiple wants for the same partition simply observe the same build progress rather than triggering redundant work.

Execution Flow

Want Registration: User/trigger creates wants (UUIDs assigned at creation)
Immediate Dispatch: Graph attempts execution without checking dependencies
Runtime Discovery: Jobs fail with MissingDependenciesError(partitions)
Want Propagation: Graph creates upstream wants from missing dependencies
Event-Driven Retry: When partitions become available (via BEL events), graph retries dependent wants

No polling required - partition availability events directly trigger reconciliation.

Reconciliation Loop

The graph monitors two event streams:

New wants: Trigger immediate execution attempts
Partition completions: Trigger retry of wants previously failed on missing dependencies

This creates an event-driven cascade where upstream completion immediately unlocks downstream work.

SLA Management

SLAs and TTLs anchor to data_timestamp, not creation time:

TTL: "Build January 1st data within 30 days of January 1st"
SLA: "January 1st data should be ready by 9am January 2nd"

This makes wants truly idempotent - the same logical want always has identical constraints regardless of when it's created.

Benefits

No planning overhead: Jobs discover dependencies at runtime
Natural batching: Graph can group wants per job preferences
Continuous progress: Partial availability enables marginal execution
Simple deployment: No in-flight state beyond wants themselves

2.7 KiB Raw Blame History