databuild/plans/20-wants-initial.md

6.4 KiB

Wants System Implementation

Overview

This plan implements the wants system described in design/wants.md, transitioning DataBuild from direct build requests to a declarative want-based model with cross-graph coordination and SLA tracking. This builds on the 3-tier BEL architecture and client-server CLI established in the previous phases.

Prerequisites

This plan assumes completion of:

  • Phase 18: 3-tier BEL architecture with storage/query/client layers
  • Phase 19: Client-server CLI architecture with service delegation

Implementation Phases

Phase 1: Extend BEL Storage for Wants

  1. Add PartitionWantEvent to databuild.proto

    • Want event schema as defined in design/wants.md
    • Want source tracking (CLI, dashboard, scheduled, API)
    • TTL and SLA timestamp fields
    • External dependency specifications
  2. Extend BELStorage Interface

    • Add append_want() method for want events
    • Extend EventFilter to support want filtering
    • Add want-specific query capabilities to storage layer
  3. Implement in SQLite Storage Backend

    • Add wants table with appropriate indexes
    • Implement want filtering in list_events()
    • Schema migration logic for existing databases

Phase 2: Basic Want API in Service

  1. Implement Want Management in Service

    • Service methods for creating and querying wants
    • Want lifecycle management (creation, expiration, satisfaction)
    • Integration with existing service auto-management
  2. Add Want HTTP Endpoints

    • POST /api/v1/wants - Create new want
    • GET /api/v1/wants - List active wants with filtering
    • GET /api/v1/wants/{id} - Get want details
    • DELETE /api/v1/wants/{id} - Cancel want
  3. CLI Want Commands

    • ./bazel-bin/my_graph.build want create <partition-ref> with SLA/TTL options
    • ./bazel-bin/my_graph.build want list with filtering options
    • ./bazel-bin/my_graph.build want status <partition-ref> for want status
    • Modify build commands to create wants via service

Phase 3: Want-Driven Build Evaluation

  1. Implement Build Evaluator in Service

    • Continuous evaluation loop that checks for buildable wants
    • External dependency satisfaction checking
    • TTL expiration filtering for active wants
  2. Replace Build Request Handling

    • Graph build commands create wants instead of direct build requests
    • Service background loop evaluates wants and triggers builds
    • Maintain atomic build semantics while satisfying multiple wants
  3. Build Coordination Logic

    • Aggregate wants that can be satisfied by same build
    • Priority handling for urgent wants (short SLA)
    • Resource coordination across concurrent want evaluation

Phase 4: Cross-Graph Coordination

  1. Implement GraphService API

    • HTTP API for cross-graph event streaming as defined in design/wants.md
    • Event filtering for efficient partition pattern subscriptions
    • Service-to-service communication for upstream dependencies
  2. Upstream Dependency Configuration

    • Service configuration for upstream DataBuild instances
    • Partition pattern subscriptions to upstream graphs
    • Automatic want evaluation when upstream partitions become available
  3. Cross-Graph Event Sync

    • Background sync process for upstream events
    • Triggering local build evaluation on upstream availability
    • Reliable HTTP-based coordination to avoid message loss

Phase 5: SLA Monitoring and Dashboard Integration

  1. SLA Violation Tracking

    • External monitoring endpoints for SLA violations
    • Want timeline and status tracking
    • Integration with existing dashboard for want visualization
  2. Want Dashboard Features

    • Want creation and monitoring UI
    • Cross-graph dependency visualization
    • SLA violation dashboard and alerting
  3. Migration from Direct Builds

    • All build requests go through want system
    • Remove direct build request pathways
    • Update documentation for new build model

Benefits of Want-Based Architecture

Unified Build Model

  • All builds (manual, scheduled, triggered) use same want mechanism
  • Complete audit trail in build event log
  • Consistent SLA tracking across all build types

Event-Driven Efficiency

  • Builds only triggered when dependencies change
  • Cross-graph coordination via efficient event streaming
  • No polling for task readiness within builds

Atomic Build Semantics Preserved

  • Individual build requests remain all-or-nothing
  • Fast failure provides immediate feedback
  • Partial progress via multiple build requests over time

Flexible SLA Management

  • Separate business expectations (SLA) from operational limits (TTL)
  • External monitoring with clear blame assignment
  • Automatic cleanup of stale wants

Cross-Graph Scalability

  • Reliable HTTP-based coordination
  • Efficient filtering via partition patterns
  • Decentralized architecture with clear boundaries

Success Criteria

Phase 1: Storage Foundation

  • Want events can be stored and queried in BEL storage
  • EventFilter supports want-specific filtering
  • SQLite backend handles want operations efficiently

Phase 2: Basic Want API

  • Service can create and query wants via HTTP API
  • Graph build commands work for want management
  • Build commands create wants instead of direct builds

Phase 3: Want-Driven Builds

  • Service background loop evaluates wants continuously
  • Build evaluation triggers on want creation and external events
  • TTL expiration and external dependency checking work correctly

Phase 4: Cross-Graph Coordination

  • GraphService API returns filtered events for cross-graph coordination
  • Upstream partition availability triggers downstream want evaluation
  • Service-to-service communication is reliable and efficient

Phase 5: Complete Migration

  • All builds go through want system
  • Dashboard supports want creation and monitoring
  • SLA violation endpoints provide monitoring integration
  • Documentation reflects new want-based build model

Risk Mitigation

  1. Incremental Migration: Implement wants alongside existing build system initially
  2. Performance Validation: Ensure want evaluation doesn't introduce significant latency
  3. Backwards Compatibility: Maintain existing build semantics during transition
  4. Monitoring Integration: Provide clear observability into want lifecycle and performance