databuild/plans/19-client-server-cli.md

6.9 KiB

Client-Server CLI Architecture

Overview

This plan transforms DataBuild's CLI from a monolithic in-process execution model to a Bazel-style client-server architecture. The CLI becomes a thin client that delegates all operations to a persistent service process, enabling better resource management and build coordination.

Current State Analysis

The current CLI (databuild/cli/main.rs) directly:

  • Creates event log connections
  • Runs analysis and execution in-process
  • Spawns bazel processes directly
  • No coordination between concurrent CLI invocations

This creates several limitations:

  • No coordination between concurrent builds
  • Multiple BEL connections from concurrent CLI calls
  • Each CLI process spawns separate bazel execution
  • No shared execution environment for builds

Target Architecture

Bazel-Style Client-Server Model

CLI (Thin Client):

  • Auto-starts service if not running
  • Delegates all operations to service via HTTP
  • Streams progress back to user
  • Auto-shuts down idle service

Service (Persistent Process):

  • Maintains single BEL connection
  • Coordinates builds across multiple CLI calls
  • Manages bazel execution processes
  • Auto-shuts down after idle timeout

Implementation Plan

Phase 1: Service Foundation

  1. Extend Current Service for CLI Operations

    • Add new endpoints to handle CLI build requests
    • Move analysis and execution logic from CLI to service
    • Service maintains orchestrator state and coordinates builds
  2. Add CLI-Specific API Endpoints

    • /api/v1/cli/build - Handle build requests from CLI
    • /api/v1/cli/builds/{id}/progress - Stream build progress via Server-Sent Events
    • Request/response types for CLI build operations
    • Background vs foreground build support
  3. Add Service Auto-Management

    • Service tracks last activity timestamp
    • Configurable auto-shutdown timeout (default: 5 minutes)
    • Service monitors for idle state and gracefully shuts down
    • Activity tracking includes API calls and active builds
  4. Service Port Management

    • Service attempts to bind to preferred port (e.g., 8080)
    • If port unavailable, tries next available port in range
    • Service writes actual port to lockfile/pidfile for CLI discovery
    • CLI reads port from lockfile to connect to running service
    • Cleanup lockfile on service shutdown

Phase 2: Thin CLI Implementation

  1. New CLI Main Function

    • Replace existing main with service delegation logic
    • Parse arguments and determine target service operation
    • Handle service connection and auto-start logic
    • Preserve existing CLI interface and help text
  2. Service Client Implementation

    • HTTP client for communicating with service
    • Auto-start service if not already running
    • Health check and connection retry logic
    • Progress streaming for real-time build feedback
  3. Build Command via Service

    • Parse build arguments and create service request
    • Submit build request to service endpoint
    • Stream progress updates for foreground builds
    • Return immediately for background builds with build ID

Phase 3: Repository Commands via Service

  1. Delegate Repository Commands to Service

    • Partition, build, job, and task commands go through service
    • Use existing service API endpoints where available
    • Maintain same output formats (table, JSON) as current CLI
    • Preserve all existing functionality and options
  2. Service Client Repository Methods

    • Client methods for each repository operation
    • Handle pagination, filtering, and formatting options
    • Error handling and appropriate HTTP status code handling
    • URL encoding for partition references and other parameters

Phase 4: Complete Migration

  1. Remove Old CLI Implementation

    • Delete existing databuild/cli/main.rs implementation
    • Remove in-process analysis and execution logic
    • Clean up CLI-specific dependencies that are no longer needed
    • Update build configuration to use new thin client only
  2. Service Integration Testing

    • End-to-end testing of CLI-to-service communication
    • Verify all existing CLI functionality works through service
    • Performance testing to ensure no regression
    • Error handling validation for various failure modes

Phase 5: Integration and Testing

  1. Environment Variable Support

    • DATABUILD_SERVICE_URL for custom service locations
    • DATABUILD_SERVICE_TIMEOUT for auto-shutdown configuration
    • Existing BEL environment variables passed to service
    • Clear precedence rules for configuration sources
  2. Error Handling and User Experience

    • Service startup timeout and clear error messages
    • Connection failure handling with fallback suggestions
    • Health check logic to verify service readiness
    • Graceful handling of service unavailability

Benefits of Client-Server Architecture

Build Coordination

  • Multiple CLI calls share same service instance
  • Coordination between concurrent builds
  • Single BEL connection eliminates connection conflicts

Resource Management

  • Auto-shutdown prevents resource leaks
  • Service manages persistent connections
  • Better isolation between CLI and build execution
  • Shared bazel execution environment

Improved User Experience

  • Background builds with --background flag
  • Real-time progress streaming
  • Consistent build execution environment

Simplified Architecture

  • Single execution path through service
  • Cleaner separation of concerns
  • Reduced code duplication

Future-Ready Foundation

  • Service architecture prepared for additional coordination features
  • HTTP API foundation for programmatic access
  • Clear separation of concerns between client and execution

Success Criteria

Phase 1-2: Service Foundation

  • Service can handle CLI build requests
  • Service auto-shutdown works correctly
  • Service port management and discovery works
  • New CLI can start and connect to service
  • Build requests execute with same functionality as current CLI

Phase 3-4: Complete Migration

  • All CLI commands work via service delegation
  • Repository commands (partitions, builds, etc.) work via HTTP API
  • Old CLI implementation completely removed
  • Error handling provides clear user feedback

Phase 5: Polish

  • Multiple concurrent CLI calls work correctly
  • Background builds work as expected
  • Performance meets or exceeds current CLI
  • Service management is reliable and transparent

Risk Mitigation

  1. Thorough Testing: Comprehensive testing before removing old CLI
  2. Feature Parity: Ensure all existing functionality works via service
  3. Performance Validation: Benchmark new implementation against current performance
  4. Simple Protocol: Use HTTP/JSON for service communication (not gRPC initially)
  5. Clear Error Messages: Service startup and connection failures should be obvious to users