# Client-Server CLI Architecture ## Overview This plan transforms DataBuild's CLI from a monolithic in-process execution model to a Bazel-style client-server architecture. The CLI becomes a thin client that delegates all operations to a persistent service process, enabling better resource management and build coordination. ## Current State Analysis The current CLI (`databuild/cli/main.rs`) directly: - Creates event log connections - Runs analysis and execution in-process - Spawns bazel processes directly - No coordination between concurrent CLI invocations This creates several limitations: - No coordination between concurrent builds - Multiple BEL connections from concurrent CLI calls - Each CLI process spawns separate bazel execution - No shared execution environment for builds ## Target Architecture ### Bazel-Style Client-Server Model **CLI (Thin Client)**: - Auto-starts service if not running - Delegates all operations to service via HTTP - Streams progress back to user - Auto-shuts down idle service **Service (Persistent Process)**: - Maintains single BEL connection - Coordinates builds across multiple CLI calls - Manages bazel execution processes - Auto-shuts down after idle timeout ## Implementation Plan ### Phase 1: Service Foundation 1. **Extend Current Service for CLI Operations** - Add new endpoints to handle CLI build requests - Move analysis and execution logic from CLI to service - Service maintains orchestrator state and coordinates builds - Add real-time progress streaming for CLI consumption 2. **Add CLI-Specific API Endpoints** - `/api/v1/cli/build` - Handle build requests from CLI - `/api/v1/cli/builds/{id}/progress` - Stream build progress via Server-Sent Events - Request/response types for CLI build operations - Background vs foreground build support 3. **Add Service Auto-Management** - Service tracks last activity timestamp - Configurable auto-shutdown timeout (default: 5 minutes) - Service monitors for idle state and gracefully shuts down - Activity tracking includes API calls and active builds 4. **Service Port Management** - Service attempts to bind to preferred port (e.g., 8080) - If port unavailable, tries next available port in range - Service writes actual port to lockfile/pidfile for CLI discovery - CLI reads port from lockfile to connect to running service - Cleanup lockfile on service shutdown ### Phase 2: Thin CLI Implementation 1. **New CLI Main Function** - Replace existing main with service delegation logic - Parse arguments and determine target service operation - Handle service connection and auto-start logic - Preserve existing CLI interface and help text 2. **Service Client Implementation** - HTTP client for communicating with service - Auto-start service if not already running - Health check and connection retry logic - Progress streaming for real-time build feedback 3. **Build Command via Service** - Parse build arguments and create service request - Submit build request to service endpoint - Stream progress updates for foreground builds - Return immediately for background builds with build ID ### Phase 3: Repository Commands via Service 1. **Delegate Repository Commands to Service** - Partition, build, job, and task commands go through service - Use existing service API endpoints where available - Maintain same output formats (table, JSON) as current CLI - Preserve all existing functionality and options 2. **Service Client Repository Methods** - Client methods for each repository operation - Handle pagination, filtering, and formatting options - Error handling and appropriate HTTP status code handling - URL encoding for partition references and other parameters ### Phase 4: Complete Migration 1. **Remove Old CLI Implementation** - Delete existing `databuild/cli/main.rs` implementation - Remove in-process analysis and execution logic - Clean up CLI-specific dependencies that are no longer needed - Update build configuration to use new thin client only 2. **Service Integration Testing** - End-to-end testing of CLI-to-service communication - Verify all existing CLI functionality works through service - Performance testing to ensure no regression - Error handling validation for various failure modes ### Phase 5: Integration and Testing 1. **Environment Variable Support** - `DATABUILD_SERVICE_URL` for custom service locations - `DATABUILD_SERVICE_TIMEOUT` for auto-shutdown configuration - Existing BEL environment variables passed to service - Clear precedence rules for configuration sources 2. **Error Handling and User Experience** - Service startup timeout and clear error messages - Connection failure handling with fallback suggestions - Health check logic to verify service readiness - Graceful handling of service unavailability ## Benefits of Client-Server Architecture ### ✅ **Build Coordination** - Multiple CLI calls share same service instance - Coordination between concurrent builds - Single BEL connection eliminates connection conflicts ### ✅ **Resource Management** - Auto-shutdown prevents resource leaks - Service manages persistent connections - Better isolation between CLI and build execution - Shared bazel execution environment ### ✅ **Improved User Experience** - Background builds with `--background` flag - Real-time progress streaming - Consistent build execution environment ### ✅ **Simplified Architecture** - Single execution path through service - Cleaner separation of concerns - Reduced code duplication ### ✅ **Future-Ready Foundation** - Service architecture prepared for additional coordination features - HTTP API foundation for programmatic access - Clear separation of concerns between client and execution ## Success Criteria ### Phase 1-2: Service Foundation - [ ] Service can handle CLI build requests - [ ] Service auto-shutdown works correctly - [ ] Service port management and discovery works - [ ] New CLI can start and connect to service - [ ] Build requests execute with same functionality as current CLI ### Phase 3-4: Complete Migration - [ ] All CLI commands work via service delegation - [ ] Repository commands (partitions, builds, etc.) work via HTTP API - [ ] Old CLI implementation completely removed - [ ] Error handling provides clear user feedback ### Phase 5: Polish - [ ] Multiple concurrent CLI calls work correctly - [ ] Background builds work as expected - [ ] Performance meets or exceeds current CLI - [ ] Service management is reliable and transparent ## Risk Mitigation 1. **Thorough Testing**: Comprehensive testing before removing old CLI 2. **Feature Parity**: Ensure all existing functionality works via service 3. **Performance Validation**: Benchmark new implementation against current performance 4. **Simple Protocol**: Use HTTP/JSON for service communication (not gRPC initially) 5. **Clear Error Messages**: Service startup and connection failures should be obvious to users