6.9 KiB
Client-Server CLI Architecture
Overview
This plan transforms DataBuild's CLI from a monolithic in-process execution model to a Bazel-style client-server architecture. The CLI becomes a thin client that delegates all operations to a persistent service process, enabling better resource management and build coordination.
Current State Analysis
The current CLI (databuild/cli/main.rs) directly:
- Creates event log connections
- Runs analysis and execution in-process
- Spawns bazel processes directly
- No coordination between concurrent CLI invocations
This creates several limitations:
- No coordination between concurrent builds
- Multiple BEL connections from concurrent CLI calls
- Each CLI process spawns separate bazel execution
- No shared execution environment for builds
Target Architecture
Bazel-Style Client-Server Model
CLI (Thin Client):
- Auto-starts service if not running
- Delegates all operations to service via HTTP
- Streams progress back to user
- Auto-shuts down idle service
Service (Persistent Process):
- Maintains single BEL connection
- Coordinates builds across multiple CLI calls
- Manages bazel execution processes
- Auto-shuts down after idle timeout
Implementation Plan
Phase 1: Service Foundation
-
Extend Current Service for CLI Operations
- Add new endpoints to handle CLI build requests
- Move analysis and execution logic from CLI to service
- Service maintains orchestrator state and coordinates builds
-
Add CLI-Specific API Endpoints
/api/v1/cli/build- Handle build requests from CLI/api/v1/cli/builds/{id}/progress- Stream build progress via Server-Sent Events- Request/response types for CLI build operations
- Background vs foreground build support
-
Add Service Auto-Management
- Service tracks last activity timestamp
- Configurable auto-shutdown timeout (default: 5 minutes)
- Service monitors for idle state and gracefully shuts down
- Activity tracking includes API calls and active builds
-
Service Port Management
- Service attempts to bind to preferred port (e.g., 8080)
- If port unavailable, tries next available port in range
- Service writes actual port to lockfile/pidfile for CLI discovery
- CLI reads port from lockfile to connect to running service
- Cleanup lockfile on service shutdown
Phase 2: Thin CLI Implementation
-
New CLI Main Function
- Replace existing main with service delegation logic
- Parse arguments and determine target service operation
- Handle service connection and auto-start logic
- Preserve existing CLI interface and help text
-
Service Client Implementation
- HTTP client for communicating with service
- Auto-start service if not already running
- Health check and connection retry logic
- Progress streaming for real-time build feedback
-
Build Command via Service
- Parse build arguments and create service request
- Submit build request to service endpoint
- Stream progress updates for foreground builds
- Return immediately for background builds with build ID
Phase 3: Repository Commands via Service
-
Delegate Repository Commands to Service
- Partition, build, job, and task commands go through service
- Use existing service API endpoints where available
- Maintain same output formats (table, JSON) as current CLI
- Preserve all existing functionality and options
-
Service Client Repository Methods
- Client methods for each repository operation
- Handle pagination, filtering, and formatting options
- Error handling and appropriate HTTP status code handling
- URL encoding for partition references and other parameters
Phase 4: Complete Migration
-
Remove Old CLI Implementation
- Delete existing
databuild/cli/main.rsimplementation - Remove in-process analysis and execution logic
- Clean up CLI-specific dependencies that are no longer needed
- Update build configuration to use new thin client only
- Delete existing
-
Service Integration Testing
- End-to-end testing of CLI-to-service communication
- Verify all existing CLI functionality works through service
- Performance testing to ensure no regression
- Error handling validation for various failure modes
Phase 5: Integration and Testing
-
Environment Variable Support
DATABUILD_SERVICE_URLfor custom service locationsDATABUILD_SERVICE_TIMEOUTfor auto-shutdown configuration- Existing BEL environment variables passed to service
- Clear precedence rules for configuration sources
-
Error Handling and User Experience
- Service startup timeout and clear error messages
- Connection failure handling with fallback suggestions
- Health check logic to verify service readiness
- Graceful handling of service unavailability
Benefits of Client-Server Architecture
✅ Build Coordination
- Multiple CLI calls share same service instance
- Coordination between concurrent builds
- Single BEL connection eliminates connection conflicts
✅ Resource Management
- Auto-shutdown prevents resource leaks
- Service manages persistent connections
- Better isolation between CLI and build execution
- Shared bazel execution environment
✅ Improved User Experience
- Background builds with
--backgroundflag - Real-time progress streaming
- Consistent build execution environment
✅ Simplified Architecture
- Single execution path through service
- Cleaner separation of concerns
- Reduced code duplication
✅ Future-Ready Foundation
- Service architecture prepared for additional coordination features
- HTTP API foundation for programmatic access
- Clear separation of concerns between client and execution
Success Criteria
Phase 1-2: Service Foundation
- Service can handle CLI build requests
- Service auto-shutdown works correctly
- Service port management and discovery works
- New CLI can start and connect to service
- Build requests execute with same functionality as current CLI
Phase 3-4: Complete Migration
- All CLI commands work via service delegation
- Repository commands (partitions, builds, etc.) work via HTTP API
- Old CLI implementation completely removed
- Error handling provides clear user feedback
Phase 5: Polish
- Multiple concurrent CLI calls work correctly
- Background builds work as expected
- Performance meets or exceeds current CLI
- Service management is reliable and transparent
Risk Mitigation
- Thorough Testing: Comprehensive testing before removing old CLI
- Feature Parity: Ensure all existing functionality works via service
- Performance Validation: Benchmark new implementation against current performance
- Simple Protocol: Use HTTP/JSON for service communication (not gRPC initially)
- Clear Error Messages: Service startup and connection failures should be obvious to users