databuild/plans/11-web-app-compile-time-correctness.md

510 lines
No EOL
18 KiB
Markdown

# Web App Compile-Time Correctness Plan
## Problem Statement
The DataBuild web application currently has a type safety blindspot where backend protobuf changes can cause runtime failures in the frontend without any compile-time warnings. While we achieved end-to-end type generation (Proto → Rust → OpenAPI → TypeScript), inconsistent data transformation patterns and loose TypeScript configuration allow type mismatches to slip through.
**Specific observed failures:**
- `status.toLowerCase()` crashes when status objects are passed instead of strings
- `status?.status` accesses non-existent properties on protobuf response objects
- Partitions page fails silently due to unhandled nullability
- Inconsistent data shapes flowing through components
## Root Cause Analysis
1. **Mixed Data Contracts**: Some components expect `{ status: string }` while APIs return `{ status_code: number, status_name: string }`
2. **Inconsistent Transformations**: Data shape changes happen ad-hoc throughout the component tree
3. **Protobuf Nullability**: Generated types are honest about optional fields, but TypeScript config allows unsafe access
4. **Service Boundary Leakage**: Backend implementation details leak into frontend components
## Solution: Three-Pronged Approach
### Option 2: Consistent Data Transformation (Primary)
- Define canonical dashboard types separate from generated API types
- Transform data at service boundaries, never in components
- Single source of truth for data shapes within the frontend
### Option 4: Generated Type Enforcement (Supporting)
- Use generated protobuf types in service layer for accurate contracts
- Leverage protobuf's honest nullability information
- Maintain type safety chain from backend to service boundary
### Option 3: Stricter TypeScript Configuration (Foundation)
- Enable strict null checks to catch undefined access patterns
- Prevent implicit any types that mask runtime errors
- Force explicit handling of protobuf's optional fields
## Implementation Plan
### Phase 1: TypeScript Configuration Hardening
**Goal**: Enable strict type checking to surface existing issues
**Tasks**:
1. Update `tsconfig.json` with strict configuration:
```json
{
"compilerOptions": {
"strict": true,
"noImplicitAny": true,
"strictNullChecks": true,
"noImplicitReturns": true,
"noUncheckedIndexedAccess": true,
"exactOptionalPropertyTypes": true
}
}
```
2. Run TypeScript compilation to identify all type errors
3. Create tracking issue for each compilation error
**Success Criteria**: TypeScript build passes with strict configuration enabled
**Estimated Time**: 1-2 days
### Phase 1.5: Verification of Strict Configuration
**Goal**: Prove strict TypeScript catches the specific issues we identified
**Tasks**:
1. Create test cases that reproduce original failures:
```typescript
// Test file: dashboard/verification-tests.ts
const mockResponse = { status_code: 1, status_name: "COMPLETED" };
// These should now cause TypeScript compilation errors:
const test1 = mockResponse.status?.toLowerCase(); // undefined property access
const test2 = mockResponse.status?.status; // nested undefined access
```
2. Run TypeScript compilation and verify these cause errors:
- Document which strict rules catch which specific issues
- Confirm `strictNullChecks` prevents undefined property access
- Verify `noImplicitAny` surfaces type gaps
3. Test protobuf nullable field handling:
```typescript
interface TestPartitionSummary {
last_updated?: number; // optional field from protobuf
}
// This should require explicit null checking:
const timestamp = partition.last_updated.toString(); // Should error
```
**Success Criteria**:
- All identified runtime failures now cause compile-time errors
- Clear mapping between strict TypeScript rules and caught issues
- Zero false positives in existing working code
**Estimated Time**: 0.5 days
### Phase 2: Define Dashboard Data Contracts
**Goal**: Create canonical frontend types independent of backend schema
**Tasks**:
1. Define dashboard types in `dashboard/types.ts`:
```typescript
// Dashboard-optimized types
interface DashboardBuild {
build_request_id: string;
status: string; // Always human-readable name
requested_partitions: string[]; // Always string values
total_jobs: number;
completed_jobs: number;
failed_jobs: number;
cancelled_jobs: number;
requested_at: number;
started_at: number | null;
completed_at: number | null;
duration_ms: number | null;
cancelled: boolean;
}
interface DashboardPartition {
partition_ref: string; // Always string value
status: string; // Always human-readable name
last_updated: number | null;
build_requests: string[];
}
interface DashboardJob {
job_label: string;
total_runs: number;
successful_runs: number;
failed_runs: number;
cancelled_runs: number;
last_run_timestamp: number;
last_run_status: string; // Always human-readable name
average_partitions_per_run: number;
recent_builds: string[];
}
```
2. Update component attribute interfaces to use dashboard types
3. Document the rationale for each transformation decision
**Success Criteria**: All dashboard types are self-contained and UI-optimized
**Estimated Time**: 2-3 days
### Phase 3: Service Layer Transformation
**Goal**: Create consistent transformation boundaries between API and dashboard
**Tasks**:
1. Implement transformation functions in `services.ts`:
```typescript
// Transform API responses to dashboard types
function transformBuildDetail(apiResponse: BuildDetailResponse): DashboardBuild {
return {
build_request_id: apiResponse.build_request_id,
status: apiResponse.status_name,
requested_partitions: apiResponse.requested_partitions.map(p => p.str),
total_jobs: apiResponse.total_jobs,
completed_jobs: apiResponse.completed_jobs,
failed_jobs: apiResponse.failed_jobs,
cancelled_jobs: apiResponse.cancelled_jobs,
requested_at: apiResponse.requested_at,
started_at: apiResponse.started_at ?? null,
completed_at: apiResponse.completed_at ?? null,
duration_ms: apiResponse.duration_ms ?? null,
cancelled: apiResponse.cancelled,
};
}
function transformPartitionSummary(apiResponse: PartitionSummary): DashboardPartition {
return {
partition_ref: apiResponse.partition_ref.str,
status: apiResponse.status_name,
last_updated: apiResponse.last_updated ?? null,
build_requests: apiResponse.build_requests,
};
}
```
2. Update all service methods to use transformation functions
3. Add type guards for runtime validation:
```typescript
function isValidBuildResponse(data: unknown): data is BuildDetailResponse {
return typeof data === 'object' &&
data !== null &&
'build_request_id' in data &&
'status_name' in data;
}
```
4. Handle API errors with proper typing
**Success Criteria**: All API data flows through consistent transformation layer
**Estimated Time**: 3-4 days
### Phase 3.5: Transformation Validation
**Goal**: Prove transformation functions prevent observed failures and handle edge cases
**Tasks**:
1. Create comprehensive unit tests for transformation functions:
```typescript
// Test file: dashboard/transformation-tests.ts
describe('transformBuildDetail', () => {
it('handles status objects correctly', () => {
const apiResponse = { status_code: 1, status_name: 'COMPLETED' };
const result = transformBuildDetail(apiResponse);
expect(typeof result.status).toBe('string');
expect(result.status).toBe('COMPLETED');
});
it('handles null optional fields', () => {
const apiResponse = { started_at: null, completed_at: undefined };
const result = transformBuildDetail(apiResponse);
expect(result.started_at).toBe(null);
expect(result.completed_at).toBe(null);
});
});
```
2. Test edge cases and malformed responses:
- Missing required fields
- Null values where not expected
- Wrong data types in API responses
- Verify type guards catch invalid responses
3. Validate PartitionRef transformations:
```typescript
it('converts PartitionRef objects to strings', () => {
const apiResponse = { partition_ref: { str: 'test-partition' } };
const result = transformPartitionSummary(apiResponse);
expect(typeof result.partition_ref).toBe('string');
expect(result.partition_ref).toBe('test-partition');
});
```
4. Test transformation against real protobuf response shapes:
- Use actual OpenAPI generated types in tests
- Verify transformations work with current API schema
- Document transformation rationale for each field
**Success Criteria**:
- All transformation functions have >90% test coverage
- Edge cases and null handling verified
- Real API response shapes handled correctly
- Type guards prevent invalid data from reaching components
**Estimated Time**: 1 day
### Phase 4: Component Migration
**Goal**: Update all components to use dashboard types exclusively
**Tasks**:
1. Update component implementations to use dashboard types:
- Remove direct `.status_code`/`.status_name` access
- Use transformed string status values
- Handle null values explicitly where needed
2. Fix specific identified issues:
- Line 472: `status?.status` → use `status` directly
- Badge components: Ensure they receive strings
- Partition list: Use consistent partition type
3. Update component attribute interfaces to match dashboard types
4. Add runtime assertions where needed:
```typescript
if (!status) {
console.warn('Missing status in component');
return m('span', 'Unknown Status');
}
```
**Success Criteria**: All components compile and work with dashboard types
**Estimated Time**: 2-3 days
### Phase 4.5: Continuous Component Verification
**Goal**: Verify components work correctly with dashboard types throughout migration
**Tasks**:
1. After each component migration, run verification tests:
```typescript
// Component-specific tests
describe('BuildDetailComponent', () => {
it('renders status as string correctly', () => {
const dashboardBuild: DashboardBuild = {
status: 'COMPLETED', // Transformed string, not object
// ... other fields
};
const component = m(BuildDetailComponent, { build: dashboardBuild });
// Verify no runtime errors with .toLowerCase()
});
});
```
2. Test component attribute interfaces match usage:
- Verify TypeScript compilation passes for each component
- Check that vnode.attrs typing prevents invalid property access
- Test null handling in component rendering
3. Integration tests with real transformed data:
- Use actual service layer transformation outputs
- Verify components render correctly with dashboard types
- Test error states and missing data scenarios
**Success Criteria**:
- Each migrated component passes TypeScript compilation
- No runtime errors when using transformed dashboard types
- Components gracefully handle null/undefined dashboard fields
**Estimated Time**: 0.5 days (distributed across Phase 4)
### Phase 5: Schema Change Simulation & Integration Testing
**Goal**: Verify end-to-end compile-time correctness with simulated backend changes
**Tasks**:
1. **Automated Schema Change Testing**:
```bash
# Create test script: scripts/test-schema-changes.sh
# Test 1: Add new required field to protobuf
# - Modify databuild.proto temporarily
# - Regenerate Rust types and OpenAPI schema
# - Verify TypeScript compilation fails predictably
# - Document exact error messages
# Test 2: Remove existing field
# - Remove field from protobuf definition
# - Verify transformation functions catch missing fields
# - Confirm components fail compilation when accessing removed field
# Test 3: Change field type (string → object)
# - Modify status field structure in protobuf
# - Verify transformation layer prevents type mismatches
# - Confirm this catches issues like original status.toLowerCase() failure
```
2. **Full Build Cycle Verification**:
- Proto change → `bazel build //databuild:openapi_spec_generator`
- OpenAPI regeneration → `bazel build //databuild/client:typescript_client`
- TypeScript compilation → `bazel build //databuild/dashboard:*`
- Document each failure point and error messages
3. **End-to-End Type Safety Validation**:
```typescript
// Create comprehensive integration tests
describe('End-to-End Type Safety', () => {
it('prevents runtime failures from schema changes', async () => {
// Test actual API calls with transformed responses
const service = DashboardService.getInstance();
const activity = await service.getRecentActivity();
// Verify transformed types prevent original failures
activity.recentBuilds.forEach(build => {
expect(typeof build.status).toBe('string');
expect(() => build.status.toLowerCase()).not.toThrow();
});
});
});
```
4. **Regression Testing for Original Failures**:
- Test status.toLowerCase() with transformed data
- Test status?.status access patterns
- Test partition.str access with transformed partition refs
- Verify null handling in timestamp fields
5. **Real Data Flow Testing**:
- New build creation → status updates → completion
- Partition status changes using dashboard types
- Job execution monitoring with transformed data
- Error states and edge cases
**Success Criteria**:
- Schema changes cause predictable TypeScript compilation failures
- Transformation layer prevents all identified runtime failures
- Full build cycle catches type mismatches at each stage
- Zero runtime type errors with dashboard types
- Original failure scenarios now impossible with strict types
**Estimated Time**: 2-3 days
### Phase 6: Documentation & Monitoring
**Goal**: Establish practices to maintain type safety over time
**Tasks**:
1. Document transformation patterns:
- When to create new dashboard types
- How to handle protobuf schema changes
- Service layer responsibilities
2. Add runtime monitoring:
- Log transformation failures
- Track API response shape mismatches
- Monitor for unexpected null values
3. Create development guidelines:
- Never use generated types directly in components
- Always transform at service boundaries
- Handle nullability explicitly
4. Set up CI checks:
- Strict TypeScript compilation in build pipeline
- Automated schema change detection tests
- Integration test suite for type safety validation
- Pre-commit hooks for TypeScript compilation
5. **Create Ongoing Verification Tools**:
```bash
# CI script: scripts/verify-type-safety.sh
# - Run schema change simulation tests
# - Verify transformation tests pass
# - Check strict TypeScript compilation
# - Validate component integration tests
```
**Success Criteria**:
- Team has clear practices for maintaining type safety
- CI pipeline catches type safety regressions automatically
- Schema change testing is automated and repeatable
- Documentation provides concrete examples and rationale
**Estimated Time**: 2 days
## Risk Mitigation
### High-Impact Risks
1. **Breaking Change Volume**: Strict TypeScript may reveal many existing issues
- *Mitigation*: Implement incrementally, fix issues in phases
- *Rollback*: Keep loose config as backup during transition
2. **Performance Impact**: Additional transformation layer overhead
- *Mitigation*: Profile transformation functions, optimize hot paths
- *Monitoring*: Track bundle size and runtime performance
3. **Developer Learning Curve**: Team needs to adapt to strict null checks
- *Mitigation*: Provide training on handling optional types
- *Support*: Create examples and best practices documentation
### Medium-Impact Risks
1. **API Response Changes**: Backend might return unexpected data shapes
- *Mitigation*: Add runtime validation in service layer
- *Detection*: Monitor for transformation failures
2. **Third-party Type Conflicts**: Generated types might conflict with other libraries
- *Mitigation*: Use type aliases and careful imports
- *Testing*: Verify integration with existing dependencies
## Success Metrics
### Compile-Time Safety
- [ ] Zero `any` types in dashboard code
- [ ] All protobuf optional fields handled explicitly
- [ ] TypeScript strict mode enabled and passing
- [ ] Component attribute interfaces match usage
### Runtime Reliability
- [ ] Zero "undefined is not a function" errors
- [ ] Zero "cannot read property of undefined" errors
- [ ] All API error states handled gracefully
- [ ] Consistent data shapes across all components
### Development Experience
- [ ] Backend schema changes cause predictable frontend compilation results
- [ ] Clear error messages when types don't match
- [ ] Consistent patterns for handling new data types
- [ ] Fast iteration cycle maintained
## Future Considerations
### Schema Evolution Strategy
- Plan for handling breaking vs non-breaking backend changes
- Consider versioning approach for dashboard types
- Establish deprecation process for old data shapes
### Tooling Enhancements
- Consider code generation for transformation functions
- Explore runtime schema validation libraries
- Investigate GraphQL for stronger API contracts
### Performance Optimization
- Profile transformation layer performance
- Consider caching strategies for transformed data
- Optimize bundle size impact of strict typing
---
## Implementation Notes
This plan prioritizes compile-time correctness while maintaining development velocity. The phased approach allows for incremental progress and risk mitigation, while the three-pronged strategy (Options 2+3+4) provides comprehensive type safety from protobuf definitions through to component rendering.
The key insight is that true compile-time correctness requires both accurate type definitions AND consistent data transformation patterns enforced by strict TypeScript configuration.