Skip to content

AWS Provider Architecture Design

Overview

The Open Resource Broker implements a provider architecture that enables dynamic provisioning of compute resources across multiple AWS configurations and provider instances. This document describes the design, implementation, and usage patterns of the AWS provider system.

Architecture Components

CQRS Implementation Status

The system implements CQRS (Command Query Responsibility Segregation) architecture:

Completed CQRS Components: - CommandBus and QueryBus infrastructure in src/infrastructure/di/buses.py - Query DTOs: ListTemplatesQuery, GetTemplateQuery, ValidateTemplateQuery - Command DTOs: CreateTemplateCommand, UpdateTemplateCommand, DeleteTemplateCommand, ValidateTemplateCommand - Template list endpoint using QueryBus

Implementation Status: - Template API endpoints (GET, POST, PUT, DELETE) - using CQRS handlers - Machine management endpoints - using CQRS pattern - Request processing endpoints - using CQRS pattern - Provider management endpoints - using CQRS pattern

Architecture Features: - All API endpoints using CQRS buses for command/query separation - Consistent async/await patterns across all handlers - Appropriate separation of read and write operations - Optimized query handling with caching support

1. Domain Model Extensions

Template Aggregate

The Template aggregate has been extended with multi-provider fields:

class Template:
    template_id: str
    provider_type: Optional[str]      # NEW: Provider type (aws)
    provider_name: Optional[str]      # NEW: Provider instance name (aws-us-east-1)
    provider_api: Optional[str]       # NEW: Specific API to use (EC2Fleet, SpotFleet)
    # ... existing fields

Request Aggregate

The Request aggregate now tracks provider selection:

class Request:
    provider_type: str
    provider_instance: Optional[str]  # NEW: Selected provider instance
    # ... existing fields

2. Provider Selection Service

The ProviderSelectionService implements intelligent provider selection using multiple strategies:

Selection Strategies

  1. Explicit Selection: Template specifies exact provider instance
  2. Load Balanced Selection: Distribute across provider instances by type
  3. Capability-Based Selection: Select based on API requirements
  4. Default Selection: Use configuration defaults

Selection Algorithm

def select_provider_for_template(template: Template) -> ProviderSelectionResult:
    if template.provider_name:
        return explicit_selection(template.provider_name)
    elif template.provider_type:
        return load_balanced_selection(template.provider_type)
    elif template.provider_api:
        return capability_based_selection(template.provider_api)
    else:
        return default_selection()

3. Provider Capability Service

The ProviderCapabilityService validates template requirements against provider capabilities:

Validation Levels

  • STRICT: All warnings become errors
  • LENIENT: Warnings allowed, only critical errors fail
  • BASIC: Only critical validation, minimal checks

Capability Validation

def validate_template_requirements(
    template: Template,
    provider_instance: str,
    level: ValidationLevel
) -> ValidationResult:
    # Validate API support
    # Check instance limits
    # Verify pricing model support
    # Validate fleet type compatibility

4. Template Repository Architecture

The template system implements a repository pattern that provides compliance with Clean Architecture principles:

Template Repository Implementation

The TemplateRepositoryImpl class provides a complete implementation of both AggregateRepository and TemplateRepository interfaces:

class TemplateRepositoryImpl(TemplateRepository):
    """Template repository implementation for configuration-based template management."""

    # Abstract methods from AggregateRepository
    def save(self, aggregate: Template) -> None:
        """Save a template aggregate."""

    def find_by_id(self, aggregate_id: str) -> Optional[Template]:
        """Find template by aggregate ID."""

    def delete(self, aggregate_id: str) -> None:
        """Delete template by aggregate ID."""

    # Abstract methods from TemplateRepository
    def find_by_template_id(self, template_id: str) -> Optional[Template]:
        """Find template by template ID (delegates to find_by_id)."""

    def find_by_provider_api(self, provider_api: str) -> List[Template]:
        """Find templates by provider API type."""

    def find_active_templates(self) -> List[Template]:
        """Find all active templates."""

    def search_templates(self, criteria: Dict[str, Any]) -> List[Template]:
        """Search templates by criteria."""

Key Architecture Improvements

  1. Full Interface Compliance: Implements all required abstract methods from both base interfaces
  2. Method Delegation: Avoids code duplication by delegating find_by_template_id to find_by_id
  3. Clean Dependency Injection: Uses factory pattern registration instead of decorator-based DI
  4. Comprehensive Functionality: Provides both required methods and convenience methods

Template Loading Architecture

The template system implements hierarchical template loading through the scheduler strategy:

File Priority Order (Highest to Lowest)
  1. Provider instance files: {provider-instance}_templates.json
  2. Provider type files: {provider-type}prov_templates.json
  3. Main templates file: templates.json
  4. Legacy templates file: awsprov_templates.json
Template Override Behavior

Templates with the same template_id in higher priority files override those in lower priority files.

Configuration Schema

Provider Configuration

providers:
  selection_policy: "WEIGHTED_ROUND_ROBIN"
  default_provider_type: "aws"
  default_provider_instance: "aws-us-east-1"
  providers:
    - name: "aws-us-east-1"
      type: "aws"
      enabled: true
      priority: 1
      weight: 10
      capabilities: ["EC2Fleet", "SpotFleet", "RunInstances", "ASG"]
    - name: "aws-us-west-2"
      type: "aws"
      enabled: true
      priority: 2
      weight: 5
      capabilities: ["EC2Fleet", "RunInstances"]

Template Examples

Explicit Provider Selection

{
  "template_id": "explicit-aws-east",
  "provider_name": "aws-us-east-1",
  "provider_api": "EC2Fleet",
  "image_id": "ami-12345",
  "subnet_ids": ["subnet-123"],
  "max_instances": 5
}

Provider Type Selection (Load Balanced)

{
  "template_id": "load-balanced-aws",
  "provider_type": "aws",
  "provider_api": "SpotFleet",
  "image_id": "ami-67890",
  "subnet_ids": ["subnet-456"],
  "max_instances": 10
}

API-Based Selection

{
  "template_id": "api-based-selection",
  "provider_api": "RunInstances",
  "image_id": "ami-abcdef",
  "subnet_ids": ["subnet-789"],
  "max_instances": 3
}

Provider Selection Algorithms

Weighted Round Robin

Distributes requests across provider instances based on configured weights:

def weighted_round_robin_selection(providers: List[ProviderInstance]) -> str:
    total_weight = sum(p.weight for p in providers)
    random_value = random.randint(1, total_weight)

    current_weight = 0
    for provider in providers:
        current_weight += provider.weight
        if random_value <= current_weight:
            return provider.name

Priority-Based Selection

Selects highest priority available provider:

def priority_based_selection(providers: List[ProviderInstance]) -> str:
    enabled_providers = [p for p in providers if p.enabled]
    return min(enabled_providers, key=lambda p: p.priority).name

Template File Organization

Directory Structure

config/
- templates.json                    # Main templates
- awsprov_templates.json           # AWS provider type templates
- provider1prov_templates.json         # Provider1 provider type templates
- aws-us-east-1_templates.json    # AWS US East instance templates
- aws-us-west-2_templates.json    # AWS US West instance templates
- provider1-region-a_templates.json    # Provider1 Region A instance templates

Template Inheritance

Templates inherit and override properties based on file priority:

// templates.json (base)
{
  "template_id": "web-server",
  "image_id": "ami-base",
  "instance_type": "t2.micro",
  "max_instances": 2
}

// awsprov_templates.json (provider override)
{
  "template_id": "web-server",
  "provider_type": "aws",
  "provider_api": "EC2Fleet",
  "instance_type": "t3.small",
  "max_instances": 5
}

// aws-us-east-1_templates.json (instance override)
{
  "template_id": "web-server",
  "provider_name": "aws-us-east-1",
  "image_id": "ami-east-optimized",
  "max_instances": 10
}

Final resolved template:

{
  "template_id": "web-server",
  "provider_name": "aws-us-east-1",
  "provider_type": "aws",
  "provider_api": "EC2Fleet",
  "image_id": "ami-east-optimized",
  "instance_type": "t3.small",
  "max_instances": 10
}

API Integration

REST API Endpoints

Provider Information

GET /api/v1/providers
GET /api/v1/providers/{provider-instance}/capabilities
GET /api/v1/providers/{provider-instance}/templates

Template Management

GET /api/v1/templates?provider_type=aws
GET /api/v1/templates?provider_name=aws-us-east-1
POST /api/v1/templates/validate

Request Processing

POST /api/v1/requests
{
  "templateId": "web-server",
  "maxNumber": 5,
  "providerPreference": {
    "type": "aws",
    "instance": "aws-us-east-1"
  }
}

CLI Commands

Provider Management

# List available providers
orb providers list

# Show provider capabilities
orb providers show aws-us-east-1

# Validate provider configuration
orb providers validate

Template Operations

# List templates by provider
orb templates list --provider-type aws
orb templates list --provider-name aws-us-east-1

# Show template source information
orb templates show web-server --source-info

# Validate template against provider
orb templates validate web-server --provider aws-us-east-1

Error Handling and Validation

Provider Selection Errors

  • No enabled providers: When no providers are available
  • Provider not found: When explicit provider doesn't exist
  • Provider disabled: When selected provider is disabled
  • No compatible providers: When no providers support required API

Template Validation Errors

  • API not supported: Provider doesn't support required API
  • Instance limit exceeded: Request exceeds provider limits
  • Pricing model mismatch: Provider doesn't support pricing model
  • Fleet type incompatible: Provider doesn't support fleet type

Error Response Format

{
  "error": {
    "code": "PROVIDER_NOT_FOUND",
    "message": "Provider instance 'aws-invalid' not found in configuration",
    "details": {
      "requested_provider": "aws-invalid",
      "available_providers": ["aws-us-east-1", "aws-us-west-2"]
    }
  }
}

Performance Considerations

Template Caching

  • Templates are cached in memory with file modification time tracking
  • Cache is automatically refreshed when template files change
  • Manual cache refresh available via API and CLI

Provider Selection Optimization

  • Provider configurations are cached at startup
  • Selection algorithms use pre-computed weights and priorities
  • Capability validation results are cached per provider-API combination

File I/O Optimization

  • Template files are loaded once and cached
  • Only modified files are reloaded
  • Batch operations minimize file system calls

Monitoring and Observability

Metrics

  • Provider selection distribution
  • Template validation success/failure rates
  • File loading performance
  • Cache hit/miss ratios

Logging

  • Provider selection decisions with reasoning
  • Template override chains
  • Validation failures with details
  • Performance timing information

Health Checks

  • Provider availability status
  • Template file accessibility
  • Configuration validation status
  • Cache consistency checks

Migration Guide

From Single Provider

  1. Update configuration to include provider instances
  2. Migrate templates to provider-specific files (optional)
  3. Update API calls to include provider preferences (optional)
  4. Test provider selection behavior

Template Migration

# Migrate existing templates to provider-specific files
orb templates migrate --from templates.json --to-provider aws-us-east-1

# Validate migrated templates
orb templates validate --all --provider aws-us-east-1

Best Practices

Configuration

  • Use meaningful provider instance names
  • Set appropriate weights for load balancing
  • Enable only necessary providers
  • Regular validation of provider configurations

Template Organization

  • Use provider-specific files for customizations
  • Keep common templates in main file
  • Document template inheritance chains
  • Regular cleanup of unused templates

Monitoring

  • Monitor provider selection distribution
  • Track validation failure patterns
  • Alert on provider availability issues
  • Regular performance reviews

Future Enhancements

Planned Features

  • Dynamic provider discovery
  • Cross-provider failover
  • Improved scheduling algorithms
  • Provider cost optimization
  • Multi-region template synchronization

Extension Points

  • Custom selection strategies
  • Provider-specific validation rules
  • Template transformation pipelines
  • External provider registries