ML Container Creator - Architecture Guide¶

Project Overview¶

This is a Yeoman generator that creates Docker containers for deploying ML models to AWS SageMaker using the Bring Your Own Container (BYOC) paradigm.

Quick Architecture Overview¶

For newcomers, here's how the generator works:

User runs: yo ml-container-creator
           ↓
    ┌─────────────────┐
    │   index.js      │  ← Main generator (orchestration)
    │  (~50 lines)    │
    └─────────────────┘
           ↓
    ┌─────────────────┐
    │ PromptRunner    │  ← Collects user input
    │ (prompts.js)    │     • Project name
    └─────────────────┘     • Framework choice
           ↓                • Optional modules
    ┌─────────────────┐
    │ TemplateManager │  ← Validates & determines templates
    │                 │     • Checks supported options
    └─────────────────┘     • Builds ignore patterns
           ↓
    ┌─────────────────┐
    │ Template Copy   │  ← Copies & processes templates
    │ (EJS processing)│     • Replaces variables
    └─────────────────┘     • Excludes unwanted files
           ↓
    Generated Project Ready! 🎉

Detailed Architecture¶

Generator Structure (Modular Design)¶

The generator follows a clean, modular architecture:

Main generator: generators/app/index.js - Orchestrates the generation process (~50 lines)
Prompt definitions: generators/app/lib/prompts.js - All user prompts organized by phase
Prompt orchestration: generators/app/lib/prompt-runner.js - Manages user interaction flow
Template logic: generators/app/lib/template-manager.js - Handles conditional template copying
Templates: generators/app/templates/ - EJS templates that get copied and processed

Key Components¶

1. Main Generator (`index.js`)¶

Purpose: Orchestrates the generation process
Size: ~50 lines (was 300+ before refactoring)
Responsibilities:
Delegates to specialized modules
Sets destination directory
Handles errors

2. Prompt Runner (`lib/prompt-runner.js`)¶

Purpose: Manages user interaction
Phases:
📋 Project Configuration
🔧 Core Configuration
📦 Module Selection
💪 Infrastructure & Performance
Output: Combined answers object

3. Template Manager (`lib/template-manager.js`)¶

Purpose: Handles template logic
Functions:
Validates user configuration
Determines which templates to include/exclude
Centralizes conditional logic

4. Prompts (`lib/prompts.js`)¶

Purpose: Defines all user prompts
Organization: Grouped by phase
Benefits: Easy to add new prompts

Key Concepts¶

Yeoman Generator Pattern: Extends yeoman-generator base class
Phases: Generator runs in phases (prompting → writing → install)
Template Processing: Uses EJS syntax (<%= variable %>) in template files
Conditional Generation: Files can be excluded via ignorePatterns array
Modular Design: Separation of concerns for maintainability

Supported Configurations¶

Frameworks¶

sklearn - scikit-learn models
xgboost - XGBoost models
tensorflow - TensorFlow/Keras models
transformers - Hugging Face transformer models (LLMs)

Model Servers¶

flask - Flask-based serving (traditional ML)
fastapi - FastAPI-based serving (traditional ML)
vllm - vLLM serving (transformers only)
sglang - SGLang serving (transformers only)

Model Formats¶

sklearn: pkl, joblib
xgboost: json, model, ubj
tensorflow: keras, h5, SavedModel
transformers: N/A (loaded from Hugging Face Hub)

Code Conventions¶

JavaScript Style¶

Use ES6+ features (const, arrow functions, async/await)
Prefer const over let, avoid var
Use template literals for string interpolation
Follow existing ESLint configuration

Generator Methods¶

prompting() - Collect user input via interactive prompts
writing() - Copy and process template files
_validateAnswers() - Private method for validation (prefix with _)

Template Variables¶

All answers are stored in this.answers and available in templates:

{
  projectName,
  destinationDir,
  framework,
  modelFormat,
  modelServer,
  includeSampleModel,
  includeTesting,
  testTypes,
  deployTarget,
  instanceType,
  awsRegion,
  buildTimestamp
}

Configuration Decision Tree¶

Framework?
├── sklearn/xgboost/tensorflow (Traditional ML)
│   ├── Model Format? (pkl, json, keras, etc.)
│   ├── Server? (Flask or FastAPI)
│   ├── Include sample model? (Yes/No)
│   ├── Include tests? (Yes/No)
│   └── Instance type? (CPU or GPU)
│
└── transformers (LLMs)
    ├── Server? (vLLM or SGLang)
    ├── Include tests? (Yes - endpoint only)
    └── Instance type? (GPU only)

Template Structure & Logic¶

Generated Project Structure¶

project-name/
├── Dockerfile              ← Always included
├── requirements.txt        ← Excluded for transformers
├── nginx-predictors.conf   ← Excluded for transformers (traditional ML only)
├── nginx-tensorrt.conf     ← Included only for TensorRT-LLM
├── code/
│   ├── model_handler.py   ← Excluded for transformers
│   ├── serve.py           ← Excluded for transformers
│   ├── serve              ← Excluded for traditional ML
│   └── flask/             ← Excluded if not Flask
├── deploy/
│   ├── build_and_push.sh  ← Always included
│   ├── deploy.sh          ← Always included
│   └── upload_to_s3.sh    ← Excluded for traditional ML
├── sample_model/          ← Optional module
└── test/                  ← Optional module

File Generation Logic¶

The generator uses exclusion patterns to determine which templates to copy:

// Example: Transformers exclude traditional ML files
if (framework === 'transformers') {
    exclude: [
        'model_handler.py',  // Custom loading
        'serve.py',          // Flask/FastAPI
        'nginx-predictors.conf', // Traditional ML reverse proxy
        'requirements.txt'   // Traditional deps
    ]
}

Template Exclusion Logic¶

Files are conditionally excluded based on configuration:

Transformers: Excludes traditional ML serving code (model_handler.py, serve.py, nginx-predictors.conf)
Traditional ML: Excludes transformer serving code (code/serve, upload_to_s3.sh, nginx-tensorrt.conf, start_server.sh)
TensorRT-LLM: Includes nginx-tensorrt.conf and start_server.sh for SageMaker compatibility
Non-Flask: Excludes Flask-specific code
No sample model: Excludes sample_model/ directory
No testing: Excludes test/ directory

Benefits of This Architecture¶

✅ Maintainable¶

Small, focused modules
Clear separation of concerns
Easy to understand and modify

✅ Testable¶

Each module can be tested independently
Clear inputs and outputs
Mocking is straightforward

✅ Extensible¶

Adding new prompts is simple
New template logic is centralized
Framework additions follow patterns

✅ Readable¶

Main generator is ~50 lines
Logic is organized by purpose
Comments explain the "why"

Development Workflow¶

Making Changes¶

Edit generator logic in appropriate module:
Prompts: generators/app/lib/prompts.js
Template logic: generators/app/lib/template-manager.js
Orchestration: generators/app/index.js
Edit templates in generators/app/templates/
Run npm link to test locally
Test with yo ml-container-creator
Run npm test before committing

Testing¶

Unit tests in test/ directory
Focus on TemplateManager and core logic
Run security audit before tests: npm run pretest
Use npm run test:watch for development

Adding New Features¶

Adding a New Prompt¶

Add prompt definition to appropriate phase in prompts.js
Update template logic if it affects file generation
Add test cases for the new option

Adding a New Template¶

Create template file with EJS variables
Add exclusion logic if conditional
Test with different configurations

Adding a New Framework¶

Add to prompt choices in prompts.js
Add template exclusion logic to template-manager.js
Create framework-specific templates
Add tests for the new configuration

AWS/SageMaker Context¶

SageMaker BYOC Requirements¶

Container must expose port 8080
Must implement /ping (health check) and /invocations (inference) endpoints
Model artifacts typically stored in /opt/ml/model/
Environment variables: SM_MODEL_DIR, SM_NUM_GPUS, etc.

Deployment Flow¶

Build Docker image locally
Push to Amazon ECR
Create SageMaker model from ECR image
Deploy to SageMaker endpoint
Test endpoint with sample data

Common Patterns¶

Adding a New Framework¶

Add to SUPPORTED_OPTIONS.frameworks
Add model format choices in prompting
Create template variations if needed
Update validation logic
Test all combinations

Adding a New Model Server¶

Add to SUPPORTED_OPTIONS.modelServer
Create server-specific templates in code/
Update ignore patterns
Add appropriate dependencies to requirements.txt template
Update Dockerfile if needed

Dependencies¶

Runtime (Generated Projects)¶

Python 3.8+
Docker 20+
AWS CLI 2+
Framework-specific: scikit-learn, xgboost, tensorflow, vllm, sglang

Development (Generator)¶

Node.js 24+
Yeoman
ESLint
Mocha

Security Considerations¶

Run npm audit before tests (automated in pretest)
Use npm-force-resolutions for dependency overrides
Keep dependencies updated via overrides in package.json
Never commit AWS credentials

Troubleshooting¶

Generator Issues¶

Run npm link after changes
Clear Yeoman cache: rm -rf ~/.config/configstore/insight-yo.json
Check Node version: node --version (must be 24+)

Generated Project Issues¶

Test locally before deploying: docker build and docker run
Check SageMaker logs in CloudWatch
Verify IAM role has necessary permissions
Ensure ECR repository exists in target region

ML Container Creator - Architecture Guide¶

Project Overview¶

Quick Architecture Overview¶

Detailed Architecture¶

Generator Structure (Modular Design)¶

Key Components¶

1. Main Generator (index.js)¶

2. Prompt Runner (lib/prompt-runner.js)¶

3. Template Manager (lib/template-manager.js)¶

4. Prompts (lib/prompts.js)¶

Key Concepts¶

Supported Configurations¶

Frameworks¶

Model Servers¶

Model Formats¶

Code Conventions¶

JavaScript Style¶

Generator Methods¶

Template Variables¶

Configuration Decision Tree¶

Template Structure & Logic¶

Generated Project Structure¶

File Generation Logic¶

Template Exclusion Logic¶

Benefits of This Architecture¶

✅ Maintainable¶

✅ Testable¶

✅ Extensible¶

✅ Readable¶

Development Workflow¶

Making Changes¶

Testing¶

Adding New Features¶

Adding a New Prompt¶

Adding a New Template¶

Adding a New Framework¶

AWS/SageMaker Context¶

SageMaker BYOC Requirements¶

Deployment Flow¶

Common Patterns¶

Adding a New Framework¶

Adding a New Model Server¶

Dependencies¶

Runtime (Generated Projects)¶

Development (Generator)¶

Security Considerations¶

Troubleshooting¶

Generator Issues¶

Generated Project Issues¶

1. Main Generator (`index.js`)¶

2. Prompt Runner (`lib/prompt-runner.js`)¶

3. Template Manager (`lib/template-manager.js`)¶

4. Prompts (`lib/prompts.js`)¶