Generator Architecture¶
The generator is a standalone Node.js CLI built with Commander.js. The entry point is bin/cli.js, which parses options and delegates to src/app.js. Business logic modules live in generators/app/lib/.
Entry Point Flow¶
bin/cli.js → Commander option parsing, subcommand routing
└─ src/app.js → Orchestrates the four generation phases
└─ generators/app/lib/ → Business logic modules
bin/cli.js¶
Defines all CLI options, subcommands (bootstrap, mcp, registry, configure), and calls run() from src/app.js as the default action.
src/app.js¶
The run(projectName, options) function orchestrates the full generation workflow. It also exports writeProject() and postGenerate() for use in tests and programmatic invocations.
src/copy-tpl.js¶
EJS template copying utility. Walks the template directory, renders each file through EJS, and writes to the destination — respecting ignore patterns.
src/prompt-adapter.js¶
Bridges the prompt definitions (which use Inquirer.js-compatible prompt objects) to @inquirer/prompts for interactive input collection.
Generation Phases¶
Phase 1: Initializing¶
Loads configuration from all sources and initializes the registry system.
- Commander options are converted from camelCase to kebab-case for
ConfigManagercompatibility. ConfigManager.loadConfiguration()merges values from 8 sources in precedence order: CLI options, CLI arguments, environment variables, CLI config file, custom config file (config/mcp.json), package.json section, MCP servers, and generator defaults. It also queries configured MCP servers viaMcpClient.ConfigurationManagerloads the three registries throughRegistryLoader, which reads catalog JSON files fromservers/*/catalogs/and transforms them into internal data shapes.ValidationEngineis initialized with accelerator validators (CUDA, Neuron, ROCm, CPU) for later use.
Phase 2: Prompting¶
If --skip-prompts is set, ConfigManager.getFinalConfiguration() returns the merged config directly. Otherwise:
PromptRunner.run()executes prompts in phases: Infrastructure (region, deployment target, instance type, HyperPod/async/batch settings, build target), Core ML (deployment config, engine, model format, model name, base image, HF token), Modules (sample model, testing), and Project (name, directory).- Prompt definitions live in
prompts.js. Each prompt group is an exported array. The deployment config prompt presents a flat list of 15architecture-backendvalues (e.g.,transformers-vllm,triton-fil).DeploymentConfigResolver.decompose()splits these intoarchitectureandbackendfields. PromptRunnerqueries MCP servers for instance type and region choices before presenting those prompts, merging MCP-provided choices into the prompt options.- Prompt answers are merged with the base config via
ConfigManager.getFinalConfiguration(promptAnswers).
Phase 3: Writing¶
TemplateManager.validate()checks that the deployment config, build target, deployment target, instance type, and region are all within supported values. It also enforces GPU requirements for GPU-only backends.CommentGeneratorproduces Dockerfile comments (accelerator info, validation status, troubleshooting).- All templates are copied with
copyTpl(), processing EJS variables. A small set of ignore patterns excludes architecture-specific subdirectories (triton/,diffusors/,hyperpod/) that are handled separately. - A four-way
switchonarchitecture(http, transformers, triton, diffusors) deletes files that don't belong to the selected architecture and, for triton and diffusors, copies architecture-specific templates (Dockerfile, model repository, serve scripts). - Shell scripts in
do/anddeploy/getchmod 755.
Phase 4: Post-generate¶
Runs train_abalone.py if a sample model was requested (http and eligible triton backends only). Sets executable permissions on generated scripts.
Key Modules¶
| Module | Location | Purpose |
|---|---|---|
config-manager.js |
generators/app/lib/ |
8-level configuration precedence, MCP integration, parameter matrix |
prompt-runner.js |
generators/app/lib/ |
Phased prompt execution, MCP choice injection, catalog data loading |
prompts.js |
generators/app/lib/ |
All prompt definitions, instance type registry from catalog, project name generation |
template-manager.js |
generators/app/lib/ |
Validates deployment config, build target, deployment target, instance type, region, GPU requirements |
configuration-manager.js |
generators/app/lib/ |
Orchestrates registry loading, framework/model matching, HuggingFace enrichment, env var validation |
registry-loader.js |
generators/app/lib/ |
Adapter layer: reads catalog JSON from servers/*/catalogs/ and transforms into internal shapes |
deployment-config-resolver.js |
generators/app/lib/ |
Decomposes transformers-vllm into {architecture: 'transformers', backend: 'vllm'} |
mcp-client.js |
generators/app/lib/ |
Spawns MCP server processes, performs handshake, calls get_ml_config tool |
validation-engine.js |
generators/app/lib/ |
Validates accelerator compatibility (framework requirements vs. instance capabilities) |
deployment-registry.js |
generators/app/lib/ |
CRUD operations for the local deployment registry (~/.mcc-registry/) |
copy-tpl.js |
src/ |
EJS template rendering and file copying with glob ignore patterns |
prompt-adapter.js |
src/ |
Bridges prompt definitions to @inquirer/prompts |
Configuration Flow¶
The configuration precedence system is documented in the Configuration user guide. From a code perspective, the flow is:
ConfigManagerconstructor builds a parameter matrix defining which parameters are accepted from which sources.loadConfiguration()applies sources in reverse precedence order (lowest first), so higher-precedence sources overwrite lower ones.- MCP servers are queried during loading.
McpClientspawns each configured server as a child process, performs the MCP handshake, and calls theget_ml_configtool. Returned values and choices are stored separately — values merge into the config, choices are injected into prompt options. getFinalConfiguration(promptAnswers)merges prompt answers (lowest precedence) with the accumulated config and appliesDeploymentConfigResolverto decompose thedeploymentConfigstring intoarchitectureandbackend._ensureTemplateVariables()insrc/app.jsfills in defaults for any missing fields, merges environment variables from catalog sources with a five-layer precedence (catalog defaults, framework profile, model entry, model profile, CLI overrides), and enriches transformer models with HuggingFace data.
Subcommands¶
The CLI registers subcommands via Commander:
| Subcommand | Handler | Purpose |
|---|---|---|
bootstrap [action] |
src/lib/ (Phase 2) |
One-time AWS infrastructure setup |
mcp <action> |
src/lib/ (Phase 2) |
Query configured MCP servers |
registry <action> |
src/lib/ (Phase 2) |
Deployment registry operations |
configure |
src/lib/ (Phase 2) |
Generate config file |
Testing¶
Tests invoke writeProject() directly with a prepared answers object and a temporary destination directory. This avoids interactive prompts and allows deterministic validation of generated output.
import { writeProject } from '../src/app.js'
const destDir = path.join(os.tmpdir(), 'test-output')
await writeProject(TEMPLATE_DIR, destDir, answers)
// Assert on generated files
Further Reading¶
- Template System — EJS templates, do/ script branching, Dockerfile conditionals
- MCP Server Development — Adding catalog entries, creating new servers
- Registries and Catalogs — Catalog JSON format and data flow
- Configuration — User-facing configuration reference