Configuration¶

ML Container Creator supports multiple configuration methods with a clear precedence order, from interactive prompts to fully automated CLI usage.

Precedence¶

Configuration sources are applied in strict precedence order (highest to lowest):

Priority	Source	Description	Example
1	CLI Options	Command-line flags	`--deployment-config=http-flask`
2	CLI Arguments	Positional arguments	`ml-container-creator my-project`
3	Environment Variables	Shell environment	`export AWS_REGION=us-east-1`
4	CLI Config File	`--config` specified file	`--config=production.json`
5	Custom Config File	`config/mcp.json`	Auto-discovered in current directory
6	Package.json Section	`"ml-container-creator": {...}`	Project-specific defaults
7	Generator Defaults	Built-in defaults	`awsRegion: "us-east-1"`
8	Interactive Prompts	User input (fallback)	CLI prompts

Higher precedence sources override lower ones.

Parameter Reference¶

This table shows every parameter, its CLI flag, which configuration sources support it, and whether it is required.

Parameter	CLI Option	Env Var	Config File	Package.json	Default	Required
Core
Deployment Config	`--deployment-config`	--	yes	--	--	yes
Engine	`--engine`	--	yes	--	--	no
Framework	`--framework`	--	yes	--	--	no
Model Server	`--model-server`	--	yes	--	--	no
Model Format	`--model-format`	--	yes	--	--	yes
Modules
Include Sample	`--include-sample`	--	yes	--	`false`	yes
Include Testing	`--include-testing`	--	yes	--	`true`	yes
Infrastructure
Deployment Target	`--deployment-target`	--	yes	--	--	yes
Build Target	`--build-target`	--	yes	--	--	yes
Instance Type	`--instance-type`	`ML_INSTANCE_TYPE`	yes	--	--	yes
CodeBuild Compute	`--codebuild-compute-type`	--	yes	--	`BUILD_GENERAL1_MEDIUM`	no
AWS Region	`--region`	`AWS_REGION`	yes	yes	`us-east-1`	no
AWS Role ARN	`--role-arn`	`AWS_ROLE`	yes	yes	--	no
HyperPod EKS
Cluster	`--hyperpod-cluster`	--	yes	--	--	no
Namespace	`--hyperpod-namespace`	--	yes	--	--	no
Replicas	`--hyperpod-replicas`	--	yes	--	`1`	no
FSx Volume Handle	`--fsx-volume-handle`	--	yes	--	--	no
Project
Project Name	`--project-name`	--	yes	yes	--	yes
Project Directory	`--project-dir`	--	yes	yes	`.`	yes
System
Config File	`--config`	`ML_CONTAINER_CREATOR_CONFIG`	--	yes	--	no
Skip Prompts	`--skip-prompts`	--	--	--	`false`	no

Core parameters (deployment-config, engine, model-server, model-format) are not supported via environment variables or package.json. Only infrastructure and project settings are supported in those sources.

Deployment Configs¶

The --deployment-config flag bundles the architecture and model server into a single value:

Config	Architecture	Backend	Use Case
`http-flask`	HTTP	Flask	Traditional ML with Flask server
`http-fastapi`	HTTP	FastAPI	Traditional ML with FastAPI server
`transformers-vllm`	Transformers	vLLM	LLM serving with vLLM
`transformers-sglang`	Transformers	SGLang	LLM serving with SGLang
`transformers-tensorrt-llm`	Transformers	TensorRT-LLM	LLM serving with TensorRT-LLM
`transformers-lmi`	Transformers	LMI	LLM serving with Large Model Inference
`transformers-djl`	Transformers	DJL	LLM serving with Deep Java Library
`triton-fil`	Triton	FIL	Tree models (XGBoost, LightGBM) on Triton
`triton-onnxruntime`	Triton	ONNX Runtime	ONNX models on Triton
`triton-tensorflow`	Triton	TensorFlow	TensorFlow models on Triton
`triton-pytorch`	Triton	PyTorch	PyTorch models on Triton
`triton-vllm`	Triton	vLLM	LLM serving on Triton
`triton-tensorrtllm`	Triton	TensorRT-LLM	LLM serving on Triton with TensorRT-LLM
`triton-python`	Triton	Python	Custom Python models on Triton

For traditional ML configs (http-flask, http-fastapi), also specify --engine to set the ML framework.

Model Formats¶

Framework	Supported Formats	Default
sklearn	`pkl`, `joblib`	`pkl`
xgboost	`json`, `model`, `ubj`	`json`
tensorflow	`keras`, `h5`, `SavedModel`	`keras`
transformers	N/A (models loaded from HuggingFace Hub)	--

Configuration Methods¶

Interactive Mode¶

The default. Run the generator and answer the prompts:

ml-container-creator

CLI Options¶

Use command-line flags for non-interactive generation:

ml-container-creator my-project \
  --deployment-config=http-flask \
  --engine=sklearn \
  --model-format=pkl \
  --deployment-target=managed-inference \
  --instance-type=ml.m5.large \
  --build-target=codebuild \
  --skip-prompts

The project name can also be passed as a positional argument (priority 2 in the precedence chain).

Environment Variables¶

Set infrastructure parameters via the shell environment:

export ML_INSTANCE_TYPE="ml.g5.2xlarge"
export AWS_REGION="us-west-2"
export AWS_ROLE="arn:aws:iam::123456789012:role/SageMakerRole"

ml-container-creator --deployment-config=transformers-vllm --skip-prompts

Only four environment variables are supported: ML_INSTANCE_TYPE, AWS_REGION, AWS_ROLE, and ML_CONTAINER_CREATOR_CONFIG. Core parameters must come from CLI options or config files.

Configuration Files¶

Three file-based sources are supported, in descending precedence:

CLI config file (--config flag or ML_CONTAINER_CREATOR_CONFIG env var):

ml-container-creator --config=production.json --skip-prompts

Custom config file (config/mcp.json, auto-discovered):

{
  "projectName": "my-ml-project",
  "deploymentConfig": "http-flask",
  "engine": "sklearn",
  "modelFormat": "pkl",
  "includeSampleModel": false,
  "includeTesting": true,
  "deploymentTarget": "managed-inference",
  "buildTarget": "codebuild",
  "instanceType": "ml.m5.large",
  "awsRegion": "us-east-1",
  "awsRoleArn": "arn:aws:iam::123456789012:role/SageMakerRole"
}

Package.json section (infrastructure and project settings only):

{
  "name": "my-project",
  "ml-container-creator": {
    "awsRegion": "us-west-2",
    "awsRoleArn": "arn:aws:iam::123456789012:role/MyProjectRole",
    "projectName": "my-ml-service"
  }
}

CLI Commands¶

Beyond project generation, MCC provides configuration management commands:

Command	Description
`ml-container-creator configure`	Interactive configuration file setup
`ml-container-creator generate-empty-config`	Create an empty config file template
`ml-container-creator help`	Show all options and examples

HuggingFace Authentication¶

When deploying transformer models, you may need to authenticate with HuggingFace to access private or gated models. Public models like openai/gpt-oss-20b do not require authentication.

Authentication is required for:

Private models in your HuggingFace account
Gated models requiring license agreement (e.g., Llama 2, Llama 3)
Avoiding rate limits on public models

Providing Your Token¶

CLI option:

ml-container-creator my-llm-project \
  --deployment-config=transformers-vllm \
  --model-name=meta-llama/Llama-2-7b-hf \
  --hf-token='$HF_TOKEN' \
  --skip-prompts

Config file:

{
  "deploymentConfig": "transformers-vllm",
  "modelName": "meta-llama/Llama-2-7b-hf",
  "hfToken": "$HF_TOKEN"
}

Interactive prompt: When you enter a custom model ID during generation, you will be prompted for a token. You can enter the token directly, reference $HF_TOKEN, or leave it empty for public models.

Security¶

Tokens are baked into the Docker image. Anyone with access to the image can extract the token via docker inspect.

Use $HF_TOKEN (environment variable reference) in config files and CI/CD pipelines instead of literal tokens.
Never commit tokens to version control.
Use read-only tokens with minimal permissions.
Rotate tokens periodically. Generate new ones at huggingface.co/settings/tokens.

Troubleshooting Authentication¶

Symptom	Cause	Fix
"Repository not found" or "Access denied"	Invalid token, expired token, or license not accepted	Verify token at huggingface.co; accept model license
"HF_TOKEN environment variable not set"	`$HF_TOKEN` referenced but not exported	`export HF_TOKEN=hf_...`
Container builds but fails at runtime	Model requires auth but no token provided	Rebuild with `--hf-token`

Validation¶

The generator validates configuration at multiple levels:

Parameter Validation¶

The generator validates configuration parameters and provides error messages:

# Invalid deployment config
ml-container-creator --deployment-config=invalid --skip-prompts
# Error: invalid not implemented yet.

# Incompatible model format
ml-container-creator --deployment-config=http-flask --engine=sklearn --model-format=json --skip-prompts
# Error: Unsupported model format 'json' for engine 'sklearn'

# Invalid ARN
ml-container-creator --role-arn=invalid-arn --skip-prompts
# Error: Invalid AWS Role ARN format

# Missing required parameter
ml-container-creator --skip-prompts
# Error: Required parameter 'deploymentConfig' is missing

Do not mix incompatible options: traditional ML engines with LLM deployment configs, model formats with transformer configs, or sample models with transformer configs will all produce validation errors.

Schema-Driven Validation¶

Schema-driven validation checks generated AWS API payloads against the actual AWS service model (service-2.json) files. It catches issues that parameter validation cannot — enum values that AWS has deprecated, type mismatches in nested structures, missing required fields for specific API operations, and cross-cutting consistency problems between your Dockerfile, deploy scripts, and configuration.

Setup¶

Download the AWS service models into your local schema registry:

ml-container-creator bootstrap sync-schemas

This downloads service models for SageMaker, IAM, ECR, and S3 from the AWS SDK source and stores them at ~/.ml-container-creator/schemas/. Re-run periodically to pick up new enum values and API changes.

When Validation Runs¶

Schema validation runs at two points:

At generation time (non-blocking): After the generator produces deploy scripts, it validates the constructed payloads and prints any issues as warnings. Generation still completes — this is informational.

At pre-deploy time (blocking): Run ./do/validate before deploying to catch all issues, including those introduced by manual edits to do/config after generation.

# Run full schema validation
./do/validate

# JSON output for CI integration
./do/validate --format=json

# Include smart-mode validators (future MCP integration)
./do/validate --smart

What It Checks¶

Check	Example Issue Caught
Enum values	`InferenceAmiVersion` set to a value AWS no longer accepts
Type mismatches	`InitialInstanceCount` set to a string instead of integer
Required fields	`EndpointConfigName` missing from CreateEndpointConfig payload
Pattern constraints	Role ARN not matching `arn:aws:iam::\d{12}:role/.+`
Range constraints	`VolumeSizeInGB` below minimum or above maximum
GPU consistency	`NumberOfAcceleratorDevicesRequired` doesn't match instance GPU count
Tensor parallelism	`VLLM_TENSOR_PARALLEL_SIZE` != IC GPU count != instance GPUs
CUDA compatibility	Base image requires CUDA 12 but instance only supports CUDA 11
Model source requirements	`jumpstart-hub` source without `HubAccessConfig.HubContentArn`

Exit Codes¶

Code	Meaning
0	Validation passed (no errors, may have warnings)
1	Validation failed (one or more errors found)
2	Validation could not run (schema registry missing)

Keeping Schemas Current¶

The schema registry becomes stale as AWS adds new enum values and instance types. If schemas are older than 30 days, validation prints a warning:

⚠️  Schema registry is 45 days old. Run `ml-container-creator bootstrap sync-schemas` to update.

Suppress this warning with --ignore-staleness if you're working offline.

Catalog Validation¶

Validate that catalog entries use valid AWS enum values:

npm run validate:catalogs

This checks fields like inferenceAmiVersion in model-servers.json against the SageMaker service model's enum set. Run this as a CI gate when updating catalog files.

Skipping Validation¶

Pass --no-validate to the generator to skip schema validation at generation time:

ml-container-creator my-project --deployment-config=transformers-vllm --no-validate --skip-prompts