ML Container Creator¶
ML Container Creator (MCC) is the BYOC toolkit for Amazon SageMaker AI — from model selection to endpoint in one workflow. You select a model, a serving framework, and a deployment target — MCC generates a complete project with Dockerfile, serving code, lifecycle scripts, and tests. Then iterate: fine-tune, hot-swap adapters, benchmark, and operate — all from the same CLI.
MCC validates specific model + server + instance combinations end-to-end — from build through fine-tuning and adapter serving. If your configuration is in the Supported Models catalog, every lifecycle step has been proven. If it's not, MCC still generates a project — you take on the validation yourself.
What It Supports¶
Serving Architectures¶
| Architecture | Backends | Use Case |
|---|---|---|
| Transformers | vLLM, SGLang, TensorRT-LLM, LMI, DJL | Large language models |
| HTTP | Flask, FastAPI | Predictive models (sklearn, XGBoost, TensorFlow) |
| Triton | FIL, ONNX Runtime, TensorFlow, PyTorch, vLLM, TensorRT-LLM, Python | Multi-framework model serving via NVIDIA Triton |
| Diffusors | vLLM | Diffusion models (image generation) |
Deployment Targets¶
| Target | Description |
|---|---|
| Managed Inference | SageMaker AI real-time endpoints |
| Async Inference | S3-based asynchronous processing with SNS notifications |
| Batch Transform | S3-to-S3 dataset processing |
| HyperPod EKS | Kubernetes deployment on SageMaker AI HyperPod clusters |
Full Lifecycle¶
Every generated project includes do/ scripts for the complete iteration loop:
./do/build # Build container
./do/push # Push to ECR
./do/deploy # Deploy to SageMaker AI
./do/test # Validate inference
./do/tune # Fine-tune (managed serverless)
./do/adapter add # Hot-swap LoRA adapter
./do/benchmark # Latency + throughput measurement
./do/clean # Tear down everything
See Deployment & Inference for the full script reference.
Quick Start¶
See Getting Started for prerequisites, installation details, and a full walkthrough.
Documentation¶
User Guide¶
- Getting Started — Install MCC and deploy your first model
- How It Works — Architecture, prompt flow, and generated project structure
- Configuration — CLI flags, environment variables, config files, and MCP
- Deployment & Inference — Build paths, deployment targets, and lifecycle scripts
- Fine-Tuning — Managed tuning with
do/tuneand LoRA adapter serving - Examples — End-to-end walkthroughs for each architecture
- Troubleshooting — Common issues and solutions
Operations Guide¶
- CI Integration — Automated E2E validation across golden-path models
- Benchmarking — Performance measurement with SageMaker AI Benchmarking
- Contributing — Development setup and contribution workflow
Links¶
License¶
Apache-2.0. See CONTRIBUTING for security issue reporting.