Skip to content

Getting Started

This guide will walk you through installing ML Container Creator (MCC), configuring MCC for first-time use, and the various methods to creating your first SageMaker-ready container.

Prerequisites

Before you begin, ensure you have:

Verify Prerequisites

# Check Node.js version
node --version  # Should be 24.11.1 or higher

# Check Python version
python --version  # Should be 3.8 or higher

# Check Docker (local builds only)
docker --version

# Check AWS CLI
aws --version

# Verify AWS credentials
aws sts get-caller-identity

IAM Policy Document

Coming Soon

Documentation for this feature is in progress.

Installation

# Install Yeoman
npm install -g yo

# Install the generator
git clone https://github.com/awslabs/ml-container-creator.git
cd ml-container-creator

# Install Dependencies and Link Generator
npm install
npm link

# Verify installation
yo --generators
# Should show "ml-container-creator" in the list

# Run tests to verify setup (for contributors)
npm test

# Generate your project
yo ml-container-creator

Predictive ML

Let's create a simple scikit-learn model container. All predictive ML containers feature a very basic regression model trained on the Abalone dataset. These sample models are for demonstration purposes only. The instructions assume you have a model object saved locally to be copied to your container in the format specified for your chosen framework.

For LLMs, check out the section on Generative AI.

Step 1: Prepare Your Model

First, save a trained model in the format you plan to use for your deployment. Each predictive framework supports different model format types. Check the "supported frameworks" page for more details.

Coming Soon

Link to predictive framework support page.

In this example, we'll use the sample Abalone classifier. This deployment option trains a light-weight and overly simplistic regression model using the selected predictive ML framework, and saves the model object in the format specified. The model file is automatically included in the container files for simplicity.

Step 2: Generate Container Project

Run the generator using the yo command and selecting the generator from the provided list. Alternatively, specify the generator inline: yo ml-container-creator. You'll be prompted with questions. Each option creates conditional branching logic custom to the selected values. For a basic scikit-learn container using the default regression model, follow the prompts as defined below. You'll want to do this in a new directory.

(base) frgud@842f5776eab6 ml-container-creator % mkdir scikit-test
(base) frgud@842f5776eab6 ml-container-creator % cd scikit-test 
(base) frgud@842f5776eab6 scikit-test % yo ml-container-creator scikit-test-project

πŸ“š Registry System Initialized
   β€’ Framework Registry: Loaded
   β€’ Model Registry: Loaded
   β€’ Instance Accelerator Mapping: Loaded
   β€’ Environment Variable Validation: Enabled

βš™οΈ  Configuration will be collected from prompts and merged with:
   β€’ Project name: scikit-test-project

πŸ”§ Core Configuration
βœ” Which ML framework are you using? sklearn
βœ” In which format is your model serialized? pkl
βœ” Which model server are you serving with? flask

πŸ“¦ Module Selection
βœ” Include sample Abalone classifier? Yes
βœ” Include test suite? Yes
βœ” Test type? local-model-cli, local-model-server, hosted-model-endpoint

πŸ’ͺ Infrastructure & Performance
βœ” Deployment target? sagemaker
βœ” Instance type? CPU-optimized (ml.m6g.large)
βœ” Target AWS region? us-east-1
βœ” AWS IAM Role ARN for SageMaker execution (optional)? <IAM ROLE>

⚠️  Warning: Building locally for SageMaker deployment
   Building this image locally may result in `exec format error` when deploying
   to SageMaker if your local architecture differs from the target instance.
   Ensure you have set the appropriate --platform flag in your Dockerfile
   (e.g., --platform=linux/amd64 for x86_64 instances, --platform=linux/arm64 for ARM).
   Consider using CodeBuild for architecture-independent builds.


πŸ“‹ Project Configuration

πŸš€ Manual Deployment

☁️ The following steps assume authentication to an AWS account.

πŸ’° The following commands will incur charges to your AWS account.
         ./build_and_push.sh -- Builds the image and pushes to ECR.
         ./deploy.sh -- Deploys the image to a SageMaker AI Managed Inference Endpoint.
                 deploy.sh needs a valid IAM Role ARN as a parameter.
   create scikit-test-project/Dockerfile
   create scikit-test-project/nginx-predictors.conf
   create scikit-test-project/requirements.txt
   create scikit-test-project/code/model_handler.py
   create scikit-test-project/code/serve.py
   create scikit-test-project/code/serving.properties
   create scikit-test-project/code/start_server.py
   create scikit-test-project/deploy/build_and_push.sh
   create scikit-test-project/deploy/deploy.sh
   create scikit-test-project/sample_model/test_inference.py
   create scikit-test-project/sample_model/train_abalone.py
   create scikit-test-project/test/test_endpoint.sh
   create scikit-test-project/test/test_local_image.sh
   create scikit-test-project/test/test_model_handler.py
   create scikit-test-project/code/flask/gunicorn_config.py
   create scikit-test-project/code/flask/wsgi.py

No change to package.json was detected. No package manager install will be executed.

πŸ€– Training sample model...
This will generate the model file needed for Docker build.
Model trained and saved. Test score: 0.531
Model saved.
βœ… Sample model training completed successfully!
πŸ“ Model file saved in: /User/frgud/../scikit-test/scikit-test-project/sample_model

Project Structure

Your generated project contains:

scikit-test-project/
β”œβ”€β”€ Dockerfile              # Container definition
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ nginx-predictors.conf   # Nginx configuration
β”œβ”€β”€ code/
β”‚   β”œβ”€β”€ model.pkl               # Your trained model
β”‚   β”œβ”€β”€ model_handler.py        # Model loading and inference
β”‚   β”œβ”€β”€ serve.py                # Flask server
β”‚   └── flask/                  # Flask-specific code
|       β”œβ”€β”€ gunicorn_config.py    # Gunicorn Config (Flask only)
|       └── wsgi.py               # Creates Flask app
β”œβ”€β”€ deploy/
β”‚   β”œβ”€β”€ build_and_push.sh  # Build and push to ECR
β”‚   └── deploy.sh          # Deploy to SageMaker
└── test/
    β”œβ”€β”€ test_endpoint.sh       # Test hosted endpoint
    β”œβ”€β”€ test_local_image.sh    # Test local container
    └── test_model_handler.py  # Unit tests

Step 3: Add Your Model

If you have your own model saved in the pkl format, modify the generated Dockerfile accordingly.

# COPY sample_model/abalone_model.pkl /opt/ml/model/
COPY my_local_model/model.pkl /opt/ml/model/
The Dockerfile will be different based on the values selected in Step 2, but the step remains the same so long as you are supplying your own already trained model.

Model Source Locations

At this time, predictive models can only be sourced from local storage.

Step 4: Test the Container

This particular flow yields three different tests that can be run. The first is a test of the image build process, the second tests the model's ability to receive requests and respond, and the third tests the deployed endpoint using the AWS SageMaker CLI.

Testing Data

Data used to test these capabilities come directly from the Abalone dataset. It is not a true test of a model's predictive performance. Rather it is a test of the container's to serve inference, the model loader's predict method, and the endpoint's ability to receive requests on the /invocations endpoint. If you provide your own model, you may have to modify these scripts with your own test data.

Test Container Locally:

(base) frgud@842f5776eab6 scikit-test-project % ./test/test_local_image.sh 
Building Docker image...
[+] Building 34.9s (9/20)                                                             docker:desktop-linux
...
View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/ztddsyganjri4bak78mgdjb5g
Stopping any existing container...
Starting container on port 8080...
95af750298500a09416ca1e2f8fd83adde66387be933c98bd3b897ff2a4383db
Waiting for container to start...
Testing health check endpoint...
{"status":"healthy"}

Testing inference endpoint...
{"predictions":[12.86]}

Container logs:
[2026-01-29 22:33:10 +0000] [7] [INFO] Starting gunicorn 23.0.0
[2026-01-29 22:33:10 +0000] [7] [INFO] Listening at: http://0.0.0.0:8080 (7)
[2026-01-29 22:33:10 +0000] [7] [INFO] Using worker: sync
[2026-01-29 22:33:11 +0000] [21] [INFO] Booting worker with pid: 21
INFO:serve:Loading model from /opt/ml/model
INFO:model_handler:Loading model from /opt/ml/model/abalone_model.pkl
[2026-01-29 22:33:11 +0000] [22] [INFO] Booting worker with pid: 22
INFO:serve:Loading model from /opt/ml/model
INFO:model_handler:Loading model from /opt/ml/model/abalone_model.pkl
[2026-01-29 22:33:11 +0000] [23] [INFO] Booting worker with pid: 23
INFO:serve:Loading model from /opt/ml/model
INFO:model_handler:Loading model from /opt/ml/model/abalone_model.pkl
[2026-01-29 22:33:11 +0000] [55] [INFO] Booting worker with pid: 55
INFO:serve:Loading model from /opt/ml/model
INFO:model_handler:Loading model from /opt/ml/model/abalone_model.pkl
INFO:model_handler:SKLearn model loaded successfully
INFO:serve:Model loaded successfully
INFO:model_handler:SKLearn model loaded successfully
INFO:serve:Model loaded successfully
INFO:model_handler:SKLearn model loaded successfully
INFO:serve:Model loaded successfully
INFO:model_handler:SKLearn model loaded successfully
INFO:serve:Model loaded successfully
192.168.65.1 - - [29/Jan/2026:22:33:20 +0000] "GET /ping HTTP/1.1" 200 21 "-" "curl/8.7.1"
/usr/local/lib/python3.12/site-packages/sklearn/utils/validation.py:2749: UserWarning: X does not have valid feature names, but RandomForestRegressor was fitted with feature names
  warnings.warn(
192.168.65.1 - - [29/Jan/2026:22:33:20 +0000] "POST /invocations HTTP/1.1" 200 24 "-" "curl/8.7.1"

Cleaning up...
sklearn-test
sklearn-test
Test complete!
(base) frgud@842f5776eab6 scikit-test-project % 
This test builds the container locally and stands it up as a temporary process behind localhost:8080. It then sends a request to the /ping endpoint, followed by a request to the /invocations endpoint.

Test Model Handler:

(base) frgud@842f5776eab6 scikit-test-project % python ./test/test_model_handler.py --model-path ./sample_model --input-data '[[1, 0.455, 0.365, 0.095, 0.514, 0.2245, 0.101, 0.15]]'
Loading model from: ./sample_model
INFO:model_handler:Loading model from ./sample_model/abalone_model.pkl
INFO:model_handler:SKLearn model loaded successfully
Running inference...
/Users/frgud/.local/share/mise/installs/python/3.12.11/lib/python3.12/site-packages/sklearn/utils/validation.py:2749: UserWarning: X does not have valid feature names, but RandomForestRegressor was fitted with feature names
  warnings.warn(

Result:
{
  "predictions": [
    12.86
  ]
}
(base) frgud@842f5776eab6 scikit-test-project % 
This test executes the Python code that runs within the container. If you require custom input preprocessing and post-processing, you will have to modify the code/model_handler.py file. The model handler test can be extended to test how the container receives and responds to inference requests at the model layer.

Test Endpoint:

We will come back to this once we deploy the model in the next step.

Step 5: Deploy to SageMaker

5.1: Build and Push to ECR

# Build and push Docker image to ECR
(base) frgud@842f5776eab6 scikit-test-project % ./deploy/build_and_push.sh
Building Docker image for project: scikit-test-project...
[+] Building 0.3s (21/21) 
...
View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/uz31052ank7nn4xgmv2pltmm1
Logging into ECR...

WARNING! Your credentials are stored unencrypted in '/Users/frgud/.docker/config.json'.
Configure a credential helper to remove this warning. See
https://docs.docker.com/go/credential-store/

Login Succeeded
πŸ“¦ Checking ECR repository...
{
    "repositories": [
        {
            "repositoryArn": "<REPOSITORY_ARN>",
            "registryId": "<REGISTRY_ID>",
            "repositoryName": "ml-container-creator",
            "repositoryUri": "<ECR_URI>/ml-container-creator",
            "createdAt": "2025-10-29T22:02:34.906000-04:00",
            "imageTagMutability": "MUTABLE",
            "imageScanningConfiguration": {
                "scanOnPush": false
            },
            "encryptionConfiguration": {
                "encryptionType": "AES256"
            }
        }
    ]
}
Pushing images to ECR repository: ml-container-creator...
The push refers to repository [<ECR_URI>/ml-container-creator]
...
βœ… Images successfully pushed to ECR:
  - <ECR_URI>:scikit-test-project-20260129-175045 (timestamped build)
  - <ECR_URI>:scikit-test-project-latest (project latest)
  - <ECR_URI>:latest (global latest)
(base) frgud@842f5776eab6 scikit-test-project % 

5.2: Deploy to SageMaker AI

(base) frgud@842f5776eab6 scikit-test-project % ./deploy/deploy.sh 
Using configured execution role: <EXECUTION_ROLE_ARN>
πŸš€ Deploying locally built image to SageMaker...
Pulling image from ECR...
latest: Pulling from ml-container-creator
Digest: <IMAGE_DIGEST>
Status: Image is up to date for <ECR_URI>/ml-container-creator:latest
<ECR_URI>/ml-container-creator:latest
Creating SageMaker model: scikit-test-project-model-1769727239
{
    "ModelArn": "arn:aws:sagemaker:us-east-1:<ACCOUNT_NO>:model/scikit-test-project-model-1769727239"
}
Creating endpoint configuration: scikit-test-project-endpoint-config-1769727239
{
    "EndpointConfigArn": "arn:aws:sagemaker:us-east-1:<ACCOUNT_NO>:endpoint-config/<ENDPOINT_CONFIG_NAME>"
}
Creating endpoint: <ENDPOINT_NAME>
{
    "EndpointArn": "arn:aws:sagemaker:us-east-1:<ACCOUNT_NO>:endpoint/<ENDPOINT_NAME>"
}
Waiting for endpoint to be in service...
Deployment complete!
Endpoint name: <ENDPOINT_NAME>
You can test the endpoint with:
./test_endpoint.sh <ENDPOINT_NAME>
(base) frgud@842f5776eab6 scikit-test-project % 

5.3: Test the Endpoint

(base) frgud@842f5776eab6 scikit-test-project % ./test/test_endpoint.sh <ENDPOINT_NAME>
Testing SageMaker endpoint: <ENDPOINT_NAME>
Checking endpoint status...
InService
Testing inference endpoint...
{
    "ContentType": "application/json",
    "InvokedProductionVariant": "primary"
}
Response:
{
  "predictions": [
    12.86
  ]
}

Cleaning up files...
Test complete!
(base) frgud@842f5776eab6 scikit-test-project % 

Generative AI

Step 1: Generate Container Project

Just as with the predictive scenario, we can use MCC to generate container assets for transformer-based architectures and LLMs. In this example, we'll use MCC to deploy openai/gpt-oss-20b onto a SageMaker AI managed inference endpoint using the SGLang serving framework.

(base) frgud@842f5776eab6 transformers-test % yo ml-container-creator sglang-gptoss-test 

πŸ“š Registry System Initialized
   β€’ Framework Registry: Loaded
   β€’ Model Registry: Loaded
   β€’ Instance Accelerator Mapping: Loaded
   β€’ Environment Variable Validation: Enabled

βš™οΈ  Configuration will be collected from prompts and merged with:
   β€’ Project name: sglang-gptoss-test

πŸ”§ Core Configuration
βœ” Which ML framework are you using? transformers
βœ” Which model do you want to use? openai/gpt-oss-20b
βœ” Which model server are you serving with? sglang

πŸ” Fetching model information for: openai/gpt-oss-20b
   βœ… Found on HuggingFace Hub

πŸ“‹ Model Information:
   β€’ Model ID: openai/gpt-oss-20b
   β€’ Chat Template: ❌ Not available
     (Chat endpoints may require manual configuration)
   β€’ Sources: HuggingFace_Hub_API

πŸ“¦ Module Selection
βœ” Include test suite? Yes
βœ” Test type? hosted-model-endpoint

πŸ’ͺ Infrastructure & Performance
βœ” Deployment target? codebuild (recommended)
βœ” CodeBuild compute type? BUILD_GENERAL1_MEDIUM
βœ” Instance type? GPU-optimized (ml.g6.12xlarge)
βœ” Target AWS region? us-east-1
βœ” AWS IAM Role ARN for SageMaker execution (optional)? <EXECUTION_ROLE_ARN>

πŸ“‹ Project Configuration

πŸš€ Manual Deployment

☁️ The following steps assume authentication to an AWS account.

πŸ’° The following commands will incur charges to your AWS account.
         ./build_and_push.sh -- Builds the image and pushes to ECR.
         ./deploy.sh -- Deploys the image to a SageMaker AI Managed Inference Endpoint.
                 deploy.sh needs a valid IAM Role ARN as a parameter.
   create sglang-gptoss-test/Dockerfile
   create sglang-gptoss-test/IAM_PERMISSIONS.md
   create sglang-gptoss-test/buildspec.yml
   create sglang-gptoss-test/code/serve
   create sglang-gptoss-test/code/serving.properties
   create sglang-gptoss-test/deploy/deploy.sh
   create sglang-gptoss-test/deploy/submit_build.sh
   create sglang-gptoss-test/deploy/upload_to_s3.sh
   create sglang-gptoss-test/test/test_endpoint.sh

No change to package.json was detected. No package manager install will be executed.
(base) frgud@842f5776eab6 transformers-test % 

Project Structure

Your generated project contains:

sglang-gptoss-test/
    β”œβ”€β”€ Dockerfile                    # Container definition with SGLang runtime
    β”œβ”€β”€ IAM_PERMISSIONS.md            # Required AWS IAM policies for AWS CodeBuild deployment
    β”œβ”€β”€ buildspec.yml                 # AWS CodeBuild configuration for CI/CD
    β”œβ”€β”€ code/
    β”‚   β”œβ”€β”€ serve                     # Shell entrypoint script that launches SGLang server
    β”‚   └── serving.properties        # SGLang server configuration (model ID, port, etc.)
    β”œβ”€β”€ deploy/
    β”‚   β”œβ”€β”€ deploy.sh                 # Creates SageMaker model and endpoint
    β”‚   β”œβ”€β”€ submit_build.sh           # Triggers CodeBuild to build and push Docker image
    β”‚   └── upload_to_s3.sh           # Uploads model artifacts to S3 (if needed)
    └── test/
        └── test_endpoint.sh          # Tests the deployed SageMaker endpoint with sample requests

Step 2. Build the Container

The transformer based projects used managed containers from framework providers to build the final container. These containers take significantly longer to build given how large they are. It is recommended to use AWS CodeBuild for transformer-based containers. Building with AWS CodeBuild helps reduce the likelihood of architecture mismatches as well.

(base) frgud@842f5776eab6 sglang-gptoss-test % ./deploy/submit_build.sh 
πŸ—οΈ  Submitting CodeBuild job...
Project: sglang-gptoss-test-llm-build-20260129
Region: us-east-1
Compute Type: BUILD_GENERAL1_MEDIUM
ECR Repository: ml-container-creator
πŸ“¦ Checking ECR repository...
βœ… ECR repository already exists: ml-container-creator
πŸ” Checking CodeBuild service role...
Creating CodeBuild service role: sglang-gptoss-test-llm-build-20260129-service-role
{
    "Role": {
        "Path": "/",
        "RoleName": "<ROLE_NAME>",
        "RoleId": "<IAM ROLE ID>",
        "Arn": "arn:aws:iam::<ACCOUNT_NO>:role/sglang-gptoss-test-llm-build-20260129-service-role",
        "CreateDate": "2026-01-29T23:15:56+00:00",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "codebuild.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }
    }
}
βœ… CodeBuild service role created successfully
⏳ Waiting for IAM role to propagate...
πŸ—οΈ  Checking CodeBuild project...
Creating CodeBuild project: sglang-gptoss-test-llm-build-20260129
Creating CodeBuild project...
βœ… CodeBuild project created successfully
Project ARN: arn:aws:codebuild:us-east-1:<ACCOUNT_NO>:project/sglang-gptoss-test-llm-build-20260129
⏳ Waiting for CodeBuild project to be available...
βœ… Project creation verified: sglang-gptoss-test-llm-build-20260129
πŸš€ Starting CodeBuild job...
Using project name: sglang-gptoss-test-llm-build-20260129
πŸ“ Uploading source code from current directory...
Creating source archive...
βœ… Source archive created:  16K
πŸ“€ Uploading source to S3...
upload: ../../../../../../tmp/sglang-gptoss-test-source.zip to s3://codebuild-source-<ACCOUNT_NO>-us-east-1/sglang-gptoss-test/source-20260129-181616.zip
πŸš€ Starting CodeBuild job with source from S3...
Using project name: 'sglang-gptoss-test-llm-build-20260129'
S3 source location: s3://codebuild-source-<ACCOUNT_NO>-us-east-1/sglang-gptoss-test/source-20260129-181616.zip
Starting build...
Build started with ID: sglang-gptoss-test-llm-build-20260129:e49cb662-7e0a-485d-9bcc-eb0f23e4f8ac
πŸ“Š You can monitor the build at: https://us-east-1.console.aws.amazon.com/codesuite/codebuild/projects/sglang-gptoss-test-llm-build-20260129/build/sglang-gptoss-test-llm-build-20260129:e49cb662-7e0a-485d-9bcc-eb0f23e4f8ac

⏳ Monitoring build progress...
πŸ“‹ Build status: IN_PROGRESS | Phase: PROVISIONING
πŸ“‹ Build status: IN_PROGRESS | Phase: BUILD
πŸ“‹ Build status: SUCCEEDED | Phase: COMPLETED

βœ… Build completed successfully!
🐳 Docker image available at: <ACCOUNT_NO>.dkr.ecr.us-east-1.amazonaws.com/ml-container-creator:latest

Next steps:
  1. Run './deploy/deploy.sh' to deploy to SageMaker
  2. Or use the ECR image URI in your own deployment process
(base) frgud@842f5776eab6 sglang-gptoss-test % 

Step 3. Deploy to SageMaker AI

Transformer based containers typically require GPU to successfully deploy. Take care to provision your transformer-based container onto the appropriate instance type. The deployment script is populated with a "best-guess" instance type, but you may try experimenting with the deployment instance based on your unique workload requirements.

3.1: Deploy

(base) frgud@842f5776eab6 sglang-gptoss-test % ./deploy/deploy.sh 
Using configured execution role: <ROLE_ARN>
πŸš€ Deploying CodeBuild-generated image to SageMaker...
πŸ” Verifying ECR image exists...
βœ… ECR image found: <ACCOUNT_NO>.dkr.ecr.us-east-1.amazonaws.com/ml-container-creator:latest
Creating SageMaker model: sglang-gptoss-test-model-1769729378
{
    "ModelArn": "arn:aws:sagemaker:us-east-1:<ACCOUNT_NO>:model/sglang-gptoss-test-model-1769729378"
}
Creating endpoint configuration: sglang-gptoss-test-endpoint-config-1769729378
{
    "EndpointConfigArn": "arn:aws:sagemaker:us-east-1:<ACCOUNT_NO>:endpoint-config/<ENDPOINT_CONFIG_NAME>"
}
Creating endpoint: sglang-gptoss-test-endpoint-1769729378
{
    "EndpointArn": "arn:aws:sagemaker:us-east-1:<ACCOUNT_NO>:endpoint/<ENDPOINT_NAME>"
}
Waiting for endpoint to be in service...
Deployment complete!
Endpoint name: <ENDPOINT_NAME>
You can test the endpoint with:
./test_endpoint.sh <ENDPOINT_NAME> openai/gpt-oss-20b
(base) frgud@842f5776eab6 sglang-gptoss-test % 

3.1: Test

(base) frgud@842f5776eab6 sglang-gptoss-test % ./test/test_endpoint.sh <ENDPOINT_NAME> openai/gpt-oss-20b

Testing SageMaker endpoint: <ENDPOINT_NAME>
Checking endpoint status...
InService
Testing inference endpoint...
{
    "ContentType": "application/json",
    "InvokedProductionVariant": "primary"
}
Response:
{
  "id": "5e7ce6ccc0f04cb8abd320b27b508ff5",
  "object": "chat.completion",
  "created": 1769730729,
  "model": "openai/gpt-oss-20b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "<|channel|>analysis<|message|>We need to respond to user greeting. It's a friendly question, \"Hello, how are you?\" We can respond politely: \"I'm good, thanks! How can I help you today?\" The user didn't ask a question; it's just a greeting. So respond accordingly.<|end|><|start|>assistant<|channel|>final<|message|>I’m doing greatβ€”thanks for asking! How can I help you today?",
        "reasoning_content": null,
        "tool_calls": null
      },
      "logprobs": null,
      "finish_reason": "stop",
      "matched_stop": 200002
    }
  ],
  "usage": {
    "prompt_tokens": 73,
    "total_tokens": 153,
    "completion_tokens": 80,
    "prompt_tokens_details": null,
    "reasoning_tokens": 0
  },
  "metadata": {
    "weight_version": "default"
  }
}

Cleaning up files...
Test complete!
(base) frgud@842f5776eab6 sglang-gptoss-test % 

Configuration Options

The example above used interactive prompts, but ML Container Creator supports multiple configuration methods for different workflows:

Quick CLI Generation

Skip prompts entirely using CLI options:

# Generate sklearn project with CLI options
yo ml-container-creator iris-classifier \
  --framework=sklearn \
  --model-server=flask \
  --model-format=pkl \
  --include-testing \
  --skip-prompts

Environment Variables

Set deployment-specific variables:

export AWS_REGION=us-west-2
export ML_INSTANCE_TYPE=gpu-enabled
yo ml-container-creator --framework=transformers --model-server=vllm --skip-prompts

Configuration Precedence

Configuration sources are applied in order (highest to lowest priority):

  1. CLI Options (--framework=sklearn)
  2. CLI Arguments (yo ml-container-creator my-project)
  3. Environment Variables (AWS_REGION=us-east-1)
  4. Config Files (--config=prod.json or ml-container.config.json)
  5. Package.json ("ml-container-creator": {...})
  6. Generator Defaults
  7. Interactive Prompts (fallback)

For complete configuration documentation, see the Configuration Guide.

Cleanup

To avoid ongoing charges, delete your SageMaker endpoint:

# Delete endpoint
aws sagemaker delete-endpoint --endpoint-name <ENDPOINT_NAME>

# Delete endpoint configuration
aws sagemaker delete-endpoint-config --endpoint-config-name <ENDPOINT_CONFIG_NAME>

# Delete model
aws sagemaker delete-model --model-name <MODEL_NAME>