Deployment
Prerequisites
- Set up or have access to an AWS account.
- Ensure that your AWS account has the appropriate permissions. Resource creation during the AWS CDK deployment expects Administrator or Administrator-like permissions, to include resource creation and mutation permissions. Installation will not succeed if this profile does not have permissions to create and edit arbitrary resources for the system. This level of permissions is not required for the runtime of LISA. This is only necessary for deployment and subsequent updates.
- If using the chat UI, have your Identity Provider (IdP) information available, and access.
- If using an existing VPC, have its information available.
- Familiarity with AWS Cloud Development Kit (CDK) and infrastructure-as-code principles is a plus.
- AWS CDK and Model Management both leverage AWS Systems Manager Agent (SSM) parameter store. Confirm that SSM is approved for use by your organization before beginning. If you're new to CDK, review the AWS CDK Documentation and consult with your AWS support team.
Software
- AWS CLI installed and configured
- Python 3.11
- Node.js 20
- Docker installed and running
- Sufficient disk space for model downloads and conversions
TIP:
To minimize version conflicts and ensure a consistent deployment environment, we recommend executing the following steps on a dedicated EC2 instance. However, LISA can be deployed from any machine that meets the prerequisites listed above.
Deployment Steps
Step 1: Clone the Repository
Ensure you're working with the latest stable release of LISA:
git clone -b main --single-branch <path-to-lisa-repo>
cd lisa
Step 2: Set Up Environment Variables
Create and configure your config-custom.yaml
file:
cp example_config.yaml config-custom.yaml
Set the following environment variables:
export PROFILE=my-aws-profile # Optional, can be left blank
export DEPLOYMENT_NAME=my-deployment
export ENV=dev # Options: dev, test, or prod
export CDK_DOCKER=finch # Optional, only required if not using docker as container engine
Step 3: Set Up Python and TypeScript Environments
Install system dependencies and set up both Python and TypeScript environments:
# Install system dependencies
sudo apt-get update
sudo apt-get install -y jq
# Install Python packages
pip3 install --user --upgrade pip
pip3 install yq huggingface_hub s5cmd
# Set up Python environment
make createPythonEnvironment
# Activate your python environment
# The command is the output from the previous make command)
# Install Python Requirements
make installPythonRequirements
# Set up TypeScript environment
make createTypeScriptEnvironment
make installTypeScriptRequirements
Step 4: Configure LISA
Edit the config-custom.yaml
file to customize your LISA deployment. Key configurations include:
- AWS account and region settings
- Authentication settings
- Model bucket name
Step 5: Stage Model Weights
LISA requires model weights to be staged in the S3 bucket specified in your config-custom.yaml
file, assuming the S3 bucket follows this structure:
s3://<bucket-name>/<hf-model-id-1>
s3://<bucket-name>/<hf-model-id-1>/<file-1>
s3://<bucket-name>/<hf-model-id-1>/<file-2>
...
s3://<bucket-name>/<hf-model-id-2>
Example:
s3://<bucket-name>/mistralai/Mistral-7B-Instruct-v0.2
s3://<bucket-name>/mistralai/Mistral-7B-Instruct-v0.2/<file-1>
s3://<bucket-name>/mistralai/Mistral-7B-Instruct-v0.2/<file-2>
...
To automatically download and stage the model weights defined by the ecsModels
parameter in your config-custom.yaml
, use the following command:
make modelCheck
This command verifies if the model's weights are already present in your S3 bucket. If not, it downloads the weights, converts them to the required format, and uploads them to your S3 bucket. Ensure adequate disk space is available for this process.
WARNING As of LISA 3.0, the
ecsModels
parameter inconfig-custom.yaml
is solely for staging model weights in your S3 bucket. Previously, before models could be managed through the API or via the Model Management section of the Chatbot, this parameter also dictated which models were deployed.
NOTE For air-gapped systems, before running
make modelCheck
you should manually download model artifacts and place them in amodels
directory at the project root, using the structure:models/<model-id>
.
NOTE This process is primarily designed and tested for HuggingFace models. For other model formats, you will need to manually create and upload safetensors.
Step 6: Configure Identity Provider
In the config-custom.yaml
file, configure the authConfig
block for authentication. LISA supports OpenID Connect (OIDC) providers such as AWS Cognito or Keycloak. Required fields include:
authority
: URL of your identity providerclientId
: Client ID for your applicationadminGroup
: Group name for users with model management permissionsuserGroup
: Group name for regular LISA usersjwtGroupsProperty
: Path to the groups field in the JWT tokenadditionalScopes
(optional): Extra scopes for group membership information
IDP Configuration examples using AWS Cognito and Keycloak can be found: IDP Configuration Examples
Step 7: Configure LiteLLM
We utilize LiteLLM under the hood to allow LISA to respond to the OpenAI specification. For LiteLLM configuration, a key must be set up so that the system may communicate with a database for tracking all the models that are added or removed using the Model Management API. The key must start with sk-
and then can be any arbitrary string. We recommend generating a new UUID and then using that as the key. Configuration example is below.
litellmConfig:
db_key: sk-00000000-0000-0000-0000-000000000000 # needed for db operations, create your own key # pragma: allowlist-secret
Step 8: Set Up SSL Certificates (Development Only)
WARNING: THIS IS FOR DEV ONLY When deploying for dev and testing you can use a self-signed certificate for the REST API ALB. You can create this by using the script: gen-cert.sh
and uploading it to IAM
.
export REGION=<your-region>
export DOMAIN=<your-domain> #Optional if not running in 'aws' partition
./scripts/gen-certs.sh
aws iam upload-server-certificate --server-certificate-name <cert-name> --certificate-body file://scripts/server.pem --private-key file://scripts/server.key
Update your config-custom.yaml
with the certificate ARN:
restApiConfig:
sslCertIamArn: arn:<aws-partition>:iam::<account-number>:server-certificate/<certificate-name>
Step 9: Customize Model Deployment
In the ecsModels
section of config-custom.yaml
, allow our deployment process to pull the model weights for you.
During the deployment process, LISA will optionally attempt to download your model weights if you specify an optional ecsModels
array, this will only work in non ADC regions. Specifically, see the ecsModels
section of the example_config.yaml file. Here we define the model name, inference container, and baseImage:
ecsModels:
- modelName: your-model-name
inferenceContainer: tgi
baseImage: ghcr.io/huggingface/text-generation-inference:2.0.1
Step 10: Bootstrap CDK (If Not Already Done)
If you haven't bootstrapped your AWS account for CDK:
make bootstrap
ADC Region Deployment Tips
Amazon Dedicated Cloud (ADC) regions are isolated AWS environments designed for government customers' most sensitive workloads. These regions have restricted internet access and limited external dependencies, requiring special deployment considerations for LISA.
There are two deployment approaches for ADC regions:
- Pre-built Resources (Recommended): Build all components in a commercial region, then transfer to ADC
- In-Region Building: Configure LISA to use ADC-accessible repositories for building components
Approach 1: Pre-built Resources (Recommended)
This approach builds all necessary components in a commercial region with full internet access, then transfers them to the ADC region.
Step 1: Build Components in Commercial Region
- Set up LISA in a commercial AWS region with internet access
- Build all components:bashThis generates:
make buildArchive
- Lambda function zip files in
./dist/layers/*.zip
- Docker images exported as
./dist/images/*.tar
files
- Lambda function zip files in
Step 2: Transfer to ADC Region
Upload Docker images to ECR in your ADC region:
bash# Load and tag images docker load -i lisa-rest-api.tar docker tag lisa-rest-api:latest <adc-account-id>.dkr.ecr.<adc-region>.amazonaws.com/lisa-rest-api:latest # Push to ADC ECR aws ecr get-login-password --region <adc-region> | docker login --username AWS --password-stdin <adc-account-id>.dkr.ecr.<adc-region>.amazonaws.com docker push <adc-account-id>.dkr.ecr.<adc-region>.amazonaws.com/lisa-rest-api:latest
You'll want to repeat this for lisa-batch-ingestion, as well as any of the LISA base model hosting containers (lisa-vllm, lisa-tgi, lisa-tei)
Transfer built artifacts to ADC environment
Step 3: Configure LISA for Pre-built Resources
Update your config-custom.yaml
in the ADC region:
# Lambda layers from pre-built archives
lambdaLayerAssets:
authorizerLayerPath: './dist/layers/AimlAdcLisaAuthLayer.zip'
commonLayerPath: './dist/layers/AimlAdcLisaCommonLayer.zip'
fastapiLayerPath: './dist/layers/AimlAdcLisaFastApiLayer.zip'
ragLayerPath: './dist/layers/AimlAdcLisaRag.zip'
sdkLayerPath: './dist/layers/AimlAdcLisaSdk.zip'
# Lambda functions
lambdaPath: './dist/layers/AimlAdcLisaLambda.zip'
# Pre-built web assets
webAppAssetsPath: './dist/lisa-web'
documentsPath: './dist/docs'
ecsModelDeployerPath: './dist/ecs_model_deployer'
vectorStoreDeployerPath: './dist/vector_store_deployer'
# Container images from ECR
batchIngestionConfig:
type: external
code: <adc-account-id>.dkr.ecr.<adc-region>.amazonaws.com/lisa-batch-ingestion:latest
restApiConfig:
imageConfig:
type: external
code: <adc-account-id>.dkr.ecr.<adc-region>.amazonaws.com/lisa-rest-api:latest
Approach 2: In-Region Building
This approach configures LISA to build components using repositories accessible from within the ADC region.
Prerequisites
- ADC-accessible package repositories (PyPI mirror, npm registry, container registry)
- ADC-accessible container registries
- Network connectivity to required build dependencies
Configuration
Update your config-custom.yaml
to point to ADC-accessible repositories:
# Configure pip to use ADC-accessible PyPI mirror
pipConfig:
indexUrl: https://your-adc-pypi-mirror.com/simple
trustedHost: your-adc-pypi-mirror.com
# Configure npm to use ADC-accessible registry
npmConfig:
registry: https://your-adc-npm-registry.com
# Use ADC-accessible base images for LISA-Serve and Batch Ingestion
baseImage: <adc-registry>/python:3.11
You'll also want any model hosting base containers available, e.g. vllm/vllm-openai:latest and ghcr.io/huggingface/text-embeddings-inference:latest
To utilize the prebuilt hosting model containers with self-hosted models, select type: ecr
in the Model Deployment > Container Configs.
Deployment Steps
Once your configuration is complete:
Bootstrap CDK (if not already done):
bashmake bootstrap
Deploy LISA:
bashmake deploy
Deploy specific stacks if needed:
bashmake deploy STACK=LisaServe
List available stacks:
bashmake listStacks
Testing Your Deployment
After deployment completes (10-15 minutes), test with:
pytest lisa-sdk/tests --url <rest-url-from-cdk-output> --verify <path-to-server.crt>
Troubleshooting ADC Deployments
- Build failures: Ensure all dependencies are accessible from ADC region
- Container pull errors: Verify ECR repositories exist and have correct permissions
- Lambda deployment issues: Check that lambda zip files are properly formatted and accessible
- Network connectivity: Confirm VPC configuration allows required outbound connections