Skip to content

Troubleshooting Questions

Common issues and solutions when using the GenAI IDP Accelerator.

Permission Errors

"AccessDenied" when deploying with Terraform

Symptoms:

Error: AccessDenied: User is not authorized to perform action

Solutions:

  1. Check IAM permissions:
aws sts get-caller-identity
aws iam get-user
  1. Verify required permissions:
  2. IAM: Create/manage roles and policies
  3. Lambda: Create/update functions
  4. S3: Create/manage buckets
  5. DynamoDB: Create/manage tables
  6. API Gateway: Create/manage APIs

  7. Use administrator access temporarily:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "*",
      "Resource": "*"
    }
  ]
}

"User is not authorized to invoke Bedrock model"

Symptoms:

Error: AccessDeniedException: Your account is not authorized to invoke this model

Solutions:

  1. Request model access:
  2. Go to Amazon Bedrock console
  3. Navigate to "Model access"
  4. Request access for required models

  5. Check region availability:

  6. Bedrock models aren't available in all regions
  7. Use supported regions like us-east-1, us-west-2

  8. Verify model ARN:

# Correct model ARN format
model_id = "anthropic.claude-3-sonnet-20240229-v1:0"

Resource Limits

"LimitExceededException" errors

Symptoms:

Error: LimitExceededException: Account has reached the maximum number of functions

Solutions:

  1. Check service quotas:
aws service-quotas get-service-quota \
  --service-code lambda \
  --quota-code L-B99A9384
  1. Request quota increases:
  2. Go to AWS Service Quotas console
  3. Find the relevant service and quota
  4. Submit increase request

  5. Clean up unused resources:

# List unused Lambda functions
aws lambda list-functions --query 'Functions[?LastModified<`2024-01-01`]'

"ThrottlingException" from AWS services

Symptoms:

Error: ThrottlingException: Rate exceeded

Solutions:

  1. Implement exponential backoff:
import time
import random

def retry_with_backoff(func, max_retries=3):
    for attempt in range(max_retries):
        try:
            return func()
        except ThrottlingException:
            if attempt == max_retries - 1:
                raise
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait_time)
  1. Reduce request rate:
  2. Add delays between API calls
  3. Use batch operations where possible
  4. Implement queue-based processing

Deployment Failures

Terraform state lock conflicts

Symptoms:

Error: Error acquiring the state lock

Solutions:

  1. Wait for lock release:
  2. Another Terraform operation may be running
  3. Wait 10-15 minutes for automatic release

  4. Check lock status:

aws dynamodb get-item \
  --table-name terraform-locks \
  --key '{"LockID":{"S":"your-state-file"}}'
  1. Force unlock (use carefully):
terraform force-unlock LOCK_ID

"Resource already exists" errors

Symptoms:

Error: ResourceAlreadyExistsException: Resource already exists

Solutions:

  1. Import existing resource:
terraform import aws_s3_bucket.documents existing-bucket-name
  1. Use different resource names:
resource "aws_s3_bucket" "documents" {
  bucket = "${var.environment}-idp-documents-${random_id.suffix.hex}"
}
  1. Check for naming conflicts:
  2. S3 bucket names must be globally unique
  3. Lambda function names must be unique per region/account

Runtime Errors

Lambda function timeouts

Symptoms:

Task timed out after 15.00 seconds

Solutions:

  1. Increase timeout:
resource "aws_lambda_function" "processor" {
  timeout = 300  # 5 minutes
}
  1. Optimize function performance:
# Initialize clients outside handler
import boto3

s3_client = boto3.client('s3')
textract_client = boto3.client('textract')

def lambda_handler(event, context):
    # Use pre-initialized clients
    pass
  1. Increase memory allocation:
resource "aws_lambda_function" "processor" {
  memory_size = 1024  # More memory = more CPU
}

"Document format not supported" errors

Symptoms:

Error: InvalidParameterException: Document format not supported

Solutions:

  1. Check supported formats:
  2. PDF, PNG, JPEG, TIFF only
  3. Maximum file size: 10MB (sync), 500MB (async)

  4. Validate file before processing:

import os

SUPPORTED_EXTENSIONS = {'.pdf', '.png', '.jpg', '.jpeg', '.tiff'}

def validate_document(file_name):
    ext = os.path.splitext(file_name)[1].lower()
    return ext in SUPPORTED_EXTENSIONS
  1. Convert unsupported formats:
  2. Use Lambda layers with image processing libraries
  3. Convert to supported format before processing

Performance Issues

Slow document processing

Symptoms:

  • Long processing times
  • Frequent timeouts

Solutions:

  1. Optimize Lambda configuration:
resource "aws_lambda_function" "processor" {
  memory_size = 2048  # Higher memory for better performance
  timeout     = 900   # 15 minutes max
}
  1. Implement parallel processing:
import concurrent.futures

def process_documents_parallel(documents):
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        futures = [executor.submit(process_document, doc) for doc in documents]
        results = [future.result() for future in futures]
    return results
  1. Use asynchronous processing:
  2. Use Textract async APIs for large documents
  3. Implement Step Functions for complex workflows
  4. Use SQS for queue-based processing

High memory usage

Symptoms:

Runtime.OutOfMemoryError: JavaScript heap out of memory

Solutions:

  1. Increase Lambda memory:
memory_size = 3008  # Maximum available
  1. Optimize memory usage:
# Process documents in chunks
def process_large_document(document_text):
    chunk_size = 10000  # characters
    chunks = [document_text[i:i+chunk_size] 
             for i in range(0, len(document_text), chunk_size)]

    results = []
    for chunk in chunks:
        result = process_chunk(chunk)
        results.append(result)

    return combine_results(results)

Network Issues

VPC connectivity problems

Symptoms:

  • Lambda functions can't reach AWS services
  • Timeout errors when calling APIs

Solutions:

  1. Check VPC configuration:
# Ensure NAT Gateway for private subnets
resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id     = aws_subnet.public[0].id
}
  1. Use VPC endpoints:
resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.main.id
  service_name = "com.amazonaws.${var.region}.s3"
}
  1. Check security groups:
resource "aws_security_group" "lambda" {
  egress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Data Issues

Inconsistent processing results

Symptoms:

  • Different results for same document
  • Missing or incorrect extracted data

Solutions:

  1. Improve prompt consistency:
CONSISTENT_PROMPT = """
Extract the following information from this document:
1. Document type
2. Date (format: YYYY-MM-DD)
3. Amount (format: $X.XX)
4. Parties involved

Return as JSON with these exact keys: type, date, amount, parties
"""
  1. Implement validation:
def validate_extraction_result(result):
    required_fields = ['type', 'date', 'amount', 'parties']
    return all(field in result for field in required_fields)
  1. Add error handling:
def process_with_fallback(document):
    try:
        result = primary_processing(document)
        if validate_result(result):
            return result
    except Exception as e:
        logger.warning(f"Primary processing failed: {e}")

    # Fallback to simpler processing
    return fallback_processing(document)

Monitoring and Debugging

How to debug Lambda functions

  1. Enable detailed logging:
import logging

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

def lambda_handler(event, context):
    logger.debug(f"Received event: {event}")
    # ... processing logic
    logger.debug(f"Processing result: {result}")
  1. Use X-Ray tracing:
resource "aws_lambda_function" "processor" {
  tracing_config {
    mode = "Active"
  }
}
  1. Monitor CloudWatch metrics:
  2. Duration, Errors, Throttles
  3. Memory utilization
  4. Concurrent executions

How to trace API requests

  1. Enable API Gateway logging:
resource "aws_api_gateway_stage" "main" {
  xray_tracing_enabled = true

  access_log_settings {
    destination_arn = aws_cloudwatch_log_group.api_gateway.arn
    format = jsonencode({
      requestId      = "$context.requestId"
      ip            = "$context.identity.sourceIp"
      caller        = "$context.identity.caller"
      user          = "$context.identity.user"
      requestTime   = "$context.requestTime"
      httpMethod    = "$context.httpMethod"
      resourcePath  = "$context.resourcePath"
      status        = "$context.status"
      protocol      = "$context.protocol"
      responseLength = "$context.responseLength"
    })
  }
}
  1. Use correlation IDs:
import uuid

def lambda_handler(event, context):
    correlation_id = str(uuid.uuid4())
    logger.info(f"Processing request {correlation_id}")

    # Pass correlation ID through processing chain
    result = process_document(event, correlation_id)

    return {
        'statusCode': 200,
        'headers': {'X-Correlation-ID': correlation_id},
        'body': json.dumps(result)
    }

For more troubleshooting help, see: