Nova 2 Model Support
Overview
Section titled “Overview”This document explains how the lexical-graph toolkit supports Amazon Nova 2 series models in AWS Bedrock, including the architecture, implementation details, and usage patterns.
Background
Section titled “Background”The Problem
Section titled “The Problem”Amazon Nova 2 series models (Lite, Micro, Pro, Premier, Pro Preview) were released after LlamaIndex’s BedrockConverse class was implemented. LlamaIndex maintains a hardcoded list of supported models in llama_index/llms/bedrock_converse/utils.py, and Nova 2 models are not included in this list. This causes model validation to fail when attempting to use Nova 2 models.
Additionally, Nova 2 models require using inference profile format (e.g., us.amazon.nova-2-lite-v1:0) instead of direct model IDs for on-demand throughput, which adds another layer of complexity.
The Solution
Section titled “The Solution”Rather than waiting for LlamaIndex to update their model list or monkey-patching their validation logic, we implemented a custom DirectBedrockLLM class that:
- Uses boto3’s
bedrock-runtimeclient directly, bypassing LlamaIndex’s model validation - Implements LlamaIndex’s
LLMinterface for compatibility with existing code - Properly handles credential management through
GraphRAGConfig.session - Supports pickling for multiprocessing workflows
Architecture
Section titled “Architecture”Component Overview
Section titled “Component Overview”GraphRAGConfig ├── _to_llm() method │ ├── Checks if model is in NOVA_2_MODELS list │ ├── If yes → DirectBedrockLLM │ └── If no → BedrockConverse (LlamaIndex) │ └── session property └── Provides boto3 session for AWS authentication ├── IRSA in EKS (IAM Roles for Service Accounts) └── SSO locally (AWS profiles)
DirectBedrockLLM ├── Implements LLM interface ├── Uses boto3 bedrock-runtime client ├── Gets credentials from GraphRAGConfig.session └── Supports pickling via __getstate__/__setstate__Decision Logic
Section titled “Decision Logic”The _to_llm() method in GraphRAGConfig determines which LLM implementation to use:
DirectBedrockLLM is used when:
- Model ID is in the
NOVA_2_MODELSlist - Includes both model ID format (
amazon.nova-2-*) and inference profile format (us.amazon.nova-2-*)
BedrockConverse (LlamaIndex) is used for:
- All other Bedrock models (Claude, Titan, Cohere, etc.)
- Any model NOT in the
NOVA_2_MODELSlist
Implementation Details
Section titled “Implementation Details”Supported Nova 2 Models
Section titled “Supported Nova 2 Models”The following Nova 2 models are supported (defined in config.py):
NOVA_2_MODELS = [ # Model IDs 'amazon.nova-2-lite-v1:0', 'amazon.nova-2-micro-v1:0', 'amazon.nova-2-pro-v1:0', 'amazon.nova-2-premier-v1:0', 'amazon.nova-2-pro-preview-20251202-v1:0', # Inference profile formats (required for on-demand throughput) 'us.amazon.nova-2-lite-v1:0', 'us.amazon.nova-2-micro-v1:0', 'us.amazon.nova-2-pro-v1:0', 'us.amazon.nova-2-premier-v1:0', 'us.amazon.nova-2-pro-preview-20251202-v1:0',]DirectBedrockLLM Class
Section titled “DirectBedrockLLM Class”Located in lexical-graph/src/graphrag_toolkit/lexical_graph/bedrock_llm.py:
Key Features:
- LlamaIndex Compatibility: Implements the
LLMinterface from LlamaIndex - Credential Management: Gets boto3 session from
GraphRAGConfig.session - Pickling Support: Excludes client from pickle, recreates on unpickle
- Lazy Client Creation: Client property creates client on-demand from session
Pickling Implementation:
def __getstate__(self): """Exclude client from pickle - will be recreated from GraphRAGConfig.session""" state = self.__dict__.copy() state['_client'] = None return state
def __setstate__(self, state): """Restore state and recreate client from GraphRAGConfig.session""" self.__dict__.update(state) self._client = None # Will be lazily created via property
@propertydef client(self): """Lazy client creation from GraphRAGConfig.session""" if self._client is None: from graphrag_toolkit.lexical_graph.config import GraphRAGConfig self._client = GraphRAGConfig.session.client('bedrock-runtime') return self._clientThis approach ensures:
- Client is not pickled (which would fail)
- Client is recreated with proper credentials after unpickling
- Works seamlessly in multiprocessing environments
Configuration Integration
Section titled “Configuration Integration”The _to_llm() method in GraphRAGConfig handles model selection:
def _to_llm(self, llm: LLMType): if isinstance(llm, LLM): return llm
# ... session setup ...
if _is_json_string(llm): config = json.loads(llm) model_id = config['model']
# Check if this is a Nova 2 model if model_id in NOVA_2_MODELS: from graphrag_toolkit.lexical_graph.bedrock_llm import DirectBedrockLLM logger.info(f"Using DirectBedrockLLM for Nova 2 model: {model_id}") return DirectBedrockLLM( model=model_id, temperature=config.get('temperature', 0.0), max_tokens=config.get('max_tokens', 4096) )
# Use BedrockConverse for other models return BedrockConverse(...)
else: # Check if this is a Nova 2 model if llm in NOVA_2_MODELS: from graphrag_toolkit.lexical_graph.bedrock_llm import DirectBedrockLLM logger.info(f"Using DirectBedrockLLM for Nova 2 model: {llm}") return DirectBedrockLLM( model=llm, temperature=0.0, max_tokens=4096 )
# Use BedrockConverse for other models return BedrockConverse(...)Explicit Import and Instantiation
Section titled “Explicit Import and Instantiation”To use Nova 2 multimodal embeddings, you must explicitly import and instantiate the class:
from graphrag_toolkit.lexical_graph import GraphRAGConfigfrom graphrag_toolkit.lexical_graph.utils.bedrock_utils import Nova2MultimodalEmbedding
GraphRAGConfig.embed_model = Nova2MultimodalEmbedding('amazon.nova-2-multimodal-embeddings-v1:0')GraphRAGConfig.embed_dimensions = 3072Advanced Configuration
Section titled “Advanced Configuration”from graphrag_toolkit.lexical_graph import GraphRAGConfigfrom graphrag_toolkit.lexical_graph.utils.bedrock_utils import Nova2MultimodalEmbedding
embedding = Nova2MultimodalEmbedding( model_name='amazon.nova-2-multimodal-embeddings-v1:0', embed_dimensions=3072, embed_purpose='TEXT_RETRIEVAL', truncation_mode='END')
GraphRAGConfig.embed_model = embeddingGraphRAGConfig.embed_dimensions = 3072IAM Permissions
Section titled “IAM Permissions”Cross-Region Bedrock Access
Section titled “Cross-Region Bedrock Access”Nova 2 models use inference profiles which require specific IAM permissions:
# In infrastructure/platform/stacks/argo_workflow_access_stack.pyiam.PolicyStatement( effect=iam.Effect.ALLOW, actions=[ "bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream" ], resources=[ # Inference profiles (without account ID) "arn:aws:bedrock:*::inference-profile/*", # Inference profiles (with account ID) f"arn:aws:bedrock:*:{account}:inference-profile/*", # Specific inference profile f"arn:aws:bedrock:us-east-1::inference-profile/us.amazon.nova-2-lite-v1:0", # Foundation models f"arn:aws:bedrock:*::foundation-model/*", ])Why Both ARN Patterns?
Section titled “Why Both ARN Patterns?”AWS Bedrock inference profiles can have ARNs with or without account IDs:
arn:aws:bedrock:*::inference-profile/*- Cross-account inference profilesarn:aws:bedrock:*:{account}:inference-profile/*- Account-specific inference profiles
Including both ensures compatibility with all inference profile types.
Credential Management
Section titled “Credential Management”Local Development (SSO)
Section titled “Local Development (SSO)”# Login to AWS SSOaws sso login --profile primary
# Set profileexport AWS_PROFILE=primaryexport AWS_REGION=us-east-1
# Run extractionpython extract_script.pyEKS (IRSA)
Section titled “EKS (IRSA)”In EKS, the service account is annotated with an IAM role:
apiVersion: v1kind: ServiceAccountmetadata: name: argo-workflows-server namespace: argo-workflows annotations: eks.amazonaws.com/role-arn: arn:aws:iam::188967239867:role/ArgoWorkflowAccessRoleThe GraphRAGConfig.session automatically uses IRSA credentials when running in EKS.
Validation
Section titled “Validation”Successful Workflow Example
Section titled “Successful Workflow Example”# Submit test workflowargo submit infrastructure/argo-workflows/templates/extract-bee-test-workflow.yaml \ -n argo-workflows \ --watch
# Check logsargo logs extract-bee-test-6z8nx -n argo-workflows
# Output shows:# [GraphRAGConfig] Using DirectBedrockLLM for Nova 2 model: us.amazon.nova-2-lite-v1:0# Successfully extracted 22 JSON filesVerification Steps
Section titled “Verification Steps”- Check model selection: Look for log message indicating DirectBedrockLLM usage
- Verify output: Check S3 for extracted JSON files
- Validate credentials: Ensure no authentication errors in logs
- Test pickling: Verify multiprocessing works without serialization errors
Comparison: Before vs After
Section titled “Comparison: Before vs After”Previous Implementation (Problematic)
Section titled “Previous Implementation (Problematic)”Issues:
- Client injection hacks in
llm_cache.py - Manual boto3 client creation bypassing proper credential management
- Monkey-patching to work around pickling issues
- Didn’t respect IRSA/SSO authentication
- Fragile and hard to maintain
Current Implementation (Clean)
Section titled “Current Implementation (Clean)”Benefits:
- Clean separation of concerns
- Each LLM class manages its own client
- Proper credential management through
GraphRAGConfig.session - No hacks or workarounds
- Proper pickling support via
__getstate__/__setstate__ - Works seamlessly with IRSA in EKS and SSO locally
- Extensible - easy to add more models or custom LLM implementations
- Maintainable architecture
Adding New Models
Section titled “Adding New Models”To add support for new models that aren’t in LlamaIndex’s supported list:
- Add to NOVA_2_MODELS list (or create a new list):
# In config.pyNEW_MODELS = [ 'amazon.new-model-v1:0', 'us.amazon.new-model-v1:0',]- Update _to_llm() logic:
if model_id in NOVA_2_MODELS or model_id in NEW_MODELS: return DirectBedrockLLM(...)- Update IAM permissions if needed:
resources=[ f"arn:aws:bedrock:*::inference-profile/us.amazon.new-model-v1:0",]Troubleshooting
Section titled “Troubleshooting”Model Not Found Error
Section titled “Model Not Found Error”Symptom: ValueError: Model 'amazon.nova-2-lite-v1:0' is not supported
Solution: Ensure model is in NOVA_2_MODELS list and you’re using the inference profile format (us.amazon.nova-2-lite-v1:0)
Pickling Errors
Section titled “Pickling Errors”Symptom: TypeError: cannot pickle 'botocore.client.BedrockRuntime' object
Solution: Verify DirectBedrockLLM is being used (check logs for “Using DirectBedrockLLM” message)
Authentication Errors
Section titled “Authentication Errors”Symptom: UnauthorizedOperation or AccessDenied
Solution:
- Local: Run
aws sso login --profile primary - EKS: Verify IAM role has correct permissions and service account annotation
Cross-Region Access Denied
Section titled “Cross-Region Access Denied”Symptom: AccessDenied when using inference profiles
Solution: Ensure IAM policy includes both ARN patterns:
arn:aws:bedrock:*::inference-profile/*arn:aws:bedrock:*:{account}:inference-profile/*
Files Modified
Section titled “Files Modified”Core Implementation
Section titled “Core Implementation”lexical-graph/src/graphrag_toolkit/lexical_graph/bedrock_llm.py- NEW: DirectBedrockLLM classlexical-graph/src/graphrag_toolkit/lexical_graph/config.py- UPDATED: Model selection logiclexical-graph/src/graphrag_toolkit/lexical_graph/__init__.py- UPDATED: Export DirectBedrockLLMlexical-graph/src/graphrag_toolkit/lexical_graph/utils/llm_cache.py- FIXED: Removed client injection hacklexical-graph/src/graphrag_toolkit/lexical_graph/utils/bedrock_patch.py- DELETED: Obsolete monkey-patch approach
Infrastructure
Section titled “Infrastructure”infrastructure/platform/stacks/argo_workflow_access_stack.py- UPDATED: IAM permissionsinfrastructure/argo-workflows/templates/extract-bee-test-workflow.yaml- UPDATED: Use Nova 2 modelinfrastructure/post-deployment/scripts/images/refresh-lexical-graph-bee.sh- Build script
References
Section titled “References”Conclusion
Section titled “Conclusion”The Nova 2 model support implementation provides a clean, maintainable solution for using Amazon’s latest models in the lexical-graph toolkit. By implementing a custom LLM class that bypasses LlamaIndex’s model validation while maintaining compatibility with the LlamaIndex interface, we achieve:
- Full support for Nova 2 series models
- Proper credential management (IRSA/SSO)
- Multiprocessing compatibility
- Clean architecture without hacks
- Easy extensibility for future models
This approach is significantly better than the previous implementation and provides a solid foundation for supporting new Bedrock models as they are released.