Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is a technique that enhances language models by combining retrieval and generation. Instead of relying solely on pre-trained knowledge, RAG first retrieves relevant external documents (e.g., from a database, search engine, or vector store) and then uses them to generate more accurate and context-aware responses.
RAG Repositories and Collections
LISA RAG introduces a hierarchical architecture for managing RAG content through repositories and collections:
Repository: The top-level container that defines the underlying vector store implementation (OpenSearch, PGVector, or Bedrock Knowledge Base), embedding model, and access controls. Repositories are created and managed by administrators. Repository access can be restricted to specific enterprise groups.
Collection: Within repositories, collections support a logical grouping of documents. One repository can support many collections. Collection access can be restricted to specific enterprise groups. Collections enable flexible organization of content with their own chunking strategies, metadata tags, and access controls. Administrators create and manage collections via API or UI. Users can view and upload documents within a collection using LISA's Document Library and RAG file upload.
Architecture Overview
The repository-collection model provides a two-tier organizational structure analogous to filing cabinets (repositories) containing organized drawers (collections). This architecture enables:
- Multi-Backend Support: Unified interface across OpenSearch, PGVector, and AWS Bedrock Knowledge Base implementations
- Configuration Isolation: Each collection maintains independent chunking strategies, embedding models, and access controls
- Scalable Organization: Organize documents by department, project, content type, or security classification without infrastructure changes
- Backward Compatibility: Existing repositories automatically include a default collection based on the embedding model ID
Key Benefits
- Dynamic Management: Create, update, and delete collections via API without infrastructure changes
- Optimized Chunking: Configure chunking strategies per collection to match content type (legal documents, code, customer support tickets)
- Granular Access Control: Enforce user group-based permissions at both the repository and collection level
- Multi-tenancy: Within repositories, further manage access by restricting collections access (e.g., by enterprise groups for specific organizations, departments, or teams)
- Enhanced Metadata: Tag documents with collection-specific metadata for powerful filtering
- Flexible Embedding Models: Each collection can use its own embedding model, optimizing retrieval for specific document types
Document Ingestion Methods
Customers have two methods to load files into repositories configured with LISA:
- Manual Upload: Load files via the chat assistant user interface (UI), or API
- Automated Pipeline: (Admins-only) Configure LISA's ingestion pipelines for automated document processing
Configuration
Chat Assistant UI
Files loaded via the chat assistant UI are limited by size, and are processed through a batch job. The status of the job can be viewed within the RAG File Upload dialog. When uploading documents through the UI, you can select a specific collection within a repository. If no collection is specified, documents are ingested into the default collection, which defaults to the embedding model associated with the parent repository.
Automated Document Repository Ingestion Pipeline
LISA's automated document ingestion pipeline supports larger files and broader file types. Supported file types include: PDF, docx, and plain text. The individual file size limit is 50 MB. LISA's pipelines offer chunking support for fixed size chunking or no chunking. For customers using Amazon Bedrock Knowledge Bases, LISA supports all chunking strategies offered by the service. LISA's automated ingestion pipelines provide customers with a flexible, scalable solution for loading documents into configured repositories and collections.
Customers can set up multiple ingestion pipelines for a repository. For each pipeline they define:
- The target repository and collection
- Embedding model (inherited from repository if not defined)
- Chunking strategy (can be customized per pipeline)
- Ingestion trigger (event-based or daily schedule)
- S3 bucket and prefix to monitor
Pipelines can be configured at both the repository level (for default collection ingestion) and at the collection level (for targeted ingestion). Each pipeline can run based on an event trigger or daily schedule. Pre-processing converts files into the necessary format, then processing ingests the files with the specified embedding model and loads the data into the designated collection within the repository.
LISA also supports deleting files and content from repositories, as well as listing the file names and dates ingested. When autoRemove is enabled, deleting a document from the repository will also remove it from S3, and vice versa.
Benefits
The automated ingestion pipeline provides:
- Flexibility: Accommodates various data sources and formats
- Efficiency: Streamlines the document ingestion process with pre-processing and intelligent indexing
- Customization: Allows customers to choose and easily switch between preferred vector stores
- Integration: Leverages existing LISA capabilities while extending functionality
Use Cases
Common use cases for automated ingestion include:
- Large-scale document ingestion for enterprise customers
- Integration of external mission-critical data sources
- Customized knowledge base creation for specific industries or applications
- Department or project-specific document collections with isolated access
- Content-type optimized chunking strategies (legal, technical, conversational)
NOTE: Event ingestion requires Amazon EventBridge to be enabled on the S3 bucket. You can enable this in the bucket's properties configuration page in the AWS Management Console.
Managing Collections
Collection Lifecycle
Collections can be created, updated, and deleted through the LISA UI or API. Each collection maintains:
- Chunking Strategy: Optimized for the content type (fixed size or none)
- Embedding Model: Inherited from repository or customized per collection
- Access Control: User group restrictions inherited from the repository or customized per collection
- Metadata Tags: Custom tags for organizing and filtering documents
- Privacy Settings: Collections can be marked as private for restricted visibility
- Ingestion Pipelines: Dedicated pipelines for automated document ingestion
Collections support flexible chunking configuration with multiple override levels:
- Default Strategy: Inherited from the repository configuration
- Collection Strategy: Override at the collection level for content-specific optimization
- Pipeline Strategy: Further override at the ingestion pipeline level
- API Override: Optionally allow per-document chunking strategy via API (controlled by
allowChunkingOverrideflag)
Default Collections
Every repository includes a default collection based on the embedding model ID. This ensures backward compatibility with existing LISA deployments (pre v6.0). When no collection is specified during document ingestion or retrieval, the default collection is used.
Default collections provide:
- Automatic Creation: Generated automatically during repository creation with no additional configuration
- Zero Downtime Migration: Existing documents remain accessible through default collections without database migrations
- Optional Adoption: Collections are completely optional—repositories continue to function without explicit collection configuration
- Preserved Documents: All existing documents remain accessible through default collections after upgrade
Document Lifecycle Management
LISA implements intelligent document lifecycle management that respects how content is created and maintained:
- Ingestion Type Tracking: The system distinguishes between LISA-managed documents, pipeline-generated content, and user-managed documents in Bedrock Knowledge Bases
- Asynchronous Deletion: Collection deletion operations execute asynchronously with optimized cleanup strategies per repository type:
- OpenSearch: Drops the entire index before document deletion
- PGVector: Drops the collection table/schema
- Bedrock Knowledge Base: Performs bulk document deletion
- Document Preservation: User-managed documents in Bedrock Knowledge Bases are automatically preserved during collection operations, ensuring external content is not inadvertently removed
- Status Tracking: Collections maintain status indicators (ACTIVE, DELETE_IN_PROGRESS, DELETE_FAILED) for monitoring lifecycle operations
Collection Permissions
Collection access is controlled through user groups:
- Repository-level Groups: Collections inherit allowed groups from their parent repository by default
- Collection-level Groups: Collections can override with their own group restrictions for finer control
- Admin Access: Administrators have full access to all collections across all repositories
- User Collection Creation: Repositories can be configured to allow or restrict user-created collections via the
allowUserCollectionsflag
Configuration Examples
RAG repositories and collections are configurable through the chat assistant web UI or programmatically via the API, allowing customers to tailor the ingestion process to their specific needs.
Creating a Repository
Repositories are created by administrators and define the underlying vector store implementation, embedding model, and default access controls.
Request Example:
curl -s -H 'Authorization: Bearer <your_token>' -XPOST -d @repository.json https://<apigw_endpoint>/repository// repository.json
{
"repositoryId": "my-rag-repository",
"repositoryName": "My RAG Repository",
"type": "pgvector",
"embeddingModelId": "amazon.titan-embed-text-v1",
"rdsConfig": {
"username": "postgres"
},
"allowedGroups": ["engineering", "data-science"],
"metadata": {
"tags": ["production", "customer-docs"]
},
"allowUserCollections": true,
"pipelines": [
{
"chunkingStrategy": {
"type": "fixed",
"size": 512,
"overlap": 51
},
"trigger": "event",
"s3Bucket": "my-ingestion-bucket",
"s3Prefix": "documents/",
"autoRemove": true
}
]
}Response Fields:
status: "success" if the state machine was started successfullyexecutionArn: The state machine ARN used to deploy the repository
Creating a Collection
Collections can be created by users with appropriate permissions within an existing repository.
Request Example:
curl -s -H 'Authorization: Bearer <your_token>' -XPOST -d @collection.json https://<apigw_endpoint>/repository/my-rag-repository/collection// collection.json
{
"name": "Legal Documents",
"description": "Collection for legal contracts and agreements",
"chunkingStrategy": {
"type": "fixed",
"size": 512,
"overlap": 51
},
"allowChunkingOverride": false,
"metadata": {
"tags": ["legal", "contracts", "confidential"]
},
"allowedGroups": ["legal-team", "compliance"],
"private": true,
"pipelines": [
{
"s3Bucket": "legal-docs-bucket",
"s3Prefix": "contracts/",
"trigger": "event",
"autoRemove": true
}
]
}Response Fields:
collectionId: Unique identifier for the created collection (UUID)repositoryId: Parent repository identifiername: User-friendly collection nameembeddingModel: Inherited from parent repositorycreatedBy: User ID of collection creatorcreatedAt: Creation timestamp (ISO 8601)status: Collection status (ACTIVE)
Listing Collections
Retrieve all collections accessible to the current user within a repository.
Request Example:
curl -s -H 'Authorization: Bearer <your_token>' \
'https://<apigw_endpoint>/repository/my-rag-repository/collections?page=1&pageSize=20&sortBy=name&sortOrder=asc'Query Parameters:
page: Page number (default: 1)pageSize: Items per page (default: 20, max: 100)filter: Filter by name or description (optional)sortBy: Sort field -name,createdAt, orupdatedAt(default:createdAt)sortOrder: Sort order -ascordesc(default:desc)
UI Components
RAG Repository Management (Admin)
Administrators access repository management through the Admin Configurations page. This interface provides:
- Create, update, and delete repositories
- Configure vector store implementation (OpenSearch, PGVector, Bedrock Knowledge Base)
- Set default embedding models and chunking strategies
- Define repository-level access controls
- Configure metadata tags
- Enable or disable user-created collections
RAG Collection Library
The Collection Library is accessible from the Document Library page and provides:
- Browse collections within accessible repositories
- Create new collections (if permitted)
- Update collection settings
- Delete collections (if permitted)
- View collection metadata and statistics
- Filter document collection
Collections are organized in a tree structure, similar to folders, making it intuitive to navigate and manage documents.
Chat Interface
The chat interface includes repository and collection selection:
- Select a repository from available options
- Choose a specific collection within the repository
- Default collection is used if none specified
- Embedding model is automatically determined by the collection
Document Library
The Document Library displays documents organized by collection:
- Tree view showing repository → collection → documents hierarchy
- Filter and search within specific collections
- Upload documents to selected collections
- View document metadata including collection assignment
- Delete documents with optional S3 removal (when
autoRemoveis enabled)
LISA Configuration Schema
BedrockDataSource
Object containing the following properties:
| Property | Description | Type |
|---|---|---|
id (*) | The ID of the Bedrock Knowledge Base data source | string |
name (*) | The name of the Bedrock Knowledge Base data source | string |
s3Uri (*) | The S3 URI of the data source | string (regex: /^s3:\/\/[a-z0-9][a-z0-9.-]*[a-z0-9](\/.*)?$/) |
(*) Required.
BedrockKnowledgeBaseInstanceConfig
Object containing the following properties:
| Property | Description | Type |
|---|---|---|
knowledgeBaseId (*) | The ID of the Bedrock Knowledge Base | string |
dataSources (*) | Array of data sources in this Knowledge Base | Array of at least 1 BedrockDataSource items |
(*) Required.
ChunkingStrategy
Union of the following possible types:
FixedSizeChunkingStrategy
Object containing the following properties:
| Property | Description | Type | Default |
|---|---|---|---|
type (*) | Fixed size chunking strategy type | 'fixed' | |
size | Size of each chunk in characters | number (≥100, ≤10000) | 512 |
overlap | Overlap between chunks in characters | number (≥0) | 51 |
(*) Required.
NoneChunkingStrategy
Object containing the following properties:
| Property | Description | Type |
|---|---|---|
type (*) | No chunking - documents ingested as-is | 'none' |
(*) Required.
OpenSearchExistingClusterConfig
Object containing the following properties:
| Property | Description | Type |
|---|---|---|
endpoint (*) | Existing OpenSearch Cluster endpoint | string (min length: 1) |
(*) Required.
OpenSearchNewClusterConfig
Object containing the following properties:
| Property | Description | Type | Default |
|---|---|---|---|
dataNodes | The number of data nodes (instances) to use in the Amazon OpenSearch Service domain. | number (≥1) | 2 |
dataNodeInstanceType | The instance type for your data nodes | string | 'r7g.large.search' |
masterNodes | The number of instances to use for the master node | number (≥0) | 0 |
masterNodeInstanceType | The hardware configuration of the computer that hosts the dedicated master node | string | 'r7g.large.search' |
volumeSize | The size (in GiB) of the EBS volume for each data node. The minimum and maximum size of an EBS volume depends on the EBS volume type and the instance type to which it is attached. | number (≥20) | 20 |
volumeType | The EBS volume type to use with the Amazon OpenSearch Service domain | Native enum:
| 'gp3' |
multiAzWithStandby | Indicates whether Multi-AZ with Standby deployment option is enabled. | boolean | false |
All properties are optional.
RagRepositoryConfig
Configuration schema for RAG repository. Defines settings for OpenSearch.
Object containing the following properties:
| Property | Description | Type | Default |
|---|---|---|---|
repositoryId (*) | A unique identifier for the repository, used in API calls and the UI. It must be distinct across all repositories. | string (min length: 1, regex: /^[a-z0-9-]{3,20}/, regex: /^(?!-).*(?<!-)$/) | |
repositoryName | The user-friendly name displayed in the UI. | string | |
description | Description of the repository. | string | |
embeddingModelId | The default embedding model to be used when selecting repository. | string | |
type (*) | The vector store designated for this repository. | Native enum:
| |
opensearchConfig | OpenSearchExistingClusterConfig or OpenSearchNewClusterConfig | ||
rdsConfig | Configuration schema for RDS Instances needed for LiteLLM scaling or PGVector RAG operations. The optional fields can be omitted to create a new database instance, otherwise fill in all fields to use an existing database instance. | RdsInstanceConfig | |
bedrockKnowledgeBaseConfig | BedrockKnowledgeBaseInstanceConfig | ||
pipelines | Rag ingestion pipeline for automated inclusion into a vector store from S3 | Array of RagRepositoryPipeline items | [] |
allowedGroups | The groups provided by the Identity Provider that have access to this repository. If no groups are specified, access is granted to everyone. | Array<string (_min length: 1_)> | [] |
metadata | Metadata for the repository including tags and custom fields. | RagRepositoryMetadata | |
status | Current deployment status of the repository | Native enum:
|
(*) Required.
RagRepositoryMetadata
Object containing the following properties:
| Property | Description | Type | Default |
|---|---|---|---|
tags | Tags for categorizing and organizing the repository. | Array<string> | [] |
customFields | Custom metadata fields for the repository. | Object with dynamic keys of type string and values of type any (optional & nullable) |
All properties are optional.
RagRepositoryPipeline
Object containing the following properties:
| Property | Description | Type | Default |
|---|---|---|---|
chunkSize | The size of the chunks used for document segmentation. | number | 512 |
chunkOverlap | The size of the overlap between chunks. | number | 51 |
chunkingStrategy | Chunking strategy for documents in this pipeline. | ChunkingStrategy | |
embeddingModel | The embedding model used for document ingestion in this pipeline. | string | |
collectionId | The collection ID to ingest documents into. | string | |
s3Bucket (*) | The S3 bucket monitored by this pipeline for document processing. | string | |
s3Prefix | The prefix within the S3 bucket monitored for document processing. | string (regex: /^(?!.*(?:^|\/)\.\.?(\/|$)).*/, regex: /^([a-zA-Z0-9!_.*'()/=-]+\/)*[a-zA-Z0-9!_.*'()/=-]*$/, regex: /^(?!\/).*/) | '' |
trigger | The event type that triggers document ingestion. | 'daily' | 'event' | 'event' |
autoRemove | Enable removal of document from vector store when deleted from S3. This will also remove the file from S3 if file is deleted from vector store through API/UI. | boolean | true |
(*) Required.
RdsInstanceConfig
Configuration schema for RDS Instances needed for LiteLLM scaling or PGVector RAG operations.
The optional fields can be omitted to create a new database instance, otherwise fill in all fields to use an existing database instance.
Object containing the following properties:
| Property | Description | Type | Default |
|---|---|---|---|
username | The username used for database connection. | string | 'postgres' |
passwordSecretId | The SecretsManager Secret ID that stores the existing database password. | string | |
dbHost | The database hostname for the existing database instance. | string | |
dbName | The name of the database for the database instance. | string | 'postgres' |
dbPort | The port of the existing database instance or the port to be opened on the database instance. | number | 5432 |
All properties are optional.