Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is a technique that enhances language models by combining retrieval and generation. Instead of relying solely on pre-trained knowledge, RAG first retrieves relevant external documents (e.g., from a database, search engine, or vector store) and then uses them to generate more accurate and context-aware responses.
Customers have two methods to load files into vector stores configured with LISA. Customers can either manually load files via the chatbot user interface (UI), or via an ingestion pipeline.
Configuration
Chat UI
Files loaded via the chatbot UI are limited by Lambda's service limits on document file size and volume.
Automated Document Vector Store Ingestion Pipeline
The Automated Document Ingestion Pipeline is designed to enhance LISA's RAG capabilities. Documents loaded via a pipeline are not subject to these limits, further expanding LISA’s ingestion capabilities. This pipeline feature supports the following document file types: PDF, docx, and plain text. The individual file size limit is 50 MB.
This feature provides customers with a flexible, scalable solution for loading documents into configured vector stores.
Customers can set up many ingestion pipelines. For each pipeline, they define the vector store and embedding model, and ingestion trigger. Each pipeline can be set up to run based on an event trigger, or to run daily. From there, pre-processing kicks off to convert files into the necessary format. From there, processing kicks off to ingest the files with the specified embedding model and loads the data into the designated vector store. This feature leverages LISA’s existing chunking and vectorizing capabilities.
LISA also supports deleting files and content from a vector store, as well as listing the file names and dates ingested.
Benefits
- Flexibility: Accommodates various data sources and formats
- Efficiency: Streamlines the document ingestion process with pre-processing and intelligent indexing
- Customization: Allows customers to choose and easily switch between preferred vector stores
- Integration: Leverages existing LISA capabilities while extending functionality
Use Cases
- Large-scale document ingestion for enterprise customers
- Integration of external mission-critical data sources
- Customized knowledge base creation for specific industries or applications
This new Automated Document Ingestion Pipeline significantly expands LISA's capabilities, providing customers with a powerful tool for managing and utilizing their document-based knowledge more effectively.
NOTE: Event ingestion requires Amazon EventBridge to be enabled on the S3 bucket. You can enable this in the bucket's properties configuration page in the AWS Management Console.
Configuration Example
RAG repositories and Automated Ingestion Pipelines are configurable through the chatbot web UI or programmatically via the API for managing RAG repositories, allowing customers to tailor the ingestion process to their specific needs.
Request Example:
curl -s -H 'Authorization: Bearer <your_token>' -XPOST -d @body.json https://<apigw_endpoint>/models
Response Example:
// body.json
{
"ragConfig": {
"repositoryId": "my-vector-store",
"repositoryName": "My Vector Store",
"type": "pgvector",
"rdsConfig": {
"username": "postgres"
},
"pipelines": [
{
"chunkOverlap": 51,
"chunkSize": 256,
"embeddingModel": "titan-embed-text-v1",
"trigger": "event",
"s3Bucket": "my-ingestion-bucket",
"s3Prefix": "/some/path/to/watch"
}
]
}
}
Explanation of Response Fields:
status
: "success" if the state machine was started successfully.executionArn
: The state machine ARN used to deploy the vector store.
LISA Configuration Schema
OpenSearchExistingClusterConfig
Object containing the following properties:
Property | Description | Type |
---|---|---|
endpoint (*) | Existing OpenSearch Cluster endpoint | string (min length: 1) |
(*) Required.
OpenSearchNewClusterConfig
Object containing the following properties:
Property | Description | Type | Default |
---|---|---|---|
dataNodes | The number of data nodes (instances) to use in the Amazon OpenSearch Service domain. | number (≥1) | 2 |
dataNodeInstanceType | The instance type for your data nodes | string | 'r7g.large.search' |
masterNodes | The number of instances to use for the master node | number (≥0) | 0 |
masterNodeInstanceType | The hardware configuration of the computer that hosts the dedicated master node | string | 'r7g.large.search' |
volumeSize | The size (in GiB) of the EBS volume for each data node. The minimum and maximum size of an EBS volume depends on the EBS volume type and the instance type to which it is attached. | number (≥20) | 20 |
volumeType | The EBS volume type to use with the Amazon OpenSearch Service domain | Native enum:
| 'gp3' |
multiAzWithStandby | Indicates whether Multi-AZ with Standby deployment option is enabled. | boolean | false |
All properties are optional.
RagRepositoryConfig
Configuration schema for RAG repository. Defines settings for OpenSearch.
Object containing the following properties:
Property | Description | Type | Default |
---|---|---|---|
repositoryId (*) | A unique identifier for the repository, used in API calls and the UI. It must be distinct across all repositories. | string (min length: 1, regex: /^[a-z0-9-]{1,63}/ , regex: /^(?!-).*(?<!-)$/ ) | |
repositoryName | The user-friendly name displayed in the UI. | string | |
type (*) | The vector store designated for this repository. | Native enum:
| |
opensearchConfig | OpenSearchExistingClusterConfig or OpenSearchNewClusterConfig | ||
rdsConfig | Configuration schema for RDS Instances needed for LiteLLM scaling or PGVector RAG operations. The optional fields can be omitted to create a new database instance, otherwise fill in all fields to use an existing database instance. | RdsInstanceConfig | |
pipelines | Rag ingestion pipeline for automated inclusion into a vector store from S3 | Array of RagRepositoryPipeline items | [] |
allowedGroups | The groups provided by the Identity Provider that have access to this repository. If no groups are specified, access is granted to everyone. | Array<string (_min length: 1_)> | [] |
(*) Required.
RagRepositoryPipeline
Object containing the following properties:
Property | Description | Type | Default |
---|---|---|---|
chunkSize | The size of the chunks used for document segmentation. | number | 512 |
chunkOverlap | The size of the overlap between chunks. | number | 51 |
embeddingModel (*) | The embedding model used for document ingestion in this pipeline. | string | |
s3Bucket (*) | The S3 bucket monitored by this pipeline for document processing. | string | |
s3Prefix (*) | The prefix within the S3 bucket monitored for document processing. | string | |
trigger | The event type that triggers document ingestion. | 'daily' | 'event' | 'event' |
autoRemove | Enable removal of document from vector store when deleted from S3. This will also remove the file from S3 if file is deleted from vector store through API/UI. | boolean | true |
(*) Required.
RdsInstanceConfig
Configuration schema for RDS Instances needed for LiteLLM scaling or PGVector RAG operations.
The optional fields can be omitted to create a new database instance, otherwise fill in all fields to use an existing database instance.
Object containing the following properties:
Property | Description | Type | Default |
---|---|---|---|
username | The username used for database connection. | string | 'postgres' |
passwordSecretId | The SecretsManager Secret ID that stores the existing database password. | string | |
dbHost | The database hostname for the existing database instance. | string | |
dbName | The name of the database for the database instance. | string | 'postgres' |
dbPort | The port of the existing database instance or the port to be opened on the database instance. | number | 5432 |
All properties are optional.