OpenSearch Vectors
Unstable API
0.8.0
@project-lakechain/opensearch-vector-storage-connector
The OpenSearch vector storage connector enables developers to automatically index document events and their associated vector embeddings into an Amazon OpenSearch Domains domain or an Amazon Serverless Collection.
🗄️ Indexing Documents
To use the OpenSearch vectors storage connector, you import it in your CDK stack, and connect it to a data source providing document embeddings.
💁 You specify an index definition describing the index that the connector will create in your OpenSearch database to store document events and embeddings.
Index Definition
The index definition allows you to configure the index attributes that will be used by the connector. Below is a description of the attributes that can be configured.
Attribute | Description |
---|---|
indexName | The name of the index to create. |
knnMethod | The KNN method (only hnsw is currently supported). |
knnEngine | The KNN engine (faiss or nmslib ). |
spaceType | The space type (l2 , l1 , innerproduct , cosinesimil , linf ). |
dimensions | The number of dimensions of the vectors. |
parameters | The parameters for the index. |
🌐 Endpoints
This middleware supports instances of IDomain
, ICollection
or an Amazon Serverless CfnCollection
that you can pass to the withEndpoint
method.
🏗️ Architecture
THis middleware uses a Lambda function to index documents in batches into an OpenSearch domain or OpenSearch Serverless collection.
💁 By default, this connector uses a batch of 10 documents and batches documents for a period of 20 seconds.
🏷️ Properties
Supported Inputs
Mime Type | Description |
---|---|
*/* | This middleware supports any type of documents. Note that if no embeddings are specified in the document metadata, the document is filtered out. |
Supported Outputs
This middleware does not produce any output.
Supported Compute Types
Type | Description |
---|---|
CPU | This middleware only supports CPU compute. |
📖 Examples
- Bedrock OpenSearch Pipeline - An example showcasing an embedding pipeline using Amazon Bedrock and OpenSearch.
- Cohere OpenSearch Pipeline - An example showcasing an embedding pipeline using Cohere models on Bedrock and OpenSearch.