Anthropic
Unstable API
0.8.0
@project-lakechain/bedrock-text-processors
The Anthropic text processor allows you to leverage large-language models provided by Anthropic on Amazon Bedrock within your pipelines. Using this construct, you can use prompt engineering techniques to transform text documents, including, text summarization, text translation, information extraction, and more!
đ Text Generation
To start using Anthropic models in your pipelines, you import the AnthropicTextProcessor
construct in your CDK stack, and specify the specific text model you want to use.
đ The below example demonstrates how to use the Anthropic text processor to summarize input documents uploaded to an S3 bucket.
âšī¸ Tip - Note that the Claude v3 family of models is multi-modal, and supports both text and image documents as an input.
đ¤ Model Selection
You can select the specific Anthropic model to use with this middleware using the .withModel
API.
đ You can choose amongst the following models â see the Bedrock documentation for more information.
Model Name | Model identifier |
---|---|
ANTHROPIC_CLAUDE_INSTANT_V1 | anthropic.claude-instant-v1 |
ANTHROPIC_CLAUDE_V2 | anthropic.claude-v2 |
ANTHROPIC_CLAUDE_V2_1 | anthropic.claude-v2:1 |
ANTHROPIC_CLAUDE_V3_HAIKU | anthropic.claude-3-haiku-20240307-v1:0 |
ANTHROPIC_CLAUDE_V3_SONNET | anthropic.claude-3-sonnet-20240229-v1:0 |
ANTHROPIC_CLAUDE_V3_5_SONNET | anthropic.claude-3-5-sonnet-20240620-v1:0 |
ANTHROPIC_CLAUDE_V3_OPUS | anthropic.claude-3-opus-20240229-v1:0 |
đ Region Selection
You can specify the AWS region in which you want to invoke Amazon Bedrock using the .withRegion
API. This can be helpful if Amazon Bedrock is not yet available in your deployment region.
đ By default, the middleware will use the current region in which it is deployed.
âī¸ Model Parameters
You can forward specific parameters to the text models using the .withModelParameters
method. Below is a description of the supported parameters.
Parameter | Description | Min | Max | Default |
---|---|---|---|---|
temperature | Controls the randomness of the generated text. | 0 | 1 | N/A |
max_tokens | The maximum number of tokens to generate. | 1 | 4096 | 4096 |
top_p | The cumulative probability of the top tokens to sample from. | 0 | 1 | N/A |
top_k | The number of top tokens to sample from. | 1 | 100000000 | N/A |
đŦ Prompts
The Anthropic text processor exposes an interface allowing users to specify prompts to the underlying model. A prompt is a piece of text that guides the model on how to generate the output. Using this middleware you can define 3 types of prompts to the Anthropic model.
Type | Method | Optional | Description |
---|---|---|---|
User prompt | .withPrompt | No | The user prompt is text that provides instructions to the model. |
System prompt | .withSystemPrompt | Yes | The system prompt is text that provides context to the model. |
Assistant Prefill | .withAssistantPrefill | Yes | The assistant prefill is text that directly guides the model on how to further complete its output. |
đ The below example demonstrates how to use both a user prompt and an assistant prefill to guide the model into outputting valid JSON.
𧊠Composite Events
In addition to handling single documents, the Anthropic text processor also supports composite events as an input. This means that it can take multiple text and image documents and compile them into a single input to the model.
This can come in handy in map-reduce pipelines where you use the Reducer to combine multiple documents into a single input having a similar semantic, for example, multiple pages of a PDF document that you would like the model to summarize as a whole, while keeping the context between the pages.
đī¸ Architecture
This middleware is based on a Lambda compute running on an ARM64 architecture, and integrate with Amazon Bedrock to generate text based on the given prompt and input documents.
đˇī¸ Properties
Supported Inputs
The supported inputs depend on the selected model as the Claude v3 models are multi-modal and support text and images, while the Claude v2 model only support text. The following table lists the supported inputs for each model.
Model | Supported Inputs |
---|---|
ANTHROPIC_CLAUDE_INSTANT_V1 | Text |
ANTHROPIC_CLAUDE_V2 | Text |
ANTHROPIC_CLAUDE_V2_1 | Text |
ANTHROPIC_CLAUDE_V3_HAIKU | Text, Image |
ANTHROPIC_CLAUDE_V3_SONNET | Text, Image |
ANTHROPIC_CLAUDE_V3_5_SONNET | Text, Image |
ANTHROPIC_CLAUDE_V3_OPUS | Text, Image |
Text Inputs
Below is a list of supported text inputs.
Mime Type | Description |
---|---|
text/plain | UTF-8 text documents. |
text/markdown | Markdown documents. |
text/csv | CSV documents. |
text/html | HTML documents. |
application/x-subrip | SubRip subtitles. |
text/vtt | Web Video Text Tracks (WebVTT) subtitles. |
application/json | JSON documents. |
application/xml | XML documents. |
Image Inputs
Below is a list of supported image inputs.
Mime Type | Description |
---|---|
image/jpeg | JPEG images. |
image/png | PNG images. |
image/gif | GIF images. |
image/webp | WebP images. |
Composite Inputs
The middleware also supports composite events as an input, which can be used to combine multiple text and image documents into a single input for the model.
Mime Type | Description |
---|---|
application/cloudevents+json | Composite events emitted by the Reducer . |
Supported Outputs
Mime Type | Description |
---|---|
text/plain | UTF-8 text documents. |
Supported Compute Types
Type | Description |
---|---|
CPU | This middleware only supports CPU compute. |
đ Examples
- Claude Summarization Pipeline - Builds a pipeline for text summarization using Amazon Bedrock and Anthropic Claude.
- Audio Recording Summarization Pipeline - Builds a pipeline for summarizing audio recordings using Amazon Transcribe and Amazon Bedrock.