Llama
Unstable API
0.8.0
@project-lakechain/bedrock-text-processors
The Llama text processor allows you to leverage the Llama family of large-language models provided by Meta on Amazon bedrock within your pipelines. Using this construct, you can use prompt engineering techniques to transform text documents, including, text summarization, text translation, information extraction, and more!
📝 Text Generation
To start using Llama models in your pipelines, you import the LlamaTextProcessor
construct in your CDK stack, and specify the specific text model you want to use.
💁 The below example demonstrates how to use the Llama text processor to summarize input documents uploaded to an S3 bucket using the Llama 3 70B model.
🤖 Model Selection
You can select the specific Llama model to use with this middleware using the .withModel
API.
💁 You can choose amongst the following models — see the Bedrock documentation for more information.
Model Name | Model identifier |
---|---|
LLAMA2_13B_CHAT_V1 | meta.llama2-13b-chat-v1 |
LLAMA2_70B_CHAT_V1 | meta.llama2-70b-chat-v1 |
LLAMA3_8B_INSTRUCT_V1 | meta.llama3-8b-instruct-v1:0 |
LLAMA3_70B_INSTRUCT_V1 | meta.llama3-70b-instruct-v1:0 |
LLAMA3_1_8B_INSTRUCT_V1 | meta.llama3-1-8b-instruct-v1:0 |
LLAMA3_1_70B_INSTRUCT_V1 | meta.llama3-1-70b-instruct-v1:0 |
LLAMA3_1_405B_INSTRUCT_V1 | meta.llama3-1-405b-instruct-v1:0 |
🌐 Region Selection
You can specify the AWS region in which you want to invoke Amazon Bedrock using the .withRegion
API. This can be helpful if Amazon Bedrock is not yet available in your deployment region.
💁 By default, the middleware will use the current region in which it is deployed.
⚙️ Model Parameters
You can optionally forward specific parameters to the underlying LLM using the .withModelParameters
method. Below is a description of the supported parameters.
💁 See the Bedrock Inference Parameters for more information on the parameters supported by the different models.
Parameter | Description | Min | Max | Default |
---|---|---|---|---|
temperature | Controls the randomness of the generated text. | 0 | 1 | 0.5 |
maxTokens | The maximum number of tokens to generate. | 1 | 2048 | 512 |
topP | The cumulative probability of the top tokens to sample from. | 0 | 1 | 0.9 |
🏗️ Architecture
This middleware is based on a Lambda compute running on an ARM64 architecture, and integrate with Amazon Bedrock to generate text based on the given prompt and input documents.
🏷️ Properties
Supported Inputs
Mime Type | Description |
---|---|
text/plain | UTF-8 text documents. |
text/markdown | Markdown documents. |
text/csv | CSV documents. |
text/html | HTML documents. |
application/x-subrip | SubRip subtitles. |
text/vtt | Web Video Text Tracks (WebVTT) subtitles. |
application/json | JSON documents. |
application/json+scheduler | Used by the Scheduler middleware. |
Supported Outputs
Mime Type | Description |
---|---|
text/plain | UTF-8 text documents. |
Supported Compute Types
Type | Description |
---|---|
CPU | This middleware only supports CPU compute. |
📖 Examples
- Llama Summarization Pipeline - Builds a pipeline for text summarization using Llama models on Amazon Bedrock.