generative-ai-cdk-constructs

BedrockBatchSfn


Stability: Experimental

All classes are under active development and subject to non-backward compatible changes or removal in any future version. These are not subject to the Semantic Versioning model. This means that while you may use them, you may need to update your source code when upgrading to a newer version of this package.


Language Package
Typescript Logo TypeScript @cdklabs/generative-ai-cdk-constructs
Python Logo Python cdklabs.generative_ai_cdk_constructs

Table of contents

Overview

The BedrockBatchSFN CDK construct simplifies the implementation of batch inference workflows with Amazon Bedrock by providing a pattern for processing large volumes of data asynchronously. It helps developers efficiently orchestrate batch processing tasks using Step Functions and Lambda, automatically handling job creation, status monitoring, and result collection. The construct is particularly valuable for cost-sensitive workloads like bulk text analysis, embeddings generation, and document summarization, taking advantage of Bedrock’s 50% pricing discount for batch operations. By abstracting away the complexity of managing asynchronous model invocations and state management, developers can focus on their application logic while the construct handles the infrastructure and workflow orchestration needed for reliable batch processing at scale.

Usage

This construct implements an AWS Step Functions StateMachineFragment which can be used in your state machines to manage Bedrock batch inference jobs.

It requires Amazon Simple Storage Service(S3) buckets for input and output manifests and an AWS Identity and Access Management(IAM) managed policy that allows inference. You can use a single bucket for both input and output. The policy must have the following permissions for the models and inference profiles you plan to use:

Here is a minimal deployable pattern definition:

import {BedrockBatchSfn} from "@cdklabs/generative-ai-cdk-constructs";
import {
  aws_iam as iam,
  aws_s3 as s3,
  aws_stepfunctions as sfn,
  Duration,
} from "aws-cdk-lib";

const batchBucket = new s3.Bucket(stack, 'BedrockBatchBucket');

const batchPolicy = new iam.ManagedPolicy(stack, 'BatchPolicy', {});

batchPolicy.addStatements(
  new iam.PolicyStatement({
    sid: 'Inference',
    actions: ['bedrock:InvokeModel', 'bedrock:CreateModelInvocationJob'],
    resources: [
      `arn:aws:bedrock:${stack.region}::foundation-model/*`,
    ],
  }),
);

const bedrockBatchSfnFragment = new BedrockBatchSfn(stack, 'AwsBedrockBatchSfn', {
  bedrockBatchInputBucket: batchBucket,
  bedrockBatchOutputBucket: batchBucket,
  bedrockBatchPolicy: batchPolicy,
  timeout: Duration.hours(48),
});

const inputState = new sfn.Pass(stack, 'InputState', {
  parameters: {
    job_name: 'test_job',
    manifest_keys: ['test_key.jsonl'],
    model_id: 'test.model-v1',
  },
});

const outputState = new sfn.Pass(stack, 'OutputState');

const failState = new sfn.Fail(stack, 'FailState', {
  causePath: sfn.JsonPath.stringAt('$.cause'),
  errorPath: sfn.JsonPath.stringAt('$.error'),
});

const chain = inputState
  .next(bedrockBatchSfnFragment)
  .next(outputState);

bedrockBatchSfnFragment.endStates.map((endState) => {
  if (endState instanceof sfn.TaskStateBase) {
    endState.addCatch(failState);
  }
});

const stateMachine = new sfn.StateMachine(stack, 'StateMachine', {
  definitionBody: sfn.DefinitionBody.fromChainable(chain),
});

See the API documentation.

Architecture

Architecture Diagram

Cost

Please note that you will be responsible for the costs associated with the AWS services used during the execution of this construct. The cost of using this construct varies heavily according to model selection and the size of model inference jobs. As a reference point, we will assume a workload that uses Amazon Nova Pro with 10,000 input tokens and 1,000 output tokens per invocation, 100 records per invocation job and 300 invocation jobs per month.

We recommend creating a budget through AWS Cost Explorer to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution.

The following table provides a sample cost breakdown for deploying this solution with the default parameters in the US East (N. Virginia) Region for one month.

AWS Service Dimensions Cost [USD]
Amazon Bedrock Amazon Nova Pro with 10,000 input tokens and 1,000 output tokens per invocation, 100 records per invocation job and 300 invocation jobs per month $168.00
AWS Lambda 6000 invocation requests, 128 MB, arm64 arch, 1 sec duration of each request $0.01
AWS Step Functions 300 workflow requests with 1 state transition each $0.00
Amazon Simple Storage Service Temporary Storage of Bedrock input and output manifests - 900 PUT requests, 600 GET requests, 1 GB data storage $0.03
AWS CloudWatch < 1 GB logs ingested $0.50
Total   $168.54

For comparison, with on-demand inference, the Amazon Bedrock usage would cost $336.00.

Security

When you build systems on AWS infrastructure, security responsibilities are shared between you and AWS. This shared responsibility model reduces your operational burden because AWS operates, manages, and controls the components including the host operating system, virtualization layer, and physical security of the facilities in which the services operate. For more information about AWS security, visit AWS Cloud Security.

Supported AWS Regions

This solution uses the Amazon Bedrock service, which is not currently available in all AWS Regions. You must launch this construct in an AWS Region where these services are available. For the most current availability of AWS services by Region, see the AWS Regional Services List.

Note You need to explicity enable access to models before they are available for use in the Amazon Bedrock service. Please follow the Amazon Bedrock User Guide for steps related to enabling model access.

Quotas

Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account.

Make sure you have sufficient quota for each of the services implemented in this solution. For more information, refer to AWS service quotas.

To view the service quotas for all AWS services in the documentation without switching pages, view the information in the Service endpoints and quotas page in the PDF instead.

Clean up

When deleting your stack which uses this construct, do not forget to go over the following instructions to avoid unexpected charges:


© Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.