AI Dungeon Game

Module 3: Story API implementation

The StoryApi comprises of a single API generate_story which given a Game and a list of Action’s for context, will progress a story. This API will be implemented as a streaming API in Python/FastAPI and will additionally demonstrate how changes can be made to the generated code to be fit for purpose.

API implementation

To create our API, we first need to install a couple of additional dependencies.

boto3 will be used to call Amazon Bedrock;
uvicorn will be used to start our API when used in conjunction with the Lambda Web Adapter (LWA).
copyfiles is an npm dependency that we will need to support cross-platform copying of files when updating our bundle task.

To install these dependencies, run the following commands:

pnpm nx run dungeon_adventure.story_api:add --args boto3 uvicorn

yarn nx run dungeon_adventure.story_api:add --args boto3 uvicorn

npx nx run dungeon_adventure.story_api:add --args boto3 uvicorn

bunx nx run dungeon_adventure.story_api:add --args boto3 uvicorn

pnpm add -Dw copyfiles

yarn add -D copyfiles

npm install --legacy-peer-deps -D copyfiles

bun install -D copyfiles

Now let’s replace the contents of the following files in packages/story_api/story_api:

main.py
init.py

import json

from boto3 import client
from fastapi.responses import PlainTextResponse, StreamingResponse
from pydantic import BaseModel

from .init import app, lambda_handler

handler = lambda_handler

bedrock = client('bedrock-runtime')

class Action(BaseModel):
    role: str
    content: str

class StoryRequest(BaseModel):
    genre: str
    playerName: str
    actions: list[Action]

async def bedrock_stream(request: StoryRequest):
    messages = [
        {"role": "user", "content": "Continue or create a new story..."}
    ]

    for action in request.actions:
        messages.append({"role": action.role, "content": action.content})

    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        body=json.dumps({
            "system":f"""
            You are running an AI text adventure game in the {request.genre} genre.
            Player: {request.playerName}. Return less than 200 characters of text.
            """,
            "messages": messages,
            "max_tokens": 1000,
            "temperature": 0.7,
            "anthropic_version": "bedrock-2023-05-31"
        })
    )

    stream = response.get('body')
    if stream:
        for event in stream:
            chunk = event.get('chunk')
            if chunk:
                message = json.loads(chunk.get("bytes").decode())
                if message['type'] == "content_block_delta":
                    yield message['delta']['text'] or ""
                elif message['type'] == "message_stop":
                    yield "\n"

@app.post("/story/generate",
          openapi_extra={'x-streaming': True},
          response_class=PlainTextResponse)
def generate_story(request: StoryRequest) -> str:
    return StreamingResponse(bedrock_stream(request), media_type="text/plain")

import os
import uuid
from collections.abc import Callable

from aws_lambda_powertools import Logger, Metrics, Tracer
from aws_lambda_powertools.metrics import MetricUnit
from fastapi import FastAPI, Request, Response
from fastapi.openapi.utils import get_openapi
from fastapi.responses import JSONResponse
from fastapi.routing import APIRoute
from mangum import Mangum
from pydantic import BaseModel
from starlette.middleware.exceptions import ExceptionMiddleware

os.environ["POWERTOOLS_METRICS_NAMESPACE"] = "StoryApi"
os.environ["POWERTOOLS_SERVICE_NAME"] = "StoryApi"

logger: Logger = Logger()
metrics: Metrics = Metrics()
tracer: Tracer = Tracer()

class InternalServerErrorDetails(BaseModel):
    detail: str

app = FastAPI(
    title="StoryApi",
    responses={
        500: {"model": InternalServerErrorDetails}
    }
)
lambda_handler = Mangum(app)

# Add tracing
lambda_handler.__name__ = "handler"  # tracer requires __name__ to be set
lambda_handler = tracer.capture_lambda_handler(lambda_handler)
# Add logging
lambda_handler = logger.inject_lambda_context(lambda_handler, clear_state=True)
# Add metrics last to properly flush metrics.
lambda_handler = metrics.log_metrics(lambda_handler, capture_cold_start_metric=True)

# Add exception middleware(s)
app.add_middleware(ExceptionMiddleware, handlers=app.exception_handlers)

@app.exception_handler(Exception)
async def unhandled_exception_handler(request, err):
    logger.exception("Unhandled exception")

    metrics.add_metric(name="Failure", unit=MetricUnit.Count, value=1)

    return JSONResponse(status_code=500,
                        content=InternalServerErrorDetails(
                            detail="Internal Server Error").model_dump())

@app.middleware("http")
async def metrics_handler(request: Request, call_next):
    metrics.add_dimension("route", f"{request.method} {request.url.path}")
    metrics.add_metric(name="RequestCount", unit=MetricUnit.Count, value=1)

    response = await call_next(request)

    if response.status_code == 200:
        metrics.add_metric(name="Success", unit=MetricUnit.Count, value=1)

    return response

# Add correlation id middleware
@app.middleware("http")
async def add_correlation_id(request: Request, call_next):
    # Get correlation id from X-Correlation-Id header
    corr_id = request.headers.get("x-correlation-id")
    if not corr_id and "aws.context" in request.scope:
        # If empty, use request id from aws context
        corr_id = request.scope["aws.context"].aws_request_id
    elif not corr_id:
        # If still empty, use uuid
        corr_id = uuid.uuid4().hex

    # Add correlation id to logs
    logger.set_correlation_id(corr_id)

    response = await call_next(request)

    # Return correlation header in response
    response.headers["X-Correlation-Id"] = corr_id
    return response

class LoggerRouteHandler(APIRoute):
    def get_route_handler(self) -> Callable:
        original_route_handler = super().get_route_handler()

        async def route_handler(request: Request) -> Response:
            # Add fastapi context to logs
            ctx = {
                "path": request.url.path,
                "route": self.path,
                "method": request.method,
            }
            logger.append_keys(fastapi=ctx)
            logger.info("Received request")

            return await original_route_handler(request)

        return route_handler

app.router.route_class = LoggerRouteHandler

def custom_openapi():
    if app.openapi_schema:
        return app.openapi_schema
    for route in app.routes:
        if isinstance(route, APIRoute):
            route.operation_id = route.name
    openapi_schema = get_openapi(
        title=app.title,
        version=app.version,
        openapi_version=app.openapi_version,
        description=app.description,
        routes=app.routes,
    )
    app.openapi_schema = openapi_schema
    return app.openapi_schema

app.openapi = custom_openapi

Analyzing the code above:

We use the x-streaming setting to indicate that this is a streaming API when we eventually generate our client SDK. This will allow us to consume this API in a streaming manner whilst maintaining type-safety!
Our API simply returns a stream of text as defined by both the media_type="text/plain" and the response_class=PlainTextResponse

Infrastructure

The Infrastructure we set up previously assumes that all APIs have an API Gateway integrating with Lambda functions. For our story_api we actually don’t want to use API Gateway as this does not support streaming repsonses. Instead, we will use a Lambda Function URL configured with response streaming.

To support this, we are going to first update our CDK constructs as follows:

story-api.ts
application-stack.ts

import { Duration, Stack, CfnOutput } from 'aws-cdk-lib';
import { IGrantable, Grant } from 'aws-cdk-lib/aws-iam';
import {
  Runtime,
  Code,
  Tracing,
  LayerVersion,
  FunctionUrlAuthType,
  InvokeMode,
  Function,
} from 'aws-cdk-lib/aws-lambda';
import { Construct } from 'constructs';
import url from 'url';
import { RuntimeConfig } from '../../core/runtime-config.js';

export class StoryApi extends Construct {
  public readonly handler: Function;

  constructor(scope: Construct, id: string) {
    super(scope, id);

    this.handler = new Function(this, 'Handler', {
      runtime: Runtime.PYTHON_3_12,
      handler: 'run.sh',
      code: Code.fromAsset(
        url.fileURLToPath(
          new URL(
            '../../../../../../dist/packages/story_api/bundle',
            import.meta.url,
          ),
        ),
      ),
      timeout: Duration.seconds(30),
      tracing: Tracing.ACTIVE,
      environment: {
        AWS_CONNECTION_REUSE_ENABLED: '1',
      },
    });

    const stack = Stack.of(this);
    this.handler.addLayers(
      LayerVersion.fromLayerVersionArn(
        this,
        'LWALayer',
        `arn:aws:lambda:${stack.region}:753240598075:layer:LambdaAdapterLayerX86:24`,
      ),
    );
    this.handler.addEnvironment('PORT', '8000');
    this.handler.addEnvironment('AWS_LWA_INVOKE_MODE', 'response_stream');
    this.handler.addEnvironment('AWS_LAMBDA_EXEC_WRAPPER', '/opt/bootstrap');
    const functionUrl = this.handler.addFunctionUrl({
      authType: FunctionUrlAuthType.AWS_IAM,
      invokeMode: InvokeMode.RESPONSE_STREAM,
      cors: {
        allowedOrigins: ['*'],
        allowedHeaders: [
          'authorization',
          'content-type',
          'x-amz-content-sha256',
          'x-amz-date',
          'x-amz-security-token',
        ],
      },
    });

    new CfnOutput(this, 'StoryApiUrl', { value: functionUrl.url });

    // Register the API URL in runtime configuration for client discovery
    RuntimeConfig.ensure(this).config.apis = {
      ...RuntimeConfig.ensure(this).config.apis!,
      StoryApi: functionUrl.url,
    };
  }

  public grantInvokeAccess(grantee: IGrantable) {
    Grant.addToPrincipal({
      grantee,
      actions: ['lambda:InvokeFunctionUrl'],
      resourceArns: [this.handler.functionArn],
      conditions: {
        StringEquals: {
          'lambda:FunctionUrlAuthType': 'AWS_IAM',
        },
      },
    });
  }
}

import {
  GameApi,
  GameUI,
  StoryApi,
  UserIdentity,
} from ':dungeon-adventure/common-constructs';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { ElectrodbDynamoTable } from '../constructs/electrodb-table.js';
import { PolicyStatement, Effect } from 'aws-cdk-lib/aws-iam';

export class ApplicationStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // The code that defines your stack goes here
    const userIdentity = new UserIdentity(this, 'UserIdentity');

    const electroDbTable = new ElectrodbDynamoTable(this, 'ElectroDbTable');

    const gameApi = new GameApi(this, 'GameApi', {
      integrations: GameApi.defaultIntegrations(this)
        .withDefaultOptions({
          environment: {
            TABLE_NAME: electroDbTable.tableName,
          },
        })
        .build(),
    });

    // Grant read/write access to each procedure's lambda handler according to the permissions it requires
    // Grant read/write access to each handler depending on the permissions it requires
    electroDbTable.grantReadData(gameApi.integrations['actions.query'].handler);
    electroDbTable.grantReadData(gameApi.integrations['games.query'].handler);
    electroDbTable.grantReadWriteData(
      gameApi.integrations['actions.save'].handler,
    );
    electroDbTable.grantReadWriteData(
      gameApi.integrations['games.save'].handler,
    );

    const storyApi = new StoryApi(this, 'StoryApi', {
      integrations: StoryApi.defaultIntegrations(this).build(),
    });
    const storyApi = new StoryApi(this, 'StoryApi');
    storyApi.handler.addToRolePolicy(
      new PolicyStatement({
        effect: Effect.ALLOW,
        actions: ['bedrock:InvokeModelWithResponseStream'],
        resources: [
          'arn:aws:bedrock:*::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        ],
      }),
    );

    // grant our authenticated role access to invoke our APIs
    [storyApi, gameApi].forEach((api) =>
      api.grantInvokeAccess(userIdentity.identityPool.authenticatedRole),
    );

    // Ensure this is instantiated last so our runtime-config.json can be automatically configured
    new GameUI(this, 'GameUI');
  }
}

Now we will update the story_api to support the Lambda Web Adapter deployment.

run.sh
project.json

#!/bin/bash

PATH=$PATH:$LAMBDA_TASK_ROOT/bin \
    PYTHONPATH=$PYTHONPATH:/opt/python:$LAMBDA_RUNTIME_DIR \
    exec python -m uvicorn --port=$PORT story_api.main:app

{
  "name": "dungeon_adventure.story_api",
  ...
  "targets": {
    ...
    "bundle": {
      "cache": true,
      "executor": "nx:run-commands",
      "outputs": ["{workspaceRoot}/dist/packages/story_api/bundle"],
      "options": {
        "commands": [
          "uv export --frozen --no-dev --no-editable --project packages/story_api --package dungeon_adventure.story_api -o dist/packages/story_api/bundle/requirements.txt",
          "uv pip install -n --no-installer-metadata --no-compile-bytecode --python-platform x86_64-manylinux2014 --target dist/packages/story_api/bundle -r dist/packages/story_api/bundle/requirements.txt",
          "copyfiles -f packages/story_api/run.sh dist/packages/story_api/bundle"
        ],
        "parallel": false
      },
      "dependsOn": ["compile"]
    },
    ...
  }
}

Deployment and testing

First, lets build the codebase:

pnpm nx run-many --target build --all

yarn nx run-many --target build --all

npx nx run-many --target build --all

bunx nx run-many --target build --all

If you encounter any lint errors, you can run the following command to automatically fix them.

pnpm nx run-many --target lint --configuration=fix --all

yarn nx run-many --target lint --configuration=fix --all

npx nx run-many --target lint --configuration=fix --all

bunx nx run-many --target lint --configuration=fix --all

Your application can now be deployed by running the following command:

pnpm nx run @dungeon-adventure/infra:deploy dungeon-adventure-infra-sandbox

yarn nx run @dungeon-adventure/infra:deploy dungeon-adventure-infra-sandbox

npx nx run @dungeon-adventure/infra:deploy dungeon-adventure-infra-sandbox

bunx nx run @dungeon-adventure/infra:deploy dungeon-adventure-infra-sandbox

This deployment will take around 2 minutes to complete.

You can also deploy all stacks at once. Click here for more details.

Once the deployment completes, you should see some outputs similar to the following (some values have been redacted):

dungeon-adventure-infra-sandbox
dungeon-adventure-infra-sandbox: deploying... [2/2]

 ✅  dungeon-adventure-infra-sandbox

✨  Deployment time: 354s

Outputs:
dungeon-adventure-infra-sandbox.ElectroDbTableTableNameXXX = dungeon-adventure-infra-sandbox-ElectroDbTableXXX-YYY
dungeon-adventure-infra-sandbox.GameApiEndpointXXX = https://xxx.execute-api.region.amazonaws.com/prod/
dungeon-adventure-infra-sandbox.GameUIDistributionDomainNameXXX = xxx.cloudfront.net
dungeon-adventure-infra-sandbox.StoryApiStoryApiUrlXXX = https://xxx.lambda-url.ap-southeast-2.on.aws/
dungeon-adventure-infra-sandbox.UserIdentityUserIdentityIdentityPoolIdXXX = region:xxx
dungeon-adventure-infra-sandbox.UserIdentityUserIdentityUserPoolIdXXX = region_xxx

We can test our API by either:

Starting a local instance of the FastApi server and invoke the API’s using curl.
Calling the deployed API using sigv4 enabled curl directly

Local
Deployed

Start your local FastAPI server by running the following command:

pnpm nx run dungeon_adventure.story_api:serve

yarn nx run dungeon_adventure.story_api:serve

npx nx run dungeon_adventure.story_api:serve

bunx nx run dungeon_adventure.story_api:serve

Once the FastAPI server is up and running, call it by running the following command:

curl -N -X POST http://127.0.0.1:8000/story/generate \
  -d '{"genre":"superhero", "actions":[], "playerName":"UnnamedHero"}' \
  -H "Content-Type: application/json"

acurl ap-southeast-2 lambda -N -X POST \
  https://xxx.lambda-url.ap-southeast-2.on.aws/story/generate \
  -d '{"genre":"superhero", "actions":[], "playerName":"UnnamedHero"}' \
  -H "Content-Type: application/json"

If the command executes successfully, you should see a response being streamed similar to:

UnnamedHero stood tall, his cape billowing in the wind....

Congratulations. You have built and deployed your first API using FastAPI! 🎉🎉🎉

AI Dungeon Game

Module 3: Story API implementation

API implementation

Infrastructure

Deployment and testing

Deployment command

Sigv4 enabled curl

API Gateway

Streaming Lambda function url

API Gateway

Streaming Lambda function url