Skip to content

BLIP2 Processor

Unstable API

0.7.0

@project-lakechain/blip2-image-processor

TypeScript

The BLIP2 image processor makes it possible to generate captions for images within a Lakechain pipeline. It deploys an auto-scaled cluster of GPU-enabled containers to process images using the BLIP2 image model, such that all the processing remains on customers AWS environment.



📷 Captioning

To use this middleware, you import it in your CDK stack and specify a VPC in which the cluster will be deployed.

💁 Note that you will need to specify a data source that the BLIP2 processor will use as an input, such as the S3 trigger.

import { Blip2ImageProcessor } from '@project-lakechain/blip2-image-processor';
import { CacheStorage } from '@project-lakechain/core';
class Stack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string) {
// Sample VPC.
const vpc = new ec2.Vpc(this, 'Vpc', {});
// The cache storage.
const cache = new CacheStorage(this, 'Cache');
// Create the BLIP2 processor.
const blipProcessor = new Blip2ImageProcessor.Builder()
.withScope(this)
.withIdentifier('ImageProcessor')
.withCacheStorage(cache)
.withVpc(vpc)
.withSource(source) // 👈 Specify a data source
.build();
}
}


Auto-Scaling

The cluster of containers deployed by this middleware will auto-scale based on the number of images that need to be processed. The cluster scales up to a maximum of 5 instances by default, and scales down to zero when there are no images to process.

ℹī¸ You can configure the maximum amount of instances that the cluster can auto-scale to by using the withMaxInstances method.

import { Blip2ImageProcessor } from '@project-lakechain/blip2-image-processor';
const blipProcessor = new Blip2ImageProcessor.Builder()
.withScope(this)
.withIdentifier('ImageProcessor')
.withCacheStorage(cache)
.withVpc(vpc)
.withSource(source)
.withMaxInstances(10) // 👈 Maximum amount of instances
.build();


📄 Output

The BLIP2 image processor does not modify or alter source images in any way. It instead enriches the metadata of their document by setting the description field to the output of the captioning result. It will also specify the dimensions of the image.

💁 Click to expand example

ℹī¸ Below is an example of a CloudEvent emitted by the BLIP2 processor.

{
"specversion": "1.0",
"id": "1780d5de-fd6f-4530-98d7-82ebee85ea39",
"type": "document-created",
"time": "2023-10-22T13:19:10.657Z",
"data": {
"chainId": "6ebf76e4-f70c-440c-98f9-3e3e7eb34c79",
"source": {
"url": "s3://bucket/image.png",
"type": "image/png",
"size": 245328,
"etag": "1243cbd6cf145453c8b5519a2ada4779"
},
"document": {
"url": "s3://bucket/image.png",
"type": "image/png",
"size": 245328,
"etag": "1243cbd6cf145453c8b5519a2ada4779"
},
"metadata": {
"description": "A man sitting on a wooden chair in a cozy room.",
"properties": {
"kind": "image",
"attrs": {
"width": 1280,
"height": 720
}
}
},
"callStack": []
}
}


🏗ī¸ Architecture

The BLIP2 image processor requires GPU-enabled instances (g5.2xlarge) to run the BLIP2 image model. To orchestrate deployments, it deploys an ECS auto-scaled cluster of containers that consume documents from the middleware input queue. The cluster is deployed in the private subnet of the given VPC, and caches the model on an EFS storage to optimize cold-starts.

ℹī¸ The average cold-start for the BLIP2 image processor is around 3 minutes when no instances are running.

Architecture



🏷ī¸ Properties


Supported Inputs
Mime TypeDescription
image/bmpBitmap image
image/gifGIF image
image/jpegJPEG image
image/pngPNG image
image/tiffTIFF image
image/webpWebP image
image/x-pcxPCX image
Supported Outputs

This middleware supports as outputs the same types as the supported inputs.

Supported Compute Types
TypeDescription
GPUThis middleware requires GPU instances to run the BLIP2 image model.


📖 Examples