Skip to content

Image Metadata

Unstable API 0.10.0 @project-lakechain/image-metadata-extractor TypeScript

The image metadata extractor enriches document metadata with specific information about input images, such as their dimensions, dominant color, orientation, EXIF tags, and more. Those metadata can then be later used by subsequent middlewares in the pipeline, or stored in a database.


📷 Extracting Metadata

To use this middleware, you import it in your CDK stack and instantiate it as part of a pipeline.

import { ImageMetadataExtractor } from '@project-lakechain/image-metadata-extractor';
import { CacheStorage } from '@project-lakechain/core';
class Stack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string) {
const cache = new CacheStorage(this, 'Cache');
// Extracts metadata from images.
const imageMetadata = new ImageMetadataExtractor.Builder()
.withScope(this)
.withIdentifier('ImageMetadata')
.withCacheStorage(cache)
.withSource(source) // 👈 Specify a data source
.build();
}
}


📄 Output

The image metadata extraction middleware does not modify or alter source images in any way. It instead enriches the metadata of their document with captured information. Below is an example of metadata captured using this middleware.

💁 Click to expand example
{
"specversion": "1.0",
"id": "1780d5de-fd6f-4530-98d7-82ebee85ea39",
"type": "document-created",
"time": "2023-10-22T13:19:10.657Z",
"data": {
"chainId": "6ebf76e4-f70c-440c-98f9-3e3e7eb34c79",
"source": {
"url": "s3://bucket/image.png",
"type": "image/png",
"size": 245328,
"etag": "1243cbd6cf145453c8b5519a2ada4779"
},
"document": {
"url": "s3://bucket/image.png",
"type": "image/png",
"size": 245328,
"etag": "1243cbd6cf145453c8b5519a2ada4779"
},
"metadata": {
"authors": [
"John Doe"
],
"title": "A winter in San Francisco",
"properties": {
"kind": "image",
"attrs": {
"width": 1920,
"height": 1080,
"exif": {
"Make": "Canon",
"Model": "Canon EOS 5D Mark IV"
}
}
}
},
"callStack": []
}
}


🏗️ Architecture

This middleware runs within a Lambda compute based on the ARM64 architecture, and packages different libraries to extract the metadata of images.

Architecture



🏷️ Properties


Supported Inputs
Mime TypeDescription
image/gifGIF image
image/jpegJPEG image
image/pngPNG image
image/tiffTIFF image
image/webpWebP image
image/avifAVIF image
Supported Outputs
Mime TypeDescription
image/gifGIF image
image/jpegJPEG image
image/pngPNG image
image/tiffTIFF image
image/webpWebP image
image/avifAVIF image
Supported Compute Types
TypeDescription
CPUThis middleware only supports CPU compute.


📖 Examples