Hashing
Unstable API
0.8.0
@project-lakechain/hashing-image-processor
The hashing image processor makes it possible to enrich the metadata of images with hash values associated with the visual representation of an image. This middleware supports different hashing algorithms, including average hashing, perceptual hashing, difference hashing, wavelet hashing, and color hashing.
Thoses hashing algorithm can be used to compare how different images are from a visual standpoint. They provide a more computationally efficient way to compare images, compared to vector embeddings which also take into account the semantic aspect of an image.
An example using average hashing.
Credits Branislav Rodman on Unsplash
#️⃣ Computing Hashes
To use this middleware, you import it in your CDK stack and instantiate it as part of a pipeline.
Selecting Algorithms
You can explicitely select which hashing algorithm to enable or not when enriching the document metadata with the different types of image hashes.
💁 By default, all hashing algorithms are enabled.
📄 Output
The Hashing image processor does not modify or alter source images in any way. It instead enriches the metadata of processed documents by setting the hash values associated with each of the enabled hashing algorithms.
💁 Click to expand example
ℹ️ Below is an example of a CloudEvent emitted by the Hashing image processor.
🏗️ Architecture
This middleware runs within a Lambda compute, and packages the imagehash
to compute the Laplacian variance of images.
🏷️ Properties
Supported Inputs
Mime Type | Description |
---|---|
image/jpeg | JPEG image |
image/png | PNG image |
image/bmp | BMP image |
image/webp | WebP image |
Supported Outputs
Mime Type | Description |
---|---|
image/jpeg | JPEG image |
image/png | PNG image |
image/bmp | BMP image |
image/webp | WebP image |
Supported Compute Types
Type | Description |
---|---|
CPU | This middleware only supports CPU compute. |
📖 Examples
- Image Hashing Pipeline - An example showcasing how to compute the hash of images.