TIFF Tile Codec¶
Version: 1.0
URI: https://awslabs.github.io/osml-imagery-io/codecs/tiff-tile
Codec type: array-to-bytes
Decodes compressed TIFF tiles into NumPy arrays. Supports LZW, JPEG, Deflate, Adobe Deflate, PackBits, and uncompressed tiles, including horizontal differencing predictors and YCbCr-to-RGB conversion for JPEG tiles.
Document Conventions¶
The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” in this document are to be interpreted as described in RFC 2119.
Codec Identifier¶
The value of the name member in the codec metadata MUST be
https://awslabs.github.io/osml-imagery-io/codecs/tiff-tile.
Encoded Representation¶
The encoded representation MUST be a single compressed TIFF tile as it appears
in the TIFF file’s data area (the bytes referenced by a TileOffsets/TileByteCounts
entry). The compression format is determined by the compression configuration
parameter and MUST conform to the corresponding algorithm defined in TIFF
Revision 6.0 or the applicable TIFF Technical Note.
When compression is 7 (JPEG), the tile data MAY omit shared quantization and
Huffman tables, which MUST then be provided via the jpeg_tables configuration
parameter.
Rationale: Why Compressed TIFF Tiles Need Metadata¶
Individual compressed tiles extracted from a TIFF file cannot be decoded in isolation. The compressed tile bytes are an opaque payload — the decoder needs IFD tag metadata (compression algorithm, predictor settings, photometric interpretation, JPEG quantization tables, etc.) that lives in the file header, not in the tile data itself. Without this metadata, the decoder cannot determine how to decompress the bytes or interpret the resulting pixel values.
This codec solves the problem by storing the required IFD tag values in its configuration. At decode time it constructs a minimal single-tile TIFF in memory from the configuration and the compressed tile bytes, then hands it to libtiff for decompression:
This approach delegates all decompression complexity (LZW, Deflate, JPEG, predictor reversal, byte-order conversion, color space conversion) to libtiff rather than reimplementing it. See the Synthetic Codestream Codec Pattern design document for further details.
Configuration Parameters¶
Field |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
|
No |
|
TIFF compression tag value. See table below. |
|
|
No |
|
Bits per sample per band ( |
|
|
No |
|
Number of bands. |
|
|
No |
|
Photometric interpretation. |
|
|
No |
|
Planar configuration. |
|
|
No |
|
Differencing predictor. |
|
|
No |
|
Tile width in pixels. |
|
|
No |
|
Tile height in pixels. |
|
|
No |
|
Sample format. |
|
|
No |
|
Base64-encoded shared JPEG quantization and Huffman tables (TIFF tag 347). Required when |
Compression Tag Values¶
Value |
Name |
Notes |
|---|---|---|
|
None (uncompressed) |
Raw tile bytes; still needs byte-order and planar conversion. |
|
LZW |
Supports horizontal differencing predictor ( |
|
JPEG |
Requires |
|
Deflate (zlib) |
Supports horizontal differencing predictor. |
|
PackBits |
Run-length encoding. |
|
Adobe Deflate |
Equivalent to Deflate; legacy tag value. |
Sample Format / Bits Per Sample to NumPy dtype Mapping¶
Sample Format |
Bits Per Sample |
NumPy dtype |
|---|---|---|
|
8 |
|
|
16 |
|
|
32 |
|
|
8 |
|
|
16 |
|
|
32 |
|
|
32 |
|
|
64 |
|
Algorithm¶
Decoding¶
Construct a minimal single-tile TIFF buffer in memory: an 8-byte TIFF header, an IFD containing the tag values from the codec configuration, and the compressed tile bytes appended after the IFD.
Open the buffer with libtiff’s
TIFFClientOpenusing memory-backed I/O callbacks.If
compression=7(JPEG) andphotometric=6(YCbCr), setJPEGCOLORMODE_RGBso libtiff performs YCbCr-to-RGB conversion during decode.Call
TIFFReadEncodedTile(handle, 0, ...)to decompress the tile. libtiff handles predictor reversal, byte-order conversion, and color space conversion internally.If the decoded tile is smaller than the nominal tile dimensions (edge tile), pad with zeros to the full tile shape.
Convert from chunky (pixel-interleaved) to band-sequential (BSQ) format if
planar_config=1andsamples_per_pixel > 1.Return an array with shape
(samples_per_pixel, tile_height, tile_width)and the dtype corresponding to thesample_format/bits_per_samplecombination.
Encoding¶
Encoding is not currently specified. See Implementation Notes.
Example Configuration¶
LZW with horizontal predictor¶
{
"name": "https://awslabs.github.io/osml-imagery-io/codecs/tiff-tile",
"configuration": {
"compression": 5,
"bits_per_sample": 8,
"samples_per_pixel": 3,
"photometric": 2,
"planar_config": 1,
"predictor": 2,
"tile_width": 256,
"tile_height": 256,
"sample_format": 1
}
}
References¶
TIFF Revision 6.0 — Tag Image File Format Specification
TIFF Technical Note #2 — TIFF Trees (JPEG-in-TIFF)
RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
Implementation Notes¶
aws.osml.io.zarr_codecs.TiffTileCodec — see API Reference.
Only the decode path is implemented. Calling encode() raises NotImplementedError.