Zarr Codecs¶
Zarr v3 codec plugins for decoding JPEG 2000, JPEG, TIFF, and uncompressed JBP/NITF imagery.
These codecs implement the zarr-python v3 codec protocol and are registered via Python entry points
for automatic discovery by the Zarr codec registry. They enable reading cloud-hosted NITF and TIFF
imagery through xarray.open_zarr() using Kerchunk indices.
Note
zarr is an optional dependency. Install with pip install osml-imagery-io[zarr] to enable
Zarr codec support.
Codec Classes¶
Jpeg2000Codec¶
- class aws.osml.io.zarr_codecs.Jpeg2000Codec(*, main_header=None, resolution_level=0)¶
Bases:
BytesBytesCodecZarr v3 bytes-to-bytes codec for JPEG 2000 codestreams.
Registered as: https://awslabs.github.io/osml-imagery-io/codecs/jpeg2000
- Configuration:
main_header: Optional[str] — base64-encoded J2K main header bytes resolution_level: int — target resolution level (default 0)
- codec_name = 'https://awslabs.github.io/osml-imagery-io/codecs/jpeg2000'¶
- codec_id = 'https://awslabs.github.io/osml-imagery-io/codecs/jpeg2000'¶
- compute_encoded_size(input_byte_length, chunk_spec)¶
Return input_byte_length — compressed size is not predictable.
- Return type:
- evolve_from_array_spec(array_spec)¶
Codec configuration is fixed at construction time.
- to_dict()¶
Serialize codec configuration to a JSON-compatible dictionary.
- Returns:
dict with ‘name’ and ‘configuration’ keys.
- classmethod from_dict(data)¶
Construct a Jpeg2000Codec from a serialized configuration dictionary.
Accepts both
{"name": ..., "configuration": {...}}format and a flat configuration dictionary.- Parameters:
data – Configuration dictionary.
- Returns:
Jpeg2000Codec instance.
- decode(buf, out=None)¶
Synchronous decode for numcodecs filter protocol.
For edge tiles, the decoded array may be smaller than the nominal tile dimensions. Pad to the nominal tile size so zarr v2’s reshape succeeds.
- encode(buf)¶
Encoding is not supported.
- get_config()¶
Return numcodecs-compatible configuration dict.
- classmethod from_config(config)¶
Construct from a numcodecs configuration dict.
JpegCodec¶
- class aws.osml.io.zarr_codecs.JpegCodec(*, bits_per_pixel, num_bands, block_width, block_height, imode, color_space)¶
Bases:
BytesBytesCodecZarr v3 bytes-to-bytes codec for JPEG streams.
Registered as: https://awslabs.github.io/osml-imagery-io/codecs/jpeg
- Configuration:
bits_per_pixel: int — 8 or 12 num_bands: int — number of bands block_width: int — block width in pixels block_height: int — block height in pixels imode: str — interleave mode (“B”, “P”, “R”, or “S”) color_space: str — “MONO”, “RGB”, or “YCbCr601”
- codec_name = 'https://awslabs.github.io/osml-imagery-io/codecs/jpeg'¶
- codec_id = 'https://awslabs.github.io/osml-imagery-io/codecs/jpeg'¶
- compute_encoded_size(input_byte_length, chunk_spec)¶
Return input_byte_length — compressed size is not predictable.
- Return type:
- evolve_from_array_spec(array_spec)¶
Codec configuration is fixed at construction time.
- to_dict()¶
Serialize codec configuration to a JSON-compatible dictionary.
- Returns:
dict with ‘name’ and ‘configuration’ keys.
- classmethod from_dict(data)¶
Construct a JpegCodec from a serialized configuration dictionary.
Accepts both
{"name": ..., "configuration": {...}}format and a flat configuration dictionary.- Parameters:
data – Configuration dictionary.
- Returns:
JpegCodec instance.
- Raises:
ValueError – If required configuration fields are missing.
- decode(buf, out=None)¶
Synchronous decode for numcodecs filter protocol.
- encode(buf)¶
Encoding is not supported.
- get_config()¶
Return numcodecs-compatible configuration dict.
- classmethod from_config(config)¶
Construct from a numcodecs configuration dict.
JbpBlockCodec¶
- class aws.osml.io.zarr_codecs.JbpBlockCodec(*, num_bands, block_height, block_width, nbpp, imode, pvtype)¶
Bases:
BytesBytesCodecZarr v3 bytes-to-bytes codec for uncompressed JBP/NITF/NSIF image blocks.
Registered as: https://awslabs.github.io/osml-imagery-io/codecs/jbp-block
- Configuration:
num_bands: int — number of bands block_height: int — block height in pixels block_width: int — block width in pixels nbpp: int — bits per pixel per band imode: str — NITF interleave mode (“B”, “P”, “R”, or “S”) pvtype: str — NITF pixel value type (“INT”, “SI”, “R”, or “C”)
- codec_name = 'https://awslabs.github.io/osml-imagery-io/codecs/jbp-block'¶
- codec_id = 'https://awslabs.github.io/osml-imagery-io/codecs/jbp-block'¶
- compute_encoded_size(input_byte_length, chunk_spec)¶
Return input_byte_length — compressed size is not predictable.
- Return type:
- evolve_from_array_spec(array_spec)¶
Codec configuration is fixed at construction time.
- to_dict()¶
Serialize codec configuration to a JSON-compatible dictionary.
- Returns:
dict with ‘name’ and ‘configuration’ keys.
- classmethod from_dict(data)¶
Construct a JbpBlockCodec from a serialized configuration dictionary.
Accepts both
{"name": ..., "configuration": {...}}format and a flat configuration dictionary.- Parameters:
data – Configuration dictionary.
- Returns:
JbpBlockCodec instance.
- Raises:
ValueError – If required configuration fields are missing.
- decode(buf, out=None)¶
Synchronous decode for numcodecs filter protocol.
- encode(buf)¶
Encoding is not supported.
- get_config()¶
Return numcodecs-compatible configuration dict.
- classmethod from_config(config)¶
Construct from a numcodecs configuration dict.
TiffTileCodec¶
- class aws.osml.io.zarr_codecs.TiffTileCodec(*, compression=1, bits_per_sample=8, samples_per_pixel=1, photometric=1, planar_config=1, predictor=1, tile_width=256, tile_height=256, sample_format=1, jpeg_tables=None)¶
Bases:
BytesBytesCodecZarr v3 bytes-to-bytes codec for TIFF tile codestreams.
Registered as: https://awslabs.github.io/osml-imagery-io/codecs/tiff-tile
- Configuration:
compression: int — TIFF compression tag value (default 1) bits_per_sample: int — bits per sample (default 8) samples_per_pixel: int — number of bands (default 1) photometric: int — photometric interpretation (default 1) planar_config: int — planar configuration (default 1) predictor: int — differencing predictor (default 1) tile_width: int — tile width in pixels (default 256) tile_height: int — tile height in pixels (default 256) sample_format: int — sample format (default 1) jpeg_tables: str | None — base64-encoded JPEG tables (default None)
- codec_name = 'https://awslabs.github.io/osml-imagery-io/codecs/tiff-tile'¶
- codec_id = 'https://awslabs.github.io/osml-imagery-io/codecs/tiff-tile'¶
- compute_encoded_size(input_byte_length, chunk_spec)¶
Return input_byte_length — compressed size is not predictable.
- Return type:
- evolve_from_array_spec(array_spec)¶
Codec configuration is fixed at construction time.
- to_dict()¶
Serialize codec configuration to a JSON-compatible dictionary.
- Returns:
dict with ‘name’ and ‘configuration’ keys.
- classmethod from_dict(data)¶
Construct a TiffTileCodec from a serialized configuration dictionary.
Accepts both
{"name": ..., "configuration": {...}}format and a flat configuration dictionary.- Parameters:
data – Configuration dictionary.
- Returns:
TiffTileCodec instance.
- decode(buf, out=None)¶
Synchronous decode for numcodecs filter protocol.
- encode(buf)¶
Encoding is not supported.
- get_config()¶
Return numcodecs-compatible configuration dict.
- classmethod from_config(config)¶
Construct from a numcodecs configuration dict.
Decode Binding Functions¶
decode_jpeg2000¶
- aws.osml.io.zarr_codecs.decode_jpeg2000(codestream, main_header=None, resolution_level=0)¶
Decode a JPEG 2000 codestream into a NumPy array.
If
main_headeris provided, reconstructs a single-tile codestream:[main_header] + [codestream] + [EOC marker 0xFF 0xD9]. Otherwise decodescodestreamdirectly as a complete J2K codestream.Returns an ndarray with shape
(bands, height, width)and appropriate dtype.- Parameters:
codestream – Compressed JPEG 2000 tile-part or complete codestream bytes.
main_header – Optional J2K main header bytes for single-tile reconstruction.
resolution_level – Target resolution level (0 = full resolution, default 0).
- Returns:
NumPy ndarray with shape (bands, height, width).
- Raises:
ValueError – If the codestream is invalid or decoding fails.
decode_jpeg¶
- aws.osml.io.zarr_codecs.decode_jpeg(data, bits_per_pixel, num_bands, block_width, block_height, imode, color_space)¶
Decode a JPEG stream into a NumPy array.
Returns an ndarray with shape
(num_bands, block_height, block_width)in band-sequential (BSQ) format.- Parameters:
data – Compressed JPEG bytes.
bits_per_pixel – Bits per pixel (8 or 12).
num_bands – Number of image bands.
block_width – Block width in pixels.
block_height – Block height in pixels.
imode – Interleave mode string (“B”, “P”, “R”, or “S”).
color_space – Color space string (“MONO”, “RGB”, or “YCbCr601”).
- Returns:
NumPy ndarray with shape (num_bands, block_height, block_width).
- Raises:
ValueError – If parameters are invalid or decoding fails.
decode_jbp_block¶
- aws.osml.io.zarr_codecs.decode_jbp_block(data, num_bands, block_height, block_width, nbpp, imode, pvtype)¶
Decode an uncompressed JBP/NITF/NSIF image block into a NumPy array.
Performs interleave conversion (imode → BSQ) and big-endian to native-endian byte swap.
Returns an ndarray with shape
(num_bands, block_height, block_width)and the appropriate dtype for the givennbppandpvtype.- Parameters:
data – Raw pixel bytes from a JBP/NITF image block.
num_bands – Number of image bands.
block_height – Block height in pixels.
block_width – Block width in pixels.
nbpp – Number of bits per pixel per band (8, 16, 32, or 64).
imode – NITF interleave mode string (“B”, “P”, “R”, or “S”).
pvtype – NITF pixel value type string (“INT”, “SI”, “R”, or “C”).
- Returns:
NumPy ndarray with shape (num_bands, block_height, block_width).
- Raises:
ValueError – If parameters are invalid or data length mismatches.
decode_tiff_tile¶
- aws.osml.io.zarr_codecs.decode_tiff_tile(data, compression, bits_per_sample, samples_per_pixel, photometric, planar_config, predictor, tile_width, tile_height, sample_format, jpeg_tables=None)¶
Decode a TIFF-compressed tile into a NumPy array.
Constructs a synthetic single-tile TIFF in memory from the codec configuration parameters and compressed tile bytes, then decodes it using libtiff’s TIFFReadEncodedTile.
Returns an ndarray with shape
(bands, tile_height, tile_width)in band-sequential (BSQ) format.- Parameters:
data – Compressed tile bytes.
compression – TIFF compression type (e.g. 5=LZW, 7=JPEG, 8=Deflate).
bits_per_sample – Bits per sample (8, 16, 32, or 64).
samples_per_pixel – Number of bands.
photometric – Photometric interpretation (1=MinIsBlack, 2=RGB, 6=YCbCr).
planar_config – Planar configuration (1=chunky, 2=separate).
predictor – Predictor type (1=none, 2=horizontal, 3=floating-point).
tile_width – Tile width in pixels.
tile_height – Tile height in pixels.
sample_format – Sample format (1=uint, 2=int, 3=float).
jpeg_tables – Optional JPEG quantization/Huffman tables for JPEG tiles.
- Returns:
NumPy ndarray with shape (bands, tile_height, tile_width).
- Raises:
ValueError – If parameters are invalid or decoding fails.