JPEG 2000 Codec¶
Version: 1.0
URI: https://awslabs.github.io/osml-imagery-io/codecs/jpeg2000
Codec type: array-to-bytes
Decodes JPEG 2000 (Part 1) and HTJ2K (Part 15) codestreams into NumPy arrays. Supports both complete codestreams and single-tile codestream reconstruction from a shared main header and per-tile tile-part bytes.
This codec is format-agnostic — it decodes any valid J2K codestream regardless of whether
the source data originated from NITF (IC=C8/M8/CD/MD), standalone .jp2/.j2k files,
or TIFF containers.
Document Conventions¶
The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” in this document are to be interpreted as described in RFC 2119.
Codec Identifier¶
The value of the name member in the codec metadata MUST be
https://awslabs.github.io/osml-imagery-io/codecs/jpeg2000.
Encoded Representation¶
The encoded representation MUST be a valid JPEG 2000 codestream conforming to
ISO/IEC 15444-1 (Part 1) or ISO/IEC 15444-15 (Part 15, HTJ2K). The codestream
begins with an SOC marker (0xFF4F) and ends with an EOC marker (0xFFD9).
When main_header is provided in the configuration, the encoded representation is
the tile-part data only (beginning with SOT). The codec reconstructs a complete
codestream by prepending the main header and appending EOC.
Rationale: Why Tile-Parts Are Not Self-Contained¶
JPEG 2000 codestreams support internal tiling, but the tiles are not self-contained. Each tile’s compressed data (the “tile-part”) contains only the wavelet coefficients. The decoding parameters — tile dimensions, quantization tables, wavelet decomposition levels, component counts — live in the codestream’s main header (the SIZ, COD, and QCD markers). A decoder cannot reconstruct pixels from a tile-part alone.
Additionally, JPEG 2000 supports progression orders (RLCP, RPCL) that interleave tile-parts from different tiles. Instead of writing all of tile 0’s data then all of tile 1’s data, the encoder writes resolution level 0 for every tile, then resolution level 1 for every tile, and so on. A single tile’s compressed bytes may be scattered across multiple non-contiguous locations in the file. The filesystem layer handles gathering these byte ranges — the codec receives the concatenated tile-part bytes.
This codec solves the header problem by inlining the shared main header (base64-encoded, typically 100–500 bytes) in the codec configuration. At decode time the codec reconstructs a minimal single-tile codestream on the fly:
This approach has precedent in the JPEG 2000 ecosystem. JPIP (the JPEG 2000 Interactive Protocol, ISO/IEC 15444-9) streams individual tile-parts to clients that already hold the main header.
Configuration Parameters¶
Field |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
|
No |
|
Base64-encoded J2K main header bytes (SOC, SIZ, COD, QCD markers). When present, the codec reconstructs a single-tile codestream by prepending the header to the chunk bytes and appending an EOC marker. When absent, the chunk MUST be a complete codestream. |
|
|
No |
|
Target resolution level. |
Algorithm¶
Decoding¶
If
main_headeris present in the configuration, base64-decode it to obtain the raw header bytes.Reconstruct a minimal single-tile codestream:
[main_header bytes] + [chunk bytes] + [EOC marker (0xFF 0xD9)].If no
main_headeris present, use the chunk bytes directly as a complete codestream.Decode the codestream at the specified
resolution_level.Return an array with shape
(bands, height, width)and dtype matching the codestream’s bit depth and signedness (e.g., 8-bit unsigned ->uint8, 16-bit signed ->int16).
Encoding¶
Encoding is not currently specified. See Implementation Notes.
Example Configuration¶
{
"name": "https://awslabs.github.io/osml-imagery-io/codecs/jpeg2000",
"configuration": {
"main_header": "base64:ff4f...encoded main header...",
"resolution_level": 0
}
}
References¶
ISO/IEC 15444-1:2019 — JPEG 2000 image coding system: Core coding system
ISO/IEC 15444-15:2019 — JPEG 2000 image coding system: High-Throughput JPEG 2000 (HTJ2K)
RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
Implementation Notes¶
aws.osml.io.zarr_codecs.Jpeg2000Codec — see API Reference.
Only the decode path is implemented. Calling encode() raises NotImplementedError.