# Working with Pixels ## The Simple Path For most pixel workflows, the convenience functions give you NumPy arrays directly without thinking about blocks or assets: ```python from aws.osml.io import imread, tiles # Full image as a CHW NumPy array pixels = imread("image.ntf") print(pixels.shape) # (3, 1024, 1024) — (bands, height, width) print(pixels.dtype) # uint8 # Process a large image in tiles without loading it all into memory for tile in tiles("large_image.tif", tile_size=(256, 256)): process(tile.data) # tile.data is a CHW NumPy array ``` The sections below cover the details of how pixel data is represented, how to convert between channel orderings for different libraries, and how to work with in-memory image buffers for writing. ## Image Data Arrays Block data is returned as a [NumPy](https://numpy.org/) `ndarray`. NumPy is the standard array interface shared by machine learning frameworks (PyTorch, TensorFlow), computer vision libraries (OpenCV, Pillow), and the broader scientific Python ecosystem. Returning pixel data as an ndarray means you can pass blocks directly into these tools without an intermediate copy or conversion step. For pixels wider than 8 bits (e.g. 16-bit or 32-bit imagery), the library automatically converts from the format's stored byte order to the native byte order of your platform. NITF files store multi-byte values in big-endian order; on a little-endian machine the bytes are swapped during decode so the resulting ndarray is ready to use without manual conversion. The NumPy dtype is selected automatically based on the image's pixel type — an 8-bit unsigned image produces a `uint8` array, a 16-bit signed image produces `int16`, a 32-bit float produces `float32`, and so on. Each array has shape `(bands, rows, cols)` — a channels-first (CHW) layout. This matches the convention used by PyTorch and many deep learning pipelines, where a batch of images is shaped `(N, C, H, W)`. Channels-first ordering is also convenient for remote sensing workflows where analysis steps typically operate on a subset of spectral bands. Other libraries expect different channel orderings: | Library | Format | Shape | |---------|--------|-------| | osml-imagery-io | Channels First (CHW) | `(bands, rows, cols)` | | PyTorch | Channels First (NCHW) | `(batch, channels, height, width)` | | OpenCV | Channels Last (HWC) | `(rows, cols, channels)` | | Pillow | Channels Last (HWC) | `(height, width, channels)` | Use `np.transpose` to reshape a block for channels-last libraries: ```python import numpy as np import matplotlib.pyplot as plt from PIL import Image from aws.osml.io import IO with IO.open(["image.ntf"], "r") as dataset: image = dataset.get_asset("image:0") block_chw = image.get_block(0, 0, resolution_level=0) # Convert to channels-last for display block_hwc = np.transpose(block_chw, (1, 2, 0)) # Display with matplotlib plt.imshow(block_hwc) plt.title("Block (0, 0)") plt.show() # Or convert to a PIL Image for further manipulation pil_image = Image.fromarray(block_hwc) ``` ## Creating an Image from Scratch `BufferedImageAssetProvider` and `BufferedMetadataProvider` let you build images and their associated metadata entirely in memory. `BufferedMetadataProvider` is a mutable key-value store for encoding hints and format fields — things like compression type (`IC`) and interleave mode (`IMODE`). `BufferedImageAssetProvider` holds the pixel data and implements the same `ImageAssetProvider` interface used by file-backed images, so in-memory images can be passed to any API that accepts an image asset, including the writer. This is useful for creating synthetic test data, assembling mosaics, or building images from processed results. You can populate the image all at once with `set_full_image()`: ```python from aws.osml.io import BufferedImageAssetProvider, BufferedMetadataProvider, PixelType import numpy as np metadata = BufferedMetadataProvider() metadata["IC"] = "NC" metadata["IMODE"] = "B" image_data = np.random.randint(0, 255, (3, 512, 512), dtype=np.uint8) provider = BufferedImageAssetProvider.create( key="synthetic_image", num_columns=512, num_rows=512, num_bands=3, block_width=256, block_height=256, pixel_type=PixelType.UInt8, metadata=metadata, ) provider.set_full_image(image_data) ``` For large images or sparse data, set blocks individually with `set_block()` instead of loading the full image into memory: ```python provider = BufferedImageAssetProvider.create( key="tiled_image", num_columns=1024, num_rows=1024, num_bands=3, block_width=256, block_height=256, pixel_type=PixelType.UInt8, metadata=metadata, ) for row in range(4): for col in range(4): block = np.random.randint(0, 255, (3, 256, 256), dtype=np.uint8) provider.set_block(row, col, block) ``` ## JPEG Color Space Handling in TIFF TIFF files compressed with JPEG (compression code 7) store RGB pixel data internally as YCbCr. This is a requirement of the JPEG-in-TIFF specification (TIFF Technical Note 2) — libtiff sets `PhotometricInterpretation` to YCbCr (6) for images with 3 or more bands and performs the RGB-to-YCbCr conversion automatically during encoding. On the read side, libtiff converts YCbCr back to RGB during decoding. The pixel data returned by `get_block()` is always RGB, and the `PhotometricInterpretation` reported in metadata reflects the decoded color space (RGB), not the on-disk storage format. Callers never need to handle YCbCr data directly. On the write side, callers provide standard RGB pixel data and select JPEG compression by setting encoding hint `"259"` to `7`. libtiff handles the RGB-to-YCbCr conversion as part of JPEG encoding. JPEG quality is configurable via encoding hint `"65537"` (values 1–100, default 75). For single-band (grayscale) images, JPEG compression uses `PhotometricInterpretation` MinIsBlack (1) and no color space conversion occurs. The YCbCr conversion is an internal codec detail, similar to JPEG quantization. It is lossy — writing and reading back a JPEG-compressed TIFF will not produce an exact pixel match. Use PSNR or similar metrics to evaluate fidelity rather than exact comparison. ```python from aws.osml.io import IO, BufferedImageAssetProvider, BufferedMetadataProvider, PixelType import numpy as np # Write a JPEG-compressed TIFF metadata = BufferedMetadataProvider() metadata["259"] = 7 # JPEG compression metadata["65537"] = 85 # Quality 85 image_data = np.random.randint(0, 255, (3, 256, 256), dtype=np.uint8) provider = BufferedImageAssetProvider.create( key="rgb_image", num_columns=256, num_rows=256, num_bands=3, block_width=256, block_height=256, pixel_type=PixelType.UInt8, metadata=metadata, ) provider.set_full_image(image_data) with IO.open(["output.tif"], "w", "tiff") as writer: writer.add_asset(provider) # Read it back — pixels are RGB, not YCbCr with IO.open(["output.tif"], "r") as reader: image = reader.get_asset("image:0") block = image.get_block(0, 0, resolution_level=0) print(block.dtype) # uint8 print(block.shape) # (3, 256, 256) — RGB channels ``` ## Indexed (Palette Color) Images Some image formats store pixel values as indices into a color lookup table rather than direct color values. Both TIFF and NITF support this concept, though the mechanisms differ. In all cases, `ImageAssetProvider` returns the raw index values as stored in the file — it does not apply lookup tables automatically. The library does not perform palette expansion because many workflows need the raw indices. Classification maps and thematic rasters use each index to represent a land cover class or category, not a display color. Applying the lookup table to produce RGB pixels is a separate processing step. ### TIFF Palette Color In TIFF files, palette color is indicated by `PhotometricInterpretation = 3`. Each pixel is a single-byte index and the actual RGB colors are defined in a separate `ColorMap` tag. A palette-color TIFF will report 1 band of `uint8` data, and `get_block()` returns the index array — not the expanded RGB pixels. ```python from aws.osml.io import IO with IO.open(["indexed_image.tif"], "r") as dataset: image = dataset.get_asset("image:0") print(image.num_bands) # 1 print(image.pixel_value_type) # PixelType.UInt8 block = image.get_block(0, 0, resolution_level=0) print(block.shape) # (1, rows, cols) — index values, not RGB ``` ### NITF Lookup Tables (LUTs) NITF files support per-band lookup tables through the image subheader fields `NLUTSn`, `NELUTn`, and `LUTDnm` (JBP §5.13.2.28–5.13.2.30). The most common case is `IREP=RGB/LUT`: a single-band image where each pixel is an index and three LUTs (red, green, blue) define the color mapping. Individual bands in `IREP=MONO` or `IREP=MULTI` images can also carry LUTs when `IREPBANDn=LU`. LUTs are only valid for uncompressed images (`IC=NC` or `NM`) with integer or binary pixel types (`PVTYPE=INT` or `B`). For compressed formats like JPEG (`IC=C3`) and JPEG 2000 (`IC=C8`), color handling is internal to the codec — the decoder outputs final pixel values directly and `NLUTSn` is always 0. Vector Quantization (`IC=C4`) uses its own codebook-based color lookup mechanism defined in MIL-STD-188-199, which is separate from the subheader LUT fields. As with TIFF, `get_block()` returns the raw stored values. For an `IREP=RGB/LUT` image, this means 1 band of index values. The LUT data itself is accessible through the image segment's metadata — the subheader fields `NLUTSn`, `NELUTn`, and `LUTDnm` are parsed and available for applications that need to perform the lookup.