# OversightML Imagery IO API Design: Tiled Image Pyramid Access
This document presents the API design for OversightML's low-level access to large tiled image pyramids. The API combines concepts from the National Imagery Transmission Format (NITF) specification with ideas from SpatioTemporal Asset Catalogs (STAC) to provide a framework for geospatial imagery access.
For usage examples and practical guidance, see the [User Guide](../user-guide/index.md).
## Overview
## Core API Structure
The API models **Datasets** as collections of related assets (images, graphics, text, data), each with its own metadata. Assets are accessed by string keys rather than numeric indices, enabling discovery and categorization while remaining flexible enough to represent format-specific data models like the Joint BIIF Profile (JBP).
The `DatasetReader` and `DatasetWriter` abstract classes provide the main entry points, while the `IO` class serves as a factory that selects the appropriate implementation based on file format detection.
```mermaid
classDiagram
direction TB
class DatasetReader {
<>
+get_asset(key: str) AssetProvider
+get_asset_keys(asset_type: Optional[AssetType], roles: Optional[List[str]]) List[str]
+has_asset(key: str) bool
+metadata MetadataProvider
+close() None
+__enter__() DatasetReader
+__exit__() None
}
class DatasetWriter {
<>
+add_asset(key: str, provider: AssetProvider, title: str, description: str, roles: List[str]) None
+metadata MetadataProvider
+close() None
+__enter__() DatasetWriter
+__exit__() None
}
class IO {
+open(paths: List[str], mode: str, format: Optional[str]) Union[DatasetReader, DatasetWriter]
}
DatasetReader --> AssetProvider : provides
DatasetReader --> MetadataProvider : provides
DatasetWriter --> AssetProvider : consumes
DatasetWriter --> MetadataProvider : consumes
AssetProvider --> MetadataProvider : provides
IO --> DatasetReader : provides
IO --> DatasetWriter : provides
<> AssetProvider
<> MetadataProvider
```
## Asset Provider Hierarchy
The Asset Provider hierarchy handles different content types found in geospatial datasets. The base `AssetProvider` class establishes common metadata and organizational elements that all assets share, including keys, titles, descriptions, media types, and roles for discovery and categorization. Specialized providers extend this with type-specific functionality: `ImageAssetProvider` offers blocked access for processing large imagery, `TextAssetProvider` handles encoding and format-specific text retrieval, `DataAssetProvider` provides parsing for structured data like XML and JSON, and `GraphicsAssetProvider` manages vector graphics and annotations. This hierarchy allows datasets to function as self-describing collections.
```mermaid
classDiagram
direction TB
class AssetProvider {
<>
+key str
+title str
+description str
+media_type str
+roles List[str]
+asset_type AssetType
+raw_asset BytesIO
+metadata MetadataProvider
+from_bytes(data: bytes, key: str, media_type: str)$ AssetProvider
}
class MetadataProvider {
+raw BytesIO
+entries(prefix: Optional[str]) Dict[str, Any]
+get(key: str, default=None) Any
+keys() list[str]
+values() list[Any]
+items() list[tuple[str, Any]]
}
class ImageAssetProvider {
<>
+has_block(block_row: int, block_col: int, resolution_level: int) bool
+get_block(block_row: int, block_col: int, resolution_level: int, bands: Optional[List[int]]) ndarray
+num_resolution_levels int
+num_bands int
+num_rows int
+num_columns int
+num_pixels_per_block_horizontal int
+num_pixels_per_block_vertical int
+num_bits_per_pixel int
+actual_bits_per_pixel int
+pixel_value_type dtype
+pad_pixel_value Number
+image_shape Tuple[int, int, int]
+block_shape Tuple[int, int, int]
+block_grid_size Tuple[int, int]
}
class TextAssetProvider {
<>
+text str
+encoding str
+format str
}
class DataAssetProvider {
<>
+mime_type str
}
class GraphicsAssetProvider {
<>
+raw_asset BytesIO
}
AssetProvider <|-- ImageAssetProvider
AssetProvider <|-- TextAssetProvider
AssetProvider <|-- GraphicsAssetProvider
AssetProvider <|-- DataAssetProvider
AssetProvider --> MetadataProvider : provides
```
## ImageAssetProvider Hierarchy
The ImageAssetProvider hierarchy supports multiple image compression formats and data sources through a common blocked access interface. Each concrete implementation handles the decoding and access patterns required for its format while presenting a consistent API for blocked image data retrieval. The `BufferedImageAssetProvider` enables in-memory processing workflows, while format-specific providers like `JBPImageAssetProvider`, `J2KImageAssetProvider`, and `TIFFImageAssetProvider` provide lazy decoding and encoding for specific compression schemes and file structures. This design allows applications to work with different image formats—JPEG 2000 compressed imagery in NITF files, standard TIFF pyramids, or data generated in memory—through the same interface.
```mermaid
classDiagram
direction LR
class ImageAssetProvider {
<>
+has_block(block_row: int, block_col: int, resolution_level: int) bool
+get_block(block_row: int, block_col: int, resolution_level: int, bands: Optional[List[int]]) ndarray
+num_resolution_levels int
+num_bands int
+num_rows int
+num_columns int
+num_pixels_per_block_horizontal int
+num_pixels_per_block_vertical int
+num_bits_per_pixel int
+actual_bits_per_pixel int
+pixel_value_type dtype
+pad_pixel_value Number
+image_shape Tuple[int, int, int]
+block_shape Tuple[int, int, int]
+block_grid_size Tuple[int, int]
}
class BufferedImageAssetProvider {
+create(key: str, num_columns: int, num_rows: int, num_bands: int, block_width: int, block_height: int, pixel_type: PixelType, actual_bits_per_pixel: Optional[int], metadata: Optional[MetadataProvider], title: Optional[str], description: Optional[str])$ BufferedImageAssetProvider
+set_full_image(data: ndarray) None
+set_full_image_u16(data: ndarray) None
+set_block(block_row: int, block_col: int, data: bytes) None
}
class JBPImageAssetProvider {
+__init__(key: str, file_handle: BinaryIO, ifd_offset: int, title: str, roles: List[str])
}
class J2KImageAssetProvider {
+__init__(key: str, file_handle: BinaryIO, ifd_offset: int, title: str, roles: List[str])
}
class JPEGImageAssetProvider {
+__init__(key: str, file_handle: BinaryIO, ifd_offset: int, title: str, roles: List[str])
}
class TIFFImageAssetProvider {
+__init__(key: str, file_handle: BinaryIO, ifd_offset: int, title: str, roles: List[str])
}
class PNGImageAssetProvider {
+__init__(key: str, file_handle: BinaryIO, ifd_offset: int, title: str, roles: List[str])
}
ImageAssetProvider <|-- JBPImageAssetProvider
ImageAssetProvider <|-- J2KImageAssetProvider
ImageAssetProvider <|-- JPEGImageAssetProvider
ImageAssetProvider <|-- TIFFImageAssetProvider
ImageAssetProvider <|-- PNGImageAssetProvider
ImageAssetProvider <|-- BufferedImageAssetProvider
```
For block access patterns, resolution levels, and pixel data format details, see the [Image Assets](../user-guide/image-assets.md) and [Working with Pixels](../user-guide/working-with-pixels.md) user guides.
## GraphicsAssetProvider
The `GraphicsAssetProvider` interface provides access to vector graphics data within geospatial datasets. In NITF files, graphic segments contain CGM (Computer Graphics Metafile) data representing annotations, overlays, and vector graphics that can be rendered on top of imagery.
### Interface Design
The `GraphicsAssetProvider` trait extends `AssetProvider` without adding additional methods. This minimal design reflects that:
1. Raw CGM data is accessed through the inherited `raw_asset()` method
2. Graphic-specific metadata (display level, attachment level, location, bounds) is accessed via the `metadata()` Mapping interface
3. The library extracts raw CGM bytes but does not parse CGM content—users provide their own CGM parsing libraries
```mermaid
classDiagram
direction TB
class AssetProvider {
<>
+key str
+title str
+description str
+media_type str
+roles List[str]
+asset_type AssetType
+raw_asset BytesIO
+metadata MetadataProvider
}
class GraphicsAssetProvider {
<>
}
class JBPGraphicsAssetProvider {
-key: String
-title: String
-description: String
-roles: Vec~String~
-location: SegmentLocation
-data: Arc~[u8]~
-metadata: Arc~MetadataProvider~
}
AssetProvider <|-- GraphicsAssetProvider
GraphicsAssetProvider <|.. JBPGraphicsAssetProvider
```
For usage examples, see the [Graphics Assets](../user-guide/graphics-assets.md) user guide.
## TextAssetProvider
The `TextAssetProvider` interface provides access to text content within geospatial datasets. In NITF files, text segments contain textual data with associated metadata for character encoding and display properties. The interface handles encoding-aware text retrieval and line delimiter normalization.
### Interface Design
The `TextAssetProvider` trait extends `AssetProvider` with text-specific methods for accessing decoded content and encoding information:
```mermaid
classDiagram
direction TB
class AssetProvider {
<>
+key str
+title str
+description str
+media_type str
+roles List[str]
+asset_type AssetType
+raw_asset BytesIO
+metadata MetadataProvider
}
class TextAssetProvider {
<>
+text str
+encoding str
+format str
}
class JBPTextAssetProvider {
-key: String
-title: String
-description: String
-roles: Vec~String~
-location: SegmentLocation
-data: Arc~[u8]~
-metadata: Arc~MetadataProvider~
-txtfmt: String
}
class BufferedTextAssetProvider {
-key: String
-title: String
-description: String
-roles: Vec~String~
-text_content: String
-encoding: String
-metadata: Arc~MetadataProvider~
}
AssetProvider <|-- TextAssetProvider
TextAssetProvider <|.. JBPTextAssetProvider
TextAssetProvider <|.. BufferedTextAssetProvider
```
For usage examples, see the [Text Assets](../user-guide/text-assets.md) user guide.
## Writer API: Why Encoding Hints Use Metadata
The writer side of the API uses `BufferedMetadataProvider` to control how images are encoded when written to disk. This design keeps format-specific parameters out of abstract interfaces:
```mermaid
flowchart LR
A[BufferedMetadataProvider] -->|"set('IC', 'C8')"| B[metadata storage]
B --> C[BufferedImageAssetProvider]
C -->|"metadata()"| D[DatasetWriter]
D -->|"reads IC, IMODE, etc."| E[Encoder Selection]
E --> F[Output File]
```
1. **Clean abstractions**: `BufferedImageAssetProvider` doesn't need NITF-specific parameters
2. **Seamless copying**: Metadata from a reader can flow directly to a writer
3. **Consistent naming**: The same field names used when reading are used when writing
4. **Format flexibility**: Different output formats read different hint fields
The writer knows what format it's writing, so it knows which metadata fields to look for. This allows the same `BufferedImageAssetProvider` to be written to NITF, GeoTIFF, or other formats by simply changing the writer and the encoding hints.
For encoding options, chipping/transcoding workflows, and masked image support, see the [Writing Imagery Assets](../user-guide/image-assets-writing.md) user guide.
## ImageOperation Pattern for Large Image Processing
The ImageOperation pattern applies image processing algorithms to large geospatial imagery without loading entire images into memory. This design implements the ImageAssetProvider interface, allowing operations to be chained and composed while maintaining the same blocked access patterns as the underlying data sources. The `ImageOperation` class wraps any callable function (such as scikit-image filters) and applies it block-by-block as data is requested, enabling integration with existing image processing libraries. The pattern supports both simple per-block operations and neighborhood-based algorithms through its caching and block retrieval mechanisms, allowing processing pipelines that scale to large imagery datasets.
```mermaid
classDiagram
direction TD
class ImageOperation {
-input_provider: ImageAssetProvider
-operation_func: Callable
-operation_kwargs: Dict
-cache: Dict[Tuple[int, int], ndarray]
+__init__(key: str, input_provider: ImageAssetProvider, operation_func: Callable, **kwargs)
+has_block(block_row: int, block_col: int, resolution_level: int) bool
+get_block(block_row: int, block_col: int, resolution_level: int, bands: List[int]) ndarray
+from_function(func: Callable, **kwargs) ImageOperation
+chain(other_operation: ImageOperation) ImageOperation
-apply_operation_to_block(block: ndarray) ndarray
-get_neighborhood_blocks(block_row: int, block_col: int, radius: int) List[ndarray]
}
ImageAssetProvider <|-- ImageOperation
ImageOperation --> ImageAssetProvider : consumes
```
## Format-Specific Implementations
The abstract DatasetReader/DatasetWriter and AssetProvider interfaces enable support for different geospatial formats through concrete implementations. Each format provides its own reader/writer classes and asset providers that handle format-specific encoding details.
The Joint BIIF Profile (JBP) format, which includes NITF and NSIF files, demonstrates how the abstract interfaces work with a multi-asset format that supports various compression schemes. In these formats multiple assets are represented as segments of a single combined file.
```mermaid
classDiagram
direction TB
class JBPDatasetReader {
-input_path: Path
+__init__(paths: List[Path])
}
class JBPDatasetWriter {
-output_path: Path
+__init__(path: Path)
}
DatasetReader <|-- JBPDatasetReader
DatasetWriter <|-- JBPDatasetWriter
<> DatasetReader
<> DatasetWriter
```
## Parser Infrastructure (PyStructure Classes)
The parser infrastructure provides a data-driven approach to reading and writing binary structures. Instead of hand-coding parsers for each format, structure definitions are loaded from KSY (Kaitai Struct YAML) files and used to parse binary data at runtime. This enables maintainable parsing of formats like NITF headers and TRE extensions.
```mermaid
classDiagram
direction TB
class StructureRegistry {
+__init__()
+add_search_path(path: str) None
+get(name: str) Optional[StructureDefinition]
+list() List[str]
+reload() None
+register(name: str, definition: StructureDefinition) None
+search_paths() List[str]
}
class StructureDefinition {
+id str
+title Optional[str]
+field_names List[str]
}
class StructureAccessor {
+__init__(definition: StructureDefinition, data: bytes)
+__getitem__(path: str) Value
+has(path: str) bool
+fields() List[str]
+raw_view(path: str) bytes
+field_byte_range(path: str) Tuple[int, int]
+data bytes
+definition StructureDefinition
}
class StructureWriter {
+new_fixed(definition: StructureDefinition)$ StructureWriter
+new_streaming(definition: StructureDefinition)$ StructureWriter
+__setitem__(path: str, value: Any) None
+set(path: str, value: Any) None
+is_set(path: str) bool
+finish() bytes
+buffer() bytes
}
class Value {
+as_str() str
+as_int() int
+as_float() float
+as_bytes() bytes
}
StructureRegistry --> StructureDefinition : provides
StructureAccessor --> StructureDefinition : uses
StructureAccessor --> Value : returns
StructureWriter --> StructureDefinition : uses
```
For parser usage examples and structure definition authoring, see the [Metadata](../user-guide/metadata.md) user guide.