Datasets and the IO Interface¶
The Simple Path¶
For most tasks you don’t need to think about datasets or assets at all. The convenience functions handle file opening, asset selection, and cleanup for you:
from aws.osml.io import imread, imsave, iminfo
# Read → NumPy array
pixels = imread("image.ntf")
# Inspect without reading pixels
info = iminfo("image.ntf")
print(f"{info.width}x{info.height}, {info.bands} bands, {info.dtype}")
# Save — format inferred from extension
imsave("output.tif", pixels)
When you need more control — multi-segment files, per-asset metadata, specific compression parameters, or write workflows that involve multiple assets — the full dataset API described below gives you direct access to everything in the file.
Opening a Dataset¶
The IO class is the entry point for reading and writing imagery files. It auto-detects
the format (NITF, TIFF/GeoTIFF, PNG, etc.) and returns a DatasetReader or DatasetWriter:
from aws.osml.io import IO
# Read mode — format auto-detected from extension
with IO.open(["image.ntf"], "r") as dataset:
print(type(dataset)) # DatasetReader
# Write mode — format specified explicitly
with IO.open(["output.tif"], "w", "geotiff") as writer:
print(type(writer)) # DatasetWriter
Use the context manager (with) to ensure file handles are released when you’re done.
Input Sources¶
IO.open() and the convenience functions (imread, imsave, iminfo, tiles)
accept two kinds of input:
File paths (recommended for large files)¶
Pass a string path (or list of paths for multi-file pyramids). The library memory-maps the file, so only the pages you access are loaded into RAM. This is the most performant option — the operating system efficiently manages loading imagery from disk into memory without requiring the entire file to be resident.
from aws.osml.io import imread
pixels = imread("large_image.ntf")
Python file-like objects¶
Any object with a standard .read() / .write() interface works — io.BytesIO,
fsspec handles, HTTP response bodies, or any duck-typed object with the required
methods. This is convenient when you already have bytes in memory or want to
encode directly to a buffer without touching the filesystem.
import io
from aws.osml.io import IO, imread, imsave
import numpy as np
# Read from an in-memory buffer
png_bytes = download_image_bytes()
pixels = imread(io.BytesIO(png_bytes), format="png")
# Write directly to a buffer
data = np.random.randint(0, 255, (3, 256, 256), dtype=np.uint8)
buffer = io.BytesIO()
imsave(buffer, data, format="jpeg")
Trade-offs¶
Stream sources are read entirely into memory via a single .read() call. For
large files (multi-GB NITF imagery) this can be problematic:
Memory pressure — the full file must fit in RAM, unlike memory-mapped paths which load pages on demand.
Latency for remote files — if the stream backs cloud storage (e.g., an fsspec S3 handle), the entire file must be downloaded before decoding begins.
For efficient access to large remote imagery without downloading the full file, use the VirtualiZarr tile-based access path. It issues HTTP range requests for only the tiles you need:
import zarr
import numpy as np
from aws.osml.io.multi_reference_fs import MultiReferenceFileSystem
from zarr.storage._fsspec import FsspecStore
fs = MultiReferenceFileSystem(
fo="s3://bucket/image.ntf.tile_index.json",
template_overrides={"base": "s3://bucket/imagery/"},
asynchronous=True,
remote_options={"asynchronous": True},
skip_instance_cache=True,
)
store = FsspecStore(fs=fs, read_only=True, path="")
root = zarr.open_group(store, mode="r", zarr_format=2)
# Read only the tiles you need — no full-file download
tile = np.asarray(root["0/data"][0:3, 0:256, 0:256])
See Cloud Imagery Access via Zarr for the full workflow.
Alternatively, download the remote file to a local path first to get memory-mapped performance:
import tempfile
from aws.osml.io import imread
with tempfile.NamedTemporaryFile(suffix=".ntf") as tmp:
tmp.write(remote_file.read())
tmp.flush()
pixels = imread(tmp.name)
The format parameter¶
When working with streams, the library cannot infer the image format from a file
extension. The format parameter is required for all stream operations:
# Raises ValueError — no format specified
imread(io.BytesIO(data))
# Works
imread(io.BytesIO(data), format="png")
Supported format strings: "nitf", "tiff", "png", "j2k", "jpeg".
When using file paths, format remains optional — the library infers it from the
file extension. Recognized NITF extensions include .ntf, .nitf, .nsif, .nsf,
and .hr1 through .hr8 (High Resolution Elevation products).
When streams are a good fit¶
The file is small enough to fit in memory (PNG thumbnails, JPEG tiles, small NITF chips)
You already have the bytes in memory (HTTP response bodies, message payloads)
You want to encode output directly to a buffer without a temporary file (tile server responses)
You are using fsspec handles for moderate-sized files from cloud storage
Dataset Structure¶
The dataset model in this library is inspired by the SpatioTemporal Asset Catalog (STAC) specification. STAC defines a common structure for describing and cataloging geospatial assets — any file that represents information about the Earth captured at a certain place and time. The core building block in STAC is the Item, a GeoJSON feature that groups one or more related Assets (the actual data files) together with shared metadata such as spatial extent, temporal range, and provenance.
This library adopts the same conceptual model: a single Dataset maps to a STAC Item and
may contain multiple named assets. Just as a STAC Item for a satellite scene might include
separate assets for each spectral band, a thumbnail, a metadata sidecar, and ML-derived
annotations, a Dataset opened by this library can contain multiple images, structured
data payloads (e.g. SICD/SIDD XML), text reports, and vector graphic overlays — all
accessed through a uniform interface. The key insight is that real-world geospatial
products are rarely a single file; they are bundles of related assets that share a common
spatial and temporal context.
By aligning with the STAC data model, datasets produced or consumed by this library are
straightforward to publish as STAC Items and integrate with the broader STAC ecosystem of
catalogs, search APIs, and tooling. The library does not implement the STAC JSON format
itself, but the structural alignment means the mapping between an in-memory Dataset and
a STAC Item is direct: each asset key corresponds to a STAC Asset entry, asset types map
to STAC roles, and dataset-level metadata carries the information needed to populate Item
properties.
Each asset within a dataset has a type and a key that uniquely identifies it:
Asset Type |
Description |
Examples |
|---|---|---|
|
Raster imagery with blocked access |
Satellite photos, SAR data |
|
Structured data payloads |
SICD/SIDD XML, overflow TREs |
|
Plain text content |
Mission reports, annotations |
|
Vector graphics |
CGM overlays |
Asset Roles¶
Every asset also carries one or more semantic roles that describe its purpose. Roles are aligned with the STAC asset roles convention — short strings that communicate what an asset is for, independent of the underlying file format.
Role |
Meaning |
Assigned To |
|---|---|---|
|
Full-resolution image data |
TIFF full-res IFDs, NITF image segments, JPEG, PNG |
|
Reduced-resolution image |
COG overview IFDs, multi-file R-set images |
|
Metadata asset |
NITF text segments, data extension segments |
|
Graphic/annotation overlay |
NITF graphic segments |
Roles are the primary way to distinguish between different kinds of assets without parsing key strings. See Image Pyramids below for how roles are used to separate full-resolution images from reduced-resolution overviews.
Image Pyramids¶
An image pyramid is a set of representations of the same image at progressively lower resolutions. Pyramids enable efficient multi-scale access — a viewer can load a low-resolution overview for navigation and fetch full-resolution tiles only for the region of interest.
There are three ways multi-resolution data can be represented in geospatial imagery:
Block-level resolution levels — A single image whose compressed blocks can be decoded at multiple resolutions (e.g. JPEG 2000 wavelet decomposition). The block grid stays the same; each block just produces fewer pixels at higher level numbers. See Reading Blocks for details.
Embedded overviews — A single file containing multiple images at different resolutions. Cloud Optimized GeoTIFFs (COGs) store reduced-resolution overview images as additional IFDs alongside the full-resolution image.
Multi-file pyramids — Separate files for each resolution level. NITF R-sets are a common example:
image.ntfis the full resolution,image.ntf.r1throughimage.ntf.rNare progressively reduced overviews.
This library exposes cases 2 and 3 through the same uniform interface: each resolution
level becomes a separate image asset with its own key and role. The full-resolution
image has role data, and each overview has role overview.
Overview Asset Keys¶
Overview keys follow the pattern image:{parent}:overview:{level}, where {parent}
is the index of the full-resolution image and {level} is the overview number:
from aws.osml.io import IO, AssetType
with IO.open(["cog.tif"], "r") as dataset:
for key in dataset.get_asset_keys(asset_type=AssetType.Image):
asset = dataset.get_asset(key)
print(f"{key}: {asset.num_columns}x{asset.num_rows}, roles={asset.roles}")
# image:0: 4096x4096, roles=['data']
# image:0:overview:1: 2048x2048, roles=['overview']
# image:0:overview:2: 1024x1024, roles=['overview']
Each overview is a fully functional image asset — you can read blocks, check dimensions, and access metadata just like a full-resolution image:
# Use roles to separate full-res from overviews
data_keys = dataset.get_asset_keys(asset_type=AssetType.Image, roles=["data"])
overview_keys = dataset.get_asset_keys(asset_type=AssetType.Image, roles=["overview"])
# Read a block from an overview
overview = dataset.get_asset("image:0:overview:1")
block = overview.get_block(0, 0, resolution_level=0)
This is different from the resolution_level parameter on get_block(). Block-level
resolution levels are a decompression feature that produces smaller versions of the
same block. Overview assets are separate images with their own tile grids and
dimensions. The two mechanisms are complementary — an overview image that uses JPEG
2000 compression could itself support multiple block-level resolution levels.
Multi-File Pyramids¶
When a dataset spans multiple files at different resolutions, pass all files to
IO.open() as a list. The library detects the R-set naming convention (.rN suffix)
and exposes each file as an overview asset, producing the same key and role structure
as embedded overviews:
with IO.open(["image.ntf", "image.ntf.r1", "image.ntf.r2"], "r") as dataset:
for key in dataset.get_asset_keys(asset_type=AssetType.Image):
asset = dataset.get_asset(key)
print(f"{key}: {asset.num_columns}x{asset.num_rows}, roles={asset.roles}")
# image:0: 4096x4096, roles=['data']
# image:0:overview:1: 2048x2048, roles=['overview']
# image:0:overview:2: 1024x1024, roles=['overview']
The first path is always the full-resolution base image. The overview level is extracted from the filename, not inferred from list order — these two calls produce identical results:
IO.open(["image.ntf", "image.ntf.r1", "image.ntf.r2"], "r")
IO.open(["image.ntf", "image.ntf.r2", "image.ntf.r1"], "r")
R-set detection is format-agnostic. Each file in the list is opened with its own auto-detected format reader, so users are free to select other encodings for the overview files if desired.
Note
R-sets are a de facto industry convention used by some data providers and image analysis tools. They are not part of the JBP/NITF specification — there is no internal metadata linking an R-set file to its parent. The relationship is purely by filename convention.
Some things to keep in mind with multi-file pyramids:
The caller must provide the full list of paths explicitly.
IO.open()does not scan the filesystem for sibling.rNfiles.When only one path is provided, behavior is identical to the single-file case.
R-set overviews are associated with
image:0(the primary image segment). If the base file contains multiple image segments, the R-sets apply to the primary image only.The same multi-path pattern works for writing — see Writing Multi-File R-Set Pyramids.
Streams and Explicit Roles¶
When sources are streams rather than file paths, there are no filenames to parse for
.rN suffixes. The roles parameter tells the library the purpose of each source
explicitly:
import io
from aws.osml.io import IO, AssetType
base_stream = io.BytesIO(base_bytes)
overview1_stream = io.BytesIO(overview1_bytes)
overview2_stream = io.BytesIO(overview2_bytes)
with IO.open(
[base_stream, overview1_stream, overview2_stream],
"r",
format="nitf",
roles=[["data"], ["overview:1"], ["overview:2"]],
) as dataset:
# Same asset key structure as file-path R-sets
for key in dataset.get_asset_keys(asset_type=AssetType.Image):
asset = dataset.get_asset(key)
print(f"{key}: {asset.num_columns}x{asset.num_rows}")
# image:0: 4096x4096
# image:0:overview:1: 2048x2048
# image:0:overview:2: 1024x1024
The roles parameter assigns semantic roles to each source in a multi-source
dataset:
First argument |
|
Description |
|---|---|---|
Single source ( |
|
Roles for the single source |
List of sources |
|
One inner list per source (must match list length) |
Role strings:
Role string |
Meaning |
|---|---|
|
Base image (full resolution). If omitted, the first source is treated as the base. |
|
R-set overview at level N (N ≥ 1). Maps to the |
When roles is required:
List of streams — always required (no filenames to detect from). Omitting raises
ValueError.List of file paths with
roles— explicit roles override.rNfilename detection.List of file paths without
roles— falls back to.rNdetection (common convention).
# Paths with explicit roles — bypasses .rN detection
IO.open(["base.ntf", "ovr.ntf"], "r", roles=[["data"], ["overview:1"]])
# Paths without roles — uses .rN detection
IO.open(["image.ntf", "image.ntf.r1"], "r")
Discovering Assets¶
Use get_asset_keys() to list available assets, then get_asset() to retrieve
a specific one. You can filter by asset type, by role, or both:
from aws.osml.io import IO, AssetType
with IO.open(["complex_dataset.ntf"], "r") as dataset:
# List keys by asset type
image_keys = dataset.get_asset_keys(asset_type=AssetType.Image)
text_keys = dataset.get_asset_keys(asset_type=AssetType.Text)
data_keys = dataset.get_asset_keys(asset_type=AssetType.Data)
graphics_keys = dataset.get_asset_keys(asset_type=AssetType.Graphics)
print(f"Images: {len(image_keys)}, Text: {len(text_keys)}, "
f"Data: {len(data_keys)}, Graphics: {len(graphics_keys)}")
# Retrieve a specific asset
image = dataset.get_asset("image:0")
Filtering by Role¶
The roles parameter on get_asset_keys() lets you filter assets by their semantic
purpose. This is useful when a dataset contains both full-resolution images and
overviews:
with IO.open(["cog.tif"], "r") as dataset:
# Only full-resolution images
data_keys = dataset.get_asset_keys(asset_type=AssetType.Image, roles=["data"])
# Only overview images
overview_keys = dataset.get_asset_keys(asset_type=AssetType.Image, roles=["overview"])
# All image assets (no role filter)
all_keys = dataset.get_asset_keys(asset_type=AssetType.Image)
When roles is omitted or None, all assets matching the asset_type filter are
returned. When both asset_type and roles are provided, both filters apply — only
assets that match the type and have at least one of the requested roles are returned.
NITF files can contain all four asset types. TIFF files contain only image assets —
each IFD (Image File Directory) in the file becomes a separate image asset keyed as
"image:0", "image:1", etc. Cloud Optimized GeoTIFFs additionally expose overview
IFDs as "image:0:overview:1", "image:0:overview:2", etc. PNG files contain a
single image keyed as "image:0". Text, data, and graphics asset queries will return
empty lists for TIFF and PNG datasets.
Dataset-Level Metadata¶
Every dataset exposes a metadata property with file-level fields. See the
Metadata section for details:
with IO.open(["image.ntf"], "r") as dataset:
file_metadata = dataset.metadata.entries()