Getting Started

The osml-imagery-toolkit is a Python library for processing satellite and aerial imagery. It provides sensor models, display normalization, image pyramids, tiling, orthorectification, and utilities for workign with features.

What This Library Does

Remote sensing imagery arrives as raw sensor measurements — high bit-depth photon counts, complex-valued radar returns, or tiled mosaics in vendor-specific formats. Working with this data requires common utilities that bridge the gap between raw sensor output and usable products:

  • Sensor models that relate every pixel to a geographic position on the Earth’s surface

  • Display processing that maps high-dynamic-range measurements to viewable 8-bit images

  • Multi-resolution pyramids for efficient zoom-level access to gigapixel imagery

  • Chip extraction for cutting self-contained subsets with correct metadata

  • Orthorectification for projecting imagery onto map-accurate grids

  • Feature geolocation and projection for converting between pixel coordinates and geographic coordinates on detections and annotations

The toolkit is organized into packages that can be used independently or composed together:

  • metadata — Extracts TREs, GeoKeys, and DES XML from imagery files and constructs sensor models automatically.

  • photogrammetry — Implements sensor models (RPC, RSM, SICD, SIDD, projective, affine) for pixel-to-world coordinate conversion.

  • elevation — Loads DEM tiles, computes raster offsets, and builds elevation models for terrain-corrected geolocation.

  • image_processing — Display chains, chipping, pyramids, orthorectification, resampling, and SAR complex-to-display conversion.

  • formats — Auto-generated Python dataclasses for SICD and SIDD XML schemas, used internally by metadata parsers.

  • features — Bridges pixel-space and geographic-space features: geolocates ML detections to map coordinates and projects known geographic annotations into image pixel space for overlay.

Prerequisites

Note

The toolkit requires Python 3.10 or later and installs entirely via pip. No system-level native libraries are needed — the osml-imagery-io dependency ships self-contained binary wheels.

Installation

Install from PyPI:

pip install osml-imagery-toolkit

For development, clone the repository and use Hatch to manage the environment:

git clone https://github.com/awslabs/osml-imagery-toolkit.git
cd osml-imagery-toolkit
pip install hatch
hatch env create        # creates the default virtualenv with dev deps
hatch run test          # run the full test suite
hatch run lint:check    # run linting

Tip

Hatch manages isolated environments per task. Use hatch run test for testing, hatch run lint:check for linting, and hatch run docs:build for documentation. Run hatch env show to see all available environments.

Guided Tour

The following snippets illustrate the major capabilities. Each links to its full documentation page.

Convert to a Displayable Image

Satellite sensors capture at bit depths (11-16 bits) far exceeding what a monitor can render, and SAR sensors produce complex-valued I/Q data. The display chain automatically classifies the image modality and builds an appropriate pixel processing pipeline that maps raw measurements to 8-bit RGB output:

from aws.osml.io import IO
from aws.osml.image_processing import DisplayChainFactory, MappedImageProvider

with IO.open("image.ntf", "r") as reader:
    source = reader.get_asset("image:0")
    chain = DisplayChainFactory.build(source)

    display = MappedImageProvider(
        source, chain,
        source_bands=chain.input_bands,
        num_bands=chain.output_bands,
    )
    tile = display.get_block(0, 0)  # uint8 RGB output

Build an Image Pyramid

Large satellite images can exceed 100,000 pixels per side. Serving them at multiple zoom levels requires pre-computed reduced-resolution overviews (R-Sets). The pyramid builder generates these in a single pass over the source tiles:

from aws.osml.io import IO
from aws.osml.image_processing import PyramidBuilder

with IO.open("source.tif", "r") as reader:
    source = reader.get_asset("image:0")
    builder = PyramidBuilder(source, min_size=256)

    with IO.open("output.tif", "w", "geotiff") as writer:
        builder.build_and_write(writer, base_key="image:0")

Extract Chips

ML inference pipelines and human review tools need small, encoded image tiles with correct geospatial metadata. The chip factory reads from the pyramid, applies an optional display chain, and encodes the result with derived metadata:

from aws.osml.io import IO
from aws.osml.image_processing import ChipFactory, TiledImagePyramid, PixelWindow

with IO.open("image.ntf", "r") as reader:
    pyramid = TiledImagePyramid.from_dataset(reader)

    factory = ChipFactory(source=pyramid, output_format="png")
    chip_bytes = factory.create_chip(PixelWindow(0, 0, 512, 512))

Geolocate Pixels

Every pixel in a remote sensing image corresponds to a specific point on the Earth. The photogrammetry package supports RPC, RSM, SICD, SIDD, projective, and affine sensor models:

from aws.osml.io import IO
from aws.osml.metadata import load_sensor_model
from aws.osml.photogrammetry import ImageCoordinate
from math import degrees

with IO.open("image.ntf", "r") as dataset:
    sensor_model = load_sensor_model(dataset)

    world = sensor_model.image_to_world(ImageCoordinate([512, 384]))
    print(f"{degrees(world.latitude):.6f}N, {degrees(world.longitude):.6f}E")

Orthorectify

Raw satellite imagery contains perspective distortion and terrain displacement. The warping engine removes these effects, producing north-up, map-aligned tiles:

from aws.osml.image_processing import (
    MapTileSetFactory, OrthoGridBuilder, WarpedImageProvider, WarpGridOptions,
)

tile_set = MapTileSetFactory.get_for_id("WebMercatorQuad")

grid_builder = OrthoGridBuilder(
    tile_set=tile_set,
    tile_matrix=16,
    sensor_model=sensor_model,
    source_width=source.num_columns,
    source_height=source.num_rows,
    options=WarpGridOptions.TERRAIN_CORRECTED,
    num_source_levels=source_pyramid.num_levels,
)

warped = WarpedImageProvider(source_pyramid, grid_builder)

min_row, min_col, max_row, max_col = grid_builder.tile_limits
for r in range(min_row, max_row + 1):
    for c in range(min_col, max_col + 1):
        ortho_block = warped.get_block(r, c)

Work with Features

The features package converts between pixel-space annotations and geographic features. Geolocate ML detections to map coordinates:

from aws.osml.features import Geolocator, ImagedFeaturePropertyAccessor

geolocator = Geolocator(
    property_accessor=ImagedFeaturePropertyAccessor(),
    sensor_model=sensor_model,
)
geolocator.geolocate_features(detections)
# detections now have GeoJSON "geometry" with lon/lat coordinates

Or project known geographic features into image pixel space for overlay:

from aws.osml.features import Projector, ImagedFeaturePropertyAccessor

projector = Projector(
    property_accessor=ImagedFeaturePropertyAccessor(),
    sensor_model=sensor_model,
    image_bounds=(0.0, 0.0, float(width), float(height)),
)
visible = projector.project_features(reference_features)
# visible features now have "imageGeometry" with pixel coordinates

What’s Next

Page

Capability

Sensor Models & Geolocation

Pixel ↔ geographic coordinate conversion

Elevation Models

Terrain-aware geolocation using DEMs

Display Processing

Raw sensor data → viewable images

Image Pyramids

Multi-resolution overviews

Image Chipping

Encoded chip extraction with metadata

Image Warping

Orthorectification and reprojection

Features & Annotations

Geolocation, projection, and spatial indexing of vector data