Metadata¶
MetadataProvider¶
- class aws.osml.io.MetadataProvider¶
Bases:
objectA read-only metadata provider implementing the
collections.abc.Mappingprotocol.MetadataProviderexposes metadata as a dictionary-like object. You can access individual fields with bracket notation (metadata["IC"]), iterate keys, check membership within, and convert to a plaindictviaentries()ordict(metadata).You typically obtain a
MetadataProviderfrom aDatasetReaderor anAssetProviderrather than creating one directly.Example:
```python from aws.osml.io import IO
- with IO.open([“image.ntf”], “r”) as dataset:
meta = dataset.metadata ic = meta[“IC”] # KeyError if missing ic = meta.get(“IC”, “NC”) # default if missing all_meta = meta.entries() # full dict (single Rust call) security = meta.entries(“FS”) # prefix filter for key in meta:
print(key, meta[key])
- entries(name=None)¶
Return metadata as a Python dictionary, optionally filtered by key prefix.
When name is provided, only keys that start with that prefix are included. When omitted, all metadata fields are returned. This is the fast path for bulk export (single Rust→Python crossing).
- get(key, default=None)¶
Retrieve the value for the given key, or a default if absent.
- Parameters:
key (str) – The metadata field name.
default – Value to return if key is not present (default: None).
- Returns:
The value for the key, or the default.
- items()¶
Return a list of (key, value) tuples.
- keys()¶
Return a list of all metadata keys.
- raw¶
The underlying metadata in its original binary format, as a
BytesIOobject.
- values()¶
Return a list of all metadata values.
BufferedMetadataProvider¶
- class aws.osml.io.BufferedMetadataProvider(source=None)¶
Bases:
MetadataProviderA mutable metadata provider implementing
collections.abc.MutableMapping.BufferedMetadataProviderextendsMetadataProviderwith write operations, giving it full dictionary semantics. Use bracket notation to set any native Python type (str, int, float, list, dict, bool, None) anddelto remove keys.Example:
```python from aws.osml.io import BufferedMetadataProvider
metadata = BufferedMetadataProvider() metadata[“IC”] = “NC” metadata[“IMODE”] = “B” metadata[“33550”] = [0.5, 0.5, 0.0] # list metadata[“GeoProjectedCRS”] = 32618 # int
del metadata[“IC”] metadata.update({“NPPBH”: “256”, “NPPBV”: “256”}) metadata.clear() ```
- clear()¶
Remove all key-value pairs.
- update(mapping)¶
Bulk update from a Python dict.
TIFF Tag Dictionary Key Format¶
For TIFF/GeoTIFF files, metadata uses numeric tag ID strings as keys. Each key
is the string representation of the TIFF tag number as defined in the TIFF 6.0
specification. For example, ImageWidth (tag 256) appears under the key
"256", and Compression (tag 259) appears under "259".
This applies to all IFD-level tags, including GeoTIFF tags such as
GeoKeyDirectory (tag 34735), ModelPixelScale (tag 33550), and private-use
tags (32768+). GeoKey directory contents are not decoded into separate entries;
the raw TIFF tags are stored as-is under their numeric keys.
Dataset-level entries that are not TIFF tags (e.g. "ByteOrder",
"NumberOfDirectories") retain descriptive string keys.
from aws.osml.io import IO
with IO.open(["image.tif"], "r") as dataset:
meta = dataset.metadata
width = meta["256"] # ImageWidth
height = meta["257"] # ImageLength
compression = meta["259"] # Compression
byte_order = meta["ByteOrder"] # dataset-level, not a tag
# Prefix filtering works on numeric keys
tags_starting_with_3 = dataset.metadata.entries("3")
# Returns keys like "322" (TileWidth), "339" (SampleFormat),
# "34735" (GeoKeyDirectory), etc.
For convenient name-based access, use the TagNameResolver helper described
below.
TagNameResolver¶
- class aws.osml.io.tiff.utils.TagNameResolver(tag_dict, custom_mapping=None)¶
Bases:
objectResolve TIFF tag names to numeric IDs for convenient metadata access.
Wraps a Tag_Dictionary (from MetadataProvider.entries()) and provides lookup by human-readable tag name via a configurable name-to-number mapping.
Keys that are not present in the mapping are passed through unchanged, mirroring the behaviour of
__iter__()which exposes unmapped keys directly.Example:
meta = reader.metadata.entries() resolver = TagNameResolver(meta) width = resolver["ImageWidth"] # looks up key "256" crs = resolver.by_number(34735) # direct numeric access comp = resolver.get("Compression") # returns None if absent
- VALUE_MAPPING: Dict[int, Dict[str, int]] = {259: {'ccittfax3': 3, 'ccittfax4': 4, 'ccittrle': 2, 'deflate': 8, 'jpeg': 7, 'lzw': 5, 'none': 1, 'ojpeg': 6, 'packbits': 32773}, 262: {'mask': 4, 'minisblack': 1, 'miniswhite': 0, 'palette': 3, 'rgb': 2, 'ycbcr': 6}, 274: {'bottomleft': 4, 'bottomright': 3, 'leftbottom': 8, 'lefttop': 5, 'rightbottom': 7, 'righttop': 6, 'topleft': 1, 'topright': 2}, 284: {'chunky': 1, 'planar': 2}, 317: {'floatingpoint': 3, 'horizontal': 2, 'none': 1}, 339: {'float': 3, 'int': 2, 'uint': 1, 'void': 4}}¶
- DEFAULT_MAPPING: Dict[str, int] = {'Artist': 315, 'BitsPerSample': 258, 'CellLength': 265, 'CellWidth': 264, 'ColorMap': 320, 'Compression': 259, 'Copyright': 33432, 'DateTime': 306, 'DocumentName': 269, 'DotRange': 336, 'ExtraSamples': 338, 'FillOrder': 266, 'FreeByteCounts': 289, 'FreeOffsets': 288, 'GDALMetadata': 42112, 'GDALNoData': 42113, 'GeoAsciiParams': 34737, 'GeoDoubleParams': 34736, 'GeoKeyDirectory': 34735, 'GrayResponseCurve': 291, 'GrayResponseUnit': 290, 'HalftoneHints': 321, 'HostComputer': 316, 'ImageDescription': 270, 'ImageLength': 257, 'ImageWidth': 256, 'InkNames': 333, 'InkSet': 332, 'JPEGTables': 347, 'Make': 271, 'MaxSampleValue': 281, 'MinSampleValue': 280, 'Model': 272, 'ModelPixelScale': 33550, 'ModelTiepoint': 33922, 'ModelTransformation': 34264, 'NewSubfileType': 254, 'NumberOfInks': 334, 'Orientation': 274, 'PageName': 285, 'PageNumber': 297, 'PhotometricInterpretation': 262, 'PlanarConfiguration': 284, 'Predictor': 317, 'PrimaryChromaticities': 319, 'ResolutionUnit': 296, 'RowsPerStrip': 278, 'SMaxSampleValue': 341, 'SMinSampleValue': 340, 'SampleFormat': 339, 'SamplesPerPixel': 277, 'Software': 305, 'StripByteCounts': 279, 'StripOffsets': 273, 'SubIFDs': 330, 'SubfileType': 255, 'TargetPrinter': 337, 'Threshholding': 263, 'TileByteCounts': 325, 'TileLength': 323, 'TileOffsets': 324, 'TileWidth': 322, 'WhitePoint': 318, 'XResolution': 282, 'YResolution': 283}¶
- __getitem__(name)¶
Look up a tag value by human-readable name.
If name is in the mapping it is resolved to the corresponding numeric key. Otherwise name is used directly as the dictionary key, allowing unmapped keys to pass through.
- get(name, default=None)¶
Look up a tag value by name, returning default if not found.
- Return type:
- by_number(tag_number)¶
Retrieve a tag by its numeric key directly.
- __iter__()¶
Iterate over all (resolved_name, value) pairs.
Keys are resolved to human-readable tag names when a mapping exists. Tags without a known name are yielded with their numeric string key.
- __contains__(name)¶
Check if a tag name is present in the metadata.
Returns
Truewhen the resolved key exists in the underlying dictionary. For mapped names this checks the numeric key; for unmapped names the raw key is checked directly.- Return type:
The TagNameResolver wraps a TIFF Tag_Dictionary and translates human-readable
tag names to their numeric keys. It ships with a default mapping covering
baseline TIFF 6.0 tags, GeoTIFF tags, and common GDAL tags.
from aws.osml.io import IO
from aws.osml.io.tiff.utils import TagNameResolver
with IO.open(["image.tif"], "r") as dataset:
meta = dataset.metadata.entries()
tags = TagNameResolver(meta)
# Name-based lookup
width = tags["ImageWidth"] # equivalent to meta["256"]
scale = tags["ModelPixelScale"] # equivalent to meta["33550"]
# Safe access with default
nodata = tags.get("GDALNoData", "nan")
# Direct numeric access
raw_geokeys = tags.by_number(34735)
# Check presence
if "Compression" in tags:
print(tags["Compression"])
# Custom mapping for vendor-specific tags
custom = TagNameResolver(meta, custom_mapping={
"MyVendorTag": 65000,
})
vendor_val = custom["MyVendorTag"]