IO Factory¶
- class aws.osml.io.IO¶
Bases:
objectEntry point for opening geospatial datasets for reading or writing.
The
IOclass provides a single static method,open, that accepts a file path string, a list of file paths, a file-like object (stream), or a list of file-like objects, and returns either aDatasetReaderor aDatasetWriterdepending on the requested mode. The file format is auto-detected from the extension and file header bytes when reading from paths; when reading from a stream, theformatparameter must be specified explicitly. Both local file paths andfile://URIs are supported.Example:
```python from aws.osml.io import IO
# Read mode — single string path (format auto-detected) with IO.open(“image.ntf”, “r”) as dataset:
keys = dataset.get_asset_keys() asset = dataset.get_asset(keys[0])
# Read mode — from an in-memory byte buffer import io with IO.open(io.BytesIO(raw_bytes), “r”, format=”png”) as dataset:
keys = dataset.get_asset_keys()
# Write mode — returns a DatasetWriter with IO.open(“output.ntf”, “w”, “nitf”) as writer:
writer.add_asset(“image”, provider, “Title”, “Description”, [“data”])
- static open(paths, mode='r', format=None, roles=None)¶
Open a dataset for reading or writing.
The format is auto-detected from the file extension when reading from a file path. When writing to a file, a format string is inferred from the extension or may be provided explicitly. When reading from or writing to a file-like object (stream), the
formatparameter is required since there is no filename to inspect. Use a context manager (withstatement) on the returned object to ensure file handles are released.- Parameters:
paths (str | list[str] | BinaryIO | list[BinaryIO]) – A file path, list of file paths, file-like object, or list of file-like objects. For single-file formats a bare string is accepted (
"image.ntf"). For multi-file R-set datasets a list is required. File-like objects must implement.read()for read mode and.write()+.flush()for write mode (e.g.,io.BytesIO, fsspec file handles). Accepts local paths,file://URIs, ands3://URIs.mode (str) –
"r"for reading or"w"for writing. Defaults to"r".format (str or None) – Format identifier (e.g.,
"nitf","geotiff","png"). Required whenpathsis a stream or list of streams. Required when writing to a file with an unrecognized extension. Optional otherwise.roles (list[str] or list[list[str]] or None) – Explicit role strings for each source.
list[str]whenpathsis a single source,list[list[str]]whenpathsis a list. Recognised roles:"data"designates the base source;"overview:N"(N >= 1) designates an R-set overview at resolution level N.rolesis required whenpathsis a list of streams (no filename to derive roles from). For a list of file paths,rolesis optional; if omitted, the library falls back to.rNfilename detection for backward compatibility.
- Returns:
A
DatasetReaderwhen mode is"r", or aDatasetWriterwhen mode is"w".- Return type:
- Raises:
ValueError – If paths is empty, the mode is invalid, the file format is not supported, or
format/rolesis missing when required.TypeError – If
pathshas an invalid type, or a file-like object is missing the required methods.IOError – If the file cannot be opened.
Note
When reading from a stream, the entire content is loaded into memory via
.read(). For large files (multi-GB NITF) this is significantly more expensive than the memory-mapped file path. Consider downloading large files to the local filesystem, or using the library’s VirtualiZarr-based tile index for cloud-native range-read access.Example:
```python from aws.osml.io import IO import io
# Read mode — single string path with IO.open(“image.ntf”, “r”) as dataset:
print(type(dataset)) # DatasetReader
# Read mode — list of paths (R-set, .rN detection) with IO.open([“image.ntf”, “image.ntf.r1”], “r”) as dataset:
print(type(dataset)) # DatasetReader
# Read mode — list of streams with explicit roles streams = [open(“image.ntf”, “rb”), open(“image.ntf.r1”, “rb”)] with IO.open(streams, “r”, format=”nitf”,
roles=[[“data”], [“overview:1”]]) as dataset:
print(type(dataset)) # DatasetReader
# Write mode — to an in-memory buffer buf = io.BytesIO() with IO.open(buf, “w”, “png”) as writer:
writer.add_asset(“image”, provider, “Title”, “Description”, [“data”])