skmap.parallel.blocks.RasterBlockReader#
- class RasterBlockReader(reference_file=None)[source]#
Bases:
objectThread-parallel reader for large rasters.
If
reference_fileis notNone, builds an R-tree index [1] of the block geometries read from thereference_fileon initialization. All rasters read with the initialized reader are assumed to have identical geotransforms and block structures to the reference.- Parameters:
reference_file (
str) – Path (URL) of the reference raster.
For full usage examples please refer to the block processing tutorial notebook [2].
References
[1] pygeos STRTree
[2] Raster block processing tutorial
Examples
>>> from skmap.parallel.blocks import RasterBlockReader >>> from skmap.misc import ttprint >>> >>> fp = 'https://s3.eu-central-1.wasabisys.com/skmap/lcv/lcv_landcover.hcl_lucas.corine.rf_p_30m_0..0cm_2019_skmap_epsg3035_v0.1.tif' >>> >>> ttprint('initializing reader') >>> reader = RasterBlockReader(fp) >>> ttprint('reader initialized')
Methods
Thread-parallel reading of large rasters within a bounding geometry.
- read_overlay(src_path, geometry, band=1, geometry_mask=True, max_workers=4, optimize_threadcount=True)[source]#
Thread-parallel reading of large rasters within a bounding geometry.
Only blocks that intersect with
geometryare read. Returns a generator yielding(data, mask, window)tuples for each block, wheredataare the stacked pixel values of all rasters atmask==True,maskis the reduced (via bitwiseand) block data mask for all rasters, andwindowis therasterio.windows.Window[1] for the block within the transform of thereference_file. All rasters read with the initialized reader are assumed to have identical geotransforms and block structures to thereference_fileused for initialization. If the reader was initialized withreference_file==None, the first file insrc_pathis used as the reference and the block R-tree is built before yielding data from the first block.- Parameters:
src_path (
Union[str,Iterable[str]]) – Path(s) (or URLs) of the raster file(s) to read.geometry (
dict) – The bounding geometry within which to read raster blocks, given as a dictionary (with the GeoJSON geometry schema).band (
int) – Index of band to read from all rasters.geometry_mask (
bool) – Indicates wheather or not to use the geometry as a data mask. IfFalse, the block data will be returned in its entirety, regardless if some of it falls outside of thegeometry.max_workers (
int) – Maximum number of worker threads to use, defaults tomultiprocessing.cpu_count().optimize_threadcount (
bool) – Wheather or not to optimize number of workers. IfTrue, the number of worker threads will be iteratively increased until the average read time per block stops decreasing ormax_workersis reached. IfFalse,max_workerswill be used as the number of threads.
- Returns:
Generator yielding
(data, mask, window)tuples for each block.- Return type:
For full usage examples please refer to the block processing tutorial notebook [2].
References
[1] Rasterio Window
[2] Raster block processing tutorial
Examples
>>> geom = { ... 'type': 'Polygon', ... 'coordinates': [[ ... [4765389, 2441103], ... [4764441, 2439352], ... [4767369, 2438696], ... [4761659, 2441949], ... [4765389, 2441103], ... ]], ... } >>> block_data_gen = reader.read_overlay(fp) >>> data, mask, window = next(block_data_gen)