Example - Reading and Writing with Dask

[1]:
import multiprocessing
# Linux/OSX:
import multiprocessing.popen_spawn_posix
# Windows:
# import multiprocessing.popen_spawn_win32
import threading

from dask.distributed import Client, LocalCluster, Lock

import rioxarray

Tips for using dask locks:

  • Be careful about what lock you use for your process. It is required to have a lock for each worker, so the more fine-grained the better.

  • The reading and writing processes need the same type of lock. They don’t have to share the same lock, but they do nead a lock of the same type.

See docs for:

No distributed computing example

Note: Without a lock provided, to_raster does not use dask to write to disk.

[2]:
xds = rioxarray.open_rasterio(
    "../../test/test_data/compare/small_dem_3m_merged.tif",
    chunks=True,
)
xds.rio.to_raster("simple_write.tif", tiled=True)

Multithreaded example

[3]:
xds = rioxarray.open_rasterio(
    "../../test/test_data/compare/small_dem_3m_merged.tif",
    chunks=True,
    lock=False,
    # lock=threading.Lock(), # when too many file handles open
xds.rio.to_raster(
    "dask_thread.tif", tiled=True, lock=threading.Lock(),
)

Multiple worker example

[4]:
with LocalCluster() as cluster, Client(cluster) as client:
    xds = rioxarray.open_rasterio(
        "../../test/test_data/compare/small_dem_3m_merged.tif",
        chunks=True,
        lock=False,
        # lock=Lock("rio-read", client=client), # when too many file handles open
    )
    xds.rio.to_raster(
        "dask_multiworker.tif",
        tiled=True,
        lock=Lock("rio", client=client),
    )