Skip to content

Finalize Multiprocessing API ahead of documentation #677

Open
@rhugonnet

Description

@rhugonnet

An issue to converge on our best conception of the Multiprocessing API before releasing the documentation (+ a new GeoUtils minor version!) describing it for users.

Discussion started in #669 and #661 with @vschaffn and @adebardo.

Some ideas:

  1. When the output is a raster: Return a Raster object (unloaded) that opened the config.outfile at the end of the call? This would allow to easily chain operations, and keep the same syntax as without Multiprocessing!
config1 = MultiprocConfig(chunk_size=200, outfile="reproj.tif")
config2 = MultiprocConfig(chunk_size=200, outfile="prox.tif")
rst = Raster(starting_file)
rst_reproj = rst.reproject(config=config1)
rst_reproj_prox = rst_reproj.proximity(config=config2)
  1. When the output is not a raster: The object (subsampled array, interpolated array) is expected to fit in memory, so we return it directly?
  2. Default to a temporary filepath for a call of MultiprocConfig(chunk_size=200) without outfile=, so that users can simply pass the same config argument everywhere defining only chunk size for practicality of chaining operations? (We can probably use Python's tempfile for this?)
config = MultiprocConfig(chunk_size=200)
rst = Raster(starting_file)
rst_reproj = rst.reproject(config=config)
rst_reproj_prox = rst_reproj.proximity(config=config)
  1. Add multiprocessing configuration to geoutils.config (see https://geoutils.readthedocs.io/en/stable/config.html) so that users can define a global parameter, and don't even have to pass a config argument if they don't want to:
gu.config["mp.chunksizes": (200, 200)]
rst = Raster(starting_file)
rst_reproj = rst.reproject()
rst_reproj_prox = rst_reproj.proximity()

This is for the Raster.function(config=) API.
I think we can also take notes of our ideas here on the API of the different map_overlap functions (that will be public) while integrating them for various uses into xDEM and GeoUtils 🙂

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureNeed to re-organize or re-structure somethingdocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions