Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error management for out-of-bounds patches #4

Open
tlarcher opened this issue Oct 19, 2022 · 0 comments
Open

Error management for out-of-bounds patches #4

tlarcher opened this issue Oct 19, 2022 · 0 comments
Labels
new feature question Further information is requested

Comments

@tlarcher
Copy link
Collaborator

tlarcher commented Oct 19, 2022

  • Context

Multi-modal datasets like GeoLifeCLEF's comprise numerous bioclimatic and pedologic variables that can be aggregated to construct rasters, scaling up to countries or continents.

These rasters are then divided into smaller patches that are being fed into the deep learning module by calling the private method environmental_raster.Raster._extract_patch through the PatchExtractor class constructor, by indexing an instance of said class.

During the process, the python module rasterio is used to process input data (Tiff, gTiff...) into a custom, useable dataset object.

  • Problem

Let us consider patch_extractor = PatchExtractor("./my_rasters", size=256) where size is the patch size.

Since a patch is generated from its center pixel deduced from the $[x, y]$ input geographic coordinates, one could trigger an error exception by indexing $[x, y]$ on patch_extractor if these coordinates were not far enough from the edge of the raster.

Indeed, for a patch to be correctly built and returned, there needs to enough space betwen the $[x,y]$ input geographic coordinates and the horizontal & vertical borders of a raster to query the $|\frac{size}{2}|$ values of the patch (in the pixel-coordinate system).

Otherwise, out-of-bounds values (relatively to the raster) will be queried to construct a patch.

  • Solution
    It is unclear about which solution is best suited for every user, but there are several possibilitis :
  1. Force the user to index valid input coordinates so that the patch window fits in the raster.
  2. Fill with nan values. Downside : no distinction between ocean and out-of-bounds pixels.
  3. Apply a padding around the raster data (either mirror or repeat pattern)
  4. Leave the choice to the user
@tlarcher tlarcher added question Further information is requested new feature labels Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant