Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NAIPCluster dataset and datamodule #443

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

calebrob6
Copy link
Member

Dataset of NAIP imagery sampled random from the NAIP archive and masks generated by a KMeans clustering of pixels on-the-fly.

image

image

@github-actions github-actions bot added datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets labels Mar 1, 2022
@adamjstewart
Copy link
Collaborator

How is this dataset different than our existing NAIP dataset with a new sampler? Basically, I'm wondering if this should be a sampler instead.

@adamjstewart adamjstewart added this to the 0.3.0 milestone Mar 1, 2022
@calebrob6
Copy link
Member Author

calebrob6 commented Mar 1, 2022

  • It creates masks by clustering the inputs on the fly (this is also non-trivial if you want the clustering to be a function of a window of pixels vs. a single pixel)
  • The NAIP imagery is pre-sampled so you don't have to download a bunch of NAIP tiles / gives you a reproducible set of patches to work with

@adamjstewart
Copy link
Collaborator

Couldn't that be done as a transform? Then it could be combined with any dataset, not just NAIP.

If we make this a VisionDataset then we're throwing away all geospatial metadata.

@calebrob6
Copy link
Member Author

(see my edit above)

Hmm the transform would need to take the model you want to use as input, so that would be a little cumbersome. Roughly you'd have to do:

# sample a bunch of NAIP imagery
# train a cluster model (using the more complicated logic for including windows if necessary)
# create a transform with that cluster model as input
# create another NAIP dataset with that transform

@calebrob6 calebrob6 marked this pull request as draft March 1, 2022 19:55
@calebrob6
Copy link
Member Author

It isn't urgent that we figure this out (or crucial that this be in torchgeo) -- I'll be using this dataset in my own experiments though so I wanted a branch somewhere.

@adamjstewart adamjstewart modified the milestones: 0.3.0, 0.4.0 Jul 9, 2022
@adamjstewart adamjstewart removed this from the 0.4.0 milestone Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants