azavea / nasa-hyperspectral

An event-driven image processing pipeline for developing our foundational capability to work with HSI data sources.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UMAP dimension reduction

jpolchlo opened this issue · comments

One of the standard methods for dimension reduction is the ISOMAP algorithm. There is an implementation available in sklearn that we should use to reduce our data. Early efforts resulted in high memory utilization that prevented successful completion, so it may not be entirely straightforward to apply. This issue will be resolved when we can describe how to use ISOMAP to reduce a scene worth of data.

Other questions that may be of interest: (1) can we update this basis as we encounter new scenes to develop a more representative basis, (2) can we distribute this process since the computational demands can be fairly high.

Early tests indicate that ISOMAP is too memory hungry to run reliably on examples of the scale of problem we are considering. Attempting a reduction on about 90,000 pixels of 234 bands failed due to memory demand. UMAP is a similar algorithm with lower resource demand. We'll try that instead.

UMAP ran and produced a result. Consider the following, derived from PRISMA:
image
This wonderful result also took about 6 hours to make. It appears that UMAP is both fragile and slow. We need to continue to pursue other dimension reduction strategies.