Reconstructing 31 band hyperspectral imagery from 3 band RGB input

MXR-U-Nets for Real Time Hyperspectral Reconstruction


Paper on Arxiv

In recent times, CNNs have made significant contribu- tions to applications in image generation, super-resolution and style transfer. In this paper, we build upon the work of Howard and Gugger [11], He et al. [10] and Misra, D. [16] and propose a CNN architecture that accurately reconstructs hyperspectral images from their RGB counterparts. We also propose a much shallower version of our best model with a 10% relative memory footprint and 3x faster infer- ence thus enabling real-time video applications while still experiencing only about a 0.5% decrease in performance.

Our work is significantly inspired by Antic, J.’s work[2] in reconstructing RGB bands from grayscale images. We use a modified version of perceptual loss [12] in our network. This kind of loss function has proved useful in style-transfer [12] and super-resolution [13] applications. It makes networks focus on perceptual details in an image. These details are not easily captured by standard evaluation metrics like RMSE, PSNR or MRAE but are readily visible to humans. We make use of sub-pixel convolutions [23] for upsampling in our decoder. It is an alternative to deconvolution operation for learned upsampling and is extensively used in super-resolution applications. It performs the convolution in a low resolution space and upsamples the result, instead of upsampling first. This approach is much more efficient while being mathematically equivalent to deconvolution.


The dataset was provided in the New Trends in Image Restoration and Enhancement (NTIRE) Challenge on Spectral Reconstruction from RGB Images at CVPR 2020 [4]. The datasets for both the competition tracks (Clean and Real World) consist of 450 training images and 10 validation images. The dataset for the clean track of the competition consists of 8-bit uncompressed RGB images and their 31 channel hyperspectral counterparts as ground truth. For the real world track, we have the JPEG compressed 8-bit RGB images as the model input. In our experiments, the training and validation data for the models were as provided in the original datasets.

Model Architecture

Unet Block

Results / Limited Ablation Studies


Visualizing Results

Visualization-1 Visualization-zoom

Instructions to run

First, setup the required conda environment with:

conda env create -f environment.yml

Then, make sure you download the dataset from Codalab and put it in the ../Data folder relative to the repository. The folders should be named:

  1. Train_Clean
  2. Validation_Clean
  3. Train_RealWorld
  4. Validation_RealWorld

Each folder should contain the data as is apparent from the folder name. All the images in the dataest are provided in .mat format. You should convert them to .tiff files using the tifffile python module. We will provide the data pre-processing scripts in a few weeks' time.

You can then run:


This will start running the experiments.


