Deep Encoding

Introduction

Please checkout our PyTorch implementation (recommended, memory efficient).
This repo is a Torch implementation of Encoding Layer as described in the paper:

Deep TEN: Texture Encoding Network [arXiv]
Hang Zhang, Jia Xue, Kristin Dana

@article{zhang2016deep,
  title={Deep TEN: Texture Encoding Network},
  author={Zhang, Hang and Xue, Jia and Dana, Kristin},
  journal={arXiv preprint arXiv:1612.02844},
  year={2016}
}

Traditional methods such as bag-of-words BoW (left) have a structural similarity to more recent FV-CNN methods (center). Each component is optimized in separate steps. In our approach (right) the entire pipeline is learned in an integrated manner, tuning each component for the task at hand (end-to-end texture/material/pattern recognition).

Installation

On Linux

luarocks install https://raw.githubusercontent.com/zhanghang1989/Deep-Encoding/master/deep-encoding-scm-1.rockspec

On OSX

CC=clang CXX=clang++ luarocks install https://raw.githubusercontent.com/zhanghang1989/Deep-Encoding/master/deep-encoding-scm-1.rockspec

Experiments

The Joint Encoding experiment in Sec4.2 will execute by default (tested using 1 Titan X GPU). This achieves 12.89% percentage error on STL-10 dataset, which is 49.8% relative improvement comparing to pervious state-of-the art 25.67% of Zhao et. al. 2015.:
```
git clone https://github.com/zhanghang1989/Deep-Encoding
cd Deep-Encoding/experiments
th main.lua
```

Training Deep-TEN on MINC-2500 in Sec4.1 using 4 GPUs.

Please download the pre-trained ResNet-50 Torch model and the MINC-2500 dataset to minc folder before executing the program (tested using 4 Titan X GPUs).

 th main.lua -retrain resnet-50.t7 -ft true \
 -netType encoding -nCodes 32 -dataset minc \
 -data minc/ -nClasses 23 -batchSize 64 \
 -nGPU 4 -multisize true

To get comparable results using 2 GPUs, you should change the batch size and the corresponding learning rate:

  th main.lua -retrain resnet-50.t7 -ft true \
  -netType encoding -nCodes 32 -dataset minc \
  -data minc/ -nClasses 23 -batchSize 32 \
  -nGPU 2 -multisize true -LR 0.05\

Benchmarks

Dataset	MINC-2500	FMD	GTOS	KTH	4D-Light
FV-SIFT	46.0	47.0	65.5	66.3	58.4
FV-CNN(VD)	61.8	75.0	77.1	71.0	70.4
FV-CNN(VD) _multi	63.1	74.0	79.2	77.8	76.5
FV-CNN(ResNet)_multi	69.3	78.2	77.1	78.3	77.6
Deep-TEN(ours*)	81.3	80.2_±0.9	84.5_±2.9	84.5_±3.5	81.7_±1.0
State-of-the-Art	76.0_±0.2	82.4_±1.4	81.4	81.1_±1.5	77.0_±1.1

Acknowldgements

We thank Wenhan Zhang from Physics department, Rutgers University for discussions of mathematic models. This work was supported by National Science Foundation award IIS-1421134. A GPU used for this research was donated by the NVIDIA Corporation.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
cmake		cmake
experiments		experiments
generic		generic
images		images
include		include
layers		layers
lib		lib
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
deep-encoding-scm-1.rockspec		deep-encoding-scm-1.rockspec
init.cu		init.cu
init.lua		init.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Encoding

Table of Contents

Introduction

Installation

Experiments

Benchmarks

Acknowldgements

About

Releases

Packages

Languages

zhanghang1989/Torch-Encoding-Layer

Folders and files

Latest commit

History

Repository files navigation

Deep Encoding

Table of Contents

Introduction

Installation

Experiments

Benchmarks

Acknowldgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages