Image Classification Sample

This sample demonstrates a DL model compression in case of an image-classification problem. The sample consists of basic steps such as DL model initialization, dataset preparation, training loop over epochs, training and validation steps. The sample receives a configuration file where the training schedule, hyper-parameters, and compression settings are defined.

Features

Torchvision models (ResNets, VGG, Inception, etc.) and datasets (ImageNet, CIFAR 10, CIFAR 100) support
Custom models support
Configuration file examples for sparsity, quantization, filter pruning and quantization with sparsity
Export to ONNX that is supported by the OpenVINO™ toolkit
DataParallel and DistributedDataParallel modes
Tensorboard-compatible output

Installation

To work with the sample you should install the corresponding Python package dependencies

pip install -r examples/torch/requirements.txt

Quantize FP32 Pretrained Model

This scenario demonstrates quantization with fine-tuning of MobileNet v2 on the ImageNet dataset.

Dataset Preparation

To prepare the ImageNet dataset, refer to the following tutorial.

Run Classification Sample

If you did not install the package, add the repository root folder to the PYTHONPATH environment variable.
Go to the examples/torch/classification folder.

Run the following command to start compression with fine-tuning on GPUs:

python main.py -m train --config configs/quantization/mobilenet_v2_imagenet_int8.json --data /data/imagenet/ --log-dir=../../results/quantization/mobilenet_v2_int8/

It may take a few epochs to get the baseline accuracy results.

Use the --multiprocessing-distributed flag to run in the distributed mode.
Use the --resume flag with the path to a previously saved model to resume training.
For Torchvision-supported image classification models, set "pretrained": true inside the NNCF config JSON file supplied via --config to initialize the model to be compressed with Torchvision-supplied pretrained weights, or, alternatively:
Use the --weights flag with the path to a compatible PyTorch checkpoint in order to load all matching weights from the checkpoint into the model - useful if you need to start compression-aware training from a previously trained uncompressed (FP32) checkpoint instead of performing compression-aware training from scratch.

Validate Your Model Checkpoint

To estimate the test scores of your model checkpoint, use the following command:

python main.py -m test --config=configs/quantization/mobilenet_v2_imagenet_int8.json --resume <path_to_trained_model_checkpoint>

To validate an FP32 model checkpoint, make sure the compression algorithm settings are empty in the configuration file or pretrained=True is set.

WARNING: The samples use torch.load functionality for checkpoint loading which, in turn, uses pickle facilities by default which are known to be vulnerable to arbitrary code execution attacks. Only load the data you trust

Export Compressed Model

To export trained model to the ONNX format, use the following command:

python main.py -m test --config=configs/quantization/mobilenet_v2_imagenet_int8.json --resume=../../results/quantization/mobilenet_v2_int8/6/checkpoints/epoch_1.pth --to-onnx=../../results/mobilenet_v2_int8.onnx

Export to OpenVINO™ Intermediate Representation (IR)

To export a model to the OpenVINO IR and run it using the Intel® Deep Learning Deployment Toolkit, refer to this tutorial.

Results for quantization

Model	Compression algorithm	Dataset	Accuracy (Drop) %	NNCF config file	PyTorch checkpoint
ResNet-50	None	ImageNet	76.16	resnet50_imagenet.json	-
ResNet-50	INT8	ImageNet	76.42 (-0.26)	resnet50_imagenet_int8.json	Link
ResNet-50	INT8 (per-tensor only)	ImageNet	76.37 (-0.21)	resnet50_imagenet_int8_per_tensor.json	Link
ResNet-50	Mixed, 44.8% INT8 / 55.2% INT4	ImageNet	76.2 (-0.04)	resnet50_imagenet_mixed_int_manual.json	Link
ResNet-50	INT8 + Sparsity 61% (RB)	ImageNet	75.43 (0.73)	resnet50_imagenet_rb_sparsity_int8.json	Link
ResNet-50	INT8 + Sparsity 50% (RB)	ImageNet	75.55 (0.61)	resnet50_imagenet_rb_sparsity50_int8.json	Link
Inception V3	None	ImageNet	77.34	inception_v3_imagenet.json	-
Inception V3	INT8	ImageNet	78.25 (-0.91)	inception_v3_imagenet_int8.json	Link
Inception V3	INT8 + Sparsity 61% (RB)	ImageNet	77.58 (-0.24)	inception_v3_imagenet_rb_sparsity_int8.json	Link
MobileNet V2	None	ImageNet	71.93	mobilenet_v2_imagenet.json	-
MobileNet V2	INT8	ImageNet	71.35 (0.58)	mobilenet_v2_imagenet_int8.json	Link
MobileNet V2	INT8 (per-tensor only)	ImageNet	71.3 (0.63)	mobilenet_v2_imagenet_int8_per_tensor.json	Link
MobileNet V2	Mixed, 46.6% INT8 / 53.4% INT4	ImageNet	70.92 (1.01)	mobilenet_v2_imagenet_mixed_int_manual.json	Link
MobileNet V2	INT8 + Sparsity 52% (RB)	ImageNet	71.11 (0.82)	mobilenet_v2_imagenet_rb_sparsity_int8.json	Link
SqueezeNet V1.1	None	ImageNet	58.24	squeezenet1_1_imagenet.json	-
SqueezeNet V1.1	INT8	ImageNet	58.28 (-0.04)	squeezenet1_1_imagenet_int8.json	Link
SqueezeNet V1.1	INT8 (per-tensor only)	ImageNet	58.26 (-0.02)	squeezenet1_1_imagenet_int8_per_tensor.json	Link
SqueezeNet V1.1	Mixed, 54.7% INT8 / 45.3% INT4	ImageNet	58.9 (-0.66)	squeezenet1_1_imagenet_mixed_int_manual.json	Link

Binarization

As an example of NNCF convolution binarization capabilities, you may use the configs in examples/torch/classification/configs/binarization to binarize ResNet18. Use the same steps/command line parameters as for quantization (for best results, specify --pretrained), except for the actual binarization config path.

Results for binarization

Model	Compression algorithm	Dataset	Accuracy (Drop) %	NNCF config file	PyTorch Checkpoint
ResNet-18	None	ImageNet	69.8	resnet18_imagenet.json	-
ResNet-18	XNOR (weights), scale/threshold (activations)	ImageNet	61.63 (8.17)	resnet18_imagenet_binarization_xnor.json	Link
ResNet-18	DoReFa (weights), scale/threshold (activations)	ImageNet	61.61 (8.19)	resnet18_imagenet_binarization_dorefa.json	Link

Results for filter pruning

Model	Compression algorithm	Dataset	Accuracy (Drop) %	GFLOPS	MParams	NNCF config file	PyTorch Checkpoint
ResNet-50	None	ImageNet	76.16	8.18 (100%)	25.50 (100%)	Link	-
ResNet-50	Filter pruning, 40%, geometric median criterion	ImageNet	75.62 (0.54)	4.58 (56.00%)	16.06 (62.98%)	Link	Link
ResNet-18	None	ImageNet	69.8	3.63 (100%)	11.68 (100%)	Link	-
ResNet-18	Filter pruning, 40%, magnitude criterion	ImageNet	69.26 (0.54)	2.75 (75.75%)	9.23 (79.02%)	Link	Link
ResNet-18	Filter pruning, 40%, geometric median criterion	ImageNet	69.32 (0.48)	2.75 (75.75%)	9.23 (79.02%)	Link	Link
ResNet-34	None	ImageNet	73.26	7.33 (100%)	21.78 (100%)	Link	-
ResNet-34	Filter pruning, 40%, geometric median criterion	ImageNet	72.72 (0.54)	5.06 (69.03%)	15.47 (71.03%)	Link	Link
GoogLeNet	None	ImageNet	69.72	2.99 (100%)	6.61 (100%)	Link	-
GoogLeNet	Filter pruning, 40%, geometric median criterion	ImageNet	68.89 (0.83)	1.36 (45.48%)	3.47 (52.50%)	Link	Link

Results for accuracy-aware compressed training

Model	Compression algorithm	Dataset	Accuracy (Drop) %	NNCF config file
ResNet-50	None	ImageNet	76.16	resnet50_imagenet.json
ResNet-50	Filter pruning, 52.5%, geometric median criterion	ImageNet	75.23 (0.93)	resnet50_imagenet_accuracy_aware.json
ResNet-18	None	ImageNet	69.8	resnet18_imagenet.json
ResNet-18	Filter pruning, 50%, geometric median criterion	ImageNet	69.92 (-0.12)	resnet18_imagenet_accuracy_aware.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image Classification Sample

Features

Installation

Quantize FP32 Pretrained Model

Dataset Preparation

Run Classification Sample

Validate Your Model Checkpoint

Export Compressed Model

Export to OpenVINO™ Intermediate Representation (IR)

Results for quantization

Binarization

Results for binarization

Results for filter pruning

Results for accuracy-aware compressed training

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image Classification Sample

Features

Installation

Quantize FP32 Pretrained Model

Dataset Preparation

Run Classification Sample

Validate Your Model Checkpoint

Export Compressed Model

Export to OpenVINO™ Intermediate Representation (IR)

Results for quantization

Binarization

Results for binarization

Results for filter pruning

Results for accuracy-aware compressed training