This sample demonstrates a DL model compression in case of an image-classification problem. The sample consists of basic steps such as DL model initialization, dataset preparation, training loop over epochs, training and validation steps. The sample receives a configuration file where the training schedule, hyper-parameters, and compression settings are defined.
- Torchvision models (ResNets, VGG, Inception, etc.) and datasets (ImageNet, CIFAR 10, CIFAR 100) support
- Custom models support
- Configuration file examples for sparsity, quantization, filter pruning and quantization with sparsity
- Export to ONNX that is supported by the OpenVINO™ toolkit
- DataParallel and DistributedDataParallel modes
- Tensorboard-compatible output
To work with the sample you should install the corresponding Python package dependencies
pip install -r examples/torch/requirements.txt
This scenario demonstrates quantization with fine-tuning of MobileNet v2 on the ImageNet dataset.
To prepare the ImageNet dataset, refer to the following tutorial.
- If you did not install the package, add the repository root folder to the
PYTHONPATH
environment variable. - Go to the
examples/torch/classification
folder. - Run the following command to start compression with fine-tuning on GPUs:
It may take a few epochs to get the baseline accuracy results.
python main.py -m train --config configs/quantization/mobilenet_v2_imagenet_int8.json --data /data/imagenet/ --log-dir=../../results/quantization/mobilenet_v2_int8/
- Use the
--multiprocessing-distributed
flag to run in the distributed mode. - Use the
--resume
flag with the path to a previously saved model to resume training. - For Torchvision-supported image classification models, set
"pretrained": true
inside the NNCF config JSON file supplied via--config
to initialize the model to be compressed with Torchvision-supplied pretrained weights, or, alternatively: - Use the
--weights
flag with the path to a compatible PyTorch checkpoint in order to load all matching weights from the checkpoint into the model - useful if you need to start compression-aware training from a previously trained uncompressed (FP32) checkpoint instead of performing compression-aware training from scratch.
To estimate the test scores of your model checkpoint, use the following command:
python main.py -m test --config=configs/quantization/mobilenet_v2_imagenet_int8.json --resume <path_to_trained_model_checkpoint>
To validate an FP32 model checkpoint, make sure the compression algorithm settings are empty in the configuration file or pretrained=True
is set.
WARNING: The samples use torch.load
functionality for checkpoint loading which, in turn, uses pickle facilities by default which are known to be vulnerable to arbitrary code execution attacks. Only load the data you trust
To export trained model to the ONNX format, use the following command:
python main.py -m test --config=configs/quantization/mobilenet_v2_imagenet_int8.json --resume=../../results/quantization/mobilenet_v2_int8/6/checkpoints/epoch_1.pth --to-onnx=../../results/mobilenet_v2_int8.onnx
To export a model to the OpenVINO IR and run it using the Intel® Deep Learning Deployment Toolkit, refer to this tutorial.
Model | Compression algorithm | Dataset | Accuracy (Drop) % | NNCF config file | PyTorch checkpoint |
---|---|---|---|---|---|
ResNet-50 | None | ImageNet | 76.16 | resnet50_imagenet.json | - |
ResNet-50 | INT8 | ImageNet | 76.42 (-0.26) | resnet50_imagenet_int8.json | Link |
ResNet-50 | INT8 (per-tensor only) | ImageNet | 76.37 (-0.21) | resnet50_imagenet_int8_per_tensor.json | Link |
ResNet-50 | Mixed, 44.8% INT8 / 55.2% INT4 | ImageNet | 76.2 (-0.04) | resnet50_imagenet_mixed_int_manual.json | Link |
ResNet-50 | INT8 + Sparsity 61% (RB) | ImageNet | 75.43 (0.73) | resnet50_imagenet_rb_sparsity_int8.json | Link |
ResNet-50 | INT8 + Sparsity 50% (RB) | ImageNet | 75.55 (0.61) | resnet50_imagenet_rb_sparsity50_int8.json | Link |
Inception V3 | None | ImageNet | 77.34 | inception_v3_imagenet.json | - |
Inception V3 | INT8 | ImageNet | 78.25 (-0.91) | inception_v3_imagenet_int8.json | Link |
Inception V3 | INT8 + Sparsity 61% (RB) | ImageNet | 77.58 (-0.24) | inception_v3_imagenet_rb_sparsity_int8.json | Link |
MobileNet V2 | None | ImageNet | 71.93 | mobilenet_v2_imagenet.json | - |
MobileNet V2 | INT8 | ImageNet | 71.35 (0.58) | mobilenet_v2_imagenet_int8.json | Link |
MobileNet V2 | INT8 (per-tensor only) | ImageNet | 71.3 (0.63) | mobilenet_v2_imagenet_int8_per_tensor.json | Link |
MobileNet V2 | Mixed, 46.6% INT8 / 53.4% INT4 | ImageNet | 70.92 (1.01) | mobilenet_v2_imagenet_mixed_int_manual.json | Link |
MobileNet V2 | INT8 + Sparsity 52% (RB) | ImageNet | 71.11 (0.82) | mobilenet_v2_imagenet_rb_sparsity_int8.json | Link |
SqueezeNet V1.1 | None | ImageNet | 58.24 | squeezenet1_1_imagenet.json | - |
SqueezeNet V1.1 | INT8 | ImageNet | 58.28 (-0.04) | squeezenet1_1_imagenet_int8.json | Link |
SqueezeNet V1.1 | INT8 (per-tensor only) | ImageNet | 58.26 (-0.02) | squeezenet1_1_imagenet_int8_per_tensor.json | Link |
SqueezeNet V1.1 | Mixed, 54.7% INT8 / 45.3% INT4 | ImageNet | 58.9 (-0.66) | squeezenet1_1_imagenet_mixed_int_manual.json | Link |
As an example of NNCF convolution binarization capabilities, you may use the configs in examples/torch/classification/configs/binarization
to binarize ResNet18. Use the same steps/command line parameters as for quantization (for best results, specify --pretrained
), except for the actual binarization config path.
Model | Compression algorithm | Dataset | Accuracy (Drop) % | NNCF config file | PyTorch Checkpoint |
---|---|---|---|---|---|
ResNet-18 | None | ImageNet | 69.8 | resnet18_imagenet.json | - |
ResNet-18 | XNOR (weights), scale/threshold (activations) | ImageNet | 61.63 (8.17) | resnet18_imagenet_binarization_xnor.json | Link |
ResNet-18 | DoReFa (weights), scale/threshold (activations) | ImageNet | 61.61 (8.19) | resnet18_imagenet_binarization_dorefa.json | Link |
Model | Compression algorithm | Dataset | Accuracy (Drop) % | GFLOPS | MParams | NNCF config file | PyTorch Checkpoint |
---|---|---|---|---|---|---|---|
ResNet-50 | None | ImageNet | 76.16 | 8.18 (100%) | 25.50 (100%) | Link | - |
ResNet-50 | Filter pruning, 40%, geometric median criterion | ImageNet | 75.62 (0.54) | 4.58 (56.00%) | 16.06 (62.98%) | Link | Link |
ResNet-18 | None | ImageNet | 69.8 | 3.63 (100%) | 11.68 (100%) | Link | - |
ResNet-18 | Filter pruning, 40%, magnitude criterion | ImageNet | 69.26 (0.54) | 2.75 (75.75%) | 9.23 (79.02%) | Link | Link |
ResNet-18 | Filter pruning, 40%, geometric median criterion | ImageNet | 69.32 (0.48) | 2.75 (75.75%) | 9.23 (79.02%) | Link | Link |
ResNet-34 | None | ImageNet | 73.26 | 7.33 (100%) | 21.78 (100%) | Link | - |
ResNet-34 | Filter pruning, 40%, geometric median criterion | ImageNet | 72.72 (0.54) | 5.06 (69.03%) | 15.47 (71.03%) | Link | Link |
GoogLeNet | None | ImageNet | 69.72 | 2.99 (100%) | 6.61 (100%) | Link | - |
GoogLeNet | Filter pruning, 40%, geometric median criterion | ImageNet | 68.89 (0.83) | 1.36 (45.48%) | 3.47 (52.50%) | Link | Link |
Model | Compression algorithm | Dataset | Accuracy (Drop) % | NNCF config file |
---|---|---|---|---|
ResNet-50 | None | ImageNet | 76.16 | resnet50_imagenet.json |
ResNet-50 | Filter pruning, 52.5%, geometric median criterion | ImageNet | 75.23 (0.93) | resnet50_imagenet_accuracy_aware.json |
ResNet-18 | None | ImageNet | 69.8 | resnet18_imagenet.json |
ResNet-18 | Filter pruning, 50%, geometric median criterion | ImageNet | 69.92 (-0.12) | resnet18_imagenet_accuracy_aware.json |