Fix bugs with Accuracy Aware (openvinotoolkit#1021)

### Changes Fix bugs with JSON schema, add a correct validation of Accuracy Aware config. Fix bug with non translating param validate_every_n_epochs from config. Add logging of validation metrics and final compression rate in adaptive compression level training pipeline. Improve documentation. Add typehints, add logging of saving checkpoint. Update AA config of resnet-18 with new improved metrics. ### Reason for changes Improving Accuracy Aware ### Related tickets 69079, 68302 ### Tests The current tests were changed accordingly.
vshampor · Dec 14, 2021 · ad626be · ad626be
1 parent 0290f72
commit ad626be
Show file tree

Hide file tree

Showing 18 changed files with 240 additions and 126 deletions.
diff --git a/docs/Usage.md b/docs/Usage.md
@@ -262,7 +262,7 @@ If registered module should be ignored by specific algorithms use `ignored_algor
 In the example above, the NNCF-compressed models that contain instances of `MyModule` will have the corresponding modules extended with functionality that will allow NNCF to quantize, sparsify or prune the `weight` parameter of `MyModule` before it takes part in `MyModule`'s `forward` calculation.
 
 ### Accuracy-Aware model training
-NNCF has the capability to apply the model compression algorithms while satisfying the user-defined accuracy constraints. This is done by executing an internal custom accuracy-aware training loop, which also helps to automate away some of the manual hyperparameter search related to model training such as setting the total number of epochs, the target compression rate for the model, etc. There are two supported training loops. The first one is called Early Exit Training, which aims to finish fine-tuning when the accuracy drop criterion is reached. The second one is more sophisticated. It is targeted for the automated discovery of the compression rate for the model given that it satisfies the user-specified maximal tolerable accuracy drop due to compression. Its name is Adaptive Compression Level Training. Both training loops could be run with either PyTorch or TensorFlow backend with the same user interface(except for the TF case where the Keras API is used for training).
+NNCF has the capability to apply the model compression algorithms while satisfying the user-defined accuracy constraints. This is done by executing an internal custom accuracy-aware training loop, which also helps to automate away some of the manual hyperparameter search related to model training such as setting the total number of epochs, the target compression rate for the model, etc. There are two supported training loops. The first one is called [Early Exit Training](./accuracy_aware_model_training/EarlyExitTraining.md), which aims to finish fine-tuning when the accuracy drop criterion is reached. The second one is more sophisticated. It is targeted for the automated discovery of the compression rate for the model given that it satisfies the user-specified maximal tolerable accuracy drop due to compression. Its name is [Adaptive Compression Level Training](./accuracy_aware_model_training/AdaptiveCompressionTraining.md). Both training loops could be run with either PyTorch or TensorFlow backend with the same user interface(except for the TF case where the Keras API is used for training).
 
 The following function is required to create the accuracy-aware training loop. One has to pass the `NNCFConfig` object and the compression controller (that is returned upon compressed model creation, see above).
 ```python

diff --git a/docs/accuracy_aware_model_training/AdaptiveCompressionLevelTraining.md b/docs/accuracy_aware_model_training/AdaptiveCompressionLevelTraining.md
@@ -1,16 +1,34 @@
 # Adaptive Compression Level training loop in NNCF
 
-The search can be only done for a single compression algorithm in the pipeline (i.e. several compression algorithms could be applied to the model and the search is going to be performed for a single one); currently supported algorithms for compression rate search are magnitude sparsity and filter pruning. The exact compression algorithm for which the search is done is determined from `"accuracy_aware_training"` config section added to the target compression algorithm section. Below is an example of a filter pruning configuration with added `"accuracy_aware_training"` parameters. The parameters to be set by the user in this config section are (i) `maximal_relative_accuracy_degradation` - the maximal allowed accuracy metric drop relative to the original model (in percent), (ii) `initial_training_phase_epochs` - the number of epochs to train the model with the compression schedule specified in the  `"params"` section, (iii) `patience_epochs` - the number of epochs to train the model for a compression rate level set by the search algorithm before switching to another compression rate value.
+Adaptive Compression Level training loop is the meta-algorithm that performs searching for the most compression level of the underneath compression algorithms while staying within the range of the user-defined maximum accuracy degradation.
+The compression pipeline can consist of several compression algorithms (Algorithms Mixing), however, **performing a compression level search is supported only for a single compression algorithm with an adaptive compression level**. They could be  **Magnitude Sparsity** and **Filter Pruning**. In the other words, the compression schemes like **Quantization** + **Filter Pruning** or **Quantization** + **Sparsity** are supported, while **Filter Pruning** + **Sparsity** is not, because **Filter Pruning** and **Sparsity** both are algorithms with adaptive compression level.
 
-To launch the adaptive compression training loop, the user is expected to define several function related to model training, validation and optimizer creation (see [the usage documentation](../Usage.md#accuracy-aware-model-training) for more details) and pass them to the run method of an `AdaptiveCompressionTrainingLoop` instance. The training loop logic inside of the `AdaptiveCompressionTrainingLoop` is framework-agnostic, while all of the framework specifics are encapsulated inside of corresponding `Runner` objects, which are created and called inside the training loop. The adaptive compression training loop is generally aimed at automatically searching for the optimal compression rate in the model, with the parameters of the search algorithm specified in the configuration file as follows:
+The exact compression algorithm for which the compression level search will be applied is determined in "compression" config section. The parameters to be set by the user in this config section are: 
+1) `maximal_relative_accuracy_degradation` or `maximal_absolute_accuracy_degradation` - the maximal allowed accuracy metric drop relative to the original model metrics (in percent) or the maximal allowed absolute accuracy metric drop (in original metrics value),
+2) `initial_training_phase_epochs` - the number of epochs to train the model with the compression schedule specified in the `"params"` section of `"compression"` algorithm. 
+
+3) `patience_epochs` - the number of epochs to train the model for a compression rate level set by the search algorithm before switching to another compression rate value.
+4) `minimal_compression_rate_step` (Optional; default=0.025) - The minimal compression rate change step value after which the training loop is terminated.
+5) `initial_compression_rate_step` (Optional; default=0.1) - Initial value for the compression rate increase/decrease training phase of the compression training loop.
+6) `compression_rate_step_reduction_factor` (Optional; default=0.5) - Factor used to reduce the compression rate change step in the adaptive compression training loop. 
+4) `validate_every_n_epochs` (Optional; default=1) - The parameter specifies across which number of epochs `Runner` should validate the compressed model.
+5) `maximal_total_epochs` (Optional; default=1e4) - The number of training epochs, if the fine-tuning epoch reaches this number, the loop finishes the fine-tuning and return the model with thi highest compression rate and the least accuracy drop.
+
+
+To launch the adaptive compression training loop, the user should define several functions related to model training, validation and optimizer creation (see [the usage documentation](../Usage.md#accuracy-aware-model-training) for more details) and pass them to the run method of an `AdaptiveCompressionTrainingLoop` instance. The training loop logic inside of the `AdaptiveCompressionTrainingLoop` is framework-agnostic, while all of the framework specifics are encapsulated inside of corresponding `Runner` objects, which are created and called inside the training loop. The adaptive compression training loop is generally aimed at automatically searching for the optimal compression rate in the model, with the parameters of the search algorithm specified in the configuration file. Below is an example of a filter pruning configuration with added `"accuracy_aware_training"` parameters.
 ```
 {
     "accuracy_aware_training": {
         "mode": "adaptive_compression_level",
         "params": {
             "maximal_relative_accuracy_degradation": 1.0,
             "initial_training_phase_epochs": 100,
-            "patience_epochs": 30
+            "patience_epochs": 30,
+            "minimal_compression_rate_step": 0.025, // Optional
+            "initial_compression_rate_step": 0.1, // Optional
+            "compression_rate_step_reduction_factor": 0.5, // Optional
+            "validate_every_n_epochs": 1, // Optional
+            "maximal_total_epochs": 10000 // Optional
         }
     },
     "compression": [
@@ -28,6 +46,19 @@ To launch the adaptive compression training loop, the user is expected to define
 }
 
 ```
-The above compression configuration implies that the compression rate to be varied during training is the filter pruning ratio, since the `"accuracy_aware_training"` section is specified inside the filter pruning algorithm configuration. The `"initial_training_phase_epochs"` parameter corresponds to the amount of epochs that the model is going to be trained for with the initial compression rate level/schedule set by the user in the standard NNCF manner (the initial pruning rate schedule above is an exponential schedule with the target pruning rate of 0.1). After this initial phase of fine-tuning, the next compression rate value is determined by the search algorithm and the model is fine-tuned with that selected compression rate value for `"patience_epochs"` number of epochs. The process is continued until after the search algorithm terminates. The returned model is the model with the highest compression rate encountered during training given that is satisfies the accuracy drop criterion -- the accuracy value of the compressed model should not be more that `"maximal_relative_accuracy_degradation`" percent less that the original uncompressed model's accuracy value.
-The default behavior for the compression rate search algorithm implies changes in the compression rate level value by a step value that is decreasing throughout training. The training is terminated once the compression rate step value reaches the minimal value determined by the `"minimal_compression_rate_step"` parameter that can be specified in the `"accuracy_aware_training"` section of the config (default value is 0.025). The initial value for the compression rate step is given be the `"compression_rate_step"` parameter and is equal to 0.1 by default. The step value is decreased by the `"step_reduction_factor"` value (0.5 by default) at points throughout training whenever the direction of change in compression rate changes at a point where the new compression rate is selected. That is, if a too big of an increase in compression rate resulted in the accuracy metrics below the user-defined criterion, the compression rate is reduced by a lower step in an attempt to restore the accuracy and vice versa, if the decrease was sufficient to satisfy the accuracy criterion, the compression rate is increased by a lower step to check if this higher compression rate could also result in tolerable accuracy values. This sequential search is limited by the minimal granularity of the steps given by `"minimal_compression_rate_step"`.
+
+## Description of the work of Adaptive Compression Level training loop
+
+The first step is **Initial Training Phase** - It corresponds to the amount of epochs that the model is going to be trained for with the initial compression rate level/schedule set by the user in the standard NNCF manner (the initial pruning rate schedule above is an exponential schedule with the target pruning rate of 0.1).
+
+The second one is **Finding the optimal compression rate**, where the next compression rate value is determined by the search algorithm and the model is fine-tuned for `"patience_epochs"` number of epochs. The process is continued until the search algorithm terminates. The returned model is the model with the highest compression rate encountered, which satisfies the accuracy drop criterion - the accuracy drop of the compressed model should not be more than `"maximal_relative_accuracy_degradation`" or "`maximal_absolute_accuracy_degradation`".
+
+## Compression rate search algorithm
+
+The default behavior for the compression rate search algorithm implies changes in the compression rate level value by a step value that is decreasing throughout training.
+The training is terminated once the compression rate step value reaches the minimal value determined by the `"minimal_compression_rate_step"` parameter that can be specified in the `"params"` of `"accuracy_aware_training"` section.
+The initial value for the compression rate step is given by the `"initial_compression_rate_step"` parameter.
+The step value is decreased by the `"compression_rate_step_reduction_factor"` value at points throughout training whenever the direction of change in compression rate changes at a point where the new compression rate is selected.
+That is, if a too big of an increase in compression rate resulted in the accuracy metrics below the user-defined criterion, the compression rate is reduced by a lower step in an attempt to restore the accuracy and vice versa, if the decrease was sufficient to satisfy the accuracy criterion, the compression rate is increased by a lower step to check if this higher compression rate could also result in tolerable accuracy values.
+This sequential search is limited by the minimal granularity of the steps given by `"minimal_compression_rate_step"`.
 
diff --git a/docs/accuracy_aware_model_training/EarlyExitTrainig.md b/docs/accuracy_aware_model_training/EarlyExitTrainig.md
@@ -7,7 +7,9 @@ Note: since the EarlyExit training does not control any compression parameter th
 
 This training loop supports any combination of NNCF compression algorithms.
 
-There are only two parameters of Early-Exit training loop: `maximal_relative_accuracy_degradation` or `maximal_absolute_accuracy_degradation` - relative/absolute accuracy drop in percentage/in original metric with original, uncompressed model less than that is user tolerant. And `maximal_total_epochs` - number of training epochs, if the fine-tuning epoch reaches this number, the loop finishes the fine-tuning and return the model with the least accuracy drop
+There are only two main parameters of Early-Exit training loop: `maximal_relative_accuracy_degradation` or `maximal_absolute_accuracy_degradation` - relative/absolute accuracy drop in percentage/in original metric with original, uncompressed model less than that is user tolerant. And `maximal_total_epochs` - number of training epochs, if the fine-tuning epoch reaches this number, the loop finishes the fine-tuning and returns the model with the least accuracy drop.
+Also, the user could specify `validate_every_n_epochs` - the parameter specifies across which number of epochs `Runner` should validate the compressed mode.
+
 
 There is an example of config file needed to be provided to create_accuracy_aware_training_loop (see [the usage documentation](../Usage.md#accuracy-aware-model-training) for more details).
 

diff --git a/examples/torch/classification/README.md b/examples/torch/classification/README.md
@@ -118,4 +118,4 @@ As an example of NNCF convolution binarization capabilities, you may use the con
 |ResNet-50|None|ImageNet|76.16|[resnet50_imagenet.json](configs/quantization/resnet50_imagenet.json)|
 |ResNet-50|Filter pruning, 52.5%, geometric median criterion|ImageNet|75.23 (0.93)|[resnet50_imagenet_accuracy_aware.json](configs/pruning/resnet50_imagenet_accuracy_aware.json)|
 |ResNet-18|None|ImageNet|69.8|[resnet18_imagenet.json](configs/binarization/resnet18_imagenet.json)|
-|ResNet-18|Filter pruning, 50%, geometric median criterion|ImageNet|69.92 (-0.12)|[resnet18_imagenet_accuracy_aware.json](configs/pruning/resnet18_imagenet_accuracy_aware.json)|
+|ResNet-18|Filter pruning, 60%, geometric median criterion|ImageNet|69.2 (-0.6)|[resnet18_imagenet_accuracy_aware.json](configs/pruning/resnet18_imagenet_accuracy_aware.json)|
diff --git a/examples/torch/classification/configs/pruning/resnet18_imagenet_accuracy_aware.json b/examples/torch/classification/configs/pruning/resnet18_imagenet_accuracy_aware.json
@@ -32,7 +32,7 @@
         "params": {
             "maximal_relative_accuracy_degradation": 1.0,
             "initial_training_phase_epochs": 100,
-            "patience_epochs": 30
+            "patience_epochs": 100
         }
     },
     "compression": {