Fix input quantization in case of embeddings #97

vshampor · 2020-08-10T16:25:12Z

No description provided.

* Allow sharing activation quantizers in different graph points (#67) * Update version and docs on develop (#77) * Update 3rd party integration patches (#79) * Doc updates (#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (#85) * Fix percentile per-channel init (#86) Fixes: #83 * Omit nodes called during debugging from entering NNCF graph (#87) * Enable custom range initializers for overriden scopes in schema (#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (#92) * Fix mmdetection patch (#93) * Update mmdetection patch to v2.3.0 (#95) * Allow registering user modules as NNCF modules for weight quantization (#99) * Assign latest tensor shape during ForwardTraceOnly() (#96) * Enable GPT2 ops (#98) * Fix HW config scenario with ops missing in HW config definition (#94) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com>

* Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>

* Release v1.5.0 of NNCF to master (#254) * Allow sharing activation quantizers in different graph points (#67) * Update version and docs on develop (#77) * Update 3rd party integration patches (#79) * Doc updates (#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (#85) * Fix percentile per-channel init (#86) Fixes: #83 * Omit nodes called during debugging from entering NNCF graph (#87) * Enable custom range initializers for overriden scopes in schema (#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (#92) * Fix mmdetection patch (#93) * Update mmdetection patch to v2.3.0 (#95) * Allow registering user modules as NNCF modules for weight quantization (#99) * Assign latest tensor shape during ForwardTraceOnly() (#96) * Enable GPT2 ops (#98) * Fix HW config scenario with ops missing in HW config definition (#94) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com>

* Release v1.5.0 of NNCF to master (#254) * Allow sharing activation quantizers in different graph points (#67) * Update version and docs on develop (#77) * Update 3rd party integration patches (#79) * Doc updates (#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (#85) * Fix percentile per-channel init (#86) Fixes: #83 * Omit nodes called during debugging from entering NNCF graph (#87) * Enable custom range initializers for overriden scopes in schema (#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (#92) * Fix mmdetection patch (#93) * Update mmdetection patch to v2.3.0 (#95) * Allow registering user modules as NNCF modules for weight quantization (#99) * Assign latest tensor shape during ForwardTraceOnly() (#96) * Enable GPT2 ops (#98) * Fix HW config scenario with ops missing in HW config definition (#94) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Release v1.6.0 of NNCF to master (#461) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com> * accuracy aware draft * refactor to introduce TrainingRunner for training loop control * move accuracy aware loop to common * address comments * update accuracy aware * add support for TF samples * refactor keras API sample Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>

* Release v1.5.0 of NNCF to master (openvinotoolkit#254) * Allow sharing activation quantizers in different graph points (openvinotoolkit#67) * Update version and docs on develop (openvinotoolkit#77) * Update 3rd party integration patches (openvinotoolkit#79) * Doc updates (openvinotoolkit#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (openvinotoolkit#85) * Fix percentile per-channel init (openvinotoolkit#86) Fixes: openvinotoolkit#83 * Omit nodes called during debugging from entering NNCF graph (openvinotoolkit#87) * Enable custom range initializers for overriden scopes in schema (openvinotoolkit#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (openvinotoolkit#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (openvinotoolkit#92) * Fix mmdetection patch (openvinotoolkit#93) * Update mmdetection patch to v2.3.0 (openvinotoolkit#95) * Allow registering user modules as NNCF modules for weight quantization (openvinotoolkit#99) * Assign latest tensor shape during ForwardTraceOnly() (openvinotoolkit#96) * Enable GPT2 ops (openvinotoolkit#98) * Fix HW config scenario with ops missing in HW config definition (openvinotoolkit#94) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Add AC config for SSD300_mobilenet on voc. (openvinotoolkit#441) * Minor fixes for HAWQ (openvinotoolkit#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (openvinotoolkit#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (openvinotoolkit#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (openvinotoolkit#450) * Changed working logic with json metrics (openvinotoolkit#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (openvinotoolkit#454) * Fix metric value for ssd300_mobilenet_voc. (openvinotoolkit#453) * Do not follow symlinks when opening files (openvinotoolkit#451) * Correctly construct Q-DQ config for E2E tests (openvinotoolkit#456) * Update documentation for the v1.6.0 release (openvinotoolkit#457) * Add torch.load warnings and path resolution (openvinotoolkit#458) Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com>

* Release v1.5.0 of NNCF to master (openvinotoolkit#254) * Allow sharing activation quantizers in different graph points (openvinotoolkit#67) * Update version and docs on develop (openvinotoolkit#77) * Update 3rd party integration patches (openvinotoolkit#79) * Doc updates (openvinotoolkit#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (openvinotoolkit#85) * Fix percentile per-channel init (openvinotoolkit#86) Fixes: openvinotoolkit#83 * Omit nodes called during debugging from entering NNCF graph (openvinotoolkit#87) * Enable custom range initializers for overriden scopes in schema (openvinotoolkit#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (openvinotoolkit#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (openvinotoolkit#92) * Fix mmdetection patch (openvinotoolkit#93) * Update mmdetection patch to v2.3.0 (openvinotoolkit#95) * Allow registering user modules as NNCF modules for weight quantization (openvinotoolkit#99) * Assign latest tensor shape during ForwardTraceOnly() (openvinotoolkit#96) * Enable GPT2 ops (openvinotoolkit#98) * Fix HW config scenario with ops missing in HW config definition (openvinotoolkit#94) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Release v1.6.0 of NNCF to master (openvinotoolkit#461) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (openvinotoolkit#263) * make customer happy to see param name that is wrong (openvinotoolkit#259) * kernel chainges * Add pruning sample tests. (openvinotoolkit#268) * Change an operation order in create_compressed_model (openvinotoolkit#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (openvinotoolkit#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (openvinotoolkit#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (openvinotoolkit#257) * Pruning of ConvTranspose (openvinotoolkit#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (openvinotoolkit#279) * Added Unet Mapillary AC configs (openvinotoolkit#281) * Added flag for collection quickly computed stats (openvinotoolkit#287) * Remove __getattr__ from SampleConfig (openvinotoolkit#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (openvinotoolkit#291) * Set proper workdir path for Mask-RCNN (openvinotoolkit#294) * Proper BN momentum parameter and train mode setting in BN adaptation (openvinotoolkit#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (openvinotoolkit#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (openvinotoolkit#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (openvinotoolkit#303) * Switch to VOC2012 in eval mode (openvinotoolkit#295) * Updated pruning configs and results (openvinotoolkit#305) * Don't call MLFlow if it's not enabled (openvinotoolkit#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (openvinotoolkit#296) * Fixed paths to mixed-precision configs (openvinotoolkit#306) * Correct configs for mixed precision models (openvinotoolkit#307) After openvinotoolkit#300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (openvinotoolkit#308) * Correct configs for mixed precision models After openvinotoolkit#300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (openvinotoolkit#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (openvinotoolkit#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (openvinotoolkit#314) * Move call epoch_step() method to begin of epoch. (openvinotoolkit#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (openvinotoolkit#316) * Made initialization depending on the number of samples. (openvinotoolkit#309) * Wrapped MLFlow for safe access (openvinotoolkit#313) * Introduced a separate batch size for initialization (openvinotoolkit#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (openvinotoolkit#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (openvinotoolkit#322) * Show subprocess log in test assertion stacktrace (openvinotoolkit#325) * Adjust ICNet compressed target values (openvinotoolkit#326) * Do not replace parameter during symmetric range init (openvinotoolkit#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (openvinotoolkit#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (openvinotoolkit#328) * Use PyTorch 1.7 (openvinotoolkit#223) * Move epoch_step and step to the beginning of epoch for staged worker (openvinotoolkit#318) * Use torch 1.7.0 for third party sanity tests (openvinotoolkit#333) * Fix mixing cyrillic and latin letters (openvinotoolkit#335) * Fix calculate statistics in local mode sparsity. (openvinotoolkit#337) * Fk/update packages versions (openvinotoolkit#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (openvinotoolkit#334) * Change tensorboardX to pytorch.utils.tensorboard (openvinotoolkit#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (openvinotoolkit#323) * Corrected grouping of activation quantizers (openvinotoolkit#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (openvinotoolkit#342) * Changed AC configs for SSD models (openvinotoolkit#341) * Revert "Fk/update packages versions (openvinotoolkit#338)" (openvinotoolkit#343) This reverts commit 8c17e0c. * Fk/update packages versions (openvinotoolkit#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (openvinotoolkit#340) * fix config path (openvinotoolkit#346) * Add Embedding to the CPU HW config definition (openvinotoolkit#347) * Added separated execution OV tests to start parraleling (openvinotoolkit#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (openvinotoolkit#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (openvinotoolkit#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (openvinotoolkit#349) * Add quantization support for nn.EmbeddingBag (openvinotoolkit#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (openvinotoolkit#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (openvinotoolkit#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (openvinotoolkit#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (openvinotoolkit#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (openvinotoolkit#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (openvinotoolkit#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (openvinotoolkit#362) * Revert "Added ONNX Q-DQ converting parameters (openvinotoolkit#362)" (openvinotoolkit#368) This reverts commit b0504e9. * Beta directory (openvinotoolkit#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (openvinotoolkit#370) * Fix missing default value (openvinotoolkit#373) * Enable batch norm adaptation by default (openvinotoolkit#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (openvinotoolkit#372) * Add pre post processing test (openvinotoolkit#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (openvinotoolkit#375) * Use a reduced number of BN adaptation samples for sanity testing (openvinotoolkit#378) * Dropped last data point in all DataLoaders to prevent issue with BN (openvinotoolkit#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (openvinotoolkit#377) * Reduce BN adaptation samples count in HAWQ sanity configs (openvinotoolkit#380) * Fix object detection sample. (openvinotoolkit#383) * Added Q-DQ ONNX converting parameter (openvinotoolkit#369) * Links to models were updated (openvinotoolkit#386) * include_mask flag for tfds decoder was added (openvinotoolkit#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (openvinotoolkit#388) * change VOC dataset namings (openvinotoolkit#387) * Configure device by common function for all samples (openvinotoolkit#391) * Reduced num_init_samples for range init to accelerate sanity tests (openvinotoolkit#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (openvinotoolkit#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (openvinotoolkit#393) * Print flops pruning level in statistic (openvinotoolkit#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (openvinotoolkit#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (openvinotoolkit#397) * Fix pruning l2norm (openvinotoolkit#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (openvinotoolkit#384) * converted relative imports to absolute imports (openvinotoolkit#396) * Add ac configs for pruned unet and ssd300 (openvinotoolkit#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (openvinotoolkit#398) * Add some explanations to make doc clearer (openvinotoolkit#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (openvinotoolkit#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (openvinotoolkit#404) * Use links to config files for NNCF READMEs (openvinotoolkit#407) * Combined package (openvinotoolkit#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (openvinotoolkit#405) * Remove NNCF package dependency on tensorboard (openvinotoolkit#411) * Small scheduler fixes (openvinotoolkit#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (openvinotoolkit#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (openvinotoolkit#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (openvinotoolkit#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (openvinotoolkit#415) * using common registry (openvinotoolkit#414) * fixed sanity tests for samples (openvinotoolkit#417) * Common NNCFConfig (openvinotoolkit#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (openvinotoolkit#420) * Fix NoCompressionAlgorithmBuilder (openvinotoolkit#426) * fixed issues with paths (openvinotoolkit#425) * 00.0:Updating NNCF github dockerfiles against last changes (openvinotoolkit#436) * Change thresholds for pruned ssd300 (openvinotoolkit#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (openvinotoolkit#439) Fixes: openvinotoolkit#416 * Use non-recursive BFS for graph traversal (openvinotoolkit#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (openvinotoolkit#441) * Minor fixes for HAWQ (openvinotoolkit#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (openvinotoolkit#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (openvinotoolkit#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (openvinotoolkit#450) * Changed working logic with json metrics (openvinotoolkit#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (openvinotoolkit#454) * Fix metric value for ssd300_mobilenet_voc. (openvinotoolkit#453) * Do not follow symlinks when opening files (openvinotoolkit#451) * Correctly construct Q-DQ config for E2E tests (openvinotoolkit#456) * Update documentation for the v1.6.0 release (openvinotoolkit#457) * Add torch.load warnings and path resolution (openvinotoolkit#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com> * accuracy aware draft * refactor to introduce TrainingRunner for training loop control * move accuracy aware loop to common * address comments * update accuracy aware * add support for TF samples * refactor keras API sample Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>

vshampor requested a review from a team August 10, 2020 16:25

AlexKoff88 approved these changes Aug 10, 2020

View reviewed changes

vshampor force-pushed the fix_input_quantization branch from 87c9a2d to 05eed73 Compare August 11, 2020 10:12

Fix input quantization in case of embeddings

05eed73

vshampor merged commit 357f96b into openvinotoolkit:develop Aug 11, 2020

vshampor deleted the fix_input_quantization branch October 9, 2020 10:33

kshpv pushed a commit to kshpv/nncf that referenced this pull request Oct 11, 2022

Fix input quantization in case of embeddings (openvinotoolkit#97)

a34b525

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix input quantization in case of embeddings #97

Fix input quantization in case of embeddings #97

vshampor commented Aug 10, 2020

Fix input quantization in case of embeddings #97

Fix input quantization in case of embeddings #97

Conversation

vshampor commented Aug 10, 2020