Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor NNCF compression statistics reporting API #688

Merged
merged 14 commits into from
May 24, 2021

Conversation

andrey-churkin
Copy link
Contributor

No description provided.

@andrey-churkin andrey-churkin added the NNCF Common Pull request that updates NNCF Common label Apr 30, 2021
@andrey-churkin andrey-churkin requested review from alexsu52, vshampor and a team April 30, 2021 01:46
@andrey-churkin andrey-churkin marked this pull request as draft April 30, 2021 01:47
"""
Returns a representation of the statistics as built-in data types.

:return: A representation of the statistics as built-in data types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add a note on what the string keys in the dict represent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as_dict() method was removed after refactoring. For this reason, your comment is not applicable now.

'mask_pruning_level': self.mask_pruning_level,
'filter_pruning_level': self.filter_pruning_level,
}
return summary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is essentially a copy-and-paste of the corresponding self.__dict__. If you decide to take this approach, then perhaps it would be easier to write a common Statistics function that iterates over self.__dict__, puts built-in types (int, double and str) into the retval dict as-is, puts the result of .as_dict for Statistics entries and ignores the rest (if encountered, spawn a debug message with a warning).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as_dict() method was removed after refactoring. I implemented a helper function that prepares statistics for the tensor board.

@@ -453,7 +453,7 @@ def train_epoch(train_loader, model, criterion, criterion_fn, optimizer, compres
config.tb.add_scalar("train/top1", top1.avg, i + global_step)
config.tb.add_scalar("train/top5", top5.avg, i + global_step)

for stat_name, stat_value in compression_ctrl.statistics(quickly_collected_only=True).items():
for stat_name, stat_value in compression_ctrl.statistics(quickly_collected_only=True).as_dict().items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the tb.add_scalar handle nested dicts? Because your as_dict structure is, in general, nested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tf.add_scalar is not support nested dicts.

@@ -221,7 +220,7 @@ def run(config):
**validation_kwargs)

logger.info('evaluation...')
print_statistics(compression_ctrl.statistics())
logger.info(compression_ctrl.statistics().as_str())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.info(compression_ctrl.statistics().as_str())
compression_statistics = compression_ctrl.statistics()
logger.info(compression_statistics.as_str())

If you support this change. Please update the same code snippets below too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

})

return raw_pruning_statistics
# TODO(andrey-churkin): Should be calculated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you file a ticket for this TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a ticket (55672) to track the progress of this task.

sparsity_levels = tf.keras.backend.batch_get_value(sparsity_levels)
weights_percentages = [weights_number / total_weights_number * 100
for weights_number in weights_numbers]
weights_percentages = tf.keras.backend.batch_get_value(weights_percentages)
mask_sparsity = list(zip(mask_names, weights_shapes, sparsity_levels, weights_percentages))
raw_sparsity_statistics['sparsity_statistic_by_layer'] = []

# TODO(andrey-churkin): Why we use `mask_name` instead of `layer_name`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniil-lyakhov Could you comment?

Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov May 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's name from magnitude sparsity raw_statistic method. I used wrapped_layer.name + '_rb_mask' name because it contains name of the layer and mask.name consist of weight_attribute + '_mask' thus less user friendly. mask_name not actually contains mask.names but contains their user frendly names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a bug because NNCFWrapper can contain multiple operations with weights.


target_level = self.loss.target_sparsity_rate
# TODO(andrey-churkin): Should be calculated when the distributed mode will be supported
masks_consistency = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We covered this case in tests. So this statistics does not mean sense for TF backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshampor Do we really want to report the masks_consistency statistic to users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshampor Could you please answer the question above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have had no idea what mask consistency is before you had drawn my attention to it, and it is not immediately obvious to me what this parameter means in the context of sparsity and how it impacts the sparsity outputs. I think that we can omit reporting it for now.

# TODO(andrey-churkin): Should be calculated when the distributed mode will be supported
masks_consistency = 1.0

# TODO(andrey-churkin): Check that `mean_sparse_prob` is calculated correctly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check it in this PR or file a ticket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ac19dc6.

@@ -23,6 +24,38 @@
ModelType = TypeVar('ModelType')


class CompositeStatistics(Statistics):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we look in terms of the main usage scenarios. I guess, It is access to statistics values. We will see that the CompressionAlgorithmController returns Statistics object, but the CompositeCompressionAlgorithmController returns CompositeStatistics object. Currently, If the developer want to get sparsity_level, for example, then he should write the following code:

statistics = compression_ctrl.statistics()
if isinstance(statistics, CompositeStatistics):
    items = statistics.child_statistics
else:
    items = [statistics]

for item in items:
    if isinstance(item, (MagnitudeSparsityStatistics, RBSparsityStatistics, ConstSparsityStatistics)):
          sparsity_level = item.model_statistics.sparsity_level
    

before these changes it was like this:

statistics = compression_ctrl.statistics()
sparsity_level = statistics['sparsity_level']

I was expecting the following:

statistics = compression_ctrl.statistics()
sparsity_level = statistics.magnitude_sparsity.sparsity_level
#or
magnitude_sparsity_statistics = statistics.get_statistics('magnitude_sparsity')
sparsity_level = magnitude_sparsity_statistics.sparsity_level

Can we make the statistics method return one type of object? Statistics container? @andrey-churkin, @vshampor Could you look once more at the proposal in this PR in terms of UX/DX?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your suggestion. I fixed it in b5398b0.

@andrey-churkin
Copy link
Contributor Author

Important changes:

  • The nncf/api/composite_compression.py file was moved to nncf/common/composite_compression.py.
  • The statistics() method was removed from CompressionLoss class.
  • The NNCFStatistics class was introduced. All controllers should return an instance of this class from the statistics() method.

@vshampor
Copy link
Contributor

Jenkins please retry a build

Copy link
Contributor

@asenina asenina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@andrey-churkin
Copy link
Contributor Author

Jenkins please retry a build

@andrey-churkin
Copy link
Contributor Author

@vshampor Could you please look at this PR?

@vshampor vshampor merged commit 018d5dc into openvinotoolkit:develop May 24, 2021
@l-bat l-bat mentioned this pull request May 25, 2021
kshpv pushed a commit to kshpv/nncf that referenced this pull request Oct 11, 2022
* Refactor NNCF compression statistics reporting API

* Test was updated

* Fix pylint

* Refactoring

* Minor fixes

* Minor fixes

* Minor updates

* Minor updates

* Fixed mean_sparse_prob calculation

* Refactoring

* Typo was fixed

* Test was updated

* Test was updated

* Fix pylint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF Common Pull request that updates NNCF Common
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants