-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor NNCF compression statistics reporting API #688
Conversation
nncf/api/compression.py
Outdated
""" | ||
Returns a representation of the statistics as built-in data types. | ||
|
||
:return: A representation of the statistics as built-in data types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a note on what the string keys in the dict represent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The as_dict()
method was removed after refactoring. For this reason, your comment is not applicable now.
nncf/common/pruning/statistics.py
Outdated
'mask_pruning_level': self.mask_pruning_level, | ||
'filter_pruning_level': self.filter_pruning_level, | ||
} | ||
return summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is essentially a copy-and-paste of the corresponding self.__dict__
. If you decide to take this approach, then perhaps it would be easier to write a common Statistics
function that iterates over self.__dict__
, puts built-in types (int, double and str) into the retval dict as-is, puts the result of .as_dict
for Statistics
entries and ignores the rest (if encountered, spawn a debug message with a warning).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The as_dict()
method was removed after refactoring. I implemented a helper function that prepares statistics for the tensor board.
examples/classification/main.py
Outdated
@@ -453,7 +453,7 @@ def train_epoch(train_loader, model, criterion, criterion_fn, optimizer, compres | |||
config.tb.add_scalar("train/top1", top1.avg, i + global_step) | |||
config.tb.add_scalar("train/top5", top5.avg, i + global_step) | |||
|
|||
for stat_name, stat_value in compression_ctrl.statistics(quickly_collected_only=True).items(): | |||
for stat_name, stat_value in compression_ctrl.statistics(quickly_collected_only=True).as_dict().items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the tb.add_scalar
handle nested dicts? Because your as_dict
structure is, in general, nested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tf.add_scalar
is not support nested dicts.
@@ -221,7 +220,7 @@ def run(config): | |||
**validation_kwargs) | |||
|
|||
logger.info('evaluation...') | |||
print_statistics(compression_ctrl.statistics()) | |||
logger.info(compression_ctrl.statistics().as_str()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.info(compression_ctrl.statistics().as_str()) | |
compression_statistics = compression_ctrl.statistics() | |
logger.info(compression_statistics.as_str()) |
If you support this change. Please update the same code snippets below too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
}) | ||
|
||
return raw_pruning_statistics | ||
# TODO(andrey-churkin): Should be calculated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you file a ticket for this TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a ticket (55672) to track the progress of this task.
sparsity_levels = tf.keras.backend.batch_get_value(sparsity_levels) | ||
weights_percentages = [weights_number / total_weights_number * 100 | ||
for weights_number in weights_numbers] | ||
weights_percentages = tf.keras.backend.batch_get_value(weights_percentages) | ||
mask_sparsity = list(zip(mask_names, weights_shapes, sparsity_levels, weights_percentages)) | ||
raw_sparsity_statistics['sparsity_statistic_by_layer'] = [] | ||
|
||
# TODO(andrey-churkin): Why we use `mask_name` instead of `layer_name`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daniil-lyakhov Could you comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's name from magnitude sparsity raw_statistic
method. I used wrapped_layer.name + '_rb_mask'
name because it contains name of the layer and mask.name consist of weight_attribute + '_mask'
thus less user friendly. mask_name not actually contains mask.name
s but contains their user frendly names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a bug because NNCFWrapper
can contain multiple operations with weights.
|
||
target_level = self.loss.target_sparsity_rate | ||
# TODO(andrey-churkin): Should be calculated when the distributed mode will be supported | ||
masks_consistency = 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We covered this case in tests. So this statistics does not mean sense for TF backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vshampor Do we really want to report the masks_consistency
statistic to users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vshampor Could you please answer the question above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have had no idea what mask consistency is before you had drawn my attention to it, and it is not immediately obvious to me what this parameter means in the context of sparsity and how it impacts the sparsity outputs. I think that we can omit reporting it for now.
# TODO(andrey-churkin): Should be calculated when the distributed mode will be supported | ||
masks_consistency = 1.0 | ||
|
||
# TODO(andrey-churkin): Check that `mean_sparse_prob` is calculated correctly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check it in this PR or file a ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in ac19dc6.
nncf/api/composite_compression.py
Outdated
@@ -23,6 +24,38 @@ | |||
ModelType = TypeVar('ModelType') | |||
|
|||
|
|||
class CompositeStatistics(Statistics): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we look in terms of the main usage scenarios. I guess, It is access to statistics values. We will see that the CompressionAlgorithmController
returns Statistics
object, but the CompositeCompressionAlgorithmController
returns CompositeStatistics
object. Currently, If the developer want to get sparsity_level, for example, then he should write the following code:
statistics = compression_ctrl.statistics()
if isinstance(statistics, CompositeStatistics):
items = statistics.child_statistics
else:
items = [statistics]
for item in items:
if isinstance(item, (MagnitudeSparsityStatistics, RBSparsityStatistics, ConstSparsityStatistics)):
sparsity_level = item.model_statistics.sparsity_level
before these changes it was like this:
statistics = compression_ctrl.statistics()
sparsity_level = statistics['sparsity_level']
I was expecting the following:
statistics = compression_ctrl.statistics()
sparsity_level = statistics.magnitude_sparsity.sparsity_level
#or
magnitude_sparsity_statistics = statistics.get_statistics('magnitude_sparsity')
sparsity_level = magnitude_sparsity_statistics.sparsity_level
Can we make the statistics method return one type of object? Statistics container? @andrey-churkin, @vshampor Could you look once more at the proposal in this PR in terms of UX/DX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your suggestion. I fixed it in b5398b0.
Important changes:
|
Jenkins please retry a build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Jenkins please retry a build |
@vshampor Could you please look at this PR? |
* Refactor NNCF compression statistics reporting API * Test was updated * Fix pylint * Refactoring * Minor fixes * Minor fixes * Minor updates * Minor updates * Fixed mean_sparse_prob calculation * Refactoring * Typo was fixed * Test was updated * Test was updated * Fix pylint
No description provided.