Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter Pruning scheduler does not work as expected #353

Closed
AlexKoff88 opened this issue Dec 14, 2020 · 3 comments
Closed

Filter Pruning scheduler does not work as expected #353

AlexKoff88 opened this issue Dec 14, 2020 · 3 comments
Assignees

Comments

@AlexKoff88
Copy link
Contributor

I observed the strange behavior during the fine-tuning of MobileNet v2 when doing the filter pruning. I set the "pruning_target" to 0.3 and "pruning_steps" to15 but I didn't get the desired ratio after 15 epochs. It was achieved after the 25th epoch.

Please consider with high priority.

Here is the config:

{
    "model": "mobilenet_v2",
    "pretrained": true,
    "batch_size" : 512,
    "epochs": 100,
    "input_info": {
      "sample_size": [1, 3, 224, 224]
    },
    "optimizer": {
        "type": "SGD",
        "base_lr": 0.1,
        "weight_decay": 1e-5,
        "schedule_type": "multistep",
        "steps": [
            20,
            40,
            60,
            80
        ],
        "optimizer_params":
        {
            "momentum": 0.9,
            "nesterov": true
        }
    },
       "compression": [
       {
        "algorithm": "filter_pruning",
	"pruning_init": 0.1,
        "params": {
	    "schedule": "exponential",
            "pruning_target": 0.3,
            "pruning_steps": 15,
            "weight_importance": "geometric_median"
        }
        }
       ]
}
@mkaglins
Copy link
Contributor

@AlexKoff88 do you observe this only with an exponential scheduler or other schedulers also affected?

@AlexKoff88
Copy link
Contributor Author

@mkaglins, I have no idea, you can check whether this is a generic issue to it is a scheduler-specific.

@mkaglins
Copy link
Contributor

Problem analysis summary:

  1. The problem affects only exponential and exponential_with_bias schedulers.
  2. The problem was caused by the momentum parameter in the optimizer. Statistics of the momentum from earlier epochs (with smaller pruning rates) summed with weights on the next epochs (with higher pruning rates) and make them non-zero.
  3. After a couple of training steps, this effect is vanishing and the pruning rate of masks becomes equal to the pruning rate of weights.

As a solution, it was decided to add mask applying on every training step (that significantly speeds up vanishing of momentum non-zero elements). Also, a warning message about this effect will be added to the release notes.
Changes are done in #365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants