Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue trying to use MoBY SLL pretrained model with SWIN T backbone #181

Open
Giles-Billenness opened this issue Mar 11, 2022 · 4 comments
Open

Comments

@Giles-Billenness
Copy link

Giles-Billenness commented Mar 11, 2022

When trying to load the checkpoint after SLL pretraining (1 epoch to test) with MoBY I get this error after using the --pretrained flag pointing to the checkpoint (I've tried ckpt_epoch_0.pth and checkpoint.pth). I am trying to transfer self-supervised learning with the same backbone to this architecture.

[2022-03-11 20:31:06 swin_tiny_patch4_window7_224](utils.py 47): INFO ==============> Loading weight **********/MOBY SSL SWIN/moby__swin_tiny__patch4_window7_224__odpr02_tdpr0_cm099_ct02_queue4096_proj2_pred2/default/ckpt_epoch_0.pth for fine-tuning......
Traceback (most recent call last):
  File "/content/Swin-Transformer/main.py", line 357, in <module>
    main(config)
  File "/content/Swin-Transformer/main.py", line 131, in main
    load_pretrained(config, model_without_ddp, logger)
  File "/content/Swin-Transformer/utils.py", line 70, in load_pretrained
    relative_position_bias_table_current = model.state_dict()[k]
KeyError: 'encoder.layers.0.blocks.0.attn.relative_position_bias_table'
Killing subprocess 3303```
@Giles-Billenness
Copy link
Author

Giles-Billenness commented Mar 14, 2022

Was able to load a checkpoint somewhat by changing the loading code:

  1. adding the lines
def load_pretrained(config, model, logger):
    logger.info(f"==============> Loading weight {config.MODEL.PRETRAINED} for fine-tuning......")
    checkpoint = torch.load(config.MODEL.PRETRAINED, map_location='cpu')
    state_dict = checkpoint['model']

>    if sorted(list(state_dict.keys()))[0].startswith('encoder'):
>     state_dict = {k.replace('encoder.', ''): v for k, v in state_dict.items() if k.startswith('encoder.')}
  1. Commenting out the code to do with head bias - allowing it to reset them with the code in the else
    # check classifier, if not match, then re-init classifier to zero
    # head_bias_pretrained = state_dict['head.bias']
    # Nc1 = head_bias_pretrained.shape[0]
    # Nc2 = model.head.bias.shape[0]
    # if (Nc1 != Nc2):
    #     if Nc1 == 21841 and Nc2 == 1000:
    #         logger.info("loading ImageNet-22K weight to ImageNet-1K ......")
    #         map22kto1k_path = f'data/map22kto1k.txt'
    #         with open(map22kto1k_path) as f:
    #             map22kto1k = f.readlines()
    #         map22kto1k = [int(id22k.strip()) for id22k in map22kto1k]
    #         state_dict['head.weight'] = state_dict['head.weight'][map22kto1k, :]
    #         state_dict['head.bias'] = state_dict['head.bias'][map22kto1k]
    #     else:
    torch.nn.init.constant_(model.head.bias, 0.)
    torch.nn.init.constant_(model.head.weight, 0.)
    # del state_dict['head.weight']
    # del state_dict['head.bias']
    logger.warning(f"Error in loading classifier head, re-init classifier head to 0")

    msg = model.load_state_dict(state_dict, strict=False)
    logger.warning(msg)

    logger.info(f"=> loaded successfully '{config.MODEL.PRETRAINED}'")

    del checkpoint
    torch.cuda.empty_cache()

After these changes it loads but is missing keys find the log below:

_IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_position_index', 
                                'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.3.attn.relative_position_index', 
                                'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn_mask', 'layers.2.blocks.5.attn.relative_position_index', 
                                'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_position_index', 'head.weight', 'head.bias'], unexpected_keys=[])

@Giles-Billenness
Copy link
Author

I also found after these changes that when loading pre-trained models from this repo (swin_tiny_patch4_window7_224.pth) such as from ImageNet a similar log is shown

_IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_position_index', 
                                'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.3.attn.relative_position_index', 
                                'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn_mask', 'layers.2.blocks.5.attn.relative_position_index', 
                                'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_position_index'], unexpected_keys=[])

@Giles-Billenness
Copy link
Author

Ah, these are all re init'ed anyway and so it doesn't matter. I might open a PR to allow for use of MoBY pre-trained models.

@kavin-du
Copy link

same issue when using MoBY pretrained models..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants