Skip to content

Commit

Permalink
[Feature] Support MaskFormer(NeurIPS'2021) in MMSeg 1.x (open-mmlab#2215
Browse files Browse the repository at this point in the history
)

* [Feature] Support MaskFormer(NeurIPS'2021) in MMSeg 1.x

* add mmdet try except logic

* refactor config files

* add readme

* fix config

* update models & logs

* add MMDET installation and fix info

* fix comments

* fix

* fix config norm optimizer setting

* update models & logs & unittest

* add docstring of MaskFormerHead

* wait for mmdet 3.0.0rc4

* replace seg_mask with seg_logits & add docstring for batch_input_shape

* use mmdet3.0.0rc4

* fix readme and modify config comments

* add mmdet installation in pr_stage_test.yml

* update mmcv version in pr_stage_test.yml

* add mmdet in build_cpu of pr_stage_test.yml

* modify mmdet& mmcv installation in merge_stage_test.yml

* fix typo

* update test.yml

* update test.yml
  • Loading branch information
MengzhangLI committed Dec 1, 2022
1 parent 925faea commit 933e4d3
Show file tree
Hide file tree
Showing 16 changed files with 724 additions and 11 deletions.
9 changes: 6 additions & 3 deletions .circleci/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,9 @@ jobs:
command: |
pip install git+https://github.com/open-mmlab/mmengine.git@main
pip install -U openmim
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
pip install -r requirements/tests.txt -r requirements/optional.txt
- run:
name: Build and install
Expand Down Expand Up @@ -96,18 +97,20 @@ jobs:
command: |
git clone -b main --depth 1 https://github.com/open-mmlab/mmengine.git /home/circleci/mmengine
git clone -b dev-1.x --depth 1 https://github.com/open-mmlab/mmclassification.git /home/circleci/mmclassification
git clone -b dev-3.x --depth 1 https://github.com/open-mmlab/mmdetection.git /home/circleci/mmdetection
- run:
name: Build Docker image
command: |
docker build .circleci/docker -t mmseg:gpu --build-arg PYTORCH=<< parameters.torch >> --build-arg CUDA=<< parameters.cuda >> --build-arg CUDNN=<< parameters.cudnn >>
docker run --gpus all -t -d -v /home/circleci/project:/mmseg -v /home/circleci/mmengine:/mmengine -v /home/circleci/mmclassification:/mmclassification -w /mmseg --name mmseg mmseg:gpu
docker run --gpus all -t -d -v /home/circleci/project:/mmseg -v /home/circleci/mmengine:/mmengine -v /home/circleci/mmclassification:/mmclassification -v /home/circleci/mmdetection:/mmdetection -w /mmseg --name mmseg mmseg:gpu
- run:
name: Install mmseg dependencies
command: |
docker exec mmseg pip install -e /mmengine
docker exec mmseg pip install -U openmim
docker exec mmseg mim install 'mmcv>=2.0.0rc1'
docker exec mmseg mim install 'mmcv>=2.0.0rc3'
docker exec mmseg pip install -e /mmclassification
docker exec mmseg pip install -e /mmdetection
docker exec mmseg pip install -r requirements/tests.txt -r requirements/optional.txt
- run:
name: Build and install
Expand Down
12 changes: 8 additions & 4 deletions .github/workflows/merge_stage_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -92,8 +93,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -155,8 +157,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -187,8 +190,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down
9 changes: 6 additions & 3 deletions .github/workflows/pr_stage_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ jobs:
run: |
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -92,8 +93,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -124,8 +126,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ Supported methods:
- [x] [Segmenter (ICCV'2021)](configs/segmenter)
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
- [x] [K-Net (NeurIPS'2021)](configs/knet)
- [x] [MaskFormer (NeurIPS'2021)](configs/maskformer)

Supported datasets:

Expand Down
1 change: 1 addition & 0 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ MMSegmentation 是一个基于 PyTorch 的语义分割开源工具箱。它是 O
- [x] [Segmenter (ICCV'2021)](configs/segmenter)
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
- [x] [K-Net (NeurIPS'2021)](configs/knet)
- [x] [MaskFormer (NeurIPS'2021)](configs/maskformer)

已支持的数据集:

Expand Down
60 changes: 60 additions & 0 deletions configs/maskformer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# MaskFormer

[MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation](https://arxiv.org/abs/2107.06278)

## Introduction

<!-- [ALGORITHM] -->

<a href="https://github.com/facebookresearch/MaskFormer/">Official Repo</a>

<a href="https://github.com/open-mmlab/mmdetection/blob/dev-3.x/mmdet/models/dense_heads/maskformer_head.py#L21">Code Snippet</a>

## Abstract

<!-- [ABSTRACT] -->

Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this observation, we propose MaskFormer, a simple mask classification model which predicts a set of binary masks, each associated with a single global class label prediction. Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results. In particular, we observe that MaskFormer outperforms per-pixel classification baselines when the number of classes is large. Our mask classification-based method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.

<!-- [IMAGE] -->

<div align=center>
<img src="https://user-images.githubusercontent.com/24582831/199215459-ea507126-aafe-4823-8eb1-ae6487509d5c.png" width="90%"/>
</div>

```bibtex
@article{cheng2021per,
title={Per-pixel classification is not all you need for semantic segmentation},
author={Cheng, Bowen and Schwing, Alex and Kirillov, Alexander},
journal={Advances in Neural Information Processing Systems},
volume={34},
pages={17864--17875},
year={2021}
}
```

### Usage

- MaskFormer model needs to install [MMDetection](https://github.com/open-mmlab/mmdetection) first.

```shell
pip install "mmdet>=3.0.0rc4"
```

## Results and models

### ADE20K

| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
| ---------- | --------- | --------- | ------- | -------- | -------------- | ----- | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| MaskFormer | R-50-D32 | 512x512 | 160000 | 3.29 | 42.20 | 44.29 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724.json) |
| MaskFormer | R-101-D32 | 512x512 | 160000 | 4.12 | 34.90 | 45.11 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053-c8e0931d.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053.json) |
| MaskFormer | Swin-T | 512x512 | 160000 | 3.73 | 40.53 | 46.69 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813-03550716.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813.json) |
| MaskFormer | Swin-S | 512x512 | 160000 | 5.33 | 26.98 | 49.36 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710-5ab67e58.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710.json) |

Note:

- All experiments of MaskFormer are implemented with 8 V100 (32G) GPUs with 2 samplers per GPU.
- The results of MaskFormer are relatively not stable. The accuracy (mIoU) of model with `R-101-D32` is from 44.7 to 46.0, and with `Swin-S` is from 49.0 to 49.8.
- The ResNet backbones utilized in MaskFormer models are standard `ResNet` rather than `ResNetV1c`.
- Test time augmentation is not supported in MMSegmentation 1.x version yet, we would add "ms+flip" results as soon as possible.
101 changes: 101 additions & 0 deletions configs/maskformer/maskformer.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
Collections:
- Name: MaskFormer
Metadata:
Training Data:
- Usage
- ADE20K
Paper:
URL: https://arxiv.org/abs/2107.06278
Title: 'MaskFormer: Per-Pixel Classification is Not All You Need for Semantic
Segmentation'
README: configs/maskformer/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/dev-3.x/mmdet/models/dense_heads/maskformer_head.py#L21
Version: dev-3.x
Converted From:
Code: https://github.com/facebookresearch/MaskFormer/
Models:
- Name: maskformer_r50-d32_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: R-50-D32
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 23.7
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 3.29
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 44.29
Config: configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth
- Name: maskformer_r101-d32_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: R-101-D32
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 28.65
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 4.12
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 45.11
Config: configs/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053-c8e0931d.pth
- Name: maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: Swin-T
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 24.67
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 3.73
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 46.69
Config: configs/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813-03550716.pth
- Name: maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: Swin-S
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 37.06
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 5.33
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 49.36
Config: configs/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710-5ab67e58.pth
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
_base_ = './maskformer_r50-d32_8xb2-160k_ade20k-512x512.py'

model = dict(
backbone=dict(
depth=101,
init_cfg=dict(type='Pretrained',
checkpoint='torchvision://resnet101')))
Loading

0 comments on commit 933e4d3

Please sign in to comment.