Mask2Former (CVPR'2022)
@inproceedings{cheng2021mask2former,
title={Masked-attention Mask Transformer for Universal Image Segmentation},
author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
journal={CVPR},
year={2022}
}
Segmentor | Pretrain | Backbone | Crop Size | Schedule | Train/Eval Set | mIoU | Download |
---|---|---|---|---|---|---|---|
Mask2Former | ImageNet-1k-224x224 | Swin-T | 512x512 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/130 | train/val | 49.20% | cfg | model | log |
Mask2Former | ImageNet-1k-224x224 | Swin-S | 512x512 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/130 | train/val | 51.37% | cfg | model | log |
Mask2Former | ImageNet-22k-384x384 | Swin-B | 640x640 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/130 | train/val | 54.13% | cfg | model | log |
Mask2Former | ImageNet-22k-384x384 | Swin-L | 640x640 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/130 | train/val | 56.30% | cfg | model | log |
Segmentor | Pretrain | Backbone | Crop Size | Schedule | Train/Eval Set | mIoU | Download |
---|---|---|---|---|---|---|---|
Mask2Former | ImageNet-1k-224x224 | Swin-T | 512x1024 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/500 | train/val | 82.10% | cfg | model | log |
Mask2Former | ImageNet-1k-224x224 | Swin-S | 512x1024 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/500 | train/val | 82.65% | cfg | model | log |
Mask2Former | ImageNet-22k-384x384 | Swin-B | 512x1024 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/500 | train/val | 83.62% | cfg | model | log |
Mask2Former | ImageNet-22k-384x384 | Swin-L | 512x1024 | LR/POLICY/BS/EPOCH: 0.0001/poly/16/500 | train/val | 83.79% | cfg | model | log |
You can also download the model weights from following sources:
- BaiduNetdisk: https://pan.baidu.com/s/1gD-NJJWOtaHCtB0qHE79rA with access code s757