pixel-level image understanding with semantic segmentation...

39
Pixel-Level Image Understanding with Semantic Segmentation and Panoptic Segmentation Hengshuang Zhao The Chinese University of Hong Kong May 29, 2019

Upload: others

Post on 15-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Pixel-Level Image Understanding with Semantic Segmentation and Panoptic Segmentation

Hengshuang Zhao

The Chinese University of Hong Kong

May 29, 2019

Page 2: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Part I: Semantic Segmentation

Page 3: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Semantic Segmentation

Original Image Per-Pixel Annotation

person

horse

car

background

Images adapted from PASCAL VOC 2012Images adapted from ADE20K

Page 4: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Fully Convolutional Network

FCN [Long et al. 2015]

Page 5: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Conditional Random Field

DeepLabV1 [Chen et al. 2015], DPN [Liu et al. 2015], CRF-RNN [Zheng et al. 2015]

Page 6: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Encoder-Decoder

UNet [Ronneberger et al. 2015], DeconvNet [Noh et al. 2015],SegNet [Badrinarayanan et al. 2015], LRR [Ghiasi et al. 2016],

RefineNet [Lin et al. 2017], FRRN [Pohlen et al. 2017]

Page 7: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Atrous Convolution / Dilated Convolution

DeepLabV1 [Chen et al. 2015], Dilation [Fisher et al. 2016]

Page 8: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Context Aggregation

Pooling: ParseNet [Liu et al. 2015], PSPNet [Zhao et al. 2017], DeepLabV2 [Chen et al. 2016]Large Kernel: GCN [Peng et al. 2017]

Page 9: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Neural Architecture Search

Search for backbone: Auto-DeepLab [Liu et al. 2019]Search for head: DPC [Chen et al. 2018]

Page 10: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Attention Mechanism

Spatial attention (dot product): Transformer [Vaswani et al. 2017], Non-Local-Net [Wang et al. 2018]OCNet [Yuan et al. 2018], DANet [Fu et al. 2018], CCNet [Huang et al. 2018]

Channel reweighting: SENet [Hu et al. 2018],EncNet [Zhang et al. 2018], DFN [Yu et al. 2018]

Page 11: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Point-wise Spatial Attention Network (PSANet)

• Conv & Dilated Conv: Fixed grid, information flow restricted inside local regions

• Pooling Operation: Fixed weights at each position with none adaptively manner

• Feature Correlation: Relative position information ignored

• Point-wise Spatial Attention:

• Long-range context aggregation for dense prediction

• Bi-direction information propagation

• Self-adaptively learned and location-sensitive masks

Page 12: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Point-wise Spatial Attention Network

Page 13: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Point-wise Spatial Attention Network

Information collection branch

Information distribution branch

Over-completed Compact

Page 14: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Point-wise Spatial Attention Network

Information collection branch

Information distribution branch

Over-completed Compactfeature fusion: local & global

Page 15: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Attention Mask Generation

Page 16: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Incorporation with FCN

Page 17: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Result on ADE20K and VOC 2012

ADE20K: information aggregation approaches ADE20K: result on val set

PSACAL VOC 2012:result on val set PSACAL VOC 2012: result on val set

Page 18: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Result on Cityscapes

result on val set

result on test set(train with fine set)

result on test set(train with fine+coarse set)

Page 19: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Visual Prediction on ADE20K

Page 20: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Visual Prediction on VOC 2012

Page 21: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Visual Prediction on Cityscapes

Page 22: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Mask Visualization

Page 23: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Part II: Panoptic Segmentation

Page 24: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Semantic Segmentation

semantic segmentation:instances indistinguishable

Page 25: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Instance Segmentation

instance segmentation:stuff unsolved

Page 26: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Panoptic Segmentation

panoptic segmentation:stuff and things are solved, instances distinguishable

Page 27: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Heuristic Combination

Mask R-CNN [He et al. 2017]

PSPNet [Zhao et al. 2017]

Instance

Semantic

redundant computation for independent models

Page 28: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Heuristic Combination

Mask R-CNN [He et al. 2017]

PSPNet [Zhao et al. 2017]

Instance

Semantic

HeuristicMerge

heuristic merge logic is not end-to-end trainable

Page 29: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

heuristic combination

Page 30: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

our end-to-end output

Page 31: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Unified Panoptic Segmentation Network (UPSNet)

Unified Backbone NetworkSave Computation!

Pixel-wise ClassificationConsistent Estimation!

Page 32: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Semantic & Instance Head

Semantic Head: FPN with Deformable ConvInstance Head: Same as Mask-RCNN

Page 33: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Panoptic Head

Mask logits from Instance head

𝑌𝑖 resize/pad

𝑋thing

Thing & Stuff logitsfrom Semantic head

𝑋mask𝑖

𝑁inst

H x W

𝑋stuff𝑁stuff

H x W

Panoptic logits

max

max

1Logits for Unknown

Page 34: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Performance Comparison

160

165

170

175

180

185

190

41.4

41.6

41.8

42

42.2

42.4

42.6

Results on COCO (800 x 1300)

0

200

400

600

800

1000

1200

57

57.5

58

58.5

59

59.5

Results on Cityscapes (1024 x 2048)

UPSNet MR-CNN-PSP UPSNet MR-CNN-PSP

Page 35: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Detailed Result

result on COCO result on Cityscapes

result on internal datarun time comparison

Page 36: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Visual Prediction

result on COCO

result on Cityscapes

Page 37: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Code Resource

I. Semantic Segmentation:• Caffe:

• https://github.com/hszhao/PSPNet• https://github.com/hszhao/PSANet• https://github.com/hszhao/ICNet

• PyTorch:• https://github.com/hszhao/semseg (new)• highly optimized codebase with better reimplementation results

II. Panoptic Segmentation:• PyTorch:

• https://github.com/uber-research/UPSNet• the first open sourced codebase for unified end-to-end panoptic segmentation

Page 38: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Remain Problem

I. Semantic Segmentation:• imbalance classes: long-tail distribution

• confusion classes: using human’s confusion matrix (e.g., ade20k) as prior

• data augmentation: adaptive augmentation or auto augmentation

• hard mining: effective while not elegant

• robustness and generalization: one model for different datasets

• accuracy and efficiency: can both be achieved?

II. Panoptic Segmentation:• introduce parameters into panoptic head (e.g., 3d Conv)

• new frameworks with a single panoptic head

Page 39: Pixel-Level Image Understanding with Semantic Segmentation ...valser.org/webinar/slide/slides/20190529/2019.5.29 赵恒爽.pdfMay 29, 2019  · Pixel-Level Image Understanding with

Thanks!