matteo ruggero ronchipresentations.cocodataset.org/coco17-keypoints-overview.pdf · matteo ruggero...

Matteo Ruggero Ronchi

COCO and Places Visual Recognition Challenges WorkshopSunday, October 29th, Venice, Italy

2017 Keypoints Challenge

Dataset COCO

/ 20

Multiple Perspectives, Instances, Sizes, Occlusions:

3

COCO Keypoints Dataset (I)

• 17 types of keypoints.• 58,945 images.• 156,165 annotated people.• 1,710,498 total keypoints.

Overall Statistics (train/val):

/ 204

COCO Keypoints Dataset (II)

• Avg of ~2 annotated people per image.• Up to 13 annotated people per image.

Multi-Instance Dataset:

Number of annotations

Num

ber o

f im

ages

/ 205

COCO Keypoints Dataset (III)

Distribution of the number of keypoints:

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

1 2-5 6-10 11-15 16-17Number of keypoints

Num

ber o

f ins

tanc

es

/ 206

Evaluating Keypoint Predictions

Bounding Box IoU Mask IoUObject

Keypoint Similarity

How to measure localization accuracy:

/ 207

Keypoints Evaluation Metric

Object Keypoint Similarity (OKS):

/ 208

COCO Keypoints Task

Simultaneous detection and keypoint estimation:OKS = 0.5 OKS = 0.95

prec

isio

n

0.2

0.4

0.6

0.8

1

1

recall0.2 0.4 0.6 0.8 1

recall0.2 0.4 0.6 0.8

OKS .5

Ground Truth

OKS .95

/ 209

0%

10%

20%

30%

40%

50%

60%

70%

80%

Megvii

Oks*+

Bangb

angre

n+

G-RMI*+

FAIR*

SJTU+

Samsu

ng-pos

e*

CMU-Pos

e*

METU*

Jessi

e333

21*

Gnxr9*

Lwhl*

2017 Keypoints Challenge Leaderboard (I)

COCO AP (average over all OKS)

* Single model method

+11.6% absolute~20% relative

5 teams in ~3% AP

72.1 71.4 70.6 69.1 68.9 68.8 63.6 60.5

+ Used external keypoints training dataset

/ 2010

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Megvii Oks

Bangb

angre

nG-R

MIFA

IRSJT

U

Samsu

ng-pos

eMETU

Jessi

e333

21Gnx

r9Lw

hl

AP 50 AP 75

2017 Keypoints Challenge Leaderboard (II)

Better performance at looser localization thresholds:

~20% above COCO AP

/ 2011

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Megvii Oks

Bangb

angre

nG-R

MIFA

IRSJT

U

Samsu

ng-pos

eMETU

Jessi

e333

21Gnx

r9Lw

hl

Large Instances Medium Instances

2017 Keypoints Challenge Leaderboard (III)

Instance scale is an important factor:

/ 20

50%

60%

70%

80%

90%

Megvii Oks Bangbangren G-RMIFAIR SJTU Samsung-pose

12

Performance Breakdown over Keypoints

COCO AP varies across keypointsall

face

uppe

r

torso

lower

limbs

76.3% avg COCO AP~8% spread

67.3% avg COCO AP~11% spread

/ 2013

A Closer Look at Errors

[1] Ronchi et al, “Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation”, ICCV17 www.github.com/matteorr/coco-analyze

http://www.github.com/matteorr/coco-analyze

/ 2014

A Closer Look at Errors (I)

Taxonomy of Errors for Multi-Instance Pose Estimation:

JITTER

SCORING

INVERSION SWAP MISS



/ 2015

A Closer Look at Errors (II)

Fine-grained Precision-Recall Curves


0 10

Recall

Prec

isio

n

1

@OKS .85: (AP) .651

Inversion: .858.794

Jitter:

Swap:Miss:

.953

.769

False Pos.:False Neg.:

.9801.00

Opt. Score: .972

Team Megvii

0 10

Recall

Prec

isio

n

1

@OKS .85: (AP) .578

Inversion: .857.767

Jitter:

Swap:Miss:

.953

.728

False Pos.:False Neg.:

.9701.00

Opt. Score: .970

Team FAIR


/ 20

JitterG-RMI (10.6%)

7%9%

20%

12% 13%13%

8%10%

7%

MissG-RMI (4.4%)

14%15%

13%20%

15%

9%6%

5%3%

16

Localization Errors

Best performance for each type of Localization Error

SwapJessie33321 (0.9%)

11%11%

14%17% 19%

20%

3%3%

2%

4%1%3%

11%

81%

GoodG-RMI (81.1%)

Good

Jitter

Inversion Swap

Miss

Nose

Eyes

Ears

Shoulders

Elbows

Wrists

Hips

Knees

Ankles

InversionBangbangren (2.4%)

18%

20%

27%9%5%

17%

1%3%

/ 2017

Occlusion and Crowding Benchmarks (I)

COCO Benchmarks of image complexity:

N. of Keypoints

N. o

f Ove

rlaps

2 [1, 5] 2 [6, 10] 2 [11, 15] 2 [16, 17]

0

2[1,2]

� 3

N. of Keypoints

9098 15968 29165 12246

4876 11735 16384 4636

243 644 780 193

N. o

f Ove

rlaps

2[1,2]

0

�3

2 [16, 17]2 [11, 15]2 [1, 5] 2 [6, 10]

• Occlusion: number of visible keypoints

• Crowding: number of overlapping instances (IoU > 0.1)

/ 2018

Occlusion and Crowding Benchmarks (II)

Overall challenge performance is saturated by the easiest benchmarks:

Team Megvii

/ 2019

Summary of Findings

• About 20% relative AP improvement over last year’s challenge.• Very small performance gap between top entries.• Single model performance is on par with ensembles.• Single performance metrics do not capture the complex causes of

diverse errors.• We need to broaden current benchmarks with challenging images

(high occlusion / low number of keypoints).

2017 Keypoint Challenge Take-aways:

/ 2020

2017 COCO Keypoints Challenge

Megvii 1st

Oks 2nd

Bangbangren 3rd

G-RMI 4th

FAIR 5th

SJTU 6th

Samsung-pose 7th

Invited Speakers:• Team Megvii / (10:50am - 11:05am)

• Team Oks / (11:05am - 11:20am)

Team Position

matteo ruggero ronchipresentations.cocodataset.org/coco17-keypoints-overview.pdf · matteo ruggero...

Documents