matteo ruggero ronchipresentations.cocodataset.org/coco17-keypoints-overview.pdf · matteo ruggero...
TRANSCRIPT
Matteo Ruggero Ronchi
COCO and Places Visual Recognition Challenges WorkshopSunday, October 29th, Venice, Italy
2017 Keypoints Challenge
Dataset COCO
/ 20
Multiple Perspectives, Instances, Sizes, Occlusions:
3
COCO Keypoints Dataset (I)
• 17 types of keypoints.• 58,945 images.• 156,165 annotated people.• 1,710,498 total keypoints.
Overall Statistics (train/val):
/ 204
COCO Keypoints Dataset (II)
• Avg of ~2 annotated people per image.• Up to 13 annotated people per image.
Multi-Instance Dataset:
Number of annotations
Num
ber o
f im
ages
/ 205
COCO Keypoints Dataset (III)
Distribution of the number of keypoints:
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
1 2-5 6-10 11-15 16-17Number of keypoints
Num
ber o
f ins
tanc
es
/ 206
Evaluating Keypoint Predictions
Bounding Box IoU Mask IoUObject
Keypoint Similarity
How to measure localization accuracy:
/ 207
Keypoints Evaluation Metric
Object Keypoint Similarity (OKS):
/ 208
COCO Keypoints Task
Simultaneous detection and keypoint estimation:OKS = 0.5 OKS = 0.95
prec
isio
n
0.2
0.4
0.6
0.8
1
1
recall0.2 0.4 0.6 0.8 1
recall0.2 0.4 0.6 0.8
OKS .5
Ground Truth
OKS .95
/ 209
0%
10%
20%
30%
40%
50%
60%
70%
80%
Megvii
Oks*+
Bangb
angre
n+
G-RMI*+
FAIR*
SJTU+
Samsu
ng-pos
e*
CMU-Pos
e*
METU*
Jessi
e333
21*
Gnxr9*
Lwhl*
2017 Keypoints Challenge Leaderboard (I)
COCO AP (average over all OKS)
* Single model method
+11.6% absolute~20% relative
5 teams in ~3% AP
72.1 71.4 70.6 69.1 68.9 68.8 63.6 60.5
+ Used external keypoints training dataset
/ 2010
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Megvii Oks
Bangb
angre
nG-R
MIFA
IRSJT
U
Samsu
ng-pos
eMETU
Jessi
e333
21Gnx
r9Lw
hl
AP 50 AP 75
2017 Keypoints Challenge Leaderboard (II)
Better performance at looser localization thresholds:
~20% above COCO AP
/ 2011
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Megvii Oks
Bangb
angre
nG-R
MIFA
IRSJT
U
Samsu
ng-pos
eMETU
Jessi
e333
21Gnx
r9Lw
hl
Large Instances Medium Instances
2017 Keypoints Challenge Leaderboard (III)
Instance scale is an important factor:
/ 20
50%
60%
70%
80%
90%
Megvii Oks Bangbangren G-RMIFAIR SJTU Samsung-pose
12
Performance Breakdown over Keypoints
COCO AP varies across keypointsall
face
uppe
r
torso
lower
limbs
76.3% avg COCO AP~8% spread
67.3% avg COCO AP~11% spread
/ 2013
A Closer Look at Errors
[1] Ronchi et al, “Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation”, ICCV17 www.github.com/matteorr/coco-analyze
/ 2014
A Closer Look at Errors (I)
Taxonomy of Errors for Multi-Instance Pose Estimation:
JITTER
SCORING
INVERSION SWAP MISS
[1] Ronchi et al, “Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation”, ICCV17 www.github.com/matteorr/coco-analyze
/ 2015
A Closer Look at Errors (II)
Fine-grained Precision-Recall Curves
[1] Ronchi et al, “Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation”, ICCV17 www.github.com/matteorr/coco-analyze
0 10
Recall
Prec
isio
n
1
@OKS .85: (AP) .651
Inversion: .858.794
Jitter:
Swap:Miss:
.953
.769
False Pos.:False Neg.:
.9801.00
Opt. Score: .972
Team Megvii
0 10
Recall
Prec
isio
n
1
@OKS .85: (AP) .578
Inversion: .857.767
Jitter:
Swap:Miss:
.953
.728
False Pos.:False Neg.:
.9701.00
Opt. Score: .970
Team FAIR
/ 20
JitterG-RMI (10.6%)
7%9%
20%
12% 13%13%
8%10%
7%
MissG-RMI (4.4%)
14%15%
13%20%
15%
9%6%
5%3%
16
Localization Errors
Best performance for each type of Localization Error
SwapJessie33321 (0.9%)
11%11%
14%17% 19%
20%
3%3%
2%
4%1%3%
11%
81%
GoodG-RMI (81.1%)
Good
Jitter
Inversion Swap
Miss
Nose
Eyes
Ears
Shoulders
Elbows
Wrists
Hips
Knees
Ankles
InversionBangbangren (2.4%)
18%
20%
27%9%5%
17%
1%3%
/ 2017
Occlusion and Crowding Benchmarks (I)
COCO Benchmarks of image complexity:
N. of Keypoints
N. o
f Ove
rlaps
2 [1, 5] 2 [6, 10] 2 [11, 15] 2 [16, 17]
0
2[1,2]
� 3
N. of Keypoints
9098 15968 29165 12246
4876 11735 16384 4636
243 644 780 193
N. o
f Ove
rlaps
2[1,2]
0
�3
2 [16, 17]2 [11, 15]2 [1, 5] 2 [6, 10]
• Occlusion: number of visible keypoints
• Crowding: number of overlapping instances (IoU > 0.1)
/ 2018
Occlusion and Crowding Benchmarks (II)
Overall challenge performance is saturated by the easiest benchmarks:
Team Megvii
/ 2019
Summary of Findings
• About 20% relative AP improvement over last year’s challenge.• Very small performance gap between top entries.• Single model performance is on par with ensembles.• Single performance metrics do not capture the complex causes of
diverse errors.• We need to broaden current benchmarks with challenging images
(high occlusion / low number of keypoints).
2017 Keypoint Challenge Take-aways:
/ 2020
2017 COCO Keypoints Challenge
Megvii 1st
Oks 2nd
Bangbangren 3rd
G-RMI 4th
FAIR 5th
SJTU 6th
Samsung-pose 7th
Invited Speakers:• Team Megvii / (10:50am - 11:05am)
• Team Oks / (11:05am - 11:20am)
Team Position