juergen gall analyzing human behavior in video...

Post on 20-Apr-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Juergen Gall

Analyzing Human Behavior in

Video Sequences

Analyzing Human Behavior

Videos

Low level features, e.g., gradients, optical flow

Analyzing Human Behavior

Human Pose

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 2

21 Actions from HMDB

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 3

928 clips, 33183 frames

HMDB51 (Kuehne et al, ICCV 2011)

Puppet Annotation

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 4

Joint-annotated HMDB (JHMDB)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 5

[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]

[ http://jhmdb.is.tue.mpg.de ]

Study with Annotated Data (2013)

• Large potential gain for pose feature

• Not with existing 2d human pose methods

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 6

given flow

+ ~11%

given mask

+ ~9%

pose features

+ ~20%

baseline given puppet flow given puppet mask given joint positions

Low Mid High

baseline

GT

[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]

[ http://jhmdb.is.tue.mpg.de ]

CNNs for Pose Estimation

Stack CNNs:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 7

[ S.-E. Wei et al. Convolutional Pose Machines. CVPR 2016 ]

Coupled Action Recognition and Pose

Estimation

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 16

[ U. Iqbal et al. Pose for Action – Action for Pose. FG 2017 ]

Pose Estimation in Videos

Video datasets for human pose in unconstrained videos

does not exist.

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 18

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Estimation in Videos

Video datasets for human pose in unconstrained videos

does not exist.

Unconstrained means

• Public available content from the Internet (e.g.

Youtube)

• Multiple persons in a video (no assumption about

position)

• Arbitrary number of visible joints (truncation and

occlusion)

• Large scale variations (unknown scale)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 19

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose-Track Dataset

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 20

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Joint-annotated HMDB (JHMDB)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 22

[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]

[ http://jhmdb.is.tue.mpg.de ]

Pose-Track Dataset

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 23

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose-Track Dataset

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 24

Challenge ICCV 2017

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 25

[ http://posetrack.net/workshops/iccv2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Estimate pose + person association over time:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 26

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Estimate pose + person association over time:

• Predict body joints (CNN trained on MPII Pose)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 27

Pose Track: Simultaneous Pose

Estimation and Tracking

Estimate pose + person association over time:

• Predict body joints (CNN trained on MPII Pose)

• Build a graph with temporal and spatial edges

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 28

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 29

f’f f’’f’f f’’

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 30

f’f f’’

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Unaries: Confidences of detected joints

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 31

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Spatial binaries: Extract quadratic bounding box around

detection

Two cases:

• Different joint type:

• Logistic regression based on distance and orientation

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 32

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Spatial binaries: Extract quadratic bounding box around

detection

Two cases:

• Same joint type:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 33

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 34

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Temporal binaries: Compute optical flow (DeepMatching)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 35

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Temporal binaries: Compute optical flow (DeepMatching)

Logistic regression:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 36

f’f[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Solve integer linear program:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 37

f’f f’’

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Solve integer linear program:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 38

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Solve integer linear program:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 39

f’f f’’

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

To obtain plausible pauses, constraints are added:

• Spatial transitivity:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 40

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

To obtain plausible pauses, constraints are added:

• Spatial transitivity:

• Temporal transitivity:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 42

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

To obtain plausible pauses, constraints are added:

• Spatial transitivity:

• Temporal transitivity:

• Spatio-temporal trans.:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 43

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 44

Pose Track: Simultaneous Pose

Estimation and Tracking

To obtain plausible pauses, constraints are added:

Spatio-temporal consistency:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 45

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

Estimate pose + person association over time:

• Predict body joints (CNN trained on MPII Pose)

• Build a graph with temporal and spatial edges

• Partition spatio-temporal graph

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 46

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose

Estimation and Tracking

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 47

Pose Track: Evaluation

• Pose estimation accuracy (mAP)

• Person association (MOTA)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 48

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Evaluation

• Pose estimation accuracy (mAP)

• Person association (MOTA)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 49

[ U. Iqbal et al. Pose-Track: Joint Multi-Person

Pose Estimation and Tracking. CVPR 2017 ]

Joint-annotated HMDB (JHMDB)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 50

[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]

[ http://jhmdb.is.tue.mpg.de ]

Video Analysis for Studying the

Behavior of Mice

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 72

Recurrent Neural Networks

• Gated units (LSTM/GRU)

10 /9 /20 17 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 74

Weakly Supervised Learning

• Fully supervised:

• Weakly supervised (transcripts)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 75

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 77

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 78

• Represent an activity a like “spoon_powder” by latent

sub-activities s1(a) ,s2

(a),s3(a),…

• Optimal number of sub-activities is unknown:

• Many sub-activities for long activities

• Few sub-activities for short activities

s1(a) s2

(a) s3(a) s4

(a) s5(a) s6

(a)

Model

• RNN with Gated Recurrent Units (GRU)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 79

Model

• Hidden Markov Model (HMM) enforce fixed order of

sub-activities: s1(a) ,s2

(a),s3(a),…

• HMMs use probabilities of RNN as input

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 80

Model

• Hidden Markov Model (HMM) for each activity

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 81

Model

• The transcripts define the order of activities:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 82

Model

• The transcripts define the order of activities:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 83

Model

• The transcripts define the order of activities:

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 84

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 85

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 86

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 87

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 88

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Weakly Supervised Learning

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 89

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Results

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 90

Results

• Accuracy on unseen sequences (video without

transcript)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 91

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Results

• Accuracy on unseen sequences (video without

transcript)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 92

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Results

• Accuracy on unseen sequences (video with

transcript)

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 93

[ A. Richard et al. Weakly Supervised Action Learning with RNN

based Fine-to-Coarse Modeling. CVPR 2017 ]

Research Unit - Anticipating Human

Behavior

20 .09 .2 01 6 Resear ch Uni t 2535 - Ant ic i p a t in g Hum an Behavior 94

[ https://pages.iai.uni-bonn.de/FOR2535 ]

Research Unit - Anticipating Human

Behavior

20 .09 .2 01 6 Resear ch Uni t 2535 - Ant ic i p a t in g Hum an Behavior 95

Thank you for your attention.

09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 96

top related