crowd scene understanding with coherent recurrent neural...
TRANSCRIPT
![Page 1: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/1.jpg)
Crowd Scene Understanding with Coherent RecurrentNeural Networks
Hang Su, Yinpeng Dong, Jun Zhu
May 22, 2016
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 1 / 26
![Page 2: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 LSTM Recap
3 Coherent LSTM
4 Experimental Results
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 2 / 26
![Page 3: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/3.jpg)
Outline
1 Introduction
2 LSTM Recap
3 Coherent LSTM
4 Experimental Results
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 3 / 26
![Page 4: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/4.jpg)
Background
Crowd scene is the scene of public places where a large group ofpeople who have gathered together such as a university campus orthe sidewalks of a busy street.
Groups are the main entities that make up a crowd.
When pedestrians walk in a crowded space, their trajectories areinfluenced by others and obstacles.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 4 / 26
![Page 5: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/5.jpg)
Background
Crowd scene is the scene of public places where a large group ofpeople who have gathered together such as a university campus orthe sidewalks of a busy street.
Groups are the main entities that make up a crowd.
When pedestrians walk in a crowded space, their trajectories areinfluenced by others and obstacles.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 4 / 26
![Page 6: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/6.jpg)
Background
Crowd scene is the scene of public places where a large group ofpeople who have gathered together such as a university campus orthe sidewalks of a busy street.
Groups are the main entities that make up a crowd.
When pedestrians walk in a crowded space, their trajectories areinfluenced by others and obstacles.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 4 / 26
![Page 7: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/7.jpg)
Background
Crowd scene is the scene of public places where a large group ofpeople who have gathered together such as a university campus orthe sidewalks of a busy street.
Groups are the main entities that make up a crowd.
When pedestrians walk in a crowded space, their trajectories areinfluenced by others and obstacles.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 4 / 26
![Page 8: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/8.jpg)
Applications
Understanding collective behaviors in crowd scenes has a wide range ofapplications in
Video Surveillance
Crowd Management
Avoiding Tragic Accidents
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 5 / 26
![Page 9: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/9.jpg)
Applications
Understanding collective behaviors in crowd scenes has a wide range ofapplications in
Video Surveillance
Crowd Management
Avoiding Tragic Accidents
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 5 / 26
![Page 10: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/10.jpg)
Applications
Understanding collective behaviors in crowd scenes has a wide range ofapplications in
Video Surveillance
Crowd Management
Avoiding Tragic Accidents
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 5 / 26
![Page 11: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/11.jpg)
Problem Formulation
Obtain reliable tracklets from each scene using KLT trackers. Atany time-instant t, the ith person is represented by his/hercoordinate (x
i
(t),yi
(t)).
Predict future trajectories of pedestrians and use extracted hiddenfeatures to do other classification tasks.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 6 / 26
![Page 12: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/12.jpg)
Problem Formulation
Obtain reliable tracklets from each scene using KLT trackers. Atany time-instant t, the ith person is represented by his/hercoordinate (x
i
(t),yi
(t)).
Predict future trajectories of pedestrians and use extracted hiddenfeatures to do other classification tasks.
Coherent regularization
Coherent regularization
Motion Prediction
LSTM
LSTM
LSTM
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 6 / 26
![Page 13: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/13.jpg)
Challenge
Crowd spatio-temporalpatterns behavenonlinear dynamics
Limit cyclesQuasi-periodChaos
Collective e↵ect (orcoherent motion)
Pedestrian tend toform groupsIntra-group propertiesand inter-groupproperties.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 7 / 26
![Page 14: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/14.jpg)
Challenge
Crowd spatio-temporalpatterns behavenonlinear dynamics
Limit cyclesQuasi-periodChaos
Collective e↵ect (orcoherent motion)
Pedestrian tend toform groupsIntra-group propertiesand inter-groupproperties.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 7 / 26
![Page 15: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/15.jpg)
Challenge
Crowd spatio-temporalpatterns behavenonlinear dynamics
Limit cyclesQuasi-periodChaos
Collective e↵ect (orcoherent motion)
Pedestrian tend toform groupsIntra-group propertiesand inter-groupproperties.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 7 / 26
![Page 16: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/16.jpg)
Previous Work
Traditional approach such asSocial Force model
Optimize energy functionHand-crafted functionsHard to generalize
Probabilistic Forecasting suchas Gaussian Process
Recurrent Neural NetworksN-LSTM (CVPR 2016)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 8 / 26
![Page 17: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/17.jpg)
Previous Work
Traditional approach such asSocial Force model
Optimize energy functionHand-crafted functionsHard to generalize
Probabilistic Forecasting suchas Gaussian Process
Recurrent Neural NetworksN-LSTM (CVPR 2016)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 8 / 26
![Page 18: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/18.jpg)
Previous Work
Traditional approach such asSocial Force model
Optimize energy functionHand-crafted functionsHard to generalize
Probabilistic Forecasting suchas Gaussian Process
Recurrent Neural NetworksN-LSTM (CVPR 2016)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 8 / 26
![Page 19: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/19.jpg)
Previous Work
Traditional approach such asSocial Force model
Optimize energy functionHand-crafted functionsHard to generalize
Probabilistic Forecasting suchas Gaussian Process
Recurrent Neural NetworksN-LSTM (CVPR 2016)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 8 / 26
![Page 20: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/20.jpg)
Outline
1 Introduction
2 LSTM Recap
3 Coherent LSTM
4 Experimental Results
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 9 / 26
![Page 21: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/21.jpg)
LSTM
StructureInput / Output / ForgetgateMemory state ct
AdvantagePrevent vanishing gradientproblemNonlinear characteristicGeneralization
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 10 / 26
![Page 22: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/22.jpg)
LSTM
StructureInput / Output / ForgetgateMemory state ct
AdvantagePrevent vanishing gradientproblemNonlinear characteristicGeneralization
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 10 / 26
![Page 23: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/23.jpg)
LSTM
StructureInput / Output / ForgetgateMemory state ct
AdvantagePrevent vanishing gradientproblemNonlinear characteristicGeneralization
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 10 / 26
![Page 24: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/24.jpg)
LSTM
i
t
= �(Wxi
x
t
+W
hi
h
t�1 +W
ci
c
t�1 + b
i
) (1)
f
t
= �(Wxf
x
t
+W
hf
h
t�1 +W
cf
c
t�1 + b
f
) (2)
c
t
= f
t
� c
t�1 + i
t
� tanh(Wxc
x
t
+W
hc
h
t�1 + b
c
) (3)
o
t
=�(Wxo
x
t
+W
ho
h
t�1 +W
co
c
t
+ b
o
) (4)
h
t
= o
t
� tanh(ct
) (5)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 11 / 26
![Page 25: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/25.jpg)
Outline
1 Introduction
2 LSTM Recap
3 Coherent LSTM
4 Experimental Results
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 12 / 26
![Page 26: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/26.jpg)
Why Coherent LSTM?
LSTM can model individual behaviors but dose not capture theinteraction of people in a group.
The individuals are always willing to engage with “seed” groupsand form spatially coherent structures.
When the neighboring relationship of individuals remain invariantover time and correlation of their velocities remain high, they tendto have similar hidden state.
The trajectories of pedestrians not only follow the old trend, butalso are influenced by current environment.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 13 / 26
![Page 27: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/27.jpg)
Why Coherent LSTM?
LSTM can model individual behaviors but dose not capture theinteraction of people in a group.
The individuals are always willing to engage with “seed” groupsand form spatially coherent structures.
When the neighboring relationship of individuals remain invariantover time and correlation of their velocities remain high, they tendto have similar hidden state.
The trajectories of pedestrians not only follow the old trend, butalso are influenced by current environment.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 13 / 26
![Page 28: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/28.jpg)
Why Coherent LSTM?
LSTM can model individual behaviors but dose not capture theinteraction of people in a group.
The individuals are always willing to engage with “seed” groupsand form spatially coherent structures.
When the neighboring relationship of individuals remain invariantover time and correlation of their velocities remain high, they tendto have similar hidden state.
The trajectories of pedestrians not only follow the old trend, butalso are influenced by current environment.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 13 / 26
![Page 29: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/29.jpg)
Why Coherent LSTM?
LSTM can model individual behaviors but dose not capture theinteraction of people in a group.
The individuals are always willing to engage with “seed” groupsand form spatially coherent structures.
When the neighboring relationship of individuals remain invariantover time and correlation of their velocities remain high, they tendto have similar hidden state.
The trajectories of pedestrians not only follow the old trend, butalso are influenced by current environment.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 13 / 26
![Page 30: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/30.jpg)
cLSTM Unit
c
t
= f
t
� c
t�1 +X
j2N�j
(t)f jt
� c
j
t�1 + i
t
� tanh(Wxc
x
t
+W
hc
h
t�1 + b
c
)
(6)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 14 / 26
![Page 31: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/31.jpg)
cLSTM Unit
c
t
= f
t
� c
t�1 +X
j2N�j
(t)f jt
� c
j
t�1 + i
t
� tanh(Wxc
x
t
+W
hc
h
t�1 + b
c
)
(6)
σ
σ
σ
ϕ
Forget Gate
Input Gate
Output Gate
Cell
tx
1th −
th
Coherent Regularization
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 14 / 26
![Page 32: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/32.jpg)
Coherent Motion Modeling
Use coherent filtering [Zhou et al., 2012a] [Shao et al., 2014] to discoverthe coherent group.
The dependency relationship between two tracklets within the samegroup is measured as:
⌧j
(t) =v
i
(t) · vj
(t)
kvi
(t)kkvj
(t)k (7)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 15 / 26
![Page 33: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/33.jpg)
Coherent Motion Modeling
Use coherent filtering [Zhou et al., 2012a] [Shao et al., 2014] to discoverthe coherent group.
The dependency relationship between two tracklets within the samegroup is measured as:
⌧j
(t) =v
i
(t) · vj
(t)
kvi
(t)kkvj
(t)k (7)
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 15 / 26
![Page 34: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/34.jpg)
Dependency Coe�cient
The dependency coe�cient between the ith and jth tracklets in Eq. (6)is defined as
�j
(t) =1
Z
i
exp
✓⌧j
(t)� 1
2�2
◆2 (0, 1], (8)
Z
i
: normalization constant corresponding to the ith tracklet.
�j
(t) ' 1 if vi
(t) ' v
j
(t) which implies that tracklets i and j aresimilar.
Coherent regularization encourages the tracklets to learn similarfeature distributions by sharing information across tracklets withina coherent group.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 16 / 26
![Page 35: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/35.jpg)
Dependency Coe�cient
The dependency coe�cient between the ith and jth tracklets in Eq. (6)is defined as
�j
(t) =1
Z
i
exp
✓⌧j
(t)� 1
2�2
◆2 (0, 1], (8)
Z
i
: normalization constant corresponding to the ith tracklet.
�j
(t) ' 1 if vi
(t) ' v
j
(t) which implies that tracklets i and j aresimilar.
Coherent regularization encourages the tracklets to learn similarfeature distributions by sharing information across tracklets withina coherent group.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 16 / 26
![Page 36: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/36.jpg)
Dependency Coe�cient
The dependency coe�cient between the ith and jth tracklets in Eq. (6)is defined as
�j
(t) =1
Z
i
exp
✓⌧j
(t)� 1
2�2
◆2 (0, 1], (8)
Z
i
: normalization constant corresponding to the ith tracklet.
�j
(t) ' 1 if vi
(t) ' v
j
(t) which implies that tracklets i and j aresimilar.
Coherent regularization encourages the tracklets to learn similarfeature distributions by sharing information across tracklets withina coherent group.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 16 / 26
![Page 37: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/37.jpg)
Dependency Coe�cient
The dependency coe�cient between the ith and jth tracklets in Eq. (6)is defined as
�j
(t) =1
Z
i
exp
✓⌧j
(t)� 1
2�2
◆2 (0, 1], (8)
Z
i
: normalization constant corresponding to the ith tracklet.
�j
(t) ' 1 if vi
(t) ' v
j
(t) which implies that tracklets i and j aresimilar.
Coherent regularization encourages the tracklets to learn similarfeature distributions by sharing information across tracklets withina coherent group.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 16 / 26
![Page 38: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/38.jpg)
Framework
Unsupervised encoder-decoder cLSTM framework:
h
T
= cLSTMe
(xT
,hT�1), (9)
x̂
t
= cLSTMdr
(ht
, x̂t+1), where t 2 [1, T ], (10)
x̂
t
= cLSTMdp
(ht
, x̂t�1). where t > T, (11)
Coherent Regularization
1x 2x 3x
2x̂ 1̂x3x̂
4x̂ 5x̂ 6x̂eW eW
rdW rdW
pdW pdWEncoder
Reconstruction Decoder
Prediction DecoderLearnt Hidden
Features
Th
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 17 / 26
![Page 39: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/39.jpg)
Crowd Scene Profiling
Solve critical tasks in crowd scene analysis:
Estimating group stateGas, Solid, Pure Fluid and Impure FluidSoftmax classification using the feature learnt from theunsupervised cLSTM.
Crowd video classification
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 18 / 26
![Page 40: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/40.jpg)
Crowd Scene Profiling
Solve critical tasks in crowd scene analysis:
Estimating group stateGas, Solid, Pure Fluid and Impure FluidSoftmax classification using the feature learnt from theunsupervised cLSTM.
Crowd video classification
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 18 / 26
![Page 41: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/41.jpg)
Outline
1 Introduction
2 LSTM Recap
3 Coherent LSTM
4 Experimental Results
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 19 / 26
![Page 42: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/42.jpg)
Datasets and Settings
CUHK Crowd Datasethttp://www.ee.cuhk.edu.hk/~xgwang/CUHKcrowd.html
Scene: streets, shopping malls, airports and parksMore than 400 sequences and more then 200,000 traklets
Settings128 hidden units in cLSTM2/3 of tracklets as the input and 1/3 as the predicted tracklets toevaluate the performance.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 20 / 26
![Page 43: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/43.jpg)
Datasets and Settings
CUHK Crowd Datasethttp://www.ee.cuhk.edu.hk/~xgwang/CUHKcrowd.html
Scene: streets, shopping malls, airports and parksMore than 400 sequences and more then 200,000 traklets
Settings128 hidden units in cLSTM2/3 of tracklets as the input and 1/3 as the predicted tracklets toevaluate the performance.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 20 / 26
![Page 44: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/44.jpg)
Future Path Forecasting
Table 1: Error of Path Prediction(pixels)
Kalman Filter Un-coherent LSTM Coherent LSTM9.32 ± 1.99 6.64 ± 1.76 4.37±0.93
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 21 / 26
![Page 45: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/45.jpg)
Future Path Forecasting
Table 1: Error of Path Prediction(pixels)
Kalman Filter Un-coherent LSTM Coherent LSTM9.32 ± 1.99 6.64 ± 1.76 4.37±0.93
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 21 / 26
![Page 46: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/46.jpg)
Group State Estimation
Gas: Particles move in di↵erent directions without formingcollective behaviors
Solid: Particles move in the same direction with relative positionsunchanged
Pure Fluid: Particles move towards the same direction withever-changing relative positions
Impure Fluid: Particles move in a pure fluid style with invasion ofparticles from other groups
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 22 / 26
![Page 47: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/47.jpg)
Group State Estimation
Gas: Particles move in di↵erent directions without formingcollective behaviors
Solid: Particles move in the same direction with relative positionsunchanged
Pure Fluid: Particles move towards the same direction withever-changing relative positions
Impure Fluid: Particles move in a pure fluid style with invasion ofparticles from other groups
(a) Gas (b) Solid (c) Pure Fluid (d) Impure Fluid
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 22 / 26
![Page 48: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/48.jpg)
Group State Estimation
Confusion matrices of estimating group states using di↵erent methods:(a) collective transition [Shao et al., 2014]; (b) prediction LSTM; (c)reconstruction LSTM; (d) un-coherent LSTM; and (e) coherent LSTM.
(a) Collective Transition
(e) Coherent LSTM(d) Un-coherent LSTM
(b) Prediction LSTM (c) Reconstruction LSTM
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 23 / 26
![Page 49: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/49.jpg)
Crowd Video Classification
All video clips are annotated into 8 classes as 1) Highly mixedpedestrian walking; 2) Crowd walking following a mainstream and wellorganized; 3) Crowd walking following a mainstream but poorlyorganized; 4) Crowd merge; 5) Crowd split; 6) Crowd crossing inopposite directions; 7) Intervened escalator tra�c; and 8) Smoothescalator tra�c.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 24 / 26
![Page 50: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/50.jpg)
Crowd Video Classification
All video clips are annotated into 8 classes as 1) Highly mixedpedestrian walking; 2) Crowd walking following a mainstream and wellorganized; 3) Crowd walking following a mainstream but poorlyorganized; 4) Crowd merge; 5) Crowd split; 6) Crowd crossing inopposite directions; 7) Intervened escalator tra�c; and 8) Smoothescalator tra�c.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 24 / 26
![Page 51: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/51.jpg)
Crowd Video Classification
All video clips are annotated into 8 classes as 1) Highly mixedpedestrian walking; 2) Crowd walking following a mainstream and wellorganized; 3) Crowd walking following a mainstream but poorlyorganized; 4) Crowd merge; 5) Crowd split; 6) Crowd crossing inopposite directions; 7) Intervened escalator tra�c; and 8) Smoothescalator tra�c.
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 25 / 26
![Page 52: Crowd Scene Understanding with Coherent Recurrent Neural ...qngw2014.bj.bcebos.com/upload/2016/05/dongyinpeng.pdf · Crowd scene is the scene of public places where a large group](https://reader034.vdocument.in/reader034/viewer/2022042223/5ec9f063309b8201df35d94b/html5/thumbnails/52.jpg)
Thanks for your time!
Questions?
Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 26 / 26