clustering crowdsourced videos by line-of-sight focus: clustering crowdsourced videos by...

31
FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and Kirk Beaty

Upload: joselyn-ellerd

Post on 01-Apr-2015

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

FOCUS: Clustering Crowdsourced Videos by Line-of-Sight

Puneet Jain, Justin Manweiler, Arup Acharya, and Kirk Beaty

Page 2: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Clustered by shared

subject

Page 3: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

CHALLENGES

Page 4: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

CAN IMAGE PROCESSING SOLVE THIS PROBLEM?

Page 5: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

5

Camera 2

Camera 4Camera 3

Camera 1

LOGICAL similarity does not imply VISUAL similarity

Page 6: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

6

VISUAL similarity does not imply LOGICAL similarity

Page 7: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

CAN SMARTPHONE SENSING SOLVE THIS PROBLEM?

Page 8: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Sensors are noisy, hard to distinguish subjects…

Why not triangulate?

Page 9: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

GPS-COMPASS Line-of-Sight

Page 10: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

INSIGHT

Page 11: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Don’t need to visually identify actual SUBJECT, can use background as PROXY

hard to identify

easy to identify

Simplifying Insight 1

Page 12: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

same basic structure persists

Simplifying Insight 2

Don’t need to directly match videos, can compare all to a predefined visual MODEL

Page 13: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Simplifying Insight 3

Light-of-sight (triangulation) is almost enough, just not via sensing (alone)

Page 14: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

FOCUSFast Optical Clustering of live User Streams

Sensing

Cloud

Vision

Page 15: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Hadoop/HDFSFailover, elasticity

Image processingComputer visionVideo Streams

(Android, iOS, etc.)

Clustered Videos

FOCUS Cloud Video Analytics

VideoExtraction

Watching Livehome: 2 away: 1

Users Select & Watch Organized Streams

Change Angle

ChangeFocus

Page 16: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Clustered Videos

FOCUS Cloud Video Analytics

VideoExtraction

Watching Livehome: 2 away: 1

Users Select & Watch Organized Streams

Change Angle

ChangeFocus

pre-defined reference “model”

Hadoop/HDFSFailover, elasticity

Image processingComputer vision

Page 17: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

17Model construction technique based onPhoto Tourism: Exploring image collections in 3DSnavely et al., SIGGRAPH 2006

zmulti-view reconstructionzkeypoint

extraction

estimates camera POSE and content in field-of-view

Multi-view Stereo Reconstruction

Page 18: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Visualizing Camera Pose

Page 19: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

19

~ 1 second at 90th%

~ 18 seconds at 90th%

zmulti-view reconstructionzkeypoint

extraction zframe-by-framevideo to model

alignmentzsensory inputs

• Given a pre-defined 3D, align incoming video frames to the model

• Also known as camera pose estimation

Page 20: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

20

zmulti-view reconstructionzkeypoint

extraction zintegration of sensory inputs

Gyroscope, provides “diff” from vision initial position

0 1 2 3 4 t - 1 t - 2

Filesize ≈ 1/Blur Sampled FrameGyroscope

Page 21: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

21

Field-of-view

Using POSE + model POINT CLOUD, FOCUS geometrically identifies the set of model points in background of view

zmulti-view reconstructionzkeypoint

extraction zpairwise model image analysis

Page 22: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

1

3

2

Similarity between image 1 & 2 = 18

Similarity betweenimage 1 & 3 = 13

22

Finding the similarity across videos as size of point cloud set intersection

zmulti-view reconstructionzkeypoint

extraction zpairwise model image analysis

Page 23: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Clustering “similar” videos

Similarity Score1

33

22

1Application of Modularity Maximization

high modularity implies:• high correlation among the

members of a cluster • minor correlation with the

members of other clusters

Page 24: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

RESULTS

Page 25: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

25

Collegiate Football Stadium

• Stadium 33K seats56K maximum attendance

• Model: 190K points 412 images (2896 x 1944 resolution)

• Android Appon Samsung Galaxy Nexus, S3

• 325 videos captured 15-30 seconds each

Page 26: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

26

Line-of-Sight Accuracy (visual)

Page 27: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

27

Line-of-Sight Accuracy

GPS/Compass LOS estimation is <260 meters for the same percentage

In >80% of the cases, Line-of-sight estimation is off by < 40 meters

Page 28: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

28

FOCUS Performance

75% true positives

Trigger GPS/Compass failover techniques

Page 29: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Natural Questions

• What if 3D model is not available?– Online model generation from first few uploads

• Stadiums look very different on a game day?– Rigid structures in the background persists

• Where it won’t work?– Natural or dynamic environment are hard

Page 30: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Conclusion

• Computer vision and image processing are often computation hungry, restricting real-time deployment

• Mobile Sensing is a powerful metadata, can often reduce computation burden

• Computer vision + Mobile Sensing + Geometry, along with right set of BigData tools, can enable many real-time applications

• FOCUS, displays one such fusion, a ripe area for further research

Page 31: Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and

Thank You

http://cs.duke.edu/~puneet