harvesting crowdsourced mobile videos under bandwidth constraint
TRANSCRIPT
Harvesting Crowdsourced Mobile Videos
under Bandwidth Constraint
Hien To
Big Mobile Video Data• YouTube statistics:
20% mobile videos 3 hours of video are uploaded per minutes
• An online media management system to collect, organize, share, and search mobile videos using geo-tagged metadata
MediaQ helped PBS cover the Presidential Inauguration on Jan. 20, 2013
Universities in Germany, China, South Korea, Singapore, Hong Kong and Saudi Arabia plan to release the system to their students for research and other purposes
Funding from the National Science Foundation, Google and Northrop Grumman
MediaQ was covered in the article “The Future of Citizen Journalism” http://magazine.viterbi.usc.edu/fall-2014/whats-next/the-future-of-citizen-journalism/
http://mediaq.usc.edu/ [Kim et.al MMSys’14]
Mobile App
Web App
Client Side
Uploading API
Search and Video
Playing API
GeoCrowd API
Server SideWeb Services
Query processing
Video repository
Metadata repository
MySQL MongoDB
Data Storage
DatabasesGeoCrowd Engine
Account Management
Transcoding
Visual Analytics
Keyword Tagging
Video Analysis
User API
R-tree index
MediaQ
Rich video metadata – W4
• A video of Angela at the USC on the 2011 USC-UCLA football game day
What about (USC-UCLA football game)
Time (05/03/2011) Person (Angela)
Place (USC)
Where?
When?
What?
Who?05/03/2011
Video Frame Model
GPS
CompassCamera
WiFi
p: camera location: camera view orientationα: viewable angleR: viewable distancetime: timestamp
�⃗�
α
�⃗�
pR
Field Of View (FOV) model [1]
[1] A. S. Ay, R. Zimmermann, and S. H. Kim. Viewable Scene Modeling for Geospatial Video Search. In ACM Intl. Conf. on MM, pages 309–318, 2008.
• Where and when metadata
A video frame an FOV
Video Frame Model
• Who metadata Face counting Face recognition
Intel Viewmont CoprocessorGPU
p: camera location: camera view orientationα: viewable angleR: viewable distancetime: timestamp
�⃗�
α
�⃗�
pR
Field Of View (FOV) model
Video Frame Model
• What metadata Manual keyword tagging Automatic keyword tagging [2]
p: camera location: camera view orientationα: viewable angleR: viewable distancetime: timestamp
�⃗�
α
�⃗�
pR
Field Of View (FOV) model
Geographic Information Systems
[2] Z. Shen, S. Arslan Ay, S. H. Kim, and R. Zimmermann. Automatic Tag Generation and Ranking for Sensor-rich Outdoor Videos. In 19th ACM Intl. Conference on Multimedia, pages 93–102, 2011.
Problem StatementMaximizing Information from Crowdsourced Mobile Videos under Bandwidth Constraint
Mobile Users
Server
Akdogan, A., To, H., Kim, S. H., & Shahabi, C. (2014). A Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures. In Big Data Benchmarks, Performance Optimization, and Emerging Hardware (pp. 57-70). Springer International Publishing.
Bandwidth k1 Bandwidth k2
Constraints on client size
Crowdsourced Mobile Videos
Bandwidth K
Constraint on the server size
OpenSignal
http://opensignal.com/
Crowdsources data on wireless coverage
Needs in Disaster Response
2010 Haiti earthquake 2011 Tōhoku earthquake and tsunami
Communication systems, i.e., road sensors and cell towers are disrupted by the disasters
Authority can assess the damage resulting from a disaster across a large geographical area
Needs in Other Kinds of Disaster • Cable cuts cause immediate and long-lasting network outage
Service providers prioritizes videos to be uploaded
• Root causes Human error or malicious Accidents and acts of nature
E.g., vehicles runs into aerial poles, ship anchors , fires, floods, and fallen trees, shark attacks, deer, gophers, squirrels, etc.
AAG cable cuts Vietnam[1] http://all.net/CID/Attack/papers/CableCuts.html [2] http://tuoitrenews.vn/business/25268/cable-cut-hitting-vietnams-internet-to-be-fixed-by-jan-23-operator
Needs in Popular Events
Boston Marathon
[1] http://www.wired.com/2013/04/boston-crowdsourced/
New England Patriots v. Seattle Seahawks
Network outage during popular events, e.g., new year, demonstrations, super bowl
Roadmap
Ying Lu, Cyrus Shahabi, and Seon Ho Kim, An Efficient Index Structure for Large-scale Geo-tagged Video Databases, ACM SIGSPATIAL GIS, 2014
Approaches for capturing, integrating and storing the data associated with disasters
Trade-off between short-term fixes (i.e., one time snapshot) and comprehensive long-term solutions (i.e., multiple time snapshots)
Sharing information while maintaining high levels of security and privacy
Approach to One Time Snapshot
Video 1
Video 2
Moving trajectories and view orientations Video coverage from metadata
Due to limited bandwidth, it is infeasible to collect all crowdsourced videos at one time. However, we can first obtain video metadata, e.g., location, size, and FOVs in real-time. Leveraging rich metadata, the server later decides which videos/frames to be uploaded, and in what order.
SC-Server
W4
W4 W4Worker 1
Worker 2 Worker 3
Location/Region EntropyDiversity of a location l
Measures the diversity of unique visitors of a location A location has high entropy if many users were observed at the location
with equal proportion
lOlFreq )(
luOlUserCount ,)(
)(log)()( uPuPlLElUu ll
l
lul O
OuP ,)( where
)(log)()( uPuPrRErUu rr
r
rur O
OuP ,)( where
Region entropy
Location entropyTotal number of visitsNumber of visits by worker uLocation entropy
Importance of a Video
).(Area).(Priority).(RE)( rvrvrvvV iiii
Evaluate the value of a video by the region it covers:
Historical region entropy prefer videos whose covered region are visited by many workers many times, e.g., school, hospital. RE can be computed from any existing location-based data, e.g., Foursquare.
The priority of an area, e.g., nuclear plant areas are more important than residence areas.
Video that capture large geographical areas are important.
USC Campus
Nuclear plant
f1f2
f3
Coverage area
Optimization at Video Level
ksvvdtsvVvdMaximizeV
iii
V
iii
||
1
||
1
.)(..)()(
•{v1, v2,…}: video list•k: bandwidth•vi.r, vi.s: coverage region and size of vi•d(vi): 0/1 decision, whether or not to select vi•V(vi): value of vi
This problem is knapsack, which is np-hard.Greedy algorithm achieve 0.5-approximation ratio.
Maximizing Information from Crowdsourced Mobile Videos under Bandwidth Constraint
Approach to Multiple Time Snapshots
Video 1
Video 2
Moving trajectories and view orientations Video coverage from metadata
The server can adaptively select new videos based on the metadata, the collected videos and the policies, e.g., collect/crowdsource more data in•Sparse-video areas•Specific regions of interest
SC-Server
W4
W4 W4Worker 1
Worker 2 Worker 3
Optimization at Video Level• This scenario would be useful in case of disaster
response where the videos are likely to be diverse in a large geographical area
• However, in popular events, e.g., demonstrations, marathon, football, concerts, videos’ coverage areas are overlapped
• Thus, there is a need of optimization at video frames level
0 100 200 300 400 500 600(frames)
Interested video frame
Optimization at Frames Level
f1f1
f3
f4
Each video include a list of FOVs(field of views)
||
1
)()(F
iii fVfdMaximize
ksffdtsF
iii
||
1
.)(..
•{f1,f2…}: FOVs list•k: bandwidth•fi.s: size of frame i•d(fi): 0/1 decision, whether or not to select fi•V(fi): value of a frame
This problem is knapsack, which is np-hard.
This scenario requires to run algorithms on phones, e.g., frame/feature extraction and object detection. (OpenCV on Android).
Importance of a Frame
•fi.r: coverage region fi•RE(fi.r): historical location entropy of fi’s region•Quality(fi): [01] frame quality based on brightness, noise, objects, etc.•Area(vi.r): a frame that captures large geographical area is more important
Evaluate the value of a frame:
)(Area)(Quality).(RE)( iiii ffrffV
Minimize Redundant Coverage
• In the figire, FOVs f1 and f2 cover almost the same region, i.e., the red area is covered redundantly
Overlapped FOVs
• Each FOV fi includes• A set of cells C(fi)={c1,c2…}• Each cell cj has a value – location entropy
Minimize FOV Overlaps
Select K frames from dataset such that
)()(
)(..
)().()(
j
jcj
Ffi
jjjCc
cdf
fd
kfdts
cWlcLEcdMaximize
i
i
j
•F={f1,f2…}: FOVs list•K: the number of FOVs•d(fi): 0/1 decision, whether or not to select fi•cj.l: location of cell j (i.e., center)•LE(cj.l): historical location entropy of cell j•W(cj): [01] cell importance, based on interested objects, e.g., human
This problem is a Weighted Maximum Coverage Problem, which is np-hard. Greedy algorithm achieve 0.63-approximation ratio.
Maximize weighted sum of the covered cells
No more than k frames are selected
If a cell cj is selected, at least one FOV fi that cover cj is selected
Important Cells from Interested Objects
Interesting Cells
f2
Focal lengthAverage human heightDistance from originInteresting cells
Consider FOV’s direction
• In Fig 1, although f1 and f2 cover almost the same region, they provides us different angle of the same scene
• To capture direction, an approach is to associate each cell a number of directions, e.g., NSEW. Then, each FOV covers the cell from a particular direction, e.g., Fig 2.
f1
f2
Fig 1. Overlapped FOVs, but different angles Fig 2. Overlapped FOVs
f2
f1
References• Hien To, Seon Ho Kim, Cyrus Shahabi. Effectively
Crowdsourcing the Acquisition and Analysis of Visual Data for Disaster Response. In proceeding of 2015 IEEE International Conference on Big Data (IEEE Big Data 2015), Santa Clara, CA, USA, October 29-November 1, 2015 (Acceptance rate ~18%) (Paper) (PPT)