the computer science behind youtube

90
the computer science behind

Upload: wei-tsang-ooi

Post on 20-Feb-2017

1.713 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Computer Science behind YouTube

the computer sciencebehind

Page 2: The Computer Science behind YouTube

Ooi Wei TsangDepartment of Computer Science

Page 3: The Computer Science behind YouTube
Page 4: The Computer Science behind YouTube
Page 5: The Computer Science behind YouTube

how to make video download or upload

more efficiently?

Page 6: The Computer Science behind YouTube

24hours of

video uploadedevery minute(april 2010)

Page 7: The Computer Science behind YouTube

700,000video viewedevery minute

(october 2009)

Page 8: The Computer Science behind YouTube

700,000 x 480seconds of video viewed

per minute

(average length: 8 minutes)

Page 9: The Computer Science behind YouTube

700,000 x 480x 328

kilobits of video served per minute

(average bitrate: 328 kbps)

Page 10: The Computer Science behind YouTube

229 TBof video served

per seconds

Page 11: The Computer Science behind YouTube

how to make video download or upload

more efficiently?

Page 12: The Computer Science behind YouTube

700,000 x 480x 328

kilobits of video served per minute

Page 13: The Computer Science behind YouTube

quality

bytes / second

Page 14: The Computer Science behind YouTube

quality

bytes / second

better compression

Page 15: The Computer Science behind YouTube

how video compression works?

Page 16: The Computer Science behind YouTube

pixel: location (150,40) color (1,129,166)

Page 17: The Computer Science behind YouTube

group into 8x8 pixels

Page 18: The Computer Science behind YouTube

principle: donʼt store details thatour eye cannot see (details)

Page 19: The Computer Science behind YouTube
Page 20: The Computer Science behind YouTube

principle: store differences

Page 21: The Computer Science behind YouTube
Page 22: The Computer Science behind YouTube
Page 23: The Computer Science behind YouTube

compression:

how to store information in fewest number of bits

how to throw away bits without losing information

Page 24: The Computer Science behind YouTube
Page 25: The Computer Science behind YouTube

135.4 millionUS users in January 2010

Page 26: The Computer Science behind YouTube

12.7 billionsvideo watched in January 2010

Page 27: The Computer Science behind YouTube

viewerInternetWebserver

Page 28: The Computer Science behind YouTube

viewer

I want this

InternetWebserver viewer

Page 29: The Computer Science behind YouTube

viewerInternet

here it is

Webserver viewer

Page 30: The Computer Science behind YouTube

viewerInternet

I want to share this

Webserver viewer

Page 31: The Computer Science behind YouTube

viewer

Internet

viewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewer

Webserver

I want to share this

I want to share this

I want to share this

I want watch this

I want watch this

I want watch this

Page 32: The Computer Science behind YouTube

how to support many many viewersand not have them

wait too long?

Page 33: The Computer Science behind YouTube

Internet

viewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewer

WebserverWeb

serverWebserverWeb

serverWebserverWeb

server

have many many servers

Page 34: The Computer Science behind YouTube

Internet

viewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewer

WebserverWeb

serverWeb

serverWebserver

WebserverWeb

server

distribute all over the world

Page 35: The Computer Science behind YouTube

Internet

viewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewerviewer

WebserverWeb

serverWeb

serverWebserver

WebserverWeb

server

popular videos are duplicated

Page 36: The Computer Science behind YouTube

where to put video servers?

which video store where?

Page 37: The Computer Science behind YouTube

who has this video?

who is closest to the user?

Page 38: The Computer Science behind YouTube

need to adapt to:

network condition,popularity of videos

Page 39: The Computer Science behind YouTube

distributed systems

how multiple computers, communicating over a network, can achieve a common goal?

Page 40: The Computer Science behind YouTube
Page 41: The Computer Science behind YouTube

how to identify interesting and informative frame as thumbnail?

frame 1 frame 900

Page 42: The Computer Science behind YouTube
Page 43: The Computer Science behind YouTube
Page 44: The Computer Science behind YouTube

need to understand the content of the video

Page 45: The Computer Science behind YouTube

(0,0):(1,244,39)(0,1):(1,243,38)(0,2):(2,243,27)(0,3):(95,34,221)::

is there a variety in the colors?

easy

Page 46: The Computer Science behind YouTube

(0,0):(1,244,39)(0,1):(1,243,38)(0,2):(2,243,27)(0,3):(95,34,221)::

is there a face inside?

is it the same face?

doable

Page 47: The Computer Science behind YouTube

(0,0):(1,244,39)(0,1):(1,243,38)(0,2):(2,243,27)(0,3):(95,34,221)::

is the scene an important one, based on the text description?

very hard

Page 48: The Computer Science behind YouTube

same content

Page 49: The Computer Science behind YouTube

need to detect duplicates

Page 50: The Computer Science behind YouTube

but length, size, quality, colors, content

may change slightly

Page 51: The Computer Science behind YouTube
Page 52: The Computer Science behind YouTube
Page 53: The Computer Science behind YouTube
Page 54: The Computer Science behind YouTube

one possible method:

compute color histogram for each frame

Page 55: The Computer Science behind YouTube
Page 56: The Computer Science behind YouTube
Page 57: The Computer Science behind YouTube

construct sequence of color histograms for videos

Page 58: The Computer Science behind YouTube

perform approximate matching

Page 59: The Computer Science behind YouTube

how to do it fast?

Page 60: The Computer Science behind YouTube

with recognition of video content, we can also

Page 61: The Computer Science behind YouTube

detect inappropriate materials

Page 62: The Computer Science behind YouTube

content-based search(e.g., find video of cat the musical)

Page 63: The Computer Science behind YouTube

detect copyrighted video

Page 64: The Computer Science behind YouTube

more relevant video

Page 65: The Computer Science behind YouTube

multimedia content analysis

how to reason about the content of image, video, and audio?

Page 66: The Computer Science behind YouTube
Page 67: The Computer Science behind YouTube

need to predict what a user likes to watch

Page 68: The Computer Science behind YouTube

Observe pattern: many users who viewed X, also viewed Y

if a user is viewing X, the user might like Y too.

Page 69: The Computer Science behind YouTube
Page 70: The Computer Science behind YouTube
Page 71: The Computer Science behind YouTube
Page 72: The Computer Science behind YouTube
Page 73: The Computer Science behind YouTube
Page 74: The Computer Science behind YouTube

User U may like video V if

there is a short path between U and Vthere are many paths between U and Vnot going through popular videos

Page 75: The Computer Science behind YouTube

Netflix gave out $1M to improve their

recommendation

Page 76: The Computer Science behind YouTube

machine learning

how to recognize pattern and learn information from data

Page 77: The Computer Science behind YouTube
Page 78: The Computer Science behind YouTube

each video has many properties:

who has viewed itwho likes itwho commented what about itwho tags it with what : (and when, where ..)

Page 79: The Computer Science behind YouTube

how to find:

top 10 most viewed video in Singapore

Page 80: The Computer Science behind YouTube

how to find:

all videos tagged with “cat” “favorited” by me

Page 81: The Computer Science behind YouTube

top 10 most watched videos liked by viewers who also liked video X

Page 82: The Computer Science behind YouTube

~250,000,000videos on YouTube

(my estimate)

Page 83: The Computer Science behind YouTube

700,000video viewedevery minute

(october 2009)

Page 84: The Computer Science behind YouTube

need to store, search, retrieve, and update metadata about video efficiently and consistently

Page 85: The Computer Science behind YouTube

database

how to manage large amount of data

Page 86: The Computer Science behind YouTube

compressiondistributed systems

media content analysismachine learning

database

Page 87: The Computer Science behind YouTube

computational methods to store

retrievetransmitprocessdisplay

interact withinformation

Page 88: The Computer Science behind YouTube

computer science

Page 89: The Computer Science behind YouTube

the computer sciencebehind

Page 90: The Computer Science behind YouTube

Thank You