how gpus power comcast's x1 voice remote and smart video...

Post on 20-Mar-2018

215 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics

Jan NeumannComcast Labs DCMay 10th, 2017

2

Comcast Applied Artificial Intelligence Lab

SmartHome

SmartTV

SmartInternet

Media&

VideoAnalytics

DeepLearning

DataScience

Recommendations

&Search

Voice&

NLP

3

Today: How Comcast Uses AI to Evolve and Reinvent the TV Experience

Smart Home

SmartTV

SmartInternet

Media&

VideoAnalytics

DeepLearning

DataScience

Recommendations

&Search

Voice&

NLP

4

Netflix

LIVETV

OnlineVideo

AI for Content Discovery –Voice Search

5

• Query: “HBO”

X1 Smart TV with Voice

AnswerSelector

Voice remote ASR

query

NLP modules

action

Set-top Box TV

6

Open NLP: Multiple Domains with Voice

TV

HOME

.

.

.

queryDomain

Selector

Answer

Selector

.

.

.

Answer

Selectorresponse

CUSTOMERCARE

NEWS

7

Open NLP: Multiple Domains with Voice

TV

HOMEturnontheheat

Domain

Selector

Answer

SelectorAnswer

Selector

CUSTOMERCARE

NEWS

response

0.80

0.15

0.02

0.03

Selected={TV,Home} Precision=100%

Applicable={TV,Home} Recall=100%

Threshold=0.10

8

Open NLP: Multiple Domains with Voice

TV

HOME

Showmemypassword

Domain

Selector

Answer

SelectorAnswer

Selector

CUSTOMERCARE

NEWS

response

0.03

0.04

0.03

0.90

Selected={CustomerCare} Precision=100%

Applicable={CustomerCare} Recall=100%

Threshold=0.10

9

• Cascade of Deep Learning Models of increasing complexity

Domain Selector in Practice

Entity Detection

Service“HBO”

SimpleModel

ComplexModel

SENDTODOMAIN

DONOTSENDTODOMAIN

YES

YES

NO YES

NO

NO

10

SENDTODOMAIN

NO

• Cascade of Deep Learning Models of increasing complexity

Domain Selector in Practice

Entity Detection

Service

“Show me funny

comedies”

SimpleModel

ComplexModel

DONOTSENDTODOMAIN

YES

YES

NO YES

NO

11

• Query: “who plays the oracle in matrix”

X1 Smart TV with Voice

Voice remote

query action

QA Answer (id or text)

Question(text)

ASR NLP modules Set-top Box TV

12

• Given:• Question in natural-language form q• Structured knowledge base that contains list of facts• [ subject – relation – (attribute) – object ]

• Return: • Answer to q

• Assuming:• q answerable by a single fact.• Source entity mentioned in q.• Answer is neighbor of source entity node.

First-order Question Answering

subject object

attribute

“Matrix” “Keanu Reeves”

“Neo”

“Tom Hanks” “9/1/1956”

13

Question Answering with Knowledge Graph

Predict

Relation

Question

Extract

Entities [e1,…,eN ]names/titles

Structured

Query

Subj=e1Obj=?Rel=r

Knowledge

Graph

Search

e1 |r |e2

relation

r

Generate

Answer

TextanswerTrain

subj |rel |obj

Howoldis

TomHanks?

14

Question Answering with Knowledge Graph

Predict

Relation

Question

Extract

Entities [e1,…,eN ]names/titles

Structured

Query

Subj=e1Obj=e2Rel=r

Knowledge

Graph

Search

e1 |r |e2

relation

r

Generate

Answer

TextanswerTrain

subj |rel |obj

[e1,…,eN ]names/titles

Subj=TomHanksRel=birthObj =?relation

r

TomHanks

is55years

old.

birth

Tom

Hanks TomHanks|birth|1956

TomHanks

is59yearsold

Howoldis

TomHanks?

15

EntityDetection [ e1, …, eN ]

names / titles

PredictRelation

relationr

subj=eobj=?attr=?rel=r

Question Answering with Knowledge Graph using Recurrent Neural Networks (RNNs)

StructuredQuery

Question

where Tom Hanks was

placeof birth

born

memory

where Tom Hanks was

NA Subj Subj NA NA

born

mem

ory Entity Detection ~ Tagging Relation Prediction ~ Classification

16

word

hidden

input

output

0.39 0.61

washingtonheights

0.89 0.11

memory

Recurrent Neural Networks

LOC PER PERLOC

17

Netflix

LIVETV

OnlineVideo

AI for Content Discovery – Automatic Content Analysis

18

Most metadata is at the asset level

• Genres• Credits• Synopsis• Keywords

19

Much more data exists within the asset

• Chapters• Moments• Annotations

MovieFrameShotScene

Chapter

20

Why is this useful?

Whoisinthis

scene?

Whatarethebest

momentsonTV?

In-game

highlight

navigation

Search&

Recommendations

21

How does Automatic Content Analysis work?

ComputerVision

Audio Analysis

NaturalLanguage

Processing

AI &Machine Learning

Chaptering

Scene-levelAnnotations

Video

Frame-levelAnnotations

22

Why is it possible now?

Large-scale Image recognition performance

Big

Data

Better

Algorithms

(Deeplearning)

Cloud/GPU

Computing

23

Super-human accuracy in speech and image recognition!

Large-scale Image recognition performance

Big

Data

Better

Algorithms

(Deeplearning)

Cloud/GPU

Computing

24

New experiences!

Big

Data

Better

Algorithms

(Deeplearning)

Cloud/GPU

Computing

25

• Place highlights over games recorded onto customers’ DVRs for football, baseball, hockey, basketball and soccer.

Example Application: In-Game Highlights

“I’llrecordasmanygamesasIcan.WhenIdon’twanttowatchthewholegame,it’sagreatwaytodoit.”– CustomerTestimonial

“In-GameHighlights”

FeatureforNFLhasbeen

releasedonComcastX1

lastfall

26

Netflix

LIVETV

OnlineVideo

AI for Content Discovery – Personalization

27

+

=

Personalized Entertainment Experiences

What is popular right now? What do you like?

PersonalizedRecommendations

28

Deep learning-based recommender system for Live TV - Training a joint embedding space to combine the scores- Channel- and Program-based recommendations

- Time-dependent recommendations

- Trending/popular and personal favorite channels, programs, sport teams

- Rich content descriptions from automatic content analysis

What should I watch right now?

Live TVRecommenderSystem

Favorite

Channels

Favorite

Programs

Collaborative

Filtering

Trending

Popularity

Content

Descriptions

29

Netflix

LIVETV

OnlineVideo

Deep Learning Infrastructure

30

• Deep Learning Frameworks – Keras, Tensorflow, Theano, PyTorch, Caffee (older models)

• All deployments using nvidia-docker– Thanks to Nvidia solutions team to help with best practices

• All deep learning training done on multi-GPU servers– NvidiaTesla (Production) and 8xTitan X (Dev) GPUs– Nvidia DGX-1 for large scale training – video and nlp

• Next steps– Container scheduler – Kubernetes and Hashicorp Nomad– Network compression/simplification for increased efficiency (TensorRT)

Deep Learning Infrastructure

31

Machine Learning Data Science

Big Data AI

Improving Customer

Experience Everywhere at

Comcast/NBCU

Deep Learning-based ML is applied everywhere at Comcast

HighSpeedInternet

Video

IPTelephony

HomeSecurity/

Automation

UniversalParks

MediaProperties

Formoreinfosee:

dclabs.comcast.com

top related