heidi young, ozlo vp of engineering, seattle interactive 2016

28
The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and Our Lives Heidi Young VP of Engineering Ozlo

Upload: teamozlo

Post on 24-Jan-2017

46 views

Category:

Internet


2 download

TRANSCRIPT

Page 1: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

The Future of Search: How Measuring Satisfaction Will Enhance Our Personal AIs and

Our Lives

Heidi YoungVP of Engineering

Ozlo

Page 2: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Who am I?

Search Junkie, Data Scientist, Engineer

Currently building Ozlo!!!

Page 3: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

What is Ozlo?

Next generation assistant

Ozlo is leveraging artificial intelligence, machine learning and natural language processing to power the next generation of search

Ozlo is in the early stages of learning to understand a wide range of human goals and activities, and the words and ideas that connect those things to help users find what they actually need

Page 4: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

AI Assistant and Chatbot Landscape

Siri

Alexa Skills Store

Bot Store

Skype Bot Store

Assistants

Platforms for exposing chatbots

Building a chatbot or assistant

Page 5: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

AI Assistant and Chatbot Landscape

https://twitter.com/ashevat/status/786690547733889024/photo/1

Page 6: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

AI Assistant and Chatbot Landscape

https://twitter.com/davidjbland/status/725119174368976897

Page 7: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Why all the hype then?We’ve moved to mobile where messaging is the natural method of communication

We’re moving to connected smart devices and expect our interactions to be natural to our surroundings

Page 8: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Why all the hype then?

There’s a good chunk of information seeking tasks that search engines don’t handle well in their current form Say wha?And they aren’t the really hard ones that you’re thinking of (i.e. research travel, buy a house)

Page 9: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Conversational UI

Page 10: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Why is conversational a better experience?It isn’t for a lot of things

Alexa, buy me some pants

I can’t buy pants. So I’ve added it to your shopping list.

😒

I want to order a pizza

Great! What kind of toppings would you like?

Pepperoni and sausage with extra cheese

And what kind of crust?

Thin crust

What size pizza would you like?

😒On average 73 taps with conversational ui vs conventional filtering ui with 16 taps

Page 11: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Why is conversational a better experience?

Rich, robust filtering

Highly visual experience

A lot of variety

It isn’t for a lot of things

Page 12: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Answer? The most natural interaction for the taskThe bar should be:

What kind of response would you expect from a really knowledgeable friend?

Are there any good movies playing?

Here’s some:…

Anything more kid friendly?

How about these? …

Which of these is playing around 9pm?

This is the only one playing around9pm, near you…

Great! Can you get me a ticket?

Here’s a link to buy it on Fandango

Page 13: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Information Task Modes

Remember• Simple Facts• Simple 1-2

sentence answers

• Clean, cut, dried

Understand• Obtaining

knowledge from a multitude of sources

• Constructing meaning from different content sources

Analyze• Breaking

material into constituent parts

• Determine relationships

• Make decisions

https://www.microsoft.com/en-us/research/wp-content/uploads/2015/08/fp286-bailey.pdf

Page 14: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Information Task Modes

https://www.microsoft.com/en-us/research/wp-content/uploads/2015/08/fp286-bailey.pdf

In typical web search tasks, users have expectations for the number of queries they’ll issue and documents they’ll review

How many queries they expect to issue How many documents they expect to review

Page 15: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Back to that hype thing…

https://www.microsoft.com/en-us/research/wp-content/uploads/2015/08/fp286-bailey.pdf

Chatbots and AI of today are primarily focused on stuff that’s pretty easy to get with an existing app or search engine

X X X XBut our expectation is that they can do these

Page 16: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Understand or Analyze Type of Task

What’s a good place to watch the game nearby?

Point of interest

That is rated highly or is popular or is known forthis type of task

Implies sports bar or point of interest that has a television with sports typically available

Close to your current location

Depending on where you’re located, could mean within walking distance or could mean 20 mins driving distance, depending on density of POIs and sparsity of available content

VERY IMPORTANT!!!

There is not ONE right answer to this question

It is a subjective question. Depending on your content sources, results can widely vary.

It requires a lot of synthesis across multiple sources, and likely presenting multiple sources, not a definitive answer.

Page 17: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

What you really want

???

???

???

Place A:

Great sports bar nearby

Place B:

Romantic restaurant

nearby

Place C:

Coffeeshop nearby

XX

Place A:

Great sports bar nearby

Place D:

Restaurant known for

sports and tvs

Place D:

Restaurant known for

sports and tvs

Page 18: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Some existing experiences

Alexa

Google Assistant via Allo

Page 19: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

What might a good experience look like?

Present evidence as to why those are good options

Present multiple options, but not so many that it’s overwhelming

Establish that you were heard and that he understood what you actually meant (i.e. sports bars, nearby)

Offer most likely refinements and follow on prompts

Page 20: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Successful Measurement of

Conversational UIs

Page 21: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

To measure, we must understand

National Communication Association publishes a rating scale to assess skills in interpersonal settings during conversation

1 5

Inadequate awkward, disruptive, leaving a negative impression

Excellentsmooth, controlled, leaving a

positive impression

Attentiveness

Attention to, concern for conversational partner

Composure

Confidence, assertiveness

Expressiveness

Articulation, animation, variation

Coordination

Non disruptive negotiation of speaking turns

Page 22: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

What do REAL messaging conversations look like?New vs Continuing Conversations

Identifying satisfaction of each sub-conversation

Page 23: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

How we think about things at Ozlo

Negative conversations

Bottom Line: How did the conversation end?

Negative indicators,implicit AND explicit

We:

1. Identify conversation boundaries2. Assign positive or negative

assessment of each interaction3. Mark as negative if it “ended”

negatively

Page 24: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

What’s a negative ending conversation?Conversations that contain one of the following in the last N messages in the interaction:

1. Explicit negative feedback

2. Highly latent

3. Not well understood

4. No follow on

VS

Page 25: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

What’s a negative ending conversation?Negative Ending Specific Signal Roughly maps to NCA ratings for…

Explicit Negative Feedback

Thumbs down Composure (i.e. Didn’t understand, Results could be better)

Attentiveness (i.e. Oddly worded response, Didn’t understand)

Expressiveness (i.e. Oddly worded response)

Highly latent >1 second Coordination (i.e. Controlling the flow of conversation, “Never leave me hanging”)

Not well understood Didn’t understand, low confidence scores

Composure

Expressiveness

No follow on Lack of prompts displayed, Lack of engagement for non QnA questions

Coordination

Attentiveness

Page 26: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Why this over DAUs?

It’s not one over the other

DAUs/MAUs are lagging indicators

We must optimize for in-the-moment interactions

Negatively ending conversations allows us to react in the moment, and aggregate and set targets

Page 27: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Will this result in better AI experiences?Still early

This is how we learn, reinforce good behavior

Once we successfully measure, we can optimize!

Page 28: Heidi Young, Ozlo VP of Engineering, Seattle Interactive 2016

Questions?