smart data webinar: deep qa (question/answer) - lessons from watson and jeopardy!
TRANSCRIPT
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved. 7/17/2015
Deep QA (Question/Answer)Lessons From Watson and Jeopardy!
October 13, 2016
Adrian Bowles, PhDFounder, STORM Insights, Inc.
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Deep Question/Answering - Lessons from Watson & Jeopardy!
The GameThe ChallengeScope of the problem
DeepQA Architecture & Processes
Software, Hardware & Resources
Next Steps
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Answers must be given in the form of a questionLast contestant to answer correctly chooses the next questionCorrect responses must satisfy the demands of both the clue and the category
JeopardySix categories, 5 Questions for each category, $100-500 based on difficulty
Double JeopardySix categories, 5 Questions for each category, $200-1,000 based on difficulty, and 3 hidden questions allow the person who chooses them to bet everything they have at that point in the game
Final JeopardyPlayer must have a positive balance from the previous round to playPlayers see the category and then decide - secretly - how much to wagerThe question is presented30 seconds to answer
Playing the Game:
Wikipedia, The Free Encyclopedia. October 12, 2016, 02:40 UTC. Available at: https://en.wikipedia.org/w/index.php?title=Jeopardy!&oldid=743931483. Accessed October 12, 2016.
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Open Domain, broad use of language - Jeopardy! questions often involve puns, ambiguity…IBM reviewed a sample of 20,000 questions, and found 2,500 distinct lexical answer types (LANo single LAT accounted for more than 3% of the totalFor each category, there could be thousands of questionsBest players provide correct answers ~85% of the timeBest players know what they don’t know - base their bets on their confidence~3 seconds to answer questions
Challenges of Jeopardy! for Machines:
Players may only use the data/knowledge they have on arrival - no lifelines, resources…Constraint
Winning Jeopardy! requires a contestant to answer ~70% of the questions, with 80%+ precision.
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Predicting lexical answer types in open domain question and answering (qa) systems US 20130035931 A1 2013, Ferrucci, Gliozzo, Kalyanpur
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Precision
SpeedConfidence
Quality
SpeedCost
Business Constraints Jeopardy! Constraints
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Look for SimilarSolved
Problems
Accept or CreateProblem Statement
GenerateHypotheses
Identify Evidencein Corpus
Score Evidence
Score Hypotheses
PresentResults
GetFeedback
Train
ModelOrientAct
Observe
Decide
WorldModel
Formalizing the Decision-Making Process
Boyd’s LoopJohn Boyd (1927-
1997) Continuous Learning
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
MachineLearning
NLU NLG
Information
RetrievalReasoning
KnowledgeRepresent
ation
Evidence
Gather Decide
Evaluate WeighGenerate Hypothese
s
Automating QA
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
* Building Watson: An Overview of the DeepQA Project, AI Magazine, Fall 2010 Issue,Ferrucci, Brown, Chu-Carroll, Fan, Gondek, Kalyanpur, Lally, Murdock, Nyberg, Prager, Schlaefer, Welty.
Build a database of question/answer pairsBuild a formal model of the worldBuild a search engine
What they didn’t do:
What they did:
DeepQA - “a massively parallel probabalistic evidence-based architecture.”*
Develop reusable NLU tech to analyze textAnalyze sources - structured and unstructured - to capture background knowledgeApply knowledge representation and Reasoning (KRR) to the resulting structured knowledgeUse machine learning to generate and score hypotheses
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Massively Parallel Probabalistic Evidence-based Architecture
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Content AcquisitionBuilding the corpus
For Jeopardy! this had to be completed before the game commenced.Ingested encyclopedias, dictionaries, thesauri, newswire articles, literary works, databases, taxonomies, ontologies…
IRL, we can identify and use new resources based on the problem at hand.
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Question AnalysisWhat is being asked?
Question classification:any words with double meanings?Puzzle question, factoid…?
Detect focus LATrelations
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Relation-detection
“They’re the two states you could be reentering if you’re crossing Florida’s norther border.”
Category: Head North
borders(Florida, ?,x,north)
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Hypothesis Generation& Scoring
Use a candidate answer with the question, try to prove correct with a degree of confidence supported by the evidence.
Scoring may use a variety of relationships:
temporalspatialgeospatialtaxonomic classificationcorrelation between candidate and question…
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Evaluating Potential Answers
Watson scores evidence in multiple dimensions
What works for a factoid question may not work for a puzzle question.
“Chile shares its longest land border with this country.”
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Merging & Ranking
Identifying the most likely answer based on confidence scores.
Answer scores are merged before ranking and confidence estimation.
Uses ML approach to compare with training set data when confidence scores in different categories result in “too close to call” results.
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Wikipedia, The Free Encyclopedia. October 12, 2016, 17:06 UTC. Available at: https://en.wikipedia.org/w/index.php?title=Watson_(computer)&oldid=744021754. Accessed October 12, 2016.
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Software
Apache Hadoophttp://hadoop.apache.org
Apache UIMA - Unstructured Information Management Architecturehttp://uima.apache.org
IBM DB2
Linux (Suse Enterprise Server 11)
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Resources
Wordnet(R) Princeton University "About WordNet." WordNet. Princeton University. 2010. <http://wordnet.princeton.edu>
Wordnet(R)
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Resources
Wordnet(R) Princeton University "About WordNet." WordNet. Princeton University. 2010. <http://wordnet.princeton.edu>
Wordnet(R)
Copyright (c) 2016 by STORM Insights Inc. All Rights Reserved. 9/28/2011
IBM Power 75090 servers, 32 cores/server, 2880 Cores in 10 racks
16Tb RAM
~80TeraFLOPS
80,000,000,000,000FLOPS
Hardware
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Next Steps…
For more information:
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Twitter @ajbowlesSkype ajbowles
Upcoming Webinar Dates & Topics
November 10 Emerging Hardware Choices for Modern AI Data ManagementDecember 8 Leverage the IOT to Build a Smart Data Ecosystem
2017 Webinar Themes
Technology TrendsMarket Trends
CommunicatingLearningUnderstandingReasoningPlanning