Lecture 1:Introduction
Kai-Wei Chang
CS @ University of Virginia
Couse webpage: http://kwchang.net/teaching/NLP16
1CS6501– Natural Language Processing
Announcements
Waiting list: Start attending the first few meetings
of the class as if you are registered. Given that
some students will drop the class, some space
will free up.
We will use Piazza as an online discussion
platform. Please enroll.
CS6501– Natural Language Processing 2
Staff
Instructor: Kai-Wei Chang
Email: [email protected]
Office: R412 Rice Hall
Office hour: 2:00 – 3:00, Tue (after class).
Additional office hour: 3:00 – 4:00, Thu
TA: Wasi Ahmad
Email: [email protected]
Office: R432 Rice Hall
Office hour: 4:00 – 5:00, Mon
3CS6501– Natural Language Processing
This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
CS6501– Natural Language Processing 4
What is NLP
Wiki: Natural language processing (NLP) is
a field of computer science, artificial
intelligence, and computational linguistics
concerned with the interactions between
computers and human (natural) languages.
CS6501– Natural Language Processing 5
Go beyond the keyword matching
Identify the structure and meaning of
words, sentences, texts and conversations
Deep understanding of broad language
NLP is all around us
CS6501– Natural Language Processing 6
Machine translation
CS6501– Natural Language Processing 7
Facebook translation, image credit: Meedan.org
Statistical machine translation
CS6501– Natural Language Processing 8
Image credit: Julia Hockenmaier, Intro to NLP
Dialog Systems
CS6501– Natural Language Processing 9
Sentiment/Opinion Analysis
CS6501– Natural Language Processing 10
Text Classification
Other applications?
CS6501– Natural Language Processing 11
www.wired.com
Question answering
CS6501– Natural Language Processing 12
credit: ifunny.com
'Watson' computer wins at 'Jeopardy'
Question answering
Go beyond search
CS6501– Natural Language Processing 13
Natural language instruction
CS6501– Natural Language Processing 14
https://youtu.be/KkOCeAtKHIc?t=1m28s
Digital personal assistant
Semantic parsing – understand tasks
Entity linking – “my wife” = “Kellie” in the phone
book
CS6501– Natural Language Processing 15
credit: techspot.com
More on natural language instruction
Information Extraction
Unstructured text to database entries
CS6501– Natural Language Processing 16
Yoav Artzi: Natural language processing
Language Comprehension
Q: who wrote Winnie the Pooh?
Q: where is Chris lived?
CS6501– Natural Language Processing 17
Christopher Robin is alive and well. He is the same
person that you read about in the book, Winnie the Pooh.
As a boy, Chris lived in a pretty home called Cotchfield
Farm. When Chris was three years old, his father wrote
a poem about him. The poem was printed in a magazine
for others to read. Mr. Robin then wrote a book
What will you learn from this course
The NLP Pipeline
Key components for
understanding text
NLP systems/applications
Current techniques & limitation
Build realistic NLP tools
CS6501– Natural Language Processing 18
What’s not covered by this course
Speech recognition – no signal processing
Natural language generation
Details of ML algorithms / theory
Text mining / information retrieval
CS6501– Natural Language Processing 19
This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
CS6501– Natural Language Processing 20
Overview
New course, first time being offered
Comments are welcomed
Aimed at first- or second- year PhD students
Lecture + Seminar
No course prerequisites, but I assume
programming experience (for the final project)
basics of probability calculus, and linear
algebra (HW0)
CS6501– Natural Language Processing 21
Grading
No exam & HW -- hooray
Lectures & forum
Participate in discussion (additional credits)
Review quizzes (25%): 3 quizzes
Critical review report (10%)
Paper presentation (15%)
Final project (50%)
CS6501– Natural Language Processing 22
Quizzes
Format
Multiple choice questions
Fill-in-the-blank
Short answer questions
Each quiz: ~20 min in class
Schedule: see course website
Closed book, Closed notes, Closed laptop
CS6501– Natural Language Processing 23
Critical review report
1 page maximum
Pick one paper from the suggested list
Summarize the paper (use you own words)
Provide detailed comments
What can be improved
Potential future directions
Other related work
Some students will be selected to present
their critical reviews
CS6501– Natural Language Processing 24
Paper presentation
Each group has 2~3 students
Picked one paper from the suggested
readings, or your favorite paper
Cannot be the same as critical review report
Can be related to your final project
Register your choice early
15 min presentation + 2 mins Q&A
Will be graded by the instructor, TA, other
students
CS6501– Natural Language Processing 25
Final Project
Work in groups (2~3 students)
Project proposal
Written report, 2 page maximum
Project report (35%)
< 8 pages, ACL format
Due 2 days before the final presentation
Project presentation (15%)
5-min in-class presentation (tentative)
CS6501– Natural Language Processing 26
Late Policy
Credit of 48 hours for all the assignments
Including proposal and final project
No accumulation
No more grace period
No make-up exam
unless under emergency situation
CS6501– Natural Language Processing 27
Cheating/Plagiarism
No. Ask if you have concerns
UVA Honor Code:
http://www.virginia.edu/honor/
CS6501– Natural Language Processing 28
Lectures and office hours
Participation is highly appreciated!
Ask questions if you are still confusing
Feedbacks are welcomed
Lead the discussion in this class
Enroll Piazza
https://piazza.com/virginia/fall2016/cs6501004
CS6501– Natural Language Processing 29
Topics of this class
Fundamental NLP problems
Machine learning & statistical approaches
for NLP
NLP applications
Recent trend in NLP
CS6501– Natural Language Processing 30
What to Read?
Natural Language ProcessingACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL
aclweb.org/anthology
Machine learningICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ
Artificial IntelligenceAAAI, IJCAI, UAI, JAIR
CS6501– Natural Language Processing 31
Questions?
CS6501– Natural Language Processing 32
This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
CS6501– Natural Language Processing 33
Challenges – ambiguity
Word sense ambiguity
CS6501– Natural Language Processing 34
Challenges – ambiguity
Word sense / meaning ambiguity
CS6501– Natural Language Processing 35
Credit: http://stuffsirisaid.com
Challenges – ambiguity
PP attachment ambiguity
CS6501– Natural Language Processing 36
Credit: Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711
Challenges -- ambiguity
Ambiguous headlines:
Include your children when baking cookies
Hospitals are Sued by 7 Foot Doctors
Iraqi Head Seeks Arms
Safety Experts Say School Bus Passengers
Should Be Belted
CS6501– Natural Language Processing 37
Challenges – ambiguity
Pronoun reference ambiguity
CS6501– Natural Language Processing 38
Credit: http://www.printwand.com/blog/8-catastrophic-examples-of-word-choice-mistakes
Challenges – language is not static
Language grows and changes
e.g., cyber lingo
CS6501– Natural Language Processing 39
LOL Laugh out loud
G2G Got to go
BFN Bye for now
B4N Bye for now
Idk I don’t know
FWIW For what it’s worth
LUWAMH Love you with all my heart
Challenges--language is compositional
CS6501– Natural Language Processing 40
Carefully Slide
Challenges--language is compositional
CS6501– Natural Language Processing 41
小心:
Carefully
Careful
Take
Care
Caution
地滑:
Slide
Landslip
Wet Floor
Smooth
Challenges – scale
Examples:
Bible (King James version): ~700K
Penn Tree bank ~1M from Wall street journal
Newswire collection: 500M+
Wikipedia: 2.9 billion word (English)
Web: several billions of words
CS6501– Natural Language Processing 42
This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
CS6501– Natural Language Processing 43
Part of speech tagging
CS6501– Natural Language Processing 44
Syntactic (Constituency) parsing
CS6501– Natural Language Processing 45
Syntactic structure => meaning
CS6501– Natural Language Processing 46
Image credit: Julia Hockenmaier, Intro to NLP
Dependency Parsing
CS6501– Natural Language Processing 47
Semantic analysis
Word sense disambiguation
Semantic role labeling
CS6501– Natural Language Processing 48
Credit: Ivan Titov
Christopher Robin is alive and well. He is the
same person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived in a
pretty home called Cotchfield Farm. When
Chris was three years old, his father wrote a
poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then
wrote a book
49
Q: [Chris] = [Mr. Robin] ?
Slide modified from Dan Roth
Christopher Robin is alive and well. He is the
same person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived in a
pretty home called Cotchfield Farm. When
Chris was three years old, his father wrote a
poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then
wrote a book
50
Co-reference Resolution
Questions?
CS6501– Natural Language Processing 51