welcome to the course! introduction to natural language processing
DESCRIPTION
Welcome to the course! Introduction to Natural Language Processing. Professors: Marta Gatius Vila Horacio Rodríguez Hontoria Hours per week 2 theory hours + 1 problem/laboratory hour Main goal Understand the fundamental concepts of NLP Most well-known techniques and theories - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/1.jpg)
NLP Introduction 1
Welcome to the course!Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Professors:
Marta Gatius Vila
Horacio Rodríguez Hontoria Hours per week
2 theory hours + 1 problem/laboratory hour Main goal
Understand the fundamental concepts of NLP
Most well-known techniques and theories Most relevant existing resources. Most relevant applications
![Page 2: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/2.jpg)
NLP Introduction 2
Welcome to the course!Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Content1. Introduction to Natural Language
Processing 2. Applications.3. Language models.4. Morphology and lexicons.5. Syntactic processing.6. Semantic and pragmatic processing. 7. Generation
![Page 3: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/3.jpg)
NLP Introduction 3
Welcome to the course!Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Assesment• Exams
Partial exam- November, the 8th
Final exam – Final exams period- It will include all the course contents
• Project – Groups of two students
Course grade = maximum ( partial exam grade*0.15 + final exam grade*0.45, final exam grade* 0.6) + exercises grade *0.4
![Page 4: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/4.jpg)
NLP Introduction 4
Related (or the same) disciplines:•Computational Linguistics•Natural Language Processing, NLP,•Linguistic Engineering, LE•Human Language Technology, HL
Welcome to the course!Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
![Page 5: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/5.jpg)
NLP Introduction 5
Linguistic Engineering (LE)
• LE consists of the application of linguistic knowledge to the development of computer systems able to recognize, understand, interpretate and generate of human language in all its forms.
• LE includes:• Formal models (representations of knowledge
of language at the different levels)• Theories and algorithms• Techniques and Tools• Resources (Lingware)• Applications
![Page 6: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/6.jpg)
NLP Introduction 6
Linguistic knowledge levels
– Phonetics and phonology. Language models– Morphology: Meaningful components of words.
Lexicone.g., doors is plural
– Syntax: Structural relationships between words. Grammar
e.g., an utterance is a question or a statement– Semantics: Meaning of words and how they
combine. Grammar, domain knowledgee.g., open the door
– Pragmatics: How language is used to accomplish goals. Domain and Dialogue Knowledge
e.g., to be polite– Discourse: How single utterances are structured.
Dialogue models
![Page 7: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/7.jpg)
NLP Introduction 7
Examples of applications involving language models at those different levels Intelligent agents (i.e. HAL from the
movie 2001: A space Odyssey) Web-based question answered Machine translation engines
Foundations of LE lie in: Linguistics Mathematics Elecctrical engineering Phychology
Linguistic Engineering (LE)
![Page 8: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/8.jpg)
NLP Introduction 8
Exciting time for Increase in computer resources availabe The rise of the Web (a massive source of
information) Wireless mobile access Intelligent phones
Revolutionary applications are currently in use
Coversational agents guiding the user making travel reservations
Speech systems for cars
Cross-language information retrieval and tanslation (i.e. Google)
Automate systems to analyze students essays (i.e. Pearson)
Linguistic Engineering (LE)
![Page 9: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/9.jpg)
NLP Introduction 9
Components of the Technology
TEXT SPEECH IMAGE
INPUT
OUTPUT
TEXT SPEECH IMAGE
LINGUISTIC RESOURCES
Recognize andValidate
Analyze and Understanding Apply Generate
![Page 10: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/10.jpg)
NLP Introduction 10
This course is focused on Language Understanding
• Different levels of understanding • Incremental analysis• shallow and partial analysis• Looking for the interest focus (spotting)• In depth analysis of interest focus
• Linguistic, statistical, ML, hybrid approaches
• Main problems: ambiguity, unseen words, ungrammatical text
![Page 11: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/11.jpg)
NLP Introduction 11
Language generation
• Content planning • Semantic representation of the text• What to say, how to say
• Form planning Presentation of content
Using rethorical elements
![Page 12: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/12.jpg)
NLP Introduction 12
Dialogue
• Need of a high level of understanding• Involve additional processes• Identification of the illocutionary
content of speaker utterances• Speech acts
• assertions, orders, askings, questions, etc.
• Direct and indirect speech acts
![Page 13: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/13.jpg)
NLP Introduction 13
NLP Basics 1
• Why NLP is difficult?• Language is alive (changing)• Ambiguity• Complexity• Knowledge imprecise, probabilistic,
fuzzy• World (common sense) knowledge
is needed• Language is embedded into a
system of social interaction
![Page 14: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/14.jpg)
NLP Introduction 14
NLP Basics 2
• Phonetical ambiguity• Lexical ambiguity• Syntactic ambiguity• Semantic ambiguity• Pragmatic ambiguity.
Reference
Ambiguity
![Page 15: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/15.jpg)
NLP Introduction 15
• Multiple alternative linguistic structures can be built – I made her duck
• I cooked waterfowl for her• I cooked waterfowl belonging to her• I created the (plaster?) duck she owns• I caused her to quickly lowed her head or body• I waved my magic wand and turned her into
undifferentiated waterfowl– Ambiguities in the sentence
• Duck can be noun(waterfowl) or a verb (go down) -> syntactic ambiguity
• Her can be a dative pronoun or a possessive pronoun -> syntactic ambiguity
• Make can be create or cook -> semantic ambiguity
Resolving ambiguous input
![Page 16: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/16.jpg)
NLP Introduction 16
NLP Basics 3
Lexical ambiguity
• Words are (sometimes) polysemous• Frequent words are more ambiguous
![Page 17: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/17.jpg)
NLP Introduction 17
NLP Basics 4
Syntactic ambiguity
• Grammars are usually ambiguous• In general more than one parse tree is
correct for a sentence given a grammar
• Some kind of ambiguity (as pp-attachment) are at some level predictable
![Page 18: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/18.jpg)
NLP Introduction 18
NLP Basics 5
Semantic ambiguity
• More than one semantic interpretationsis possible.
• Peter gave a cake to the children• One cake for all them?• One cake for each?
![Page 19: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/19.jpg)
NLP Introduction 19
NLP Basics 6
Pragmatic ambiguity. Reference
• Later he asked her to put it above• Later? When?• He?• Her?• It?• Above what?
![Page 20: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/20.jpg)
NLP Introduction 20
Pragmatic Ambiguity
![Page 21: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/21.jpg)
NLP Introduction 21
Pragmatic Ambiguity(II)
![Page 22: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/22.jpg)
NLP Introduction 22
Which kind of ambiguity?
![Page 23: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/23.jpg)
NLP Introduction 23
Resolving ambiguous input
– Using models and algorithms– Using data-driven methods – Semantic-guided processing
• Restricting the domain. Considering only the language needed for accessing several services
• Using context knowledge. ( Shallow or Partial analysis)
![Page 24: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/24.jpg)
NLP Introduction 24
NLP Basics 7
Two type of models
• Racionalist model. Noam Chomsky• Most of the knowledge needed for NLP can
be acquired previously, prescripted and used as initial knowledge for NLP.
• Empiricist model. Zellig Harris• Linguistic knowledge can be inferred from
the experience, through textual corpora by simple means as the association or the generalization.
• Firth “we can know a word by the company it owns"
![Page 25: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/25.jpg)
NLP Introduction 25
NLP Basics 8
• fonetics
• fonology
• lexic
• morfology
• sintaxis
• logics
• semantics
• pragmatics
• illocution
• discourse
Levels of linguistic description
![Page 26: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/26.jpg)
NLP Introduction 26
Small number of formal models and theories:
• State machine• Rule systems• Logic• Probabilistic models• Vector space models
NLP Basics 9
![Page 27: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/27.jpg)
NLP Introduction 27
State machines
• Formal models that consits of state, transitions and input representations
• Variations• Deterministic/non deterministic• Finite-state automata• Finite-state transducers
![Page 28: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/28.jpg)
NLP Introduction 28
Rule systems
• Grammar formalisms• Regular grammars• Context free grammars• Feature grammars
• Probabilistics variants of them• Used for phonology, morphology and
sintax
![Page 29: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/29.jpg)
NLP Introduction 29
Logic
• First order logic (Predicate calculus)• Related formalism
• Lambda calculus• Features structures• Semantic primitives
• Used for modelling semantics and pragmatics but also for lexical semantics
![Page 30: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/30.jpg)
NLP Introduction 30
Probabilistic models
• State machine, rule systems and logic systems can be augmented with probabilistic.
• State machine aumented with probabilistics become markov model and hidden Markov model. • Used in different processes: part-of-speech
tagging, speech recognition,dialogue understanding, text-to-speech and macine translation.
• Ability to solve ambiguity problems
![Page 31: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/31.jpg)
NLP Introduction 31
Vector-space models
- Based on linear algebra- Underlie information retrieval and many
treatment of word meaning
![Page 32: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/32.jpg)
NLP Introduction 32
• Architecture based on layers• Each layer owns specific classes in
charge of solving some problems.• The objects of a layer request services
to other objects of the same layer or of the layer of the immediate inferior level.
• The objects of a layer provide services to other objects of the same layer or of the layer of the immediate superior level.
• Architecture based on pipes & filters• Each filter enriches in some way the
input stream and send it to the output stream
Architecture of NLP systems
![Page 33: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/33.jpg)
NLP Introduction 33
• Three levels of increasing complexity• Basic level
• Tasks needed by most NLP systems:• (paragraph, sentence) Segmenters, language
identificators, NER, NEC, NERC, tokenizers, morphological analizers, POS taggers, WSD, parsers, chunkers, semantic analyzers...
• Intermediate level• Tasks implying the performance of basic
components.• Document classification, Automatic summarization,
IE, IR, ...
• Application level• NLP appplications, such as conversational systems.
NLP taks
![Page 34: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/34.jpg)
NLP Introduction 34
Language processing and intelligence
The ability to process language is related to intelligence machines.
Turing test (1950) consists of convincing the interrogator the machine is a person ( The machine tries to answer questions as a human would). Q: Please write me a sonnet on the topic of
the Forth Bridge A: Count me on this one. I never could write
poetry. Q: Add 34957 to 70764 A: (Pause) 105621
![Page 35: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/35.jpg)
NLP Introduction 35
Language processing and intelligence2Eliza program (Weizenbaum, 1966)
Very simple program (based on pattern-maching).
I does not understand humans but it seems it does.User: You are like my father in some waysEliza: what resemblance you see User: you don’t argue with meEliza: Why do you think I don’t argue with youUser: You are afraid of meEliza: Does it please to belive I am afraid of
you?
![Page 36: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/36.jpg)
NLP Introduction 36
Language processing and intelligence3
Loebner Prize competition based on Turing test.Some programs fool judges some of the time (Shieber, 1994)
There are fun web robots based trying to look human (Alice)
There are dialogue systems (conversational systems) helping people use many different type of applications
![Page 37: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/37.jpg)
NLP Introduction 37
Relevant resources
• Conference and journals focus on LE: ACL, EACL, COLING, AI conferences.
• Competitions: TREC, CLEF, MUC, ACE, TAC
• Available resources:– Corpora, Ontologies– WordNet, EuroWordNet, Balkanet,– FrameNet, VerbNet, Propbank,
OntoNotes
![Page 38: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/38.jpg)
NLP Introduction 38
Resources for language understanding
• General Lexicons• Dictionaries• Specialized Lexicons• Ontologies• Grammars• Textual Corpora• Internet as an information source
![Page 39: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/39.jpg)
NLP Introduction 39
General Lexicons
• Word repositories• lemmaries, formaries, lists of words,
phrasal lexicons...
• Knowledge on words• Phonology• Morphology: POS, agreement..• Sintax: category, sub-categorization,
subcategorization, argument structure, valency, co-occurrence patterns...
• Semantics: semantic class, selectional restrictions...
– Pragmatics: use, register, domain, ...
![Page 40: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/40.jpg)
NLP Introduction 40
Dictionaries
• MRDs• types: general, normative, learner,
mono/bilingual...• size, content, organization
• entry, sense, ralations, ...
• Lexical databases• ej. Acquilex LDB
• Other sources: enciclopaedias, thesaurus,...
![Page 41: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/41.jpg)
NLP Introduction 41
Specialized Lexicons
• Onomasticae• terminoligical databases• Gazetteers• dictionaries of locutions, idioms,...• Wordnets• Acronyms, idioms, jaergon• Date, numbers, quantities+units,
currencies...
![Page 42: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/42.jpg)
NLP Introduction 42
morpholexical relations. U. Las Palmas (O. Santana)
![Page 43: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/43.jpg)
NLP Introduction 43
Ej: using Gazetteers in Q&A systems
• Multitext (U.Waterloo)• Clarke et al, 2001, 2002
• Structured data• biographies (25,000), Trivial Q&A (330,000),
Country locations (800), acronyms (112,000), cities (21,000), animals (500), previous TREC Q&A (1393), ...
• 1 Tb of Web data• Altavista
• AskMSR (Microsoft)• Brill, 2002
![Page 44: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/44.jpg)
NLP Introduction 44
Grammars
• morphological Grammars • syntactic Grammars
• constituent • dependency• case• transformational• systemic
• Phrase-strucure vs de Unification Grammars • Probabilistic Grammars • Coverage, language, tagsets
![Page 45: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/45.jpg)
NLP Introduction 45
Ontologies
• Lexical vs Conceptual Ontologies• General vs domain restricted Ontologies• Task Ontologies, metaontologies• Content, granularity, relations• Interlinguas: KIF, PIF• CYC, Frame-Ontology, WordNet,
EuroWordNet, GUM, MikroKosmos• Protegé
![Page 46: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/46.jpg)
NLP Introduction 46
Raw Corpora
• Textual vs Speech• Size (1Mw - 1Gw - 1TW)• Few estructure (if any) • Provide information not available in a
more treatable way:• colocations, argumental structure ,
context of occurrence, grammatical induction, lexical relations, selectional restrictions, idioms, examples of use,...
![Page 47: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/47.jpg)
NLP Introduction 47
• pos tagged (all tags or disambiguated)• lemma• sense (granularity of tagset, WN)• parenthised
• parsed
• Paralel corpora• Balanced, pyramidal, oportunistic
corpora
Tagged Corpora
![Page 48: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/48.jpg)
NLP Introduction 48
• Brown Corpus• ACL/DCI (Wall Street Journal, Hansard, ...)• ACL/ECI (European Corpus Initiative)• USA-LDC (Linguistic Data Consortium)• LOB (ICAME, International Computer Archive of
Modern English)• BNC (British National Corpus)• SEC (Lancaster Spoken English Corpus)• Penn Treebank• Susanne• SemCor• Trésor de la Langue Française (TLF)
Some examples of Corpora 1
![Page 49: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/49.jpg)
NLP Introduction 49
• Oficina del Español en la Sociedad de la Información OESI• http://www.cervantes.es/default.htm
• CREA, RAE. 200 Mw.• CRATER, (sp, en, fr), U.A.Madrid. 5.5Mw. aligned,
POS tagged• ALBAYZIN. Speech, isolated sentences, queries to a
geographic database • LEXESP, 5Mw, Pos taged, lemmatized• Ancora, Spanish & Catalan, Extremelly rich
annotation, 500Kw• IEC in the framework of DCC (catalan)
Some examples of Spanish Corpora 2
![Page 50: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/50.jpg)
NLP Introduction 50
example of Ancora treebank
![Page 51: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/51.jpg)
NLP Introduction 51
Internet as an information source 1
• Huge volume• > 2,000 Million pages, tenths of Tb, • expansion (doubles size each two years)
• Heterogeneity• content, language (70% Englsih), formats• redundancy• Hidden Web
• General Information servers• (Medialinks)• 14,000 servers (5,000 newspapers, 70 in Spain)
![Page 52: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/52.jpg)
NLP Introduction 52
• Internet today• documents HTML• built for human use (visualization) • Many pages automatically generated by
applications• Access through
• known URLs • searchers (or meta-searchers) of general
purpose• specific searchers for a site
• Limitations• access (by applications) to HTML codified text
(often bad) • building (and maintaining!) wrappers
Internet as an information source 2
![Page 53: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/53.jpg)
NLP Introduction 53
• Web2.0• Software agents
• crawlers, spiders, softbots, infobots ...
• Wacki• Baroni, 2008
• Wikipedia
Internet as an information source 3
![Page 54: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/54.jpg)
NLP Introduction 54
Applications
• Two main areas• Massive management of textual information
sources• for human use• for automatic collection of linguistic resources
• Person/Machine interaction
![Page 55: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/55.jpg)
NLP Introduction 55
Massive management of textual information sources
• Machine Translation• Information Management
• Automatic Summarization• Information {Retrieval, Extraction, Filtering
Routing, Harvesting, Mining}• Document Classification• Question Answering• Conceptual searchers
![Page 56: Welcome to the course! Introduction to Natural Language Processing](https://reader035.vdocument.in/reader035/viewer/2022062304/5681458d550346895db279bb/html5/thumbnails/56.jpg)
NLP Introduction 56
• Aligned corpora (various levels)• grammars• gazetteers• morphology• selectional restrictions• Subcategorization patterns• Topic Signatures
automatic collection of linguistic resources