natural language processing lecture 6 : revision

22
Natural Language Processing Lecture 6 : Revision

Upload: emery-harrington

Post on 31-Dec-2015

226 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Natural Language Processing Lecture 6 : Revision

Natural Language Processing

Lecture 6 : Revision

Page 2: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 2

RevisionRevision

1. What is the part of speech? Give 4 examples• Each part of speech explains what the word is, and how the word is used. In fact, the same word can be a noun in one sentence and a verb or adjective in the next. Example : noun, verb, preposition, pronoun

2. Complete the following sentences • A closed class is a class that contains a relatively fixed set of words;•An open class is a class that contains a constantly changing set of words

•Transitive verb: a verb that take a direct object complement.• Intransitive verb: a verb that do not take a direct object.

3. Give 3 examples for closed classes• Articles: a, an, the•Conjunctions: and, but, or, ...•Demonstratives: this, that, these, ...

Page 3: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 3

RevisionRevision

4. Give the definition of the Tagging Tagging :The process of assigning a part-of-speech to each word in a corpus

5. Describe the three tagging methodologies• Rule-Based Tagging : use a linguistic rules

•Stochastic Tagging : Based on probability of certain tag occurring given various possibilities. Requires a training corpus

•Transformation-Based Tagging : Combination of Rule-based and stochastic tagging methodologies

Page 4: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 4

RevisionRevision

6. What is the role of the regular expressions?• A regular expression is a formula in a special language that is used for specifying the regular languages.

7. What is the definition of :•String : is a sequence of letters Defined over an alphabet•Language : is a set of strings

8. What does specify the following regular expressions?

• /[3-9]*[A-Z]/ : zero or more digit followed by a capital letter

• \D : /[^0-9]/ : any non digit

• \w : : /[A-Za-z0-9 ]/ : any alphanumeric or space

Page 5: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 5

RevisionRevision

9. Give the role of the Finite State Automata•FSA recognize the regular languages represented by regular expressions

•10. FSA is a 5-tuple consisting of : • Q: set of states • : an alphabet of symbols• q0: A start state• F: a set of final states in Q • (q,i): a transition function mapping Q x to Q

Page 6: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 6

RevisionRevision

11. Give 3 words accepted by the following FSA

abba, baba, aaaaba, baaaabbaba,

Page 7: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 7

RevisionRevision

12. What is the syntax?• Syntax is the study of the rules governing the way words are combined to form sentences in a language.

13. What is the role of :•Syntactic analysis : is concerned with the construction of sentences.• Syntactic structure : indicates how the words are related to each other.

•Lexicon : indicates syntactic category of words.

•Grammar (typically Context Free Grammar) : specifies legitimate

concatenations of constituents(set of rules)

Page 8: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 8

RevisionRevision

15. Describe the 4 types of grammar•

Page 9: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 9

RevisionRevision

16. Complete the sentences :•Groups of words that belong together are called constituents

•The component that determines the properties of the constituent is the head, and the constituent can be referred to as a phrase.

•17. Draw a labeled tree diagram for the following English phrases. a. The ancient pyramids b.  in the early evening c.  Drove a car

Page 10: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 10

RevisionRevision

The

NDet

ancient

Adj

pyramids

NPPP

P NP

The

NDet

early

Adj

eveningin

Page 11: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 11

RevisionRevision

Drove

NPV

a car

VP

Det N

Page 12: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 12

RevisionRevision

18. Rewrite the following sentences with Phrase Structure Rules.

•The cat sits on a mat•Peter told the truth.

Page 13: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 13

RevisionRevision

V PP

on

NPP

a

Det N

matThe

N

sits

S

NP VP

Det

cat

The cat sits on a mat

Page 14: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 14

RevisionRevision

V

NP

the

Det N

truthPeter told

S

N VP

Peter told the truth

Page 15: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 15

RevisionRevision

19. Draw the trees for the following sentences: •The boy saw the man with the telescope•The children put the toy in the box

Page 16: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 16

RevisionRevision

V PP

with

NPP

the

Det N

telescopeThe

N

saw

S

NP VP

Det

boy

NP

the

Det N

man

The boy saw the man with the telescope

Page 17: Natural Language Processing Lecture 6 : Revision

04/19/23 NLP 17

RevisionRevision

V PP

in

NPP

the

Det N

boxThe

N

put

S

NP VP

Det

children

NP

the

Det N

toy

The children put the toy in the box

Page 18: Natural Language Processing Lecture 6 : Revision

Exercises1.Build a non-deterministic finite automata that recognizes words in the alphabet {a, b} that end with bab.

Page 19: Natural Language Processing Lecture 6 : Revision

2. Give the transitions tables

1

3

2

4

5

aaa

aa

aa

aa

b

b

b

a,b

a,b

a b

1 1-2-3-4-5 4-52 3-53 24 5 45

Page 20: Natural Language Processing Lecture 6 : Revision

b

3. Write the transitions table of the following automata. What is the recognized language?

0 1A B A

B B C

C C C

The recognized language is : 1*0+1(0 |1)*

Page 21: Natural Language Processing Lecture 6 : Revision

4. Determine the language for each regular expression :

L(001) = {001} the word 001

L(0|10*) = {0,1,10,100,1000,10000000,100000000 } 0 or all words

that contain 1 followed by 0 or more

L(0*10*) = { 1,01,10,100,001,010,00001,1000000000,000100000 }

All words that contain 1

Page 22: Natural Language Processing Lecture 6 : Revision

L()* = { w| w is a string of even length} all words over

L((0(0|1))*) = {ε,0,01,001,0001,0101 } all words that begin with 0

or ε

L((0|ε)(1| ε)) = {01,0,1, ε }

L ((0|ε|b)(1| a)) = {01,0a,1,a,b1,ba }