definite clause grammarpjh/modules/current/02630/... · prolog’s in-built grammar rule notation...

Post on 10-Aug-2020

11 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

06-25433 – Logic Programming

Definite Clause Grammar

Context-Free Grammar (CFG) is introduced as a way of writing rules about structured knowledge. Definite Clause Grammar (DCG) is Prolog’s in-built notation for writing CFGs.

06-25433 – Logic Programming

12 - Definite Clause Grammar 1

This lecture …

Writing DCGs is introduced showing:

– the basic framework;

– embedding calls to “ordinary” Prolog;

– building structures.

DCGs suffer from problems with left-recursive rules.

DCGs are a general programming tool with applications beyond language parsing.

06-25433 – Logic Programming

12 - Definite Clause Grammar 2

Type testing by scanning

Imagine we have a Prolog term and want to decide if it is:

• integer (e.g. 1, 123, …)

• atom (e.g. a, abc, aBC12)

• Variable (e.g. Butter, _123)

Also, assume we have a list of the terms as individual atoms, e.g. [' 1 ', ' 2 ', ' 3 ']

[a, ' B ', ' C ', ' 1 ', ' 2 ']

06-25433 – Logic Programming

12 - Definite Clause Grammar 3

What do we know about atoms, integers

and variables?

(For the time being), an atom begins with a lowercase letter and can be followed by any upper or lowercase letter, digit or ‘_’:

atom ::= lowercase other_symb

other_symb ::= lowercase other_symb |

uppercase other_symb |

digit other_symb |

"" % i.e. nothing

06-25433 – Logic Programming

12 - Definite Clause Grammar 4

Writing this in Prolog

term(atom) -->

lower_case, remaining_terms.

lower_case -->

[Letter],

{ Letter @>= 'a',

Letter @=< 'z' }.

… continued

Code within { … } is

treated as “normal”

Prolog code.

@>=, @>, @=< and

@< test term equality

or precedence.

06-25433 – Logic Programming

12 - Definite Clause Grammar 5

Writing this in Prolog

remaining_terms -->

( lower_case ;

upper_case ;

under_score ;

digit ),

remaining_terms.

remaining_terms -->

[].

“;” is another way of

expressing OR-

choice. It is best used

only when the options

are deterministic.

06-25433 – Logic Programming

12 - Definite Clause Grammar 6

What does this mean?

lower_case -->

[Letter],

{ Letter @>= 'a',

Letter @=< 'z' }.

is Prolog short-hand for writing:

lower_case([Letter|S], S) :-

Letter @>= 'a',

Letter @=< 'z'.

06-25433 – Logic Programming

12 - Definite Clause Grammar 7

What does this mean?

term(atom) -->

lower_case, remaining_terms.

is Prolog short-hand for writing:

term(atom, S0, S) :-

lower_case(S0, S1),

remaining_terms(S1, S).

06-25433 – Logic Programming

12 - Definite Clause Grammar 8

What does this mean?

remaining_terms(S0, S) :-

( lower_case(S0, S1)

;

upper_case(S0, S1)

;

under_score(S0, S1)

;

digit(S0, S1) ),

remaining_terms(S1, S).

remaining_terms(S, S).

06-25433 – Logic Programming

12 - Definite Clause Grammar 9

What is Prolog doing?

Meta-interpreting

This means writing code in one form and compiling it (automatically) into another, runable, form.

Meta-interpreting is usually used to allow domain experts to write knowledge in a user-friendly way but compile it into machine-friendly code.

This is similar to how we transformed formulas in logic.

06-25433 – Logic Programming

12 - Definite Clause Grammar 10

Prolog’s in-built

grammar rule notation

Definite Clause Grammar (DCG) is an in-built notation that looks like a CFG. DCGs can be executed as Prolog programs. This means that DCGs run exactly like Prolog: top-down and depth-first. (DCGs can also be used as a rule base to be used by another Prolog program – e.g. a chart parser.)

06-25433 – Logic Programming

12 - Definite Clause Grammar 11

A first anatomy of DCGs - 1

A rule is written:

left_hand --> right_side1, right_side2, dict_entry.

We can write words directly into rules as follows:

left_hand -->

right_side1,

[noddy],

right_side2.

06-25433 – Logic Programming

12 - Definite Clause Grammar 12

A first anatomy of DCGs - 2

Dictionary entries are written as:

dict_entry --> [the].

dict_entry --> [river,avon].

06-25433 – Logic Programming

12 - Definite Clause Grammar 13

What a DCG is compiled into - 1

Our rules become:

left_hand(S0, S) :-

right_side1(S0, S1),

right_side2(S1, S2),

dict_entry(S2, S).

left_hand(S0, S) :-

right_side1(S0, S1),

‘C’(S1,noddy, S2)

right_side2(S2, S).

06-25433 – Logic Programming

12 - Definite Clause Grammar 14

What a DCG is compiled into - 2

Dictionary entries become:

dict_entry(S0, S) :-

‘C’(S0, the, S).

and there is an in-built fact:

‘C’([Token|S], Token, S).

06-25433 – Logic Programming

12 - Definite Clause Grammar 15

Using DCG in a program checker

One of the strengths of declarative languages such as Prolog and Haskell is the ease with which programs can be written to manipulate other programs – or themselves.

This program checks that there are clauses for every subgoal in a program.

06-25433 – Logic Programming

12 - Definite Clause Grammar 16

The general idea

Given a clause such as:

read_text(Current_Word) :-

look_up(Current_Word),

read(Next_Word),

read_text(Next_Word).

check there is a rule or fact for each subgoal (unless the subgoal is built-in, like read/1).

06-25433 – Logic Programming

12 - Definite Clause Grammar 17

Design

At the highest level:

1. Open a file, read in a program to a list and close the file;

2. Parse each clause, listing goals (heads) and subgoals (from the bodies of rules);

3. Check that each subgoal has a definition and report to the user.

06-25433 – Logic Programming

12 - Definite Clause Grammar 18

Open a file, read in a program to a list

and close the file The code for this is on the WWW and described in the notes.

It is fairly straightforward for someone who knows how to open, read and close files in another language.

The important point is the output is a list of clauses: [skills(fred,jones,C++),

(happy_student(_6016):-

module_reg(_6016,prolog))]

06-25433 – Logic Programming

12 - Definite Clause Grammar 19

Parse each clause, listing goals (heads)

and subgoals (from the bodies of rules) This is easy using DCG:

clause(Goals0, Goals,

Sub_Goals, Sub_Goals) -->

[Fact],

{

% check this isn’t a rule

Fact \= (_ :- _),

% extract the fact as a goal

add_goal(Fact, Goals0, Goals)

}.

06-25433 – Logic Programming

12 - Definite Clause Grammar 20

Parse each clause, listing goals (heads)

and subgoals (from the bodies of rules) This is easy using DCG:

clause(Goals0, Goals,

Sub_Goals0, Sub_Goals) -->

[(Head :- Body)],

{

% extract the head as a goal

add_goal(Head, Goals0, Goals),

% extract the subgoals

body(Sub_Goals0, Sub_Goals, Body)

}.

06-25433 – Logic Programming

12 - Definite Clause Grammar 21

Processing bodies

The body of a Prolog rule is a conjunction of terms:

body(Sub_Goals0, Sub_Goals,

(Body, Bodies)) :-

add_goal(Body, Sub_Goals0,

Sub_Goals1),

body(Sub_Goals1, Sub_Goals, Bodies).

body(Sub_Goals0, Sub_Goals, Body) :-

Body \= (_,_),

add_goal(Body, Sub_Goals0,

Sub_Goals).

06-25433 – Logic Programming

12 - Definite Clause Grammar 22

Parsing clauses

This follows a very common pattern in parsing with DCGs:

clauses(Goals0, Goals,

Sub_Goals0, Sub_Goals) -->

clause(Goals0, Goals1,

Sub_Goals0, Sub_Goals1),

clauses(Goals1, Goals,

Sub_Goals1, Sub_Goals).

clauses(Goals, Goals,

Sub_Goals, Sub_Goals) --> [].

06-25433 – Logic Programming

12 - Definite Clause Grammar 23

Check that each subgoal has a definition

and report to the user For each subgoal, check that there is a corresponding goal in the Goal list.

This is very similar to checking history lists. The main checking code is:

% subgoal is not ins goal list

not_member(Goals, Sub_Goal/Arity),

% check subgoal is not a built-in

functor(Predicate,S ub_Goal,Arity),

\+ predicate_property(Predicate,built_in)

06-25433 – Logic Programming

12 - Definite Clause Grammar 24

The basic idea of

Context-Free Grammar - 1

A CFG has several parts:

grammar rule

grammar rule

grammar

rule left-hand

right-hand

left-hand symbol

06-25433 – Logic Programming

12 - Definite Clause Grammar 25

The basic idea of

Context-Free Grammar - 2

right-hand symbol

right-hand symbol

right-hand

dictionary entry

dictionary entry

dictionary

06-25433 – Logic Programming

12 - Definite Clause Grammar 26

Context-Free Grammar (CFG) - 1

Context-free grammar is a formalism for writing rules that describe things that are structured.

This is a grammar for a sentence:

S NP VP

NP determiner noun

VP verb PP

PP preposition NP

We will use

abbreviations in

our grammar: prep

and det.

06-25433 – Logic Programming

12 - Definite Clause Grammar 27

Context-Free Grammar (CFG) - 2

and this is the lexicon:

determiner the

noun cat

noun mat

preposition on

verb sat

06-25433 – Logic Programming

12 - Definite Clause Grammar 28

Context-Free Grammar (CFG) - 3

Applying these rules we get:

S

NP VP

det noun verb PP

the cat sat prep NP

on det noun

the mat

06-25433 – Logic Programming

12 - Definite Clause Grammar 29

A Definite Clause Grammar (DCG) - 1

DCG allows us to write CFGs in Prolog that look almost exactly like CFGs:

s --> np, vp. np --> det, noun.

vp --> verb, pp. pp --> prep, np.

det --> [the]. noun --> [cat].

noun --> [mat]. prep --> [on].

verb --> [sat].

06-25433 – Logic Programming

12 - Definite Clause Grammar 30

Definite Clause Grammar (DCG) - 2

We can add extra arguments to DCGs - e.g. to make a [syntax] phrase structure tree:

s(s(NP, VP)) --> np(NP), vp(VP).

np(np(Det, Noun)) --> det(Det),

noun(Noun).

etc.

det(det(the)) --> [the].

noun(noun(cat)) --> [cat].

etc. Demo 2

06-25433 – Logic Programming

12 - Definite Clause Grammar 31

Problems with Prolog’s

depth-first search

As with all Prolog programs, left-recursive rules will give problems:

% left recursive

np(np(NP1, NP2)) -->

np(NP1),

noun(NP2).

np(np(Det)) -->

det(Det).

det(det(the)) --> [the].

noun(noun(car)) --> [car].

06-25433 – Logic Programming

12 - Definite Clause Grammar 32

Working around left-recursive rules - 1

As with all Prolog programs, left-recursive rules will give problems:

Method 1 - remove left-recursive rule by renaming:

np(np(NP1, NP2)) -->

np1(NP1),

noun(NP2).

np1(np(Det)) --> det(Det).

det(det(the)) --> [the].

noun(noun(car)) --> [car].

06-25433 – Logic Programming

12 - Definite Clause Grammar 33

Working around left-recursive rules - 2

Method 2

– Keep a list of points in the parsing

– Examine the list to ensure that you’re not repeating a point.

np(np(NP1,NP2),History0,History,S0,S) :-

\+ memb(entry(np, S0), History0),

np(NP1, [entry(np, S0)|History0],

History1, S0, S1),

noun(NP2, [entry(noun,S1)|History1],

History, S1, S).

06-25433 – Logic Programming

12 - Definite Clause Grammar 34

Working around left-recursive rules - 3

Method 2 (continued)

np(np(Det), History0, History) -->

det(Det, History0, History).

06-25433 – Logic Programming

12 - Definite Clause Grammar 35

Summary

DCGs are a powerful and convenient implementation of CFG in Prolog.

They allow rules which specify

– structure building

– arbitrary embedded Prolog code

Because DCGs use Prolog’s depth-first search, they have problems with left-recursive rules – but this can be eliminated.

DCGs are more than a language parsing tool – they have uses in a wide variety of programs.

top related