incremental, systematic acquiring of knowledge (using closure algorithms)
DESCRIPTION
Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms). Roger L. Costello February 1, 2014. Example #1. Problem Statement. Problem: What cities can you get to from Boston?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/1.jpg)
Incremental, Systematic Acquiring of Knowledge
(using Closure Algorithms)
Roger L. CostelloFebruary 1, 2014
![Page 2: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/2.jpg)
Example #1
![Page 3: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/3.jpg)
Problem Statement
• Problem: What cities can you get to from Boston?
<Flights> <Flight> <From>Boston</From> <To>Miami</To> </Flight> <Flight> <From>Miami</From> <To>Houston</To> </Flight> <Flight> <From>Cleveland</From> <To>Akron</To> </Flight> <Flight> <From>Boston</From> <To>Denver</To> </Flight></Flights>
It is easy to eyeball this sample XML and see that from Boston you can get to Miami, Houston (via Miami), and Denver.
But if the XML document had thousands of flights, it would be impossible to eyeball it to get the answer.
3
![Page 4: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/4.jpg)
Need a Strategy
• We need a systematic, incremental approach to obtaining the answer.
• That’s what closure algorithms give us.
4
![Page 5: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/5.jpg)
5
Closure algorithms
• A closure algorithm enables the incremental, systematic acquiring of knowledge.
• Closure algorithms are characterized by two components:– An initialization, which is an assessment of what we
know initially. – An inference rule, which is a rule telling how knowledge
from several places is to be combined. • The inference rule(s) are repeated until nothing
changes any more.
![Page 6: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/6.jpg)
Initial KnowledgeThis is what we know initially:
– All the flights– From Boston we can get to Miami and Denver since there are direct flights to those
cities.<Flights> <Flight> <From>Boston</From> <To>Miami</To> </Flight> <Flight> <From>Miami</From> <To>Houston</To> </Flight> <Flight> <From>Cleveland</From> <To>Akron</To> </Flight> <Flight> <From>Boston</From> <To>Denver</To> </Flight></Flights> 6
![Page 7: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/7.jpg)
Inference Rule
• If we can get to the city identified in <From>, then we can get to the city identified in <To>.
7
![Page 8: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/8.jpg)
8
Initial knowledge
Flight Can get there from Boston?
<Flight><From>Boston</From><To>Miami</To></Flight> Yes
<Flight><From>Miami</From><To>Houston</To></Flight>
<Flight><From>Cleveland</From><To>Akron</To></Flight>
<Flight><From>Boston</From><To>Denver</To></Flight> Yes
• Go through the XML and for each <Flight> with a direct flight from Boston, mark it as “Can get there from Boston”.
![Page 9: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/9.jpg)
9
Build on top of our knowledge
• Apply the inference rule to gain more knowledge. • This is round two (round one was marking the initial
knowledge).Flight Can get there from
Boston?<Flight><From>Boston</From><To>Miami</To></Flight> Yes
<Flight><From>Miami</From><To>Houston</To></Flight> Yes (since we can get to Miami)
<Flight><From>Cleveland</From><To>Akron</To></Flight>
<Flight><From>Boston</From><To>Denver</To></Flight> Yes
![Page 10: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/10.jpg)
Round Three
• A third round yields no additional cities.
Flight Can get there from Boston?
<Flight><From>Boston</From><To>Miami</To></Flight> Yes
<Flight><From>Miami</From><To>Houston</To></Flight> Yes
<Flight><From>Cleveland</From><To>Akron</To></Flight>
<Flight><From>Boston</From><To>Denver</To></Flight> Yes
10
![Page 11: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/11.jpg)
Problem/Answer
• Problem: What cities can you get to from Boston?
• Answer: We can get to these cities:{Miami, Denver, Houston}
11
![Page 12: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/12.jpg)
Example #2
![Page 13: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/13.jpg)
Clean XML Schema
• Problem: Are there any element declarations in the adjacent XML Schema that, if used, do not result in a valid XML instance document (e.g., perhaps because they loop infinitely)? In other words, are there any non-productive element declarations?
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Document"> <xs:complexType> <xs:choice> <xs:sequence> <xs:element ref="A" /> <xs:element ref="B" /> </xs:sequence> <xs:sequence> <xs:element ref="D" /> <xs:element ref="E" /> </xs:sequence> </xs:choice> </xs:complexType> </xs:element> <xs:element name="A" type="xs:string" /> <xs:element name="B"> <xs:complexType mixed="true"> <xs:sequence> <xs:element ref="C" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="C" type="xs:string" /> <xs:element name="D"> <xs:complexType mixed="true"> <xs:sequence> <xs:element ref="F" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="E" type="xs:string" /> <xs:element name="F"> <xs:complexType mixed="true"> <xs:sequence> <xs:element ref="D" /> </xs:sequence> </xs:complexType> </xs:element>
</xs:schema>13
![Page 14: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/14.jpg)
Concise Notation
• For brevity, let’s depict the element declarations this way:
Document → A B | D EA → stringB → string CC → stringD → d FE → stringF → string D
A → string means the value of element A is a string.
B → string Cmeans the value of element B is a string and a child element C (i.e., B has mixed content). 14
![Page 15: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/15.jpg)
Terminology
These are rules.Each rule has a left-hand side and a right-hand side (the two sides are separated by an arrow →).Document, A, B, …, F are non-terminal symbols.string is a terminal symbol.
Document → A B | D EA → stringB → string CC → stringD → d FE → stringF → string D
15
![Page 16: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/16.jpg)
16
Find the productive rules
• We find the non-productive rules by finding the productive ones.
• A rule is productive if its right-hand side consists of symbols all of which are productive.
• Symbols that are productive:– Terminal symbols are productive since they
produce values.– A non-terminal is productive if there is a productive
rule for it.
![Page 17: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/17.jpg)
17
Initial knowledge
Rule Is it productive?
Document → A BDocument → D EA → string Yes
B → string CC → string Yes
D → string FE → string Yes
F → string D
• Go through the grammar and for each rule for which we know that all its right-hand side members are productive, mark the rule and the non-terminal it defines as “productive”.
![Page 18: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/18.jpg)
18
Build on top of our knowledge
• Apply the inference rule to gain more knowledge.
Rule Is it productive?
Document → A BDocument → D EA → string Yes
B → string C Yes (since “string” and C are productive)
C → string Yes
D → string FE → string Yes
F → string D
![Page 19: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/19.jpg)
19
Round threeRule Is it productive?
Document → A B Yes (since A and B are productive)
Document → D EA → string Yes
B → string C Yes (since “string” and C are productive)
C → string Yes
D → string FE → string Yes
F → string D
![Page 20: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/20.jpg)
20
Round four
• A fourth round yields nothing new.
Rule Is it productive?
Document → A B Yes (since A and B are productive)
Document → D EA → string Yes
B → string C Yes (since “string” and C are productive)
C → string Yes
D → string FE → string Yes
F → string D
![Page 21: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/21.jpg)
21
Recap
• We now know that Document, A, B, C, and E are productive and D, F, and the rule Document → D E are not productive.
Rule Is it productive?
Document → A B Yes (since A and B are productive)
Document → D EA → string Yes
B → string C Yes (since C is productive)
C → string Yes
D → string FE → string Yes
F → string D
![Page 22: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/22.jpg)
22
Remove non-productive rules• We have pursued all possible avenues for productivity,
and have not found any possibilities for D, F, and the second rule for Document. That means the rules for D, F, and the second rule for Document can be removed from the XML Schema.
The XML Schema after removing non-productive rules
Rule Is it productive?
Document → A B Yes (since A and B are productive)
A → string Yes
B → string C Yes (since C is productive)
C → string Yes
E → string Yes
![Page 23: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/23.jpg)
Cleaned XML Schema
• Problem: Does the XML Schema have any non-productive element declarations?
• Answer: Yes, and the cleaned XML Schema is shown in the adjacent box.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Document"> <xs:complexType> <xs:sequence> <xs:element ref="A" /> <xs:element ref="B" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="A" type="xs:string" /> <xs:element name="B"> <xs:complexType mixed="true"> <xs:sequence> <xs:element ref="C" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="C" type="xs:string" /> <xs:element name="E" type="xs:string" /> </xs:schema>
23
![Page 24: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/24.jpg)
24
Bottom-up process
• Removing the non-productive rules is a bottom-up process: only the bottom level, where the terminal symbols live, can we know what is productive.
![Page 25: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/25.jpg)
Why is it called “closure”?• We have seen two examples where new knowledge was
incrementally, systematically acquired using a closure algorithm.• Question: Where does the term “closure” come from?• Answer: Applying the inference rule yields knowledge about
existing symbols, without going outside the universe of discourse. – In the last example, applying the inference rule resulted in new
knowledge about the elements that are productive in the XML Schema. Conversely, say the inference rule were to result in knowledge about elements in a completely different XML Schema, then the algorithm would not be closed.
In a closure algorithm the universe of discourse is self-contained, no information outside of the universe is generated.
25
![Page 26: Incremental, Systematic Acquiring of Knowledge (using Closure Algorithms)](https://reader035.vdocument.in/reader035/viewer/2022062410/56816664550346895dd9f63c/html5/thumbnails/26.jpg)
Acknowledgement
• Some of the information in these slides come from this fabulous book:Parsing Techniques by Dick Grune and Ceriel Jacobs.X
26