assignment # 1 - weeblyfarrukhjabeen.weebly.com/uploads/1/1/6/2/1162932/p1comp581.pdf ·...
TRANSCRIPT
Farrukh Jabeen
COMP 581 Mathematical Methods in AI
Due Date: September 7, 2009
Assignment # 1
1. a) What is the "Boolean Satisfiability Problem (SAT)"? b)Why is this an important
problem in the theory of algorithms? Explain your answers and give supporting
examples.
Answer:
Satisfiability is the problem of determining if the variables of a given Boolean formula can be assigned in
such a way as to make the formula evaluate to TRUE. Equally important is to determine whether no such
assignments exist, which would imply that the function expressed by the formula is identically FALSE for all
possible variable assignments. In this latter case, we would say that the function is unsatisfiable; otherwise
it is satisfiable. To emphasize the binary nature of this problem, it is frequently referred to as Boolean or
propositional satisfiability. The shorthand "SAT" is also commonly used to denote it, with the implicit
understanding that the function and its variables are all binary-valued.
The Boolean satisfiability(SAT) problem plays a central role in combinatorial optimization and , in
particular , in NP completeness. Any NP completeness , such as the max cut problem, the graph coloring
problem, and the TSP, can be translated in polynomial time into a SAT problem. The SAT problem plays a
central role in solving large-scale computational problems, such as planning and scheduling, integrated
circuit design, computer architecture design, computer graphics, image processing and finding the folding
state of protein .
Propositional satisfiability (SAT) is a prototypical combinatorially hard problem which is as expressive as a
general constraint satisfaction problem. The importance of SAT
in artificial intelligence is clear from several practical applications and from several workshops, international
competitions, and special issues of journals devoted to this problem.
The practical applications of SAT include circuit design, diagnosis and verification, job scheduling and
planning. Several combinatorial problems have been encoded
as SAT and solved efficiently. These include map coloring and Steiner tree finding. Classical planning
problem can be solved efficiently, when encoded and solved as a SAT
problem.
Local search-based methods have gained a significant attention in the last decade. These are iterative
improvement methods. They are incomplete. These methods have solved
many SAT and constraint satisfaction problems (CSPs) orders of magnitude faster than the global search-
based methods.
Local search methods start with an initial assignment to variables and choose assignments that maximize the
increase in the number of constraints satisfied or increase the number of satisfied constraints. The methods
often choose assignments that do not lead to any improvement either to do simulated annealing or escape
from local maximum. The
next assignment which improves the value of the score function is generally chosen by examining several
candidate assignments.
These candidates are obtained by assigning different values to the same variable or assigning different
values to multiple variables. In both cases, any candidate assignment
differs from the current assignment in the value of only one variable.
Several local search-based methods have been reported. All of these generate candidate
assignments by changing the value of only one variable. For example, given an n-variable SAT instance, n candidate assignments can be generated, each by flipping the value of
one variable. Flipping the values of multiple variables on an iteration to generate the next assignment can be
useful for several reasons. Multiple flips can avoid local maximum.
Evaluating the effects of multiple flips allows a solver to look ahead. Multiple flips can reduce the number
of iterations needed to solve a problem and/or solving time.
There are different formulations for the SAT problem, but the most common one(which we discuss next),
consists of two components.
• A set of n Boolean variables{x 1 ,x 2 ,…………..x n }, representing statements that can either be TRUE(=1)
or FALSE(=0). The negation (the logical NOT) of a variable x is denoted by x. For example, TRUE=
FALSE. A variable or its negatation is called a literal.
• A set of m distinct clauses{C 1 ,C 2 ,…………C m ) of the form C i = z 1i ∨ z 2i ∨ ……….z ik , where z’s are the
literals and the ∨ denotes the logical OR operator. For example
0 V 1=1
The binary vector X=(x 1 ,x 2 ………….,x n ) is called a truth assignment, or simply an assignment. Thus ,x i
=1 assigns truth to xi and xi=0 assigns truth to xi for each i=1……..,n. The simplest SAT problem can
now be formulated as: find a truth assignment X such that all clauses are true.
Denoting the logical AND operator by ∧ , we can represent the above SAT problem via a single
formula as
F 1 = C 1 ,C 2 ,…………C m )
Where the {Ck} consists of literals connected with only ∨ operators. The SAT formula is said to be in
conjunctive normal form (CNF). An alternative SAT formulation concerns formulas of the type
F 2 = C 1 ∨ C 2 …………… ∨ C m
Where the clauses are of the C i = z 1i ∧ ∨ z 2i ∧ ……….z ik . Such a SAT problem is then said to be in
disjunctive normal form(DNF). In this case, a truth assignment x is sought that satisfies at least one of
the clauses, which is usually a much simpler problem.
Example
As an illustration of SAT problem and the corresponding SAT counting problem, consider the
following toy example of coloring the nodes of the graph of the graph in following figure. Is it
possible to color the nodes either black or white in such a way that no two adjacent nodes have the
same color.? If so how many colorings are there?
We can
Boolean variable representing the statement “the j
TRUE or FALSE,
that adjacent nodes cannot have same color can be translated into number of clauses that must all
hold. For example “node 1 and
Similarly , the statement “at least one of
x3. The same holds for all other pairs of adjacent nodes. The clauses can now be summarized as in
the following table. Here in the left hand table, for each clause C
clause contains x
clause does not contain either of them. Now call the corresponding matri
For example , a76=
clause only the indices of all Boolean variables present in that clause. In addition, each index that
corresponds to a negati
Can the graph be colores with two colors so that two adjacent
nodes have the same color?
We can translate this graph coloring problem into a SAT problem in the following way : Let x
Boolean variable representing the statement “the j
TRUE or FALSE, and we wish to assign truth to either x
hat adjacent nodes cannot have same color can be translated into number of clauses that must all
hold. For example “node 1 and
Similarly , the statement “at least one of
x3. The same holds for all other pairs of adjacent nodes. The clauses can now be summarized as in
the following table. Here in the left hand table, for each clause C
clause contains x j , a -1 means th
clause does not contain either of them. Now call the corresponding matri
For example , a76=-1 and a42=0. An alternative representation of the clause matrix
clause only the indices of all Boolean variables present in that clause. In addition, each index that
corresponds to a negation of a variable is
A SAT table and an alternative representation of clause matrix
e colores with two colors so that two adjacent
nodes have the same color?
translate this graph coloring problem into a SAT problem in the following way : Let x
Boolean variable representing the statement “the j-th node is colored black”. Obviously, x
and we wish to assign truth to either x
hat adjacent nodes cannot have same color can be translated into number of clauses that must all
hold. For example “node 1 and node 3 cannot both be black” can be translated as c
Similarly , the statement “at least one of node 1 and node 2must be black”. Is translated as C
x3. The same holds for all other pairs of adjacent nodes. The clauses can now be summarized as in
the following table. Here in the left hand table, for each clause C
1 means that the clause contains the negation x
clause does not contain either of them. Now call the corresponding matri
d a42=0. An alternative representation of the clause matrix
clause only the indices of all Boolean variables present in that clause. In addition, each index that
on of a variable is preceded by a minus sign
A SAT table and an alternative representation of clause matrix
e colores with two colors so that two adjacent
nodes have the same color?
translate this graph coloring problem into a SAT problem in the following way : Let x
th node is colored black”. Obviously, x
and we wish to assign truth to either x j or x j for each j=1…………..,5. The restriction
hat adjacent nodes cannot have same color can be translated into number of clauses that must all
node 3 cannot both be black” can be translated as c
node 1 and node 2must be black”. Is translated as C
x3. The same holds for all other pairs of adjacent nodes. The clauses can now be summarized as in
the following table. Here in the left hand table, for each clause C i a 1 in column j means
the clause contains the negation x j : and a 0 me
clause does not contain either of them. Now call the corresponding matrix A=(aij) the clause matrix.
d a42=0. An alternative representation of the clause matrix
clause only the indices of all Boolean variables present in that clause. In addition, each index that
by a minus sign
A SAT table and an alternative representation of clause matrix.
translate this graph coloring problem into a SAT problem in the following way : Let x j be the
th node is colored black”. Obviously, x j is either
for each j=1…………..,5. The restriction
hat adjacent nodes cannot have same color can be translated into number of clauses that must all
node 3 cannot both be black” can be translated as clause C 1 = x 1 ∨ x3.
node 1 and node 2must be black”. Is translated as C 2 = x 1 ∨
x3. The same holds for all other pairs of adjacent nodes. The clauses can now be summarized as in
a 1 in column j means that the
: and a 0 means that the
A=(aij) the clause matrix.
d a42=0. An alternative representation of the clause matrix is to list for each
clause only the indices of all Boolean variables present in that clause. In addition, each index that
be the
is either
x3.
A=(aij) the clause matrix.
is to list for each
Now let x= (x 1 ,x 2 ………….,x 5 ) be a truth assignment t. The question is whether there exists an x such
that all clauses {C k } are satisfied. To see if a single clause C k is satisfied, one must compare the truth
assignment for each assignment in that clause with the values 1,-1 and 0 in the clause matrix A, which
indicates that if literal corresponds to the variable, its negation ,or that neither appears in the clause
. If for example x j=0 and aij=-1 , then the literal x j is TRUE. The entire clause is TRUE if it contains at
least one true literal. Define the clause value C i (x)=1 if clause C i is TRUE with truth assignment x and
C i (x)=0 if it is FALSE. Then it is easy to see that
C i (x) = max{0,(2xj-1)aij},
j
assuming that at least one aij is nonzero for clause C i (otherwise, the clause can be deleted). For example,
for truth assignment(0,1,0,1,0) the corresponding clause values are given in the rightmost column of the
lefthand table in the Table. We see that second and forth clauses are violated. However the
assignment(1,1,0,1,0) does indeed yield all clauses true, and this therefore gives a way in which the nodes
can be colored 1=black, 2= black 3= white, 4=black, 5= white. It is easy to see that (0,0 1,0,1) is the only
assignment that renders all the clauses true.
The problem of deciding whether there exists a valid assignment, and indeed providing such a vector is
called the SAT assignment problem. Finding a coloring in the above example is a particular instance of the
SAT assignment problem. A SAT assignment problem in which each clause contains exactly K literals is
called a KSAT problem. It is well known that 2-SAT problems are easy and can be solve in polunomial time,
while K-SAT problems for k ≥3 are NP hard. A more difficult problem is to find the maximum number of
clauses that can be satisfied by one truth assignment. This is called Max-SAT problem.
2. What is the "Frame Problem" in Artificial Intelligence? Explain your answer and
give specific examples
Answer:
The term "artificial intelligence" was coined by John .While there does not appear to be any consensus about the definition
of AI, we can perhaps categorize the various definitions under three headings:
Strong AI
AI has as its goal the creation of machines with conscious intelligence.
Weak AI AI has as its goal the creation of machines which behave as if they were intelligent
Experimental Cognition AI is the study of human intellectual capabilities through the use of mental models implemented on a
computer.
In artificial intelligence, the frame problem was initially formulated as the problem of expressing a
dynamical domain in logic without explicitly specifying which conditions are not affected by an action.
The name "frame problem" derives from a common technique used by animated cartoon makers called
framing where the currently moving parts of the cartoon are superimposed on the "frame," which depicts
the background of the scene, which does not change. In the logical context, actions are typically specified
by what they change, with the implicit assumption that everything else (the frame) remains unchanged.
In another words we can say that, when we carry out an action, the world changes. In fact some aspects of
the world change, but others stay the same. Determining which stay the same is known as the frame
problem.
An effect axioms states what changes when the robot carries out a particular action in a particular
situation. It does not make any statement about what does not change. For example when the robot
carries out a Take action, it does not find itself in a different room. This kind of rule can be expressed in a
frame axiom, such as the following
Many of the frame axioms are needed if we are to describe all of the effects that do not result from
carrying out a particular action. The problem of having enormous numbers of frame axioms is known as
representational frame problem.
In any complex situation, we have to frame the problem the right way in order to achieve a solution.
"Framing" the problem means identifying what is relevant and what is irrelevant, and determining which bits
of knowledge in our extensive repository can be brought to bear to arrive at a solution. But to properly frame
the problem, we must understand the problem.
As human beings, we routinely solve complex problems. Our success in that process is so frequent and so
seemingly effortless, in fact, that it is easy to take the ability for granted. It is only when we begin trying to
capture that expertise in automated systems (e.g. robots), and try to give them the intelligence to deal with
new situations, that we discover how truly remarkable that capability is.
In all of these cases, people are effectively reorganizing their problem-solving heuristics in order to rapidly
hit on the "best" problem-solving strategy in any given situation. Currently, of course, our robotic systems
tend to be "frozen", than "self-organizing". Once programmed, the program tends to operate in exactly the
same way as time goes -- the situations may change, and the results generated by the program may differ as
a result, but the program never changes at all.
So clearly, a program that could "organize itself" would have the ability to address the frame problem. In
other words, it could get smarter over time. But that, in itself, does not solve the problem, because any one
attempt to find a solution can still take an inordinately long time. The "Robot Example" illustrates that
problem nicely.
• Robot, R1, is created. It's task is to go into a room and get it's power supply. It goes into the room,
finds its power supply in a wagon, and pulls it out of the room. Unfortunately, there is also a bomb in
the wagon. BOOM! The wagon and robot go up in smoke.
• That's not good, so a "Robot Decision-maker", R1D1, is constructed. It is able to think out the
consequences of it's actions in advance, so it won't pull out the bomb with the wagon. The new robot
goes into the room, and immediately engages it's deductive engines. It deduces that the floor is under
the wagon, and the wagon is under the roof, so therefore the floor is under the roof. Then it starts
deducing that...BOOM! The bomb goes off before the program can complete it's exhaustive set of
deductions.
• That's not good either, so a "Relevancy-detecting Robot Decision-maker", R2D1 is produced. This
time, the robot won't spend time on useless deductions, but will focus on what's important! So it, too, is
told to enter the room and retrieve it's power supply. This version sits outside the room without moving
for an awfully long time, until finally... BOOM! The power supply is destroyed by the bomb. After
debugging the output, it turns out that the program had figured out that the position of the floor wasn't
relevant, and the color of the drapes was relevant, and that was about as far is got in the time available.
At least this time the robot was saved, but the "analysis paralysis" induced by the attempt to determine
relevancy prevented the robot from taking any sort of action.
The robot we need to solve the problem, of course, is the "R2D2" -- a Relevancy-detecting Robot Decision-
maker that acts Decisively! frame problem suggests that R2D2 is impossible to construct. Impossible, that
is, if R2D2 must stand or fail on his own.
.
3. Explain the following methods of reasoning: a) deduction; b) induction (note: this
is not mathematical induction, but inductive reasoning); c) abduction.
Abduction means determining the precondition
Consider this example. It is logical to assert that “It rains; if it rains, the floor is wet; hence, the floor is wet.”
But any reasonable person can see the problem in making statements like: “The floor is wet; if it rains, the
floor is wet; hence, it rains.”. In order to make inferences to the best explanation, the researcher must need a
set of plausible explanations, and thus, abduction is usually formulated in the following mode:
The surprising phenomenon, X, is observed.
Among hypotheses A, B, and C, A is capable of explaining X.
Hence, there is a reason to pursue A.
Applications of abduction
Abduction can be well applied to quantitative research, especially Exploratory Data
Analysis (EDA) and Exploratory statistics (ES), such as factor rotation in Exploratory Factor Analysis and
path searching in Structural Equation Modeling .The whole notion of a controlled experiment is covertly
based on the logic of abduction. In a controlled experiment, the researchers control alternate explanations
and test the condition generated from the most plausible hypothesis. However, abduction shares more
common ground with EDA than with controlled experiments. In EDA, after observing some surprising facts,
we exploit them and check the predicted values against the observed values and residuals .Although there
may be more than one convincing pattern, we "abduct" only those that are more plausible for subsequent
controlled experimentation. Since experimentation is hypothesis-driven and EDA is data-driven, the logic
behind them are quite different. The abductive reasoning of EDA goes from data to hypotheses while
inductive reasoning of experimentation goes from hypothesis to expected data. By the same token, in
Exploratory Factor Analysis and Structural Equation Modeling, there might be more than one possible way
to achieve a fit between the data and the model; again, the researcher must “abduct” a plausible set of
variables and paths for modeling building.
In EDA, the role of the researcher is to explore the data in as many ways as possible until a plausible "story"
of the data emerges. EDA is not “fishing” significant results from all possible angles during research: it is
not trying out everything.
There are millions of possible explanations to a phenomenon. Due to the economy of research, we cannot
afford to falsify every possibility. We don't have to know everything to know something. By the same token,
we don't have to screen every false thing to dig out the authentic one. During the process of abduction, the
researcher should be guided by the elements of generality to extract a proper mode of perception.
Deduction means determining the conclusion
Deduction involves drawing logical consequences from premises. An inference is endorsed as deductionaly
valid when the truth of all premises guarantees the truth of conclusion.
For instance,
First premise: All the beans from the bag are white (True).
Second premise: These beans are from this bag (True).
Conclusion: Therefore, these beans are white (True).
Limitations of deduction
There are several limitations of deductive logic. First, deductive logic confines the
conclusion to a answer (True/False). A typical example is the rejection or failure of rejection of the null
hypothesis.
Second, this kind of reasoning cannot lead to the discovery of knowledge that is not already embedded in the
premise. In some cases the premise may even be tautological--true by definition. Brown (1963) illustrated
this weakness by using an example in economics: An entrepreneur seeks maximization of profits. The
maximum profits will be gained when marginal revenue equals marginal cost. An entrepreneur will operate
his business at the equilibrium between marginal cost and marginal revenue.
The above deduction simply tells that a rational man would like to make more money.
There is a similar example in cognitive psychology:
Human behaviors are rational.
One of several options is more efficient in achieving the goal.
A rational human will take the option that directs him to achieve his goal (Anderson,
1990).
The above two deductive inferences simply provide examples that a rational man will do
rational things. The specific rational behaviors have been included in the bigger set of generic rational
behaviors. Since deduction facilitates analysis based upon existing knowledge rather than generating new
knowledge, Josephson and Josephson (1994) viewed deduction as truth preserving and abduction as truth
producing.
Russell and Whitehead (1910) attempted to develop a self-sufficient logical-mathematical system. In their
view, not only can mathematics be reduced to logic, but also logic is the foundation of mathematics.
Mathematical logic relies on many unproven premises and assumptions. Statistical conclusions are
considered true only given that all premises and assumptions that are applied are true.
Deduction alone is a necessary condition, but not a sufficient condition of knowledge. Peirce (1934/1960)
warned that deduction is applicable only to the ideal state of things. In other words, deduction alone can be
applied to a well-defined problem, but not an ill-defined problem, which is more likely to be encountered by
researchers. Nevertheless, deduction performs the function of clarifying the relation of logical implications.
When well-defined categories result from abduction, premises can be generated for deductive reasoning.
Induction means determining the rule. It is learning the rule after numerous examples of the
conclusion following the precondition. Example: "The grass has been wet every time it has rained.
Thus, when it rains, the grass gets wet." Scientists are commonly associated with this style of
reasoning.
It allows inferring some a from multiple instantiations of b when a entails b. Induction is the process
of inferring probable antecedents as a result of observing multiple consequents. An inductive
statement requires perception for it to be true. For example, the statement, "it is snowing outside" is
invalid until one looks or goes outside to see whether it is true or not. Induction requires sense
experience.
Limitations of induction
Hume (1777/1912) argued that things are inconclusive by induction because in infinity
there are always new cases and new evidence. Induction can be justified, if and only if, instances of which
we have no experience resemble those of which we have experience. Thus, the problem of induction is also
known as “the skeptical problem about the future” (Hacking, 1975).
We never know when a regression line will turn flat, go down, or go up. Even inductive reasoning using
numerous accurate data and high power computing can go wrong, because predictions are made only under
certain specified conditions .
Take the modern economy as another example. Due to American economic problems in the early '80s, quite
a few reputable economists made gloomy predictions about the U.S. economy such as the takeover of
American economic and technological throne by Japan. By the end of the decade, Roberts (1989) concluded
that those economists were wrong; contrary to those forecasts, I n the 80’s the U.S. enjoyed the longest
economic expansion in its history. In the 1990s, the economic positions of the two nations changed: Japan
experienced recession while America experienced expansion.
Induction suggests the possible outcome in relation to events in long run. This is not definable for an
individual event. To make a judgment for a single event based on probability like "your chance to survive
this surgery is 75 percent" is nonsense. In actuality, the patient will either live or die. In a single event, not
only the probability is indefinable, but also the explanatory power is absent. Induction yields a general
statement that explains the event of observing, but not the facts observed.
If we observe thousands of stones, trees and flowers, we never reach a point at which we observe a
molecule. After we heat many iron bars, we can conclude the empirical fact that metals will bend when they
are heated. But we will never discover the physics of expansion coefficients in this way.
Indeed, superficial empirical-based induction could lead to wrong conclusions. For example, by repeated
observations, it seems that heavy bodies (e.g. metal, stone) fall faster than lighter bodies (paper, feather).
This Aristotelian belief had misled European scientists for over a thousand years. Galileo argued that indeed
both heavy and light objects fall at the same speed.
We don't know the real probability due to our finite existence. However, given a large number of cases, we
can approximate the actual probability. We don't have to know everything to know something. Also, we
don't have to know every case to get an approximation. This approximation is sufficient to fix our beliefs
and lead us to further inquiry.
4. Explain the strengths and weaknesses of using formal logic as a knowledge
representation method. Give specific examples.
Knowledge representation and reasoning plays a central role in Artificial Intelligence.
Research in Artificial Intelligence started off by trying to identify the general mechanisms responsible for
intelligent behavior. However, it quickly became obvious that general and powerful methods are not enough
to get the desired result, namely, intelligent behavior. Almost all tasks a human can perform which are
considered to require intelligence are also based on a huge amount of knowledge. For instance,
understanding and producing natural language heavily relies on knowledge about the language, about the
structure of the world, about social relationships etc.
One way to address the problem of representing knowledge and reasoning about it is to use some form of
logic.
Two perspectives on logic are possible. The first perspective, taken by McCarthy (1968), is that logic should
be used to represent knowledge. That is, we use logic as the representational and reasoning tool inside the
computer. Newell (1982) on the other hand proposed in his seminal paper on the knowledge level to use
logic as a formal tool to analyze knowledge. Of course, these two views are not incompatible. Furthermore,
once we accept that formal logic should be used as a tool for analyzing knowledge, it is a natural
consequence to use logic for representing knowledge and for reasoning about it as well.
Saying that logic is used as the main formal tool does not say which kind of logic is used. In fact, a large
variety of logics have been employed or developed in order to solve knowledge representation and reasoning
problems. Often, one started with a non clear specified problem, developed some kind knowledge
representation formalism without a formal semantics, and only later started to provide a formal semantics.
Using this semantics, one could then analyze the complexity of the reasoning problems and develop sound
and complete reasoning algorithms. This called as the logical method, which proved to be very fruitful in
the past and has a lot of potential for the future.
One good example for the evolution of knowledge representation formalisms is the development of
description logics, which have their roots in so-called structured inheritance networks formalisms .These
networks were originally developed in order represent word meanings. A concept node connects to other
concept nodes using roles. Moreover, the roles could be structured as well.
Determination of decidability and complexity as well as the design of decision algorithms are based on the
rigorous formalization of the initial ideas. In particular, it is not just one logic that it is used to derive these
results, but it is the logical method that led to the success. One starts with a specification of how expressions
of the language or formalism have to be interpreted in formal terms. Based on that one can specify when a
set of formulae logically implies a formula. Then one can start to find similar formalisms (e.g. modal logics)
and prove equivalences and/or one can specify a method to derive logically entailed sentences and prove
them to be correct and complete.
Another interesting area where the logical method has been applied is the development
of the so-called non-monotonic logics. These are based on the intuition that sometimes a logical
consequence should be retracted if new evidence becomes known. For example, we may assume that our car
will not be moved by somebody else after we have parked it. However, if new information becomes known,
such as the fact that the car is not at the place where we have parked it, we are ready to drop the assumption
that our car has not been moved.
This general reasoning pattern was used quite regularly in early AI systems, but it took a while before it was
analyzed from a logical point of view. In 1980, a special issue of the Artificial Intelligence journal appeared,
presenting different approaches to non-monotonic reasoning, in particular Reiter’s (1980) default logic and
McCarthy’s (1980) circumscription approach.
A disappointing fact about nonmonotonic logics appears to be that it is very difficult to formalize a domain
such that one gets the intended conclusions. In particular, in the area of reasoning about actions, McDermott
(1987) has demonstrated that the straightforward formalization of an easy temporal projection problem (the
“Yale shooting problem”) does not lead to the desired consequences. However, it is possible to get around
this problem. Once all underlying assumptions are spelled out, this and other problem can be solved
(Sandewall 1994).
It took more than a decade before people started to analyze the computational complexity
(of the propositional versions) of these logics. As it turned out, these logics are usually somewhat more
difficult than ordinary propositional logic (Gottlob 1992). This, however, seems tolerable since we get much
more conclusions than in standard propositional logic.
Right at the same time, the tight connection between nonmonotonic logic and belief revision (G¨ardenfors
1988) was noticed. Belief revision – modeling the evolution of beliefs over time – is just one way to
describe how the set of nonmonotonic consequences evolve over time, which leads to a very tight
connection on the formal level for these two forms of nonmonotonicity (Nebel 1991). Again, all these results
and insights are mainly based on the logical method to knowledge representation.
As mentioned, it is the idea of providing knowledge representation formalisms with formal (logical)
semantics that enables us to communicate their meaning, to analyze their formal properties, to determine
their computational complexity, and to devise reasoning algorithms.
While the research area of knowledge representation is dominated by the logical approach, this does not
mean that all approaches to knowledge representation must be based on logic. Probabilistic (Pearl 1988) and
decision theoretic approaches, for instance, have become very popular lately. Nowadays a number of
approaches aim at unifying decision theoretic and logical accounts by introducing a qualitative version of
decision theoretic concepts. Other approaches aim at tightly integrating decision theoretic concepts such as
Markov decision processes with logical approaches, for instance. Although this is not pure logic, the two
latter approaches demonstrate the generality of the logical method: specify the formal meaning and analyze!
5. Research the "MIU Puzzle" (originally defined in D. Hofstadter's book "Gödel,
Escher, Bach"). a) Answer the question: Can MU be produced from MI using the
given rules? Explain your answer. b) In what sense is this a "typographical" system?
c) What is Gödel's Incompleteness Theorem, and what does the MIU system have to
do with it?
Suppose there are the symbols M, I, and U which can be combined to produce strings of symbols called
"words". The MU puzzle asks one to start with the "axiomatic" word MI and transform it into the word MU
using in each step one of the following transformation rules:
1. Add a U to the end of any string ending in I. For example: MI to MIU.
2. Double any string after the M (that is, change Mx, to Mxx). For example: MIU to MIUIU.
3. Replace any III with a U For example: MUIIIU to MUUU.
4. Remove any UU. For example: MUUU to MU.
Using these four rules is it possible to change MI into MU in a finite number of steps?
The production rules can be written in a more schematic way. Suppose x and y behave as variables
(standing for strings of symbols). Then the production rules can be written as:
1. xI → xIU
2. Mx → Mxx
3. xIIIy → xUy
4. xUUy → xy
Is it possible to obtain the word MU using these rules?
Solution
The puzzle's solution is no. It is impossible to change the string MI into MU by repeatedly applying the given
rules.
In this case, one can look at the total number of I in a string. Only the second and third rules change this
number. In particular, rule two will double it while rule three will reduce it by 3. Now, the invariant
property is that the number of I is not divisible by 3:
• In the beginning, the number of Is is 1 which is not divisible by 3.
• Doubling a number that is not divisible by 3 does not make it divisible by 3.
• Subtracting 3 from a number that is not divisible by 3 does not make it divisible by 3 either.
Thus, the goal of MU with zero I cannot be achieved because 0 is divisible by 3.
In the language of modular arithmetic, the number n of I obeys the congruence
where a counts how often the second rule is applied.
b) In what sense is this a "typographical" system?
The MIU-puzzle is in fact merely a puzzle about natural numbers in typographical disguise. If we could only
find a way to transfer it to the domain of number theory, we might be able to solve it.
If we try counting the numbers of I's contained in theorems, we will soon notice that it seems never to be
0. In other words, it seems that no matter how much lengthening and shortening is involved, we can never
work in such a way that all I's are eliminated. Let us call the number of I's in any string the "I-count" of
that string. Note that the I-count of the axiom MI is 1. We can do more than show that the I-count can't
be 0 -- we can show that the I-count can never be any multiple of 3.
To begin with, notice that rules 1 and 4 (see ABOVE ) leave the I-count totally undisturbed. Therefore we
need only think about rules 2 and 3. As far as rule 3 is concerned, it diminishes the I-count by exactly three.
After an application of this rule, the I-count of the output might conceivably be a multiple of 3 -- but only if
the I-count of the input was also. Rule 3, in short, never creates a multiple of 3 from scratch. It can only
create one when it began with one. The same holds for rule 2, which doubles the I-count. The reason is
that if 3 divides 2n, then -- because 3 does not divide 2 -- it must divide n (a simple fact from the theory of
numbers). Neither rule 2 nor rule 3 can create a multiple of 3 from scratch.
But this is the key to the MU-puzzle! Here is what we know :
The I-count begins at 1 (the axiom MI ), and that is not a multiple of 3.
Two of the rules do not affect the I-count at all.
The two remaining rules which do affect the I-count do so in such a way as never to create a multiple of 3
unless given one initially.
The conclusion -- and a typically hereditary one it is, too -- is that the I-count can never become any
multiple of 3. In particular, 0 is a forbidden value of the I-count [because 0 is a multiple of 3, -- 0 x 3 = 0].
Hence, MU is not a theorem of the MIU-system.
Notice that, even as a puzzle about I-counts, this problem was still plagued by the crossfire of lengthening
and shortening rules. Zero became the goal. I-counts could increase (rule 2 ), could decrease (rule 3 ). Until
we analyzed the situation, we might have thought that, with enough switching back and forth between the
rules, we might eventually hit 0. Now, according to a simple number-theoretical argument, we know that
that is impossible.
Not all problems of the type which the MU-puzzle symbolizes are so easy to solve as this one. But we have
seen that at least one such puzzle could be embedded within, and solved within, number theory [When we
speak of "number theory" (which we often will indicate by "N") we mean its non-formalized version. The
language in this version is thus plain English or some other spoken language. Its formalized version is the
"Theoria Numerorum Typographica", or "TNT" for short (without quotation marks). Number theory is the
"meaning" or (meaningful) interpretation of TNT, that is, TNT describes number theory. Both number
theory and TNT concern the positive integers and zero (which are called "natural numbers") and their
properties. When setting up TNT, all symbols, operations, etc. in number theory are translated into special
TNT signs, in such a way that a minimum of such TNT signs is obtained.]. We are now going to see that
there is a way to embed all problems about any formal system, in number theory. To illustrate it, we will
use the MIU-system.
We begin by considering the notation of the MIU-system. We shall map each symbol onto a new symbol :
M <==> 3
I <==> 1
U <==> 0
The correspondence was chosen arbitrarily. The only reason to it is that each symbol looks a little like the
one it is mapped onto. Each number is called the Gödel number of the corresponding letter. Now
MU <==> 30
MIIU <==> 3110
MUU <==> 300
etc.
Let us now take a look at a typical derivation in the MIU-system, written simultaneously in both notations :
(1) MI -------------- axiom ---------- 31
(2) MII ------------- rule 2 ---------- 311
(3) MIIII ------------ rule 2 ---------- 31111
(4) MUI ------------ rule 3 --------- 301
(5) MUIU ---------- rule 1----------- 3010
(6) MUIUUIU ----- rule 2 ---------- 3010010
(7) MUIIU --------- rule 4 --------- 30110
The left-hand column is obtained by applying our four familiar typographic rules. The right-hand column,
too, could be thought of as having been generated by a similar set of typographic rules. Yet the right-hand
column has a dual nature. Now we explain what this means.
Seeing Things Both Typographically and Arithmetically
We could say of the fifth string ('3010') that it was made from the fourth ('301'), by appending a '0' on the
right. On the other hand we could equally well view the transition as caused by an arithmetical operation
-- multiplication by 10, to be exact. When natural numbers are written in the decimal system,
multiplication by 10 and putting a '0' on the right are indistinguishable operations. We can take advantage
of this to write an arithmetical rule which corresponds to typographical rule I :
Arithmetical Rule Ia : A number whose decimal expansion ends on the right in '1' can be multiplied by 10.
We can eliminate the reference to the symbols in the decimal expansion by arithmetically describing the
rightmost digit :
Arithmetical Rule Ib : A number whose remainder when divided by 10 is 1, can be multiplied by 10.
Now we could have stuck with a purely typographical rule (also here), such as the following one :
Typographical Rule I : From any theorem whose rightmost symbol is '1' a new theorem can be made, by
appending '0' to the right of that '1'. [Just using numerals instead of letters]
They would have the same effect. This is why the right-hand column has a "dual nature" : It can be viewed
either as a series of typographical operations changing one pattern of symbols into another, or as a series
of arithmetical operations changing one magnitude into another. But there are powerful reasons for being
more interested in the arithmetical version.
Typographical rule which tells how certain digits are to be shifted, changed, dropped, or inserted, in any
number represented decimally, .
More briefly :
Typographical rules for manipulating n u m e r a l s are actually arithmetical rules for operating on
n u m b e r s.
This simple observation is at the heart of Gödel's method, and it will have an absolutely shattering effect. It
tells us that once we have a Gödel-numbering for any formal system, we can straightaway form a set of
arithmetical rules which complete the Gödel isomorphism. The upshot is that we can transfer the study of
any formal system -- in fact the study of all formal systems -- into number theory.
What is Gödel's Incompleteness Theorem, and what does the MIU system have to do
with it?
Gödel's first incompleteness theorem states that there is no set of axioms of
arithmetic that is both complete and consistent
. In other words it means that there is at least one true statement in the system that cannot be derived
within the system.
· Gödel’s second incompleteness theorem states that if we have a set of axioms T,
then T's consistency cannot be proven within T.
To answer the question we will go back to MIU system.
MIU-system Since this is a formal system, this system consists of some restriction or rules. Our formal system---MIU system--- consists of only three letters of alphabet: M, I, U. The strings (which mean strings of letters) of the MIU-system are the strings that are composed of only those three letters. For example: MU UIM MUUMUU UIIUMIUUIMUIIUMIUUIMUIIU
are strings of the MIU-system. SYMBOLS: M, I, U AXIOM: MI RULES: (In the following, x is merely a variable) 1. If xI is a theorem, so is xIU. 2. If Mx is a theorem, so is Mxx. 3. In any theorem, III can be replace by U. 4. UU can be dropped from any theorem. Now the question is—Can we produce a string, namely “MU” in this system? An axiom, namely “MI” is granted initially and we want to produce “MU” by using given axiom and rules as below: MI -�…-�…-�…-�…-� MU The reason we want to do is that we want to find out “MU” is derivable or not mathematically.
1) The I-count begins at 1 (not a multiple of 3). 2) Rules 1 and 4 do not affect the I-count at all. 3) Rules 2 and 3 affect the I-count in such a way that they never create a multiple of 3 unless given one initially. It follows that the I-count can never be a multiple of 3 and thus we can never derive MU from MI. In other words, “MU is not a theorem of the MIU- system”. As a result, we can see that ‘MU’ actually exists in the system, but it is not derivable. So the system is incomplete. This corresponds to Gödel’s first incompleteness theorem which says “there is at least one true statement in any formal system, but it cannot be derived or proved”.
6. Formal Logic is known for its use of symbolic representation of concepts, such as
A���� B. In what sense are numbers symbols? Give examples.
Formal logic : the branch of logic that examines patterns of reasoning to determine which ones necessarily
result in valid, or formally correct, conclusions.
In the history of formal logic, different symbols have been used at different times
and by different authors
In mathematical logic, a Gödel numbering is a function that assigns to each symbol and well-formed
formula of some formal language a unique natural number, called its Gödel number. The concept was first
used by Kurt Gödel for the proof of his incompleteness theorem.
A Gödel numbering can be interpreted as an encoding in which a number is assigned to each symbol of a
mathematical notation, after which a sequence of natural numbers can then represent a sequence of
strings. These sequences of natural numbers can again be represented by single natural numbers,
facilitating their manipulation in formal theories of arithmetic.
Gödel used a system of Gödel numbering based on prime factorization. He first assigned a unique natural
number to each basic symbol in the formal language of arithmetic he was dealing with.
To encode an entire formula, which is a sequence of symbols, Gödel used the following system. Given a
sequence x1x2x3...xn of positive integers, the Gödel encoding of the sequence is the product of the first n
primes raised to their corresponding values in the sequence:
According to the fundamental theorem of arithmetic, any number obtained in this way can be uniquely
factored into prime factors, so it is possible to recover the original sequence from its Gödel number (for
any given number n of symbols to be encoded).
Gödel specifically used this scheme at two levels: first, to encode sequences of symbols representing
formulas, and second, to encode sequences of formulas representing proofs. This allowed him to show a
correspondence between statements about natural numbers and statements about the provability of
theorems about natural numbers, the key observation of the proof.
There are more sophisticated (but more concise) ways to construct a Gödel numbering for sequences.
Artificial intelligence systems are centrally concerned with representing and manipulating knowledge.
Some knowledge, particularly in mathematics and sciences , is expressed as numbers and formulae
expressions that consists of collection of numbers and arithmetic operations. Mathematics, when it
involves more abstract reasoning process(such as proving, algebraic transformations and the symbolic
solution of integral- differential equations), requires this more general and powerful language , which must
be expressed in concepts and relationships are represented by symbols and string of symbols.
In fact, numbers and formulae are really just a collection of symbols. Numbers are symbols whose
properties are defined over the set of arithmetic operations and arithmetic operations are represented by
symbols or strings of symbols( such as +,-,/,x). Thus the tools of artificial intelligence consist of languages,
processes, and construct that allow the acquisition, representation, storing, transformation and other
manipulation of concepts and relationships by information with the study of language theory, including
higher order computer languages and computer compiler theory.
Internal to a computer a symbol is just a sequence of bits that can be distinguished from other symbols.
Some symbols have a fixed interpretation, e.g., symbols that represent numbers and symbols that
represent characters. Symbols that do not have fixed meaning appear in many programming languages.
Java, starting from Java 1.5, calls them enumeration types. Lisp calls them atoms. Usually, they are
implemented as indexes into a symbol table that gives the name to print out. The only operation needed
on these symbols is equality to determine if two symbols are the same or not.
7. Do some research on the programming language called Prolog. Explain how
Prolog uses backward chaining as its solution method.
All programming languages have both declarative (definitional) and imperative (computational) components. Prolog is referred to as a declarative language because all program statements are definitional. In particular, a Prolog program consists of facts and rules which serve to define relations (in the mathematical sense) on sets of values. The imperative component of Prolog is its execution engine based on unification and resolution, a mechanism for recursively extracting sets of data values implicit in the facts and rules of a program.
Prolog rules consist of: a consequent (the head), and an antecedent (the body)
(connected by the ’:-’ symbol)
• The body of a rule may consist of one or more clauses. Each clause is a subgoal.
• The head is true if the subgoals making up the body are true.
Example: "X is wierd if X is vegetarian and eats steak, or if X is a trainspotter" wierd(X) :- vegetarian(X), eats_steak(X). wierd(X) :- is_trainspotter(X).
like facts, rules are also predicates. These 2 rules constitute the predicate "wierd/1")
Prolog rules wierd(X) :- vegetarian(X), eats_steak(X). wierd(X) :- is_trainspotter(X).
subgoals of a rule are usually separated by ’and’ (sometimes by ’or’) and terminated with
a full stop.The scope of a variable is limited to that rule. There are no global variables in Prolog. All facts
and rules have global scope
Backward chaining
Once knowledge is represented in a form that Prolog can use (facts and rules), a reasoning procedure is
required to draw conclusions from the knowledge base.
No need to program this ourselves - it is built into Prolog’s inference engine, which - by default works by
backward chaining.
This means that we start with a hypothesis and reason backwards from the hypothesis
trying to find facts that support it. For instance, in a database about grants for listed
buildings we might want to ask: |?- eligible_for_grant(jim).
The Prolog interpreter will then attempt to prove the goal statement, given the assumptions in the
KB(knowledge base).
• This may mean ascertaining that:
Jim’s property is listed
Jim does not have huge funds at his disposal
• Ascertaining that Jim does not have huge funds at his disposal might entail making sure that:
— Jim does not have lots of investments, and
— Jim does not earn > £75,000 a year
Thus Prolog is working backwards from the goal statement to the facts.
Prolog can be made to work in a forward-chaining fashion but this requires more programming effort.
In Prolog implies is written as :- and is read from right to left.
The universal quantifier is assumed for all variables and a comma is used for AND.
Example:
∀ x Cat(x) ∧ Tame(X) ⇒ Pet(x) would be written as
pet(X) :- cat(X),tame(X).
Sentences are apparently written 'backwards' to reflex the fact that Prolog uses backwards chaining. Thus,
the above sentence could be read as �to prove that X is a pet prove that X is a cat and that X is tame.
(sentences in Prolog end in a period. Variables must start with a capital letter or an underscore and objects
and relations begin with a lower-case letter.)A semicolon is used for OR.
References: http://en.wikipedia.org/wiki/Boolean_satisfiability_problem
http://books.google.com/books?id=1-ffZVmazvwC&pg=PA280&lpg=PA280&dq=importance+of+Satisfiability+problem&source=bl&ots=-
pd1h2yE_l&sig=jY96918MWW7oP_RQMDuederuozM&hl=en&ei=utmOSsaOJ4nIsQPHgvnfCQ&sa=X&oi=book_result&ct=result&resnum=2#v=onepage&q=
&f=false
http://en.wikipedia.org/wiki/Frame_problem
http://www.augustana.ab.ca/~mohrj/courses/2004.fall/csc110/lecture_notes/AI.html
http://www.treelight.com/software/collaboration/darwinframe.html
http://www.newworldencyclopedia.org/entry/Abductive_reasoning
http://www.creative-wisdom.com/teaching/WBI/abduction5.pdf
http://books.google.com/books?id=LcOLqodW28EC&pg=PA427&lpg=PA427&dq=frame+problem+artificial+intelligence&source=bl&ots=sUwegDQAK3&sig
=hh5hsw04j4SHuSzVNT-
0OreXpt4&hl=en&ei=YDaQSpv6DoeQsgOruaAM&sa=X&oi=book_result&ct=result&resnum=1#v=onepage&q=frame%20problem%20artificial%20intelligenc
e&f=false
http://www.informatik.uni-freiburg.de/~ki/papers/nebel-iesbs-01.pdf
http://www.iwriteiam.nl/MIUpuzzle.html
http://en.wikipedia.org/wiki/MU_puzzle
http://www-scf.usc.edu/~hyuen/writing/essays/OpusMagnum.pdf
http://www.metafysica.nl/nature/insect/nomos_64c.html#fsnss
http://rudar.ruc.dk/bitstream/1800/3868/1/Group%206-Philosophy%20of%20Logic%20and%20Artificial%20Intelligence-Final%20Hand%20In(5.1.2009).pdf
http://people.cs.ubc.ca/~poole/aibook/html/ArtInt_74.html
https://www.cs.kent.ac.uk/teaching/08/modules/CO/8/84/Aliy/LECTURE_NOTES/Prolog2_rules.pdf
http://www.soe.ucsc.edu/classes/cmps112/Spring03/languages/prolog/PrologIntro.pdf
http://www2.cs.uidaho.edu/~tsoule/cs470/PrologH1.html
http://www.cs.uwm.edu/~mali/conferences/mfsat-ictai.pdf
http://users.ox.ac.uk/~jrlucas/Godel/simplex.html
http://www.yourdictionary.com/formal-logic
http://www.lulu.com/items/volume_65/190000/190528/10/print/forallx090604.pdf
http://en.wikipedia.org/wiki/Godel_number
http://www.shsu.edu/~mth_jaj/math467/bowyer.pdf