assignment # 1 - weeblyfarrukhjabeen.weebly.com/uploads/1/1/6/2/1162932/p1comp581.pdf ·...

Farrukh Jabeen

COMP 581 Mathematical Methods in AI

Due Date: September 7, 2009

Assignment # 1

1. a) What is the "Boolean Satisfiability Problem (SAT)"? b)Why is this an important

problem in the theory of algorithms? Explain your answers and give supporting

examples.

Answer:

Satisfiability is the problem of determining if the variables of a given Boolean formula can be assigned in

such a way as to make the formula evaluate to TRUE. Equally important is to determine whether no such

assignments exist, which would imply that the function expressed by the formula is identically FALSE for all

possible variable assignments. In this latter case, we would say that the function is unsatisfiable; otherwise

it is satisfiable. To emphasize the binary nature of this problem, it is frequently referred to as Boolean or

propositional satisfiability. The shorthand "SAT" is also commonly used to denote it, with the implicit

understanding that the function and its variables are all binary-valued.

The Boolean satisfiability(SAT) problem plays a central role in combinatorial optimization and , in

particular , in NP completeness. Any NP completeness , such as the max cut problem, the graph coloring

problem, and the TSP, can be translated in polynomial time into a SAT problem. The SAT problem plays a

central role in solving large-scale computational problems, such as planning and scheduling, integrated

circuit design, computer architecture design, computer graphics, image processing and finding the folding

state of protein .

Propositional satisfiability (SAT) is a prototypical combinatorially hard problem which is as expressive as a

general constraint satisfaction problem. The importance of SAT

in artificial intelligence is clear from several practical applications and from several workshops, international

competitions, and special issues of journals devoted to this problem.

The practical applications of SAT include circuit design, diagnosis and verification, job scheduling and

planning. Several combinatorial problems have been encoded

as SAT and solved efficiently. These include map coloring and Steiner tree finding. Classical planning

problem can be solved efficiently, when encoded and solved as a SAT

problem.

Local search-based methods have gained a significant attention in the last decade. These are iterative

improvement methods. They are incomplete. These methods have solved

many SAT and constraint satisfaction problems (CSPs) orders of magnitude faster than the global search-

based methods.

Local search methods start with an initial assignment to variables and choose assignments that maximize the

increase in the number of constraints satisfied or increase the number of satisfied constraints. The methods

often choose assignments that do not lead to any improvement either to do simulated annealing or escape

from local maximum. The

next assignment which improves the value of the score function is generally chosen by examining several

candidate assignments.

These candidates are obtained by assigning different values to the same variable or assigning different

values to multiple variables. In both cases, any candidate assignment

differs from the current assignment in the value of only one variable.

Several local search-based methods have been reported. All of these generate candidate

assignments by changing the value of only one variable. For example, given an n-variable SAT instance, n candidate assignments can be generated, each by flipping the value of

one variable. Flipping the values of multiple variables on an iteration to generate the next assignment can be

useful for several reasons. Multiple flips can avoid local maximum.

Evaluating the effects of multiple flips allows a solver to look ahead. Multiple flips can reduce the number

of iterations needed to solve a problem and/or solving time.

There are different formulations for the SAT problem, but the most common one(which we discuss next),

consists of two components.

• A set of n Boolean variables{x 1 ,x 2 ,…………..x n }, representing statements that can either be TRUE(=1)

or FALSE(=0). The negation (the logical NOT) of a variable x is denoted by x. For example, TRUE=

FALSE. A variable or its negatation is called a literal.

• A set of m distinct clauses{C 1 ,C 2 ,…………C m ) of the form C i = z 1i ∨ z 2i ∨ ……….z ik , where z’s are the

literals and the ∨ denotes the logical OR operator. For example

0 V 1=1

The binary vector X=(x 1 ,x 2 ………….,x n ) is called a truth assignment, or simply an assignment. Thus ,x i

=1 assigns truth to xi and xi=0 assigns truth to xi for each i=1……..,n. The simplest SAT problem can

now be formulated as: find a truth assignment X such that all clauses are true.

Denoting the logical AND operator by ∧ , we can represent the above SAT problem via a single

formula as

F 1 = C 1 ,C 2 ,…………C m )

Where the {Ck} consists of literals connected with only ∨ operators. The SAT formula is said to be in

conjunctive normal form (CNF). An alternative SAT formulation concerns formulas of the type

F 2 = C 1 ∨ C 2 …………… ∨ C m

Where the clauses are of the C i = z 1i ∧ ∨ z 2i ∧ ……….z ik . Such a SAT problem is then said to be in

disjunctive normal form(DNF). In this case, a truth assignment x is sought that satisfies at least one of

the clauses, which is usually a much simpler problem.

Example

As an illustration of SAT problem and the corresponding SAT counting problem, consider the

following toy example of coloring the nodes of the graph of the graph in following figure. Is it

possible to color the nodes either black or white in such a way that no two adjacent nodes have the

same color.? If so how many colorings are there?

We can

Boolean variable representing the statement “the j

TRUE or FALSE,

that adjacent nodes cannot have same color can be translated into number of clauses that must all

hold. For example “node 1 and

Similarly , the statement “at least one of

x3. The same holds for all other pairs of adjacent nodes. The clauses can now be summarized as in

the following table. Here in the left hand table, for each clause C

clause contains x

clause does not contain either of them. Now call the corresponding matri

For example , a76=

clause only the indices of all Boolean variables present in that clause. In addition, each index that

corresponds to a negati

Can the graph be colores with two colors so that two adjacent

nodes have the same color?

We can translate this graph coloring problem into a SAT problem in the following way : Let x

Boolean variable representing the statement “the j

TRUE or FALSE, and we wish to assign truth to either x

hat adjacent nodes cannot have same color can be translated into number of clauses that must all

hold. For example “node 1 and

Similarly , the statement “at least one of



clause contains x j , a -1 means th


For example , a76=-1 and a42=0. An alternative representation of the clause matrix


corresponds to a negation of a variable is

A SAT table and an alternative representation of clause matrix

e colores with two colors so that two adjacent


translate this graph coloring problem into a SAT problem in the following way : Let x

Boolean variable representing the statement “the j-th node is colored black”. Obviously, x

and we wish to assign truth to either x


hold. For example “node 1 and node 3 cannot both be black” can be translated as c

Similarly , the statement “at least one of node 1 and node 2must be black”. Is translated as C



1 means that the clause contains the negation x


d a42=0. An alternative representation of the clause matrix


on of a variable is preceded by a minus sign

A SAT table and an alternative representation of clause matrix

e colores with two colors so that two adjacent


translate this graph coloring problem into a SAT problem in the following way : Let x

th node is colored black”. Obviously, x

and we wish to assign truth to either x j or x j for each j=1…………..,5. The restriction


node 3 cannot both be black” can be translated as c

node 1 and node 2must be black”. Is translated as C


the following table. Here in the left hand table, for each clause C i a 1 in column j means

the clause contains the negation x j : and a 0 me

clause does not contain either of them. Now call the corresponding matrix A=(aij) the clause matrix.

d a42=0. An alternative representation of the clause matrix


by a minus sign

A SAT table and an alternative representation of clause matrix.

translate this graph coloring problem into a SAT problem in the following way : Let x j be the

th node is colored black”. Obviously, x j is either

for each j=1…………..,5. The restriction


node 3 cannot both be black” can be translated as clause C 1 = x 1 ∨ x3.

node 1 and node 2must be black”. Is translated as C 2 = x 1 ∨


a 1 in column j means that the

: and a 0 means that the

A=(aij) the clause matrix.

d a42=0. An alternative representation of the clause matrix is to list for each


be the

is either

x3.

A=(aij) the clause matrix.

is to list for each

Now let x= (x 1 ,x 2 ………….,x 5 ) be a truth assignment t. The question is whether there exists an x such

that all clauses {C k } are satisfied. To see if a single clause C k is satisfied, one must compare the truth

assignment for each assignment in that clause with the values 1,-1 and 0 in the clause matrix A, which

indicates that if literal corresponds to the variable, its negation ,or that neither appears in the clause

. If for example x j=0 and aij=-1 , then the literal x j is TRUE. The entire clause is TRUE if it contains at

least one true literal. Define the clause value C i (x)=1 if clause C i is TRUE with truth assignment x and

C i (x)=0 if it is FALSE. Then it is easy to see that

C i (x) = max{0,(2xj-1)aij},

j

assuming that at least one aij is nonzero for clause C i (otherwise, the clause can be deleted). For example,

for truth assignment(0,1,0,1,0) the corresponding clause values are given in the rightmost column of the

lefthand table in the Table. We see that second and forth clauses are violated. However the

assignment(1,1,0,1,0) does indeed yield all clauses true, and this therefore gives a way in which the nodes

can be colored 1=black, 2= black 3= white, 4=black, 5= white. It is easy to see that (0,0 1,0,1) is the only

assignment that renders all the clauses true.

The problem of deciding whether there exists a valid assignment, and indeed providing such a vector is

called the SAT assignment problem. Finding a coloring in the above example is a particular instance of the

SAT assignment problem. A SAT assignment problem in which each clause contains exactly K literals is

called a KSAT problem. It is well known that 2-SAT problems are easy and can be solve in polunomial time,

while K-SAT problems for k ≥3 are NP hard. A more difficult problem is to find the maximum number of

clauses that can be satisfied by one truth assignment. This is called Max-SAT problem.

2. What is the "Frame Problem" in Artificial Intelligence? Explain your answer and

give specific examples

Answer:

The term "artificial intelligence" was coined by John .While there does not appear to be any consensus about the definition

of AI, we can perhaps categorize the various definitions under three headings:

Strong AI

AI has as its goal the creation of machines with conscious intelligence.

Weak AI AI has as its goal the creation of machines which behave as if they were intelligent

Experimental Cognition AI is the study of human intellectual capabilities through the use of mental models implemented on a

computer.

In artificial intelligence, the frame problem was initially formulated as the problem of expressing a

dynamical domain in logic without explicitly specifying which conditions are not affected by an action.

The name "frame problem" derives from a common technique used by animated cartoon makers called

framing where the currently moving parts of the cartoon are superimposed on the "frame," which depicts

the background of the scene, which does not change. In the logical context, actions are typically specified

by what they change, with the implicit assumption that everything else (the frame) remains unchanged.

In another words we can say that, when we carry out an action, the world changes. In fact some aspects of

the world change, but others stay the same. Determining which stay the same is known as the frame

problem.

An effect axioms states what changes when the robot carries out a particular action in a particular

situation. It does not make any statement about what does not change. For example when the robot

carries out a Take action, it does not find itself in a different room. This kind of rule can be expressed in a

frame axiom, such as the following

Many of the frame axioms are needed if we are to describe all of the effects that do not result from

carrying out a particular action. The problem of having enormous numbers of frame axioms is known as

representational frame problem.

In any complex situation, we have to frame the problem the right way in order to achieve a solution.

"Framing" the problem means identifying what is relevant and what is irrelevant, and determining which bits

of knowledge in our extensive repository can be brought to bear to arrive at a solution. But to properly frame

the problem, we must understand the problem.

As human beings, we routinely solve complex problems. Our success in that process is so frequent and so

seemingly effortless, in fact, that it is easy to take the ability for granted. It is only when we begin trying to

capture that expertise in automated systems (e.g. robots), and try to give them the intelligence to deal with

new situations, that we discover how truly remarkable that capability is.

In all of these cases, people are effectively reorganizing their problem-solving heuristics in order to rapidly

hit on the "best" problem-solving strategy in any given situation. Currently, of course, our robotic systems

tend to be "frozen", than "self-organizing". Once programmed, the program tends to operate in exactly the

same way as time goes -- the situations may change, and the results generated by the program may differ as

a result, but the program never changes at all.

So clearly, a program that could "organize itself" would have the ability to address the frame problem. In

other words, it could get smarter over time. But that, in itself, does not solve the problem, because any one

attempt to find a solution can still take an inordinately long time. The "Robot Example" illustrates that

problem nicely.

• Robot, R1, is created. It's task is to go into a room and get it's power supply. It goes into the room,

finds its power supply in a wagon, and pulls it out of the room. Unfortunately, there is also a bomb in

the wagon. BOOM! The wagon and robot go up in smoke.

• That's not good, so a "Robot Decision-maker", R1D1, is constructed. It is able to think out the

consequences of it's actions in advance, so it won't pull out the bomb with the wagon. The new robot

goes into the room, and immediately engages it's deductive engines. It deduces that the floor is under

the wagon, and the wagon is under the roof, so therefore the floor is under the roof. Then it starts

deducing that...BOOM! The bomb goes off before the program can complete it's exhaustive set of

deductions.

• That's not good either, so a "Relevancy-detecting Robot Decision-maker", R2D1 is produced. This

time, the robot won't spend time on useless deductions, but will focus on what's important! So it, too, is

told to enter the room and retrieve it's power supply. This version sits outside the room without moving

for an awfully long time, until finally... BOOM! The power supply is destroyed by the bomb. After

debugging the output, it turns out that the program had figured out that the position of the floor wasn't

relevant, and the color of the drapes was relevant, and that was about as far is got in the time available.

At least this time the robot was saved, but the "analysis paralysis" induced by the attempt to determine

relevancy prevented the robot from taking any sort of action.

The robot we need to solve the problem, of course, is the "R2D2" -- a Relevancy-detecting Robot Decision-

maker that acts Decisively! frame problem suggests that R2D2 is impossible to construct. Impossible, that

is, if R2D2 must stand or fail on his own.

.

3. Explain the following methods of reasoning: a) deduction; b) induction (note: this

is not mathematical induction, but inductive reasoning); c) abduction.

Abduction means determining the precondition

Consider this example. It is logical to assert that “It rains; if it rains, the floor is wet; hence, the floor is wet.”

But any reasonable person can see the problem in making statements like: “The floor is wet; if it rains, the

floor is wet; hence, it rains.”. In order to make inferences to the best explanation, the researcher must need a

set of plausible explanations, and thus, abduction is usually formulated in the following mode:

The surprising phenomenon, X, is observed.

Among hypotheses A, B, and C, A is capable of explaining X.

Hence, there is a reason to pursue A.

Applications of abduction

Abduction can be well applied to quantitative research, especially Exploratory Data

Analysis (EDA) and Exploratory statistics (ES), such as factor rotation in Exploratory Factor Analysis and

path searching in Structural Equation Modeling .The whole notion of a controlled experiment is covertly

based on the logic of abduction. In a controlled experiment, the researchers control alternate explanations

and test the condition generated from the most plausible hypothesis. However, abduction shares more

common ground with EDA than with controlled experiments. In EDA, after observing some surprising facts,

we exploit them and check the predicted values against the observed values and residuals .Although there

may be more than one convincing pattern, we "abduct" only those that are more plausible for subsequent

controlled experimentation. Since experimentation is hypothesis-driven and EDA is data-driven, the logic

behind them are quite different. The abductive reasoning of EDA goes from data to hypotheses while

inductive reasoning of experimentation goes from hypothesis to expected data. By the same token, in

Exploratory Factor Analysis and Structural Equation Modeling, there might be more than one possible way

to achieve a fit between the data and the model; again, the researcher must “abduct” a plausible set of

variables and paths for modeling building.

In EDA, the role of the researcher is to explore the data in as many ways as possible until a plausible "story"

of the data emerges. EDA is not “fishing” significant results from all possible angles during research: it is

not trying out everything.

There are millions of possible explanations to a phenomenon. Due to the economy of research, we cannot

afford to falsify every possibility. We don't have to know everything to know something. By the same token,

we don't have to screen every false thing to dig out the authentic one. During the process of abduction, the

researcher should be guided by the elements of generality to extract a proper mode of perception.

Deduction means determining the conclusion

Deduction involves drawing logical consequences from premises. An inference is endorsed as deductionaly

valid when the truth of all premises guarantees the truth of conclusion.

For instance,

First premise: All the beans from the bag are white (True).

Second premise: These beans are from this bag (True).

Conclusion: Therefore, these beans are white (True).

Limitations of deduction

There are several limitations of deductive logic. First, deductive logic confines the

conclusion to a answer (True/False). A typical example is the rejection or failure of rejection of the null

hypothesis.

Second, this kind of reasoning cannot lead to the discovery of knowledge that is not already embedded in the

premise. In some cases the premise may even be tautological--true by definition. Brown (1963) illustrated

this weakness by using an example in economics: An entrepreneur seeks maximization of profits. The

maximum profits will be gained when marginal revenue equals marginal cost. An entrepreneur will operate

his business at the equilibrium between marginal cost and marginal revenue.

The above deduction simply tells that a rational man would like to make more money.

There is a similar example in cognitive psychology:

Human behaviors are rational.

One of several options is more efficient in achieving the goal.

A rational human will take the option that directs him to achieve his goal (Anderson,

1990).

The above two deductive inferences simply provide examples that a rational man will do

rational things. The specific rational behaviors have been included in the bigger set of generic rational

behaviors. Since deduction facilitates analysis based upon existing knowledge rather than generating new

knowledge, Josephson and Josephson (1994) viewed deduction as truth preserving and abduction as truth

producing.

Russell and Whitehead (1910) attempted to develop a self-sufficient logical-mathematical system. In their

view, not only can mathematics be reduced to logic, but also logic is the foundation of mathematics.

Mathematical logic relies on many unproven premises and assumptions. Statistical conclusions are

considered true only given that all premises and assumptions that are applied are true.

Deduction alone is a necessary condition, but not a sufficient condition of knowledge. Peirce (1934/1960)

warned that deduction is applicable only to the ideal state of things. In other words, deduction alone can be

applied to a well-defined problem, but not an ill-defined problem, which is more likely to be encountered by

researchers. Nevertheless, deduction performs the function of clarifying the relation of logical implications.

When well-defined categories result from abduction, premises can be generated for deductive reasoning.

Induction means determining the rule. It is learning the rule after numerous examples of the

conclusion following the precondition. Example: "The grass has been wet every time it has rained.

Thus, when it rains, the grass gets wet." Scientists are commonly associated with this style of

reasoning.

It allows inferring some a from multiple instantiations of b when a entails b. Induction is the process

of inferring probable antecedents as a result of observing multiple consequents. An inductive

statement requires perception for it to be true. For example, the statement, "it is snowing outside" is

invalid until one looks or goes outside to see whether it is true or not. Induction requires sense

experience.

Limitations of induction

Hume (1777/1912) argued that things are inconclusive by induction because in infinity

there are always new cases and new evidence. Induction can be justified, if and only if, instances of which

we have no experience resemble those of which we have experience. Thus, the problem of induction is also

known as “the skeptical problem about the future” (Hacking, 1975).

We never know when a regression line will turn flat, go down, or go up. Even inductive reasoning using

numerous accurate data and high power computing can go wrong, because predictions are made only under

certain specified conditions .

Take the modern economy as another example. Due to American economic problems in the early '80s, quite

a few reputable economists made gloomy predictions about the U.S. economy such as the takeover of

American economic and technological throne by Japan. By the end of the decade, Roberts (1989) concluded

that those economists were wrong; contrary to those forecasts, I n the 80’s the U.S. enjoyed the longest

economic expansion in its history. In the 1990s, the economic positions of the two nations changed: Japan

experienced recession while America experienced expansion.

Induction suggests the possible outcome in relation to events in long run. This is not definable for an

individual event. To make a judgment for a single event based on probability like "your chance to survive

this surgery is 75 percent" is nonsense. In actuality, the patient will either live or die. In a single event, not

only the probability is indefinable, but also the explanatory power is absent. Induction yields a general

statement that explains the event of observing, but not the facts observed.

If we observe thousands of stones, trees and flowers, we never reach a point at which we observe a

molecule. After we heat many iron bars, we can conclude the empirical fact that metals will bend when they

are heated. But we will never discover the physics of expansion coefficients in this way.

Indeed, superficial empirical-based induction could lead to wrong conclusions. For example, by repeated

observations, it seems that heavy bodies (e.g. metal, stone) fall faster than lighter bodies (paper, feather).

This Aristotelian belief had misled European scientists for over a thousand years. Galileo argued that indeed

both heavy and light objects fall at the same speed.

We don't know the real probability due to our finite existence. However, given a large number of cases, we

can approximate the actual probability. We don't have to know everything to know something. Also, we

don't have to know every case to get an approximation. This approximation is sufficient to fix our beliefs

and lead us to further inquiry.

4. Explain the strengths and weaknesses of using formal logic as a knowledge

representation method. Give specific examples.

Knowledge representation and reasoning plays a central role in Artificial Intelligence.

Research in Artificial Intelligence started off by trying to identify the general mechanisms responsible for

intelligent behavior. However, it quickly became obvious that general and powerful methods are not enough

to get the desired result, namely, intelligent behavior. Almost all tasks a human can perform which are

considered to require intelligence are also based on a huge amount of knowledge. For instance,

understanding and producing natural language heavily relies on knowledge about the language, about the

structure of the world, about social relationships etc.

One way to address the problem of representing knowledge and reasoning about it is to use some form of

logic.

Two perspectives on logic are possible. The first perspective, taken by McCarthy (1968), is that logic should

be used to represent knowledge. That is, we use logic as the representational and reasoning tool inside the

computer. Newell (1982) on the other hand proposed in his seminal paper on the knowledge level to use

logic as a formal tool to analyze knowledge. Of course, these two views are not incompatible. Furthermore,

once we accept that formal logic should be used as a tool for analyzing knowledge, it is a natural

consequence to use logic for representing knowledge and for reasoning about it as well.

Saying that logic is used as the main formal tool does not say which kind of logic is used. In fact, a large

variety of logics have been employed or developed in order to solve knowledge representation and reasoning

problems. Often, one started with a non clear specified problem, developed some kind knowledge

representation formalism without a formal semantics, and only later started to provide a formal semantics.

Using this semantics, one could then analyze the complexity of the reasoning problems and develop sound

and complete reasoning algorithms. This called as the logical method, which proved to be very fruitful in

the past and has a lot of potential for the future.

One good example for the evolution of knowledge representation formalisms is the development of

description logics, which have their roots in so-called structured inheritance networks formalisms .These

networks were originally developed in order represent word meanings. A concept node connects to other

concept nodes using roles. Moreover, the roles could be structured as well.

Determination of decidability and complexity as well as the design of decision algorithms are based on the

rigorous formalization of the initial ideas. In particular, it is not just one logic that it is used to derive these

results, but it is the logical method that led to the success. One starts with a specification of how expressions

of the language or formalism have to be interpreted in formal terms. Based on that one can specify when a

set of formulae logically implies a formula. Then one can start to find similar formalisms (e.g. modal logics)

and prove equivalences and/or one can specify a method to derive logically entailed sentences and prove

them to be correct and complete.

Another interesting area where the logical method has been applied is the development

of the so-called non-monotonic logics. These are based on the intuition that sometimes a logical

consequence should be retracted if new evidence becomes known. For example, we may assume that our car

will not be moved by somebody else after we have parked it. However, if new information becomes known,

such as the fact that the car is not at the place where we have parked it, we are ready to drop the assumption

that our car has not been moved.

This general reasoning pattern was used quite regularly in early AI systems, but it took a while before it was

analyzed from a logical point of view. In 1980, a special issue of the Artificial Intelligence journal appeared,

presenting different approaches to non-monotonic reasoning, in particular Reiter’s (1980) default logic and

McCarthy’s (1980) circumscription approach.

A disappointing fact about nonmonotonic logics appears to be that it is very difficult to formalize a domain

such that one gets the intended conclusions. In particular, in the area of reasoning about actions, McDermott

(1987) has demonstrated that the straightforward formalization of an easy temporal projection problem (the

“Yale shooting problem”) does not lead to the desired consequences. However, it is possible to get around

this problem. Once all underlying assumptions are spelled out, this and other problem can be solved

(Sandewall 1994).

It took more than a decade before people started to analyze the computational complexity

(of the propositional versions) of these logics. As it turned out, these logics are usually somewhat more

difficult than ordinary propositional logic (Gottlob 1992). This, however, seems tolerable since we get much

more conclusions than in standard propositional logic.

Right at the same time, the tight connection between nonmonotonic logic and belief revision (G¨ardenfors

1988) was noticed. Belief revision – modeling the evolution of beliefs over time – is just one way to

describe how the set of nonmonotonic consequences evolve over time, which leads to a very tight

connection on the formal level for these two forms of nonmonotonicity (Nebel 1991). Again, all these results

and insights are mainly based on the logical method to knowledge representation.

As mentioned, it is the idea of providing knowledge representation formalisms with formal (logical)

semantics that enables us to communicate their meaning, to analyze their formal properties, to determine

their computational complexity, and to devise reasoning algorithms.

While the research area of knowledge representation is dominated by the logical approach, this does not

mean that all approaches to knowledge representation must be based on logic. Probabilistic (Pearl 1988) and

decision theoretic approaches, for instance, have become very popular lately. Nowadays a number of

approaches aim at unifying decision theoretic and logical accounts by introducing a qualitative version of

decision theoretic concepts. Other approaches aim at tightly integrating decision theoretic concepts such as

Markov decision processes with logical approaches, for instance. Although this is not pure logic, the two

latter approaches demonstrate the generality of the logical method: specify the formal meaning and analyze!

5. Research the "MIU Puzzle" (originally defined in D. Hofstadter's book "Gödel,

Escher, Bach"). a) Answer the question: Can MU be produced from MI using the

given rules? Explain your answer. b) In what sense is this a "typographical" system?

c) What is Gödel's Incompleteness Theorem, and what does the MIU system have to

do with it?

Suppose there are the symbols M, I, and U which can be combined to produce strings of symbols called

"words". The MU puzzle asks one to start with the "axiomatic" word MI and transform it into the word MU

using in each step one of the following transformation rules:

1. Add a U to the end of any string ending in I. For example: MI to MIU.

2. Double any string after the M (that is, change Mx, to Mxx). For example: MIU to MIUIU.

3. Replace any III with a U For example: MUIIIU to MUUU.

4. Remove any UU. For example: MUUU to MU.

Using these four rules is it possible to change MI into MU in a finite number of steps?

The production rules can be written in a more schematic way. Suppose x and y behave as variables

(standing for strings of symbols). Then the production rules can be written as:

1. xI → xIU

2. Mx → Mxx

3. xIIIy → xUy

4. xUUy → xy

Is it possible to obtain the word MU using these rules?

Solution

The puzzle's solution is no. It is impossible to change the string MI into MU by repeatedly applying the given

rules.

In this case, one can look at the total number of I in a string. Only the second and third rules change this

number. In particular, rule two will double it while rule three will reduce it by 3. Now, the invariant

property is that the number of I is not divisible by 3:

• In the beginning, the number of Is is 1 which is not divisible by 3.

• Doubling a number that is not divisible by 3 does not make it divisible by 3.

• Subtracting 3 from a number that is not divisible by 3 does not make it divisible by 3 either.

Thus, the goal of MU with zero I cannot be achieved because 0 is divisible by 3.

In the language of modular arithmetic, the number n of I obeys the congruence

where a counts how often the second rule is applied.

b) In what sense is this a "typographical" system?

The MIU-puzzle is in fact merely a puzzle about natural numbers in typographical disguise. If we could only

find a way to transfer it to the domain of number theory, we might be able to solve it.

If we try counting the numbers of I's contained in theorems, we will soon notice that it seems never to be

0. In other words, it seems that no matter how much lengthening and shortening is involved, we can never

work in such a way that all I's are eliminated. Let us call the number of I's in any string the "I-count" of

that string. Note that the I-count of the axiom MI is 1. We can do more than show that the I-count can't

be 0 -- we can show that the I-count can never be any multiple of 3.

To begin with, notice that rules 1 and 4 (see ABOVE ) leave the I-count totally undisturbed. Therefore we

need only think about rules 2 and 3. As far as rule 3 is concerned, it diminishes the I-count by exactly three.

After an application of this rule, the I-count of the output might conceivably be a multiple of 3 -- but only if

the I-count of the input was also. Rule 3, in short, never creates a multiple of 3 from scratch. It can only

create one when it began with one. The same holds for rule 2, which doubles the I-count. The reason is

that if 3 divides 2n, then -- because 3 does not divide 2 -- it must divide n (a simple fact from the theory of

numbers). Neither rule 2 nor rule 3 can create a multiple of 3 from scratch.

But this is the key to the MU-puzzle! Here is what we know :

The I-count begins at 1 (the axiom MI ), and that is not a multiple of 3.

Two of the rules do not affect the I-count at all.

The two remaining rules which do affect the I-count do so in such a way as never to create a multiple of 3

unless given one initially.

The conclusion -- and a typically hereditary one it is, too -- is that the I-count can never become any

multiple of 3. In particular, 0 is a forbidden value of the I-count [because 0 is a multiple of 3, -- 0 x 3 = 0].

Hence, MU is not a theorem of the MIU-system.

Notice that, even as a puzzle about I-counts, this problem was still plagued by the crossfire of lengthening

and shortening rules. Zero became the goal. I-counts could increase (rule 2 ), could decrease (rule 3 ). Until

we analyzed the situation, we might have thought that, with enough switching back and forth between the

rules, we might eventually hit 0. Now, according to a simple number-theoretical argument, we know that

that is impossible.

Not all problems of the type which the MU-puzzle symbolizes are so easy to solve as this one. But we have

seen that at least one such puzzle could be embedded within, and solved within, number theory [When we

speak of "number theory" (which we often will indicate by "N") we mean its non-formalized version. The

language in this version is thus plain English or some other spoken language. Its formalized version is the

"Theoria Numerorum Typographica", or "TNT" for short (without quotation marks). Number theory is the

"meaning" or (meaningful) interpretation of TNT, that is, TNT describes number theory. Both number

theory and TNT concern the positive integers and zero (which are called "natural numbers") and their

properties. When setting up TNT, all symbols, operations, etc. in number theory are translated into special

TNT signs, in such a way that a minimum of such TNT signs is obtained.]. We are now going to see that

there is a way to embed all problems about any formal system, in number theory. To illustrate it, we will

use the MIU-system.

We begin by considering the notation of the MIU-system. We shall map each symbol onto a new symbol :

M <==> 3

I <==> 1

U <==> 0

The correspondence was chosen arbitrarily. The only reason to it is that each symbol looks a little like the

one it is mapped onto. Each number is called the Gödel number of the corresponding letter. Now

MU <==> 30

MIIU <==> 3110

MUU <==> 300

etc.

Let us now take a look at a typical derivation in the MIU-system, written simultaneously in both notations :

(1) MI -------------- axiom ---------- 31

(2) MII ------------- rule 2 ---------- 311

(3) MIIII ------------ rule 2 ---------- 31111

(4) MUI ------------ rule 3 --------- 301

(5) MUIU ---------- rule 1----------- 3010

(6) MUIUUIU ----- rule 2 ---------- 3010010

(7) MUIIU --------- rule 4 --------- 30110

The left-hand column is obtained by applying our four familiar typographic rules. The right-hand column,

too, could be thought of as having been generated by a similar set of typographic rules. Yet the right-hand

column has a dual nature. Now we explain what this means.

Seeing Things Both Typographically and Arithmetically

We could say of the fifth string ('3010') that it was made from the fourth ('301'), by appending a '0' on the

right. On the other hand we could equally well view the transition as caused by an arithmetical operation

-- multiplication by 10, to be exact. When natural numbers are written in the decimal system,

multiplication by 10 and putting a '0' on the right are indistinguishable operations. We can take advantage

of this to write an arithmetical rule which corresponds to typographical rule I :

Arithmetical Rule Ia : A number whose decimal expansion ends on the right in '1' can be multiplied by 10.

We can eliminate the reference to the symbols in the decimal expansion by arithmetically describing the

rightmost digit :

Arithmetical Rule Ib : A number whose remainder when divided by 10 is 1, can be multiplied by 10.

Now we could have stuck with a purely typographical rule (also here), such as the following one :

Typographical Rule I : From any theorem whose rightmost symbol is '1' a new theorem can be made, by

appending '0' to the right of that '1'. [Just using numerals instead of letters]

They would have the same effect. This is why the right-hand column has a "dual nature" : It can be viewed

either as a series of typographical operations changing one pattern of symbols into another, or as a series

of arithmetical operations changing one magnitude into another. But there are powerful reasons for being

more interested in the arithmetical version.

Typographical rule which tells how certain digits are to be shifted, changed, dropped, or inserted, in any

number represented decimally, .

More briefly :

Typographical rules for manipulating n u m e r a l s are actually arithmetical rules for operating on

n u m b e r s.

This simple observation is at the heart of Gödel's method, and it will have an absolutely shattering effect. It

tells us that once we have a Gödel-numbering for any formal system, we can straightaway form a set of

arithmetical rules which complete the Gödel isomorphism. The upshot is that we can transfer the study of

any formal system -- in fact the study of all formal systems -- into number theory.

What is Gödel's Incompleteness Theorem, and what does the MIU system have to do

with it?

Gödel's first incompleteness theorem states that there is no set of axioms of

arithmetic that is both complete and consistent

. In other words it means that there is at least one true statement in the system that cannot be derived

within the system.

· Gödel’s second incompleteness theorem states that if we have a set of axioms T,

then T's consistency cannot be proven within T.

To answer the question we will go back to MIU system.

MIU-system Since this is a formal system, this system consists of some restriction or rules. Our formal system---MIU system--- consists of only three letters of alphabet: M, I, U. The strings (which mean strings of letters) of the MIU-system are the strings that are composed of only those three letters. For example: MU UIM MUUMUU UIIUMIUUIMUIIUMIUUIMUIIU

are strings of the MIU-system. SYMBOLS: M, I, U AXIOM: MI RULES: (In the following, x is merely a variable) 1. If xI is a theorem, so is xIU. 2. If Mx is a theorem, so is Mxx. 3. In any theorem, III can be replace by U. 4. UU can be dropped from any theorem. Now the question is—Can we produce a string, namely “MU” in this system? An axiom, namely “MI” is granted initially and we want to produce “MU” by using given axiom and rules as below: MI -�…-�…-�…-�…-� MU The reason we want to do is that we want to find out “MU” is derivable or not mathematically.

1) The I-count begins at 1 (not a multiple of 3). 2) Rules 1 and 4 do not affect the I-count at all. 3) Rules 2 and 3 affect the I-count in such a way that they never create a multiple of 3 unless given one initially. It follows that the I-count can never be a multiple of 3 and thus we can never derive MU from MI. In other words, “MU is not a theorem of the MIU- system”. As a result, we can see that ‘MU’ actually exists in the system, but it is not derivable. So the system is incomplete. This corresponds to Gödel’s first incompleteness theorem which says “there is at least one true statement in any formal system, but it cannot be derived or proved”.

6. Formal Logic is known for its use of symbolic representation of concepts, such as

A�� B. In what sense are numbers symbols? Give examples.

Formal logic : the branch of logic that examines patterns of reasoning to determine which ones necessarily

result in valid, or formally correct, conclusions.

In the history of formal logic, different symbols have been used at different times

and by different authors

In mathematical logic, a Gödel numbering is a function that assigns to each symbol and well-formed

formula of some formal language a unique natural number, called its Gödel number. The concept was first

used by Kurt Gödel for the proof of his incompleteness theorem.

A Gödel numbering can be interpreted as an encoding in which a number is assigned to each symbol of a

mathematical notation, after which a sequence of natural numbers can then represent a sequence of

strings. These sequences of natural numbers can again be represented by single natural numbers,

facilitating their manipulation in formal theories of arithmetic.

Gödel used a system of Gödel numbering based on prime factorization. He first assigned a unique natural

number to each basic symbol in the formal language of arithmetic he was dealing with.

To encode an entire formula, which is a sequence of symbols, Gödel used the following system. Given a

sequence x1x2x3...xn of positive integers, the Gödel encoding of the sequence is the product of the first n

primes raised to their corresponding values in the sequence:

According to the fundamental theorem of arithmetic, any number obtained in this way can be uniquely

factored into prime factors, so it is possible to recover the original sequence from its Gödel number (for

any given number n of symbols to be encoded).

Gödel specifically used this scheme at two levels: first, to encode sequences of symbols representing

formulas, and second, to encode sequences of formulas representing proofs. This allowed him to show a

correspondence between statements about natural numbers and statements about the provability of

theorems about natural numbers, the key observation of the proof.

There are more sophisticated (but more concise) ways to construct a Gödel numbering for sequences.

Artificial intelligence systems are centrally concerned with representing and manipulating knowledge.

Some knowledge, particularly in mathematics and sciences , is expressed as numbers and formulae

expressions that consists of collection of numbers and arithmetic operations. Mathematics, when it

involves more abstract reasoning process(such as proving, algebraic transformations and the symbolic

solution of integral- differential equations), requires this more general and powerful language , which must

be expressed in concepts and relationships are represented by symbols and string of symbols.

In fact, numbers and formulae are really just a collection of symbols. Numbers are symbols whose

properties are defined over the set of arithmetic operations and arithmetic operations are represented by

symbols or strings of symbols( such as +,-,/,x). Thus the tools of artificial intelligence consist of languages,

processes, and construct that allow the acquisition, representation, storing, transformation and other

manipulation of concepts and relationships by information with the study of language theory, including

higher order computer languages and computer compiler theory.

Internal to a computer a symbol is just a sequence of bits that can be distinguished from other symbols.

Some symbols have a fixed interpretation, e.g., symbols that represent numbers and symbols that

represent characters. Symbols that do not have fixed meaning appear in many programming languages.

Java, starting from Java 1.5, calls them enumeration types. Lisp calls them atoms. Usually, they are

implemented as indexes into a symbol table that gives the name to print out. The only operation needed

on these symbols is equality to determine if two symbols are the same or not.

7. Do some research on the programming language called Prolog. Explain how

Prolog uses backward chaining as its solution method.

All programming languages have both declarative (definitional) and imperative (computational) components. Prolog is referred to as a declarative language because all program statements are definitional. In particular, a Prolog program consists of facts and rules which serve to define relations (in the mathematical sense) on sets of values. The imperative component of Prolog is its execution engine based on unification and resolution, a mechanism for recursively extracting sets of data values implicit in the facts and rules of a program.

Prolog rules consist of: a consequent (the head), and an antecedent (the body)

(connected by the ’:-’ symbol)

• The body of a rule may consist of one or more clauses. Each clause is a subgoal.

• The head is true if the subgoals making up the body are true.

Example: "X is wierd if X is vegetarian and eats steak, or if X is a trainspotter" wierd(X) :- vegetarian(X), eats_steak(X). wierd(X) :- is_trainspotter(X).

like facts, rules are also predicates. These 2 rules constitute the predicate "wierd/1")

Prolog rules wierd(X) :- vegetarian(X), eats_steak(X). wierd(X) :- is_trainspotter(X).

subgoals of a rule are usually separated by ’and’ (sometimes by ’or’) and terminated with

a full stop.The scope of a variable is limited to that rule. There are no global variables in Prolog. All facts

and rules have global scope

Backward chaining

Once knowledge is represented in a form that Prolog can use (facts and rules), a reasoning procedure is

required to draw conclusions from the knowledge base.

No need to program this ourselves - it is built into Prolog’s inference engine, which - by default works by

backward chaining.

This means that we start with a hypothesis and reason backwards from the hypothesis

trying to find facts that support it. For instance, in a database about grants for listed

buildings we might want to ask: |?- eligible_for_grant(jim).

The Prolog interpreter will then attempt to prove the goal statement, given the assumptions in the

KB(knowledge base).

• This may mean ascertaining that:

Jim’s property is listed

Jim does not have huge funds at his disposal

• Ascertaining that Jim does not have huge funds at his disposal might entail making sure that:

— Jim does not have lots of investments, and

— Jim does not earn > £75,000 a year

Thus Prolog is working backwards from the goal statement to the facts.

Prolog can be made to work in a forward-chaining fashion but this requires more programming effort.

In Prolog implies is written as :- and is read from right to left.

The universal quantifier is assumed for all variables and a comma is used for AND.

Example:

∀ x Cat(x) ∧ Tame(X) ⇒ Pet(x) would be written as

pet(X) :- cat(X),tame(X).

Sentences are apparently written 'backwards' to reflex the fact that Prolog uses backwards chaining. Thus,

the above sentence could be read as �to prove that X is a pet prove that X is a cat and that X is tame.

(sentences in Prolog end in a period. Variables must start with a capital letter or an underscore and objects

and relations begin with a lower-case letter.)A semicolon is used for OR.

References: http://en.wikipedia.org/wiki/Boolean_satisfiability_problem

http://books.google.com/books?id=1-ffZVmazvwC&pg=PA280&lpg=PA280&dq=importance+of+Satisfiability+problem&source=bl&ots=-

pd1h2yE_l&sig=jY96918MWW7oP_RQMDuederuozM&hl=en&ei=utmOSsaOJ4nIsQPHgvnfCQ&sa=X&oi=book_result&ct=result&resnum=2#v=onepage&q=

&f=false

http://en.wikipedia.org/wiki/Frame_problem

http://www.augustana.ab.ca/~mohrj/courses/2004.fall/csc110/lecture_notes/AI.html

http://www.treelight.com/software/collaboration/darwinframe.html

http://www.newworldencyclopedia.org/entry/Abductive_reasoning

http://www.creative-wisdom.com/teaching/WBI/abduction5.pdf

http://books.google.com/books?id=LcOLqodW28EC&pg=PA427&lpg=PA427&dq=frame+problem+artificial+intelligence&source=bl&ots=sUwegDQAK3&sig

=hh5hsw04j4SHuSzVNT-

0OreXpt4&hl=en&ei=YDaQSpv6DoeQsgOruaAM&sa=X&oi=book_result&ct=result&resnum=1#v=onepage&q=frame%20problem%20artificial%20intelligenc

e&f=false

http://www.informatik.uni-freiburg.de/~ki/papers/nebel-iesbs-01.pdf

http://www.iwriteiam.nl/MIUpuzzle.html

http://en.wikipedia.org/wiki/MU_puzzle

http://www-scf.usc.edu/~hyuen/writing/essays/OpusMagnum.pdf

http://www.metafysica.nl/nature/insect/nomos_64c.html#fsnss

http://rudar.ruc.dk/bitstream/1800/3868/1/Group%206-Philosophy%20of%20Logic%20and%20Artificial%20Intelligence-Final%20Hand%20In(5.1.2009).pdf

http://people.cs.ubc.ca/~poole/aibook/html/ArtInt_74.html

https://www.cs.kent.ac.uk/teaching/08/modules/CO/8/84/Aliy/LECTURE_NOTES/Prolog2_rules.pdf

http://www.soe.ucsc.edu/classes/cmps112/Spring03/languages/prolog/PrologIntro.pdf

http://www2.cs.uidaho.edu/~tsoule/cs470/PrologH1.html

http://www.cs.uwm.edu/~mali/conferences/mfsat-ictai.pdf

http://users.ox.ac.uk/~jrlucas/Godel/simplex.html

http://www.yourdictionary.com/formal-logic

http://www.lulu.com/items/volume_65/190000/190528/10/print/forallx090604.pdf

http://en.wikipedia.org/wiki/Godel_number

http://www.shsu.edu/~mth_jaj/math467/bowyer.pdf

assignment # 1 - weeblyfarrukhjabeen.weebly.com/uploads/1/1/6/2/1162932/p1comp581.pdf ·...

Documents