1 computational intelligence: an f-matrix view qianchuan zhao center for intelligent and networked...
TRANSCRIPT
1
Computational intelligence: an F-matrix view
Qianchuan ZhaoCenter for Intelligent and Networked
SystemsTsinghua University
Beijing 100084, China Presented to: SFI summer school at
QingdaoJuly 8, 2004
2
Joint work with
• Prof. Yu-Chi Ho, Dr. David Pepyne, Prof. Da-Zhong Zheng, Prof. Bruce Krogh, Prof. Qiang Lu, Mr. Kai Sun, Dr. Ke Yang, Mr. Qingshan Jia
3
Acknowledgement
• National Science Foundation of China
60074012 and 60274011, funding from ministry of education (Chinese) and a Tsinghua University (China) Fundamental Research Funding Grant.
4
Computational Intelligence
• Methods inspired by nature intelligence (Genetic Algorithms, Swarm Intelligence, Simulated Annealing, Quantum Computing)
• Methods inspired by human brain structure
(Artificial Neural Networks)
• Methods inspired by how human reasons (Fuzzy Logic)
5
Outline
• Optimization
• Modeling strategies
• General search strategies
• General design strategies
• Complexity in behavior of dynamic systems
6
Outline
• Optimization
• Modeling strategies
• General search strategy
• General design strategy
• Complexity in behavior of dynamic systems
7
Optimization
A optimization problem is to maximum (minimum) performance index in a search space subject to some constraints.
)(max xfx
0)( xgSubject to:
8
Complexity in evaluating objective function
An objective function f is complex to evaluate if it can only be evaluated by simulation.
9
Outline
• Optimization
• Modeling strategies
• General search strategy
• General design strategy
• Complexity in behavior of dynamic systems
10
Modeling of optimization problems
• Encoding
• Filtering
• Surrogate
• Goal soften
11
Representing solutions
• EncodingUsing strings or numbers to represent a
solution to the optimization problem as input such that optimization algorithms can proceed.
Solutions should be able to obtained by decoding the outputs of optimization algorithms.
12
Example
• TSP (traveling salesman problem):
Find a minimum cost tour of n cities with each city visited once and only once.
1
2
3
4
The sequence of nodes x=1234 is a solution.
13
Example
• Buffer allocation example:
A solution is a vector of ten buffers.
Alternatively, in observing the constraints, a solution can be defined as a vector of 4 variables (B0,B4,B5,B8).
14
Filtering
• Solve the original problem by stages.
At the first stage, easy constraints are obtained to narrow down the solution space to a smaller space .
At the second stage, hard constraints are handled only within .
15
16
ExampleTraditional function optimization
Max f(x)Subject to: x=[0,1]
f is a continuously differentiable function. Method: obtain set by solving df(x)/dx=0 on xR
at the first stage and then solve Max f(x)
x ={0,1}
17
ExampleIslanding operation for power systems:
Under local failures, to avoid collapse of the entire power system, it is separated into several small islands which can operate in safe conditions.
18
ExampleIslanding operation for power systems [Zhao03a]
[Sun03]:
The balance of static power supply and load in each island is a necessary condition for each island to operate safely.
First stage: obtain solution set by search all separation operation keeping static power balance.
Second stage: search within true proper separation operation by simulation.
19
A power system
20
21
22
Surrogate
• Exploration
Learning by example:
Predict complex constraints/objective function with ANN
• Average
Noised observation by Mote Carlo simulation
23
24
Example
• Q-learning
• Neural dynamic programming
25
Goal soften
• Instead of asking best for sure, we ask good enough with high probability
26
Example
• Ordinal Optimization
27
F-matrix [Ho02]
x1
x2
x|X|
f1 f2 f|F|
y1
y1
y1
y1
y1
y2
y|Y|
y|Y|
y|Y|
y|Y-1|
y|Y-1|
y|Y-1|
y|Y-1|
y|Y-1|
y|Y|
The number of all different problem instances is |Y||X|.
Note the sum for each row is the same.
28
F-matrix
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
29
F-matrix
• Assumptions:
a) Finite world assumption: finite search space and finite set of performance values.
b) There is no constraint.
c) Only P (polynomial) solutions can be searched.
30
Outline
• Optimization
• Modeling strategies
• General search strategies
• General design strategies
• Complexity in behavior of dynamic systems
31
General search strategies
• Neighborhood search
• Random guess
• Parallel search
• Hybrid search
• Hill climbing
• Backtracking
32
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
Neighborhood search (one dimension)
Every element of designs in a neighborhood can be listed as nearby designs.
33
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
34
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
X X 2 1 2 1 1 1
Total computation effort consumed:
2+2+2+1+2+1+1+1
Number of successes: 6
35
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
Random guess
Not like neighborhood search, random guess jumps in the entire search space stochastically.
36
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
37
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
X 2 X 1 2 1 1 1
Total computation effort consumed:
2+2+2+1+2+1+1+1
Number of successes : 6
38
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
S0S1
S2
S3
39
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
X X 2 1 2 1 1 1X 2 X 1 2 1 1 1
S0S1
S2
S3
NS
RG
40
n
k SP
k0
~;
n is the number of solutions.
Sk is the set of binary strings with exactly k 1s.
is 1 if the problem instance f has outcome 1 for at least one solution in the randomly picked P solutions {x’1,x’2,…,x’P}.
P~
;
41
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
Parallel search
Parallel search allow several search procedures work simultaneously.
42
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
P1 is a search procedure
P2 is another search procedure
P12: the iterative search process
For each search step of both procedures, results are reported to the controller.
P1 P2
P1 P2c
43
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1
X X X 1 X 1 1 1
x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
X 1 X X 1 1 X 1
44
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1
X X X 1 X 1 1 1
x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
X 1 X X 1 1 X 1
X 2 X 2 2 2 2 2
Total computation effort consumed:
2+2+2+2+2+2+2+2
Number of successes: 6
P12
P2
P1
45
n is the number of solutions.
Sk is the set of binary strings with exactly k 1s.
is 1 if the problem instance f has outcome 1 for at least one design in P1 designs (decided by neighborhood search). is defined similarly.
},max{21 ;
0; P
n
k SP
k
},max{21 ;
0; P
n
k SP
k
n
k SPP
k0; 21
=
1;P
2;P
46
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
Hybrid
Simple search strategies can also be combined.
47
n is the number of solutions.
Sk is the set of binary strings with exactly k 1s.
is 1 if the problem instance f has outcome 1 for at least one design in the randomly picked P1 designs.
is defined similarly.
},max{21
~;
0
~; P
n
k SP
k
},max{21
~;
0
~; P
n
k SP
k
n
k SPP
k0
~~; 21
=
1~
;P
2~
;P
48
Hill climbing
The purpose of hill climbing is to find the maximum outcome of the given instance by search in an increasing direction. If it find a maximum, we say it makes a hit.
x1
x2
x|X|
f1 f2 f|F|
y1
y1
y1
y1
y1
y2
y|Y|
y|Y|
y|Y|
y|Y-1|
y|Y-1|
y|Y-1|
y|Y-1|
y|Y-1|
y|Y|
49
f
x1 0x2 1x3 2x4 3x5 2x6 4
…xn 0
50
f1 f2 f3 f4 f5x1 0 1 0 1 2x2 1 0 1 2 0x3 0 1 2 0 1x4 1 2 0 1 0x5 2 0 1 0 1
3 2 1 X X
5 4 3 2 1HC
NS
51
Back tracking
The purpose of back tracking is to return to a history point and pickup a different search direction so that the algorithm can traverse the whole solution space.
52
f
x1 0x2 1x3 2x4 3x5 2x6 0
…xn 4
53
f1 f2 f3 f4 f5x1 0 1 0 1 2x2 1 0 1 2 0x3 0 1 2 0 1x4 1 2 0 1 0x5 2 0 1 0 1
3 2 1 4 5
5 4 3 2 1HC+BT
NS
54
Types of search problems
• Easy
• Hard
55
Number of difficult instances
Easy problem instances : the number of good outcomes k in the instance is large enough:
k/n>,
where is a threshold level such as 1%.
For easy instances, we can solve by random search. The total number of easy instances is sumk>n(|Sk|)
The number of difficulty instances is sumk<=n(|Sk|)
56
Example
• Difficult problem: a problem including some difficult instances
Boolean satisfiability problem (SAT)Whether there is a x such that f(x)=1?
f1 f2 f3 f4 f5 f6 f7 f8x1 0 0 0 1 0 1 1 1x2 0 0 1 0 1 0 1 1x3 0 1 0 0 1 1 0 1
57
Find a needle in a Haystack problem
Guess a N bit password. f1 f2 f8
x1 0 0 … 1x2 0 0 0x3 0 0 0x4 0 0 0x5 0 0 0x6 0 0 0x7 0 1 0x8 1 0 0
On average 2N-1 search steps are needed!
(N=3 example)
58
Summary
• Neighborhood search, random guess, parallel search, hybrid, hill climbing plus back tracking are equivalent when no problem information are available.
• In other words, there is no universal search strategy-No Free Lunch Theorem [Wolpert97].
• Problem specific knowledge discovery should be honored.
59
Outline
• Optimization
• Modeling strategies
• General search strategies
• General design strategies
• Complexity in behavior of dynamic systems
60
Design Problem
• Obtain engineering systems with – High performance– Robustness– Safeness– High level of security– Low cost
61
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
Design problem
Find a specific row so that outcomes for all possible columns (called planned columns) are acceptable.
Possible columns
62
General design strategies
• Modular design
• Hierarchical design
• Small world design
63
Some facts from F-matrix
• The determination of planned columns is based on designer’s knowledge about the problem. It may be inaccurate.
• Catastrophes are not avoidable
• Designs for complex systems are robust yet fragile.
64
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
When an unplanned column happens, it may not give good outcomes.
Planned columns
65
Example
• The design of airbags: although airbags can increase the safeness in general, but they may kill children which is not intended. This is an unexpected situation when implement the design.
66
Gamma matrix [Ho04]
• All design when encoded as l-bit strings, can form a Gamma matrix
b1
b2
bl
x1 x2 x|X|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
67
Modular design
M1 M2
68
Modularity in solution
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
f1x1 0x2 1x3 1x4 1
x5 1
x6 2x7 2x8 2
s1=1 for if b1=1
s23 =1 when b2 or b3 =1
f= s1 + s23
69
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
+
70
f0 f1 f2x1 0 0 0x2 1 1 0x3 1 1 0x4 1 1 0
x5 0 1 1
x6 1 2 1x7 1 2 1x8 1 2 1
Case of module failure
Error in b1 Error in b2b3x1 x2 x3 x4 x5 x6 x7 x8
b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
71
f1x1 0x2 2x3 2x4 2
x5 1
x6 1x7 1x8 1
s1=1 for x2,x3,x4,x5
s23 =1 for x2,x3,x4,x6,x7,x8
f= s1 + s23
When the design of two modules are coupled
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
72
f0 f1 f2x1 0x2 1 2 0x3 1 2 0x4 1 2 0
x5 1
x6 1x7 1x8 1
Case of module failure for coupled design
Error in b1 Error in b2b3
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
73
Hierarchical design
• The system is organized in a tree structure.
M2
M1 M3
The high level modules depends on the low level modules but the low modules does not depends on the high level modules. The system functions well only when the high level module works well.
74
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
s1=1 for b1=1
s2 =1 for b2=1
s3 =1 for b3=1
f= s2 (s1 + s3)
f1x1 0x2 0x3 0x4 1
x5 0
x6 0x7 1x8 2
The system requires the high level module M2 and at least one low level module (M1 or M3) works.
75
f0 f1 f2x1 0x2 0x3 0x4 1
x5 0
x6 0x7 1x8 1 2 0
Case of module failure for hierarchical design
Error in b1 or b3 Error in b2
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
76
Small world design
• The system is organized in asymmetry flat pattern. Some modules (head modules) makes more contribution than other modules.
M3M1
M2
77
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
f1x1 0x2 1x3 2x4 3
x5 1
x6 2x7 3x8 4
s1=1 for b1=1
s2 =1 for b2=1
s3 =1 for b3=1
f= s1 + 2s2 + s3
A small world design
78
A small world designf0 f1 f2
x1 0x2 1x3 2x4 3
x5 1
x6 2x7 3x8 3 4 2
Error in b1 or b3 Error in b2
x1 x2 x3 x4 x5 x6 x7 x8b1 0 0 0 0 1 1 1 1b2 0 0 1 1 0 0 1 1b3 0 1 0 1 0 1 0 1
79
Example
• This is consistent to what happens in the small world network.
For example, when a famous website like Google is attacked by DoS, the average number of hops one want to find a web page on the internet will increase a lot. But if DoS happens only for a hospital website, generally we will not feel much change in search on the web.
80
Summary
• The benefits of distributed design are:
Helps us to achieve robustness in design such that a design can degenerate performance gradually when random failure happens.
It helps to implement the system when central control is too expensive to design or implement.
81
Finding good design for problems with modularity
• Hill climbing or neighborhood search can be used to improve the quality of solution incrementally. How to divide the system into proper set of modules and which module is to design first is a search problem which needs try and error.
82
Outline
• Optimization
• Modeling strategies
• General search strategies
• General design strategies
• Complexity in behavior of dynamic systems
83
Complexity in behavior of dynamic systems
• Reachability problem
• Abstraction points
84
Some facts
• Halting problem of Turing machine is undecidable.
• Reachability of given state in a discrete event simulation model is NP-hard [Jacobson99].
85
Some facts
Attraction point problem of Dynamic Boolean Networks (DBN) is NP-hard [Zhao03].
86
References[Ho92] HO, Y.C., SREENIVAS, R.S., and VAKILI, P., Ordinal
Optimization of Discrete Event Dynamic Systems, Journal of Discrete Event Dynamic Systems, Vol. 2, pp. 61-68, 1992.
[Ho02] Ho and Pepyne, "Simple Explanation of the No-Free-Lunch Theorem and Its Implications," JOTA, Vol. 115, No. 3, 2002.
[Ho03] YC Ho, QC Zhao and DL Pepyne, The No Free Lunch Theorem, Complexity, and Computer Security, IEEE Trans. Automat. Contr., 48 (5): 783-793, 2003.
[Ho04] YC Ho and DL Pepyne, Conceptual Framework for Optimization and Distributed Intelligence, Submitted to CDC04.
[Jacobson99] SH Jacobson, On the complexity of verifying structural properties of discrete event simulation models, Operations Research, 47(3), 476-481,1999.
[Sun03] K Sun, DZ Zheng and Q Lu, Splitting Strategies for Islanding Operation of Large-Scale Power Systems Using OBDD-Based Methods, IEEE Transactions on Power Systems, 18(2), 912-923, 2003.
87
References
[Watts98] DJ Watts and SH Strogatz, Collective dynamics of ‘small-world’ networks, Nature 393, 1998, pp. 440-442.
[Wolpert97] Wolpert, D.H. and W.G. Macready, No Free Lunch Theorems for Optimization, IEEE TEC, Vol. 1, No. 1, April 1997.
[Zhao03a] QC Zhao, K Sun, DZ Zheng, J Ma and Q Lu, A Study of System Splitting Strategies for Island Operation of Power System: A Two-phase Method Based on OBDDs, IEEE Transactions on Power Systems, 18(4), 1556-1565, 2003.
[Zhao03b] QC Zhao, Inseparablity of min-max systems is co-NP hard, Chinese Control Conference, 454-458, 2003.
[Zhao04] QC Zhao, YC Ho and QS Jia, Vector Ordinal Optimization, Journal of Optimization Theory and Applications, to be published.
88
Thanks!