dsm workshop

8/3/2019 Dsm Workshop

1/8


2/8

1 Introduction

This paper explores the application of machine learning techniques to create Design Struc-ture Matrices (DSMs). This provides the means for creating DSMs with a reduced need

for recourse to domain experts, and thereby both reducing the time and cost for generating

DSMs. By drawing the similarities of the DSM with the structure used by a Bayesian Belief

Net (BBN), it is argued that a BBN is a general form of DSM. Once this link has been

established, it then becomes possible to use machine learning tools developed for inducing

BBNs from previous data to provide a similar functionality for DSMs.

A review DSM in Section 2 provides a working definition for this paper. Bayesian

Belief Nets are described in Section 3, including a brief overview of the machine learning

algorithm used to induce a BBN from data. The link between DSM and BBN is then

drawn in Section 4, and this is illustrated in Section 5 using the Pittsburgh Bridge design

database. Finally, Section 6 provides an overview of the work needed to specialise the

machine learning approach for inducing DSMs from previous product data.

2 Design Structure Matrix Review

The Design Structure Matrix (DSM) provides a means for guiding a designer through the

design process of a particular product (Steward, 1981). This is achieved by representingthe causal relationships between aspects of the design through the use of a matrix. This

is achieved by listing the set of tasks to be completed along both axes of a matrix. Where

task i has a causal impact on task j, a mark is entered into the (i, j)th entry of this matrix.

Given a DSM, it is possible to rearrange the task order (i.e. the rows and columns)

so that the matrix attains either an upper triangular form or block diagonal form. If the

matrix can be transformed into an upper triangular shape, this provides a linear order of

tasks to perform.. As the matrix is upper triangular, the task on any line will not impact

any previous tasks, and hence the top row represents the first task to be undertaken. If

the matrix can be arranged a series of blocks along the diagonal, each block represents an

independent set of task. Not only can these blocks can be tackled in any order, but they

can also be performed asynchronously as there will be no impact on the other blocks.

In reality, the design matrices cannot be arranged in such neat formats. It will often

be the case that these formats can be approximately be achieved, with a small number of

stray entries. Provided there are not too many strays, these strays provide weak links to

the remainder of the design. As these are known, it is possible to identify where special

2


3/8

attention is required.

2.1 Issues

There are two main issues with generating a DSM: identification of the tasks and the

populating of the matrix. It is important to identify the correct set of tasks too few

and the DSM will serve little purpose, too many and the matrix becomes unwieldy or

wastes effort in later identifying redundant tasks. This issue is related to the matrix

population issue: if there are N tasks, there are N2 cells in the matrix to be considered.

This represents a significant problem where the design domain contains many tasks that

need to be included in the DSM.

Using machine learning helps towards solving the second of these issues. The nextsection describes an algorithm that solves a similar problem in creating Bayesian Belief

Networks. The first does remain a problem, however it is now cost effective to collect a

larger set of tasks, and post-process the machine learning algorithms results to identify

redundant tasks.

3 Bayesian Belief Networks

Bayesian, or conditional, probability provides a measure of how likely an event will be giventhat some other event has occurred. When this probability is non-zero, the two events are

said to be conditionally linked.

Bayesian Belief Networks (BBNs) are a graphical means for representing the conditional

links between a set of stochastically observed variables. The graphs are directed acyclic

graphs (DAGs) where each node is a stochastic variable and the edges represent the ex-

istence of a Bayesian link. These graphs can be either constructed manually or induced

computationally. Figure 1 is an example of such a graph.

3.1 Bayesian Probability

The Bayesian, or conditional, probability of an event A happening given that B has occured

is mathematically defined (or measured), as follows:

P(A|B) =P(A B)

P(B)(1)

3


4/8

A

B

Figure 1: Example of a BBN: the flow of the network is from left to right.

This effectively restricts the considered outcomes to those where both A and B haveoccurred and normalises with respect to how frequently B occurs. Where there is a signif-

icant difference between the conditional probability and the prior probability P(A), it can

be inferred that there is a conditional link between the two.

3.2 Applying Bayes to Networks

This simple concept becomes a powerful tool when combined with a network of conditional

probability values (or potentially probability distribution functions). The most common

is the event where only the probabilities of the terminal nodes of the DAG are known

along with the conditional probabilities. The aim is to compute the probabilities of the

downstream node A:

P(A) =

B

P(A B) (2)

=

B

P(A|B)P(B) (3)

This equation sums over all possible input events, and uses the known conditional

probabilities to determine the probability of the node A. Where the domain is large, this

will represent a large computational saving, as only the parent nodes need to be considered.

For example, in Figure 1, the parent nodes of A would be considered, which will need to

recursively consider the parents of B. However, at each stage this is independent of the

rest of the network.

4


5/8

3.3 Learning BBNs

Learning can be divided into two main types: parameter learning and structure learning.

In parameter learning, the model is given and the aim is to fit this model to a dataset.

Structure learning involves identifying how the model variables are related to each other.

It is this second type of learning that is performed for learning BBNs.

Structure learning requires searching the graph space spanned by the variables (nodes).

The number of possible graphs (i.e. edges connecting these nodes) explodes as the number

of nodes increases. This represents an important computational challenge, and there are

several research groups dedicated to this area. For the purposes of this paper, a brief

overview of the adopted method will be described.

The Bayes Net Toolbox (Murphy, 2001) provides a number of structure learning algo-rithms. The algorithm selected for this paper was the Monte Carlo Markov Chain search.

This algorithm has the advantage of requiring the least amount of prior information (the

other algorithms require node ordering or knowledge of the conditional dependency). The

algorithm uses a greedy search heuristic, starting with an empty graph. The search pro-

ceeds iteratively, at each step searching the neighbourhood of the previous graph by

examining a sample of graphs that differ by one edge of the previous graph (see Figure 2).

The best graph is selected based on how likely it would be that it would generate the given

dataset.

After a preset number of iterations, the search terminates. This provides a BBN that

represents a reasonable approximation to the true conditional relationships of the observed

domain. There are a number of limitations. First, the graph will be a DAG even if

the true model had a cyclic set of nodes. Second, the search method tends to generate

fully connected graphs even in event where the domain would be better, and more simply,

represented by multiple independent graphs.

4 Linking DSM and BBN

It is a trivial effort to link the representations for DSM and BBNs. To transform a DSM to

a graph structure, use graph nodes to represent the design parameters. For each entry (i, j)

in the DSM matrix, attach a directed edge from node i to node j in the graph. Conversely,

to transform a BBN into a DSM, first list all the nodes as the column/row labels. For each

row, mark the cells that node points to.

This principal interest in this paper is the transformation from BBN to DSM. However,

5


6/8

O S

G

R

O S

G

R

O S

G

R

O S

G

R

Figure 2: Searching the space of potential BBNs.

it must be noted that as BBNs are acyclic the resulting DSM will not contain any cyclic

dependencies. This will result in a matrix that can always be arranged as a triangular

matrix, even if in reality the true underlying model should have a cycle. Further, with

the MCMC BBN search algorithm used it will be unlikely that true independent sets of

variables are identified. However, even with these two shortcomings this approach doesprovide a rapid means for generating an initial DSM.

5 Illustration

For the purpose of illustrating transforming product data into a DSM, the Bridges of

Pittsburgh dataset will be used (Blake and Merz, 1998). This database was compiled

by Reich et al. (1994) for testing and illustrating various machine learning techniques,

and has the advantage of being publicly available. The process of analysing this dataset

required five steps: data acquisition; data cleaning and formatting; passing through the

BNT learning algorithm; visualising the BBN; and transforming into a DSM.

The data was downloaded from the UCI Machine Learning data repository (Blake and

Merz, 1998) and took the following format: label, river, erected, purpose, length,

lanes, clear-g, t-or-d, material, span, rel-l, type. The first variable, label,

was redundant and hence deleted. A number of bridges contained missing observations,

6


7/8

1

2 3

4

5 6

7

8

9

10

11

1 111098765432

1

11

10

9

8

7

6

5

4

3

2

Figure 3: Left: BBN as generated by the BNT; Right: The BNT translated as a DSM.Labels are: (1) river, (2) erected, (3) purpose, (4) length, (5) lanes, (6) clear-g, (7) t-or-d,(8) material, (9) span, (10) rel-l, (11) type

which the BNT is not capable of handling and so these entries were also deleted. This left

a total of 70 bridges. The entries in each column were then discretised into integers. This

formed a matrix which was then loaded into Matlab. This matrix was then passed as the

set of observations to the BNT MCMC learning algorithm. On completion, the learning

algorithm returns a DAG which is passed to a graph visualisation function. These results

are presented in Figure 3.

From the graph, a DSM is generated. The following can be inferred from this matrix,

amongst other heuristics:

1. River type (node 1, a final node) determines what constuction method to use (T-or-D,

node 7) and the Span (node 9)

2. The material (node 8) is effectively the last design variable that would be determined

here

It is interesting to note that the two design variables that cannot be set by the designer

(River (1) and Length (4)) are at at the very end of BBN. This is expected, as these values

are to be back propogated through the network, driving the search for the remaining design

variable values.

7


8/8

6 Further work

Two main shortcomings were identified: the inability to generate cyclic graphs and thedifficulty for identifying independent sets of variables. These will require modifying the

graph search heuristics.

This work explored the potential of using a widely available machine learning algorithm

to analyse a product database and generate a DSM for the given product family. Matthews

and Lowe (2003) report similar work, providing a rapid probabilistic change propogation

design search tool. While preliminary results appear satisfactory, it is vital to apply these

method to more domains for verification purposes.

References

Blake, C. L. and Merz, C. J. (1998). UCI repository of machine learning databases. http:

//www.ics.uci.edu/~mlearn/MLRepository.html .

Matthews, P. C. and Lowe, D. R. (2003). Inducing change propagation models using

previous designs, inA. Folkeson, K. Gralen, M. Norell and U. Sellgren (eds), Proceedings

of the 14th International Conference on Engineering Design, Design Society, Stockholm.

Murphy, K. P. (2001). The Bayes Net Toolbox for Matlab, inE. J. Wegman, A. Braverman,A. Goodman and P. Smyth (eds), Computing Science and Statistics, Vol. 33, Interface

Foundation of North America, pp. 331351.

Reich, Y., Fenves, S. J. and Subrahmanian, E. (1994). Flexible extraction of practical

knowledge from bridge databases, Proceedings of the First Congress on Computing in

Civil Engineering, Washington, DC, pp. 10141021.

Steward, D. V. (1981). The Design Structure System: A method for managing the design

of complex systems, IEEE Transactions on Engineering Management 28: 7174.

8

dsm workshop

Documents