decision support systems artificial neural networks for data mining
TRANSCRIPT
Decision Support Decision Support SystemsSystems
Artificial Neural Networks for Artificial Neural Networks for Data MiningData Mining
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-2
Learning ObjectivesLearning Objectives Understand the concept and definitions of
artificial neural networks (ANN) Learn the different types of neural network
architectures Learn the advantages and limitations of ANN Understand how backpropagation learning
works in feedforward neural networks Understand the step-by-step process of how
to use neural networks Appreciate the wide variety of applications of
neural networks; solving problem types
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-3
Opening Vignette:Opening Vignette:
“Predicting Gambling Referenda with Neural Networks”
Decision situationProposed solutionResultsAnswer and discuss the case questions
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-4
Neural Network ConceptsNeural Network Concepts Neural networks (NN): a brain
metaphor for information processing Neural computing Artificial neural network (ANN) Many uses for ANN for
pattern recognition, forecasting, prediction, and classification
Many application areas finance, marketing, manufacturing, and
so on
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-5
Biological Neural Biological Neural NetworksNetworks
Soma
Axon
Axon
SynapseSynapse
Dendrites
Dendrites Soma
Two interconnected brain cells (neurons)
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-6
Processing Information in Processing Information in ANNANN
w1
w2
wn
x1
x2
xn
.
.
.
Y
Y1
Yn
Y2
Inputs Weights Outputs
.
.
.
Neuron (or PE)
n
iiiWXS
1
)( Sf
SummationTransferFunction
A single neuron (processing element – PE) with inputs and outputs
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-7
Elements of ANNElements of ANN Processing element (PE) Network architecture
Hidden layers Parallel processing
Network information processing Inputs Outputs Connection weights Summation function
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-8
Elements of ANNElements of ANN
(PE)
(PE)
(PE)
(PE)
(PE)
(PE)
(PE)
TransferFunction
( f )
WeightedSum(S)
x1
x2
x3
Y1
InputLayer
Hidden Layer
Output Layer
Neural Network with one Hidden Layer
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-9
Elements of ANNElements of ANN
x1
x2 2211 WXWXY
(PE)
(PE)
Y
(PE)
(PE)
w1
w1
w11
w21
w12
w22
w23
x1
x2
Y1
Y2
Y3
2121111 WXWXY
2221212 WXWXY
2323 WXY
(a) Single neuron (b) Multiple neurons
PE: Processing Element (or neuron)
Summation Function for a Single Neuron (a) and Several Neurons (b)
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-10
Elements of ANNElements of ANN Transformation (Transfer) Function
Linear function Sigmoid (logical activation) function [0 1] Tangent Hyperbolic function [-1 1]
X1 = 3
Processing element (PE)
X2 = 1
X3 = 2
W1 = 0.2
W2 = 0.4
W3 = 0.1
Y = 1.2
Summation function:
Transfer function:
Y = 3(0.2) + 1(0.4) + 2(0.1) = 1.2
YT = 1/(1 + e-1.2) = 0.77
YT = 0.77
Threshold value?
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-11
Neural Network Neural Network ArchitecturesArchitectures
Several ANN architectures exist Feedforward Recurrent Associative memory Probabilistic Self-organizing feature maps Hopfield networks
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-12
Neural Network Neural Network ArchitecturesArchitecturesRecurrent Neural Recurrent Neural NetworksNetworks
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-13
Neural Network Neural Network ArchitecturesArchitectures
Architecture of a neural network is driven by the task it is intended to address Classification, regression, clustering,
general optimization, association Most popular architecture: Feedforward, multi-layered perceptron with
backpropagation learning algorithm
Both for classification and regression type problems
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-14
Learning in ANNLearning in ANN
A process by which a neural network learns the
underlying relationship between input and outputs,
or just among the inputs Supervised learning
For prediction type problems (e.g., backpropagation)
Unsupervised learning For clustering type problems (e.g., adaptive
resonance theory)
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-15
A Taxonomy of ANN A Taxonomy of ANN Learning AlgorithmsLearning Algorithms
Learning Algorithms
Discrete/binary input Continuous Input
Surepvised Unsupervised
· Delta rule· Gradient Descent· Competitive learning· Neocognitron· Perceptor
· Simple Hopefield· Outerproduct AM· Hamming Net
· ART-1· Carpenter /
Grossberg
· ART-3· SOFM (or SOM)· Other clustering
algorithms
Architectures
Supervised Unsupervised
Recurrent Feedforward Extimator Extractor
· Hopefield · SOFM (or SOM)· Nonlinear vs. linear· Backpropagation· ML perceptron· Boltzmann
· ART-1· ART-2
UnsupervisedSurepvised
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-16
A Supervised Learning A Supervised Learning ProcessProcess
Compute output
Is desiredoutput
achieved?
Stoplearning
Adjustweights
Yes
No
ANNModel
Three-step process:
1.Compute temporary outputs
2.Compare outputs with desired targets
3.Adjust the weights and repeat the process
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-17
How a Network LearnsHow a Network Learns Example: single neuron that learns
the inclusive OR operation
Learning parameters:
Learning rate
Momentum
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-18
Backpropagation LearningBackpropagation Learning
Backpropagation of Error for a Single Neuron
w1
w2
wn
x1
x2
xn
.
.
.
Yi
Neuron (or PE)
n
iiiWXS
1
)( Sf
SummationTransferFunction
)(SfY
a(Zi – Yi)error
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-19
Backpropagation LearningBackpropagation Learning The learning algorithm procedure:
1. Initialize weights with random values and set other network parameters
2. Read in the inputs and the desired outputs
3. Compute the actual output4. Compute the error (difference
between the actual and desired output)
5. Change the weights by working backward through the hidden layers
6. Repeat steps 2-5 until weights stabilize
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-20
Development Process of Development Process of an ANNan ANN
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-21
An MLP ANN Structure for An MLP ANN Structure for the Box-Office Prediction the Box-Office Prediction ProblemProblem
1
2
3
4
5
6
7
...
1
2
3
4
5
6
7
8
9
...
MPAA Rating (5)(G, PG, PG13, R, NR)
Competition (3)(High, Medium, Low)
Star Value (3)(High, Medium, Low)
Genre (10)(Sci-Fi, Action, ... )
Technical Effects (3)(High, Medium, Low)
Sequel (2)(Yes, No)
Number of Screens (Positive Integer)
Class 1 - FLOP(BO < 1 M)
Class 2(1M < BO < 10M)
Class 3(10M < BO < 20M)
Class 4(20M < BO < 40M)
Class 5(40M < BO < 65M)
Class 6(65M < BO < 100M)
Class 7(100M < BO < 150M)
Class 8(150M < BO < 200M)
Class 9 - BLOCKBUSTER(BO > 200M)
INPUTLAYER
(27 PEs)
HIDDENLAYER I(18 PEs)
HIDDENLAYER II(16 PEs)
OUTPUTLAYER(9 PEs)
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-22
Testing a Trained ANN Testing a Trained ANN ModelModel
Data is split into three parts Training (~60%) Validation (~20%) Testing (~20%)
k-fold cross validation Less bias Time consuming
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-23
Sensitivity Analysis on Sensitivity Analysis on ANN ModelsANN Models A common criticism for ANN:
The lack of expandability The black-box syndrome! Answer: sensitivity analysis
Conducted on a trained ANN The inputs are perturbed while
the relative change on the output is measured/recorded
Results illustrates the relative importance of input variables
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-24
Sensitivity Analysis on Sensitivity Analysis on ANN ModelsANN Models
D1
Systematically Perturbed
Inputs
Observed Change in Outputs
Trained ANN“the black-box”
For a good exampleSensitivity analysis reveals the most important injury severity factors in traffic accidents
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-25
A Sample Neural Network A Sample Neural Network ProjectProjectBankruptcy PredictionBankruptcy Prediction
A comparative analysis of ANN versus logistic regression (a statistical method)
Inputs X1: Working capital/total assets X2: Retained earnings/total assets X3: Earnings before interest and
taxes/total assets X4: Market value of equity/total debt X5: Sales/total assets
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-26
A Sample Neural Network A Sample Neural Network ProjectProjectBankruptcy PredictionBankruptcy Prediction
Data was obtained from Moody's Industrial Manuals Time period: 1975 to 1982 129 firms (65 of which went bankrupt
during the period and 64 nonbankrupt) Different training and testing
propositions are used/compared 90/10 versus 80/20 versus 50/50 Resampling is used to create 60 data
sets
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-27
A Sample Neural Network A Sample Neural Network ProjectProjectBankruptcy PredictionBankruptcy Prediction
x1
x2
x3
x4
x5
BR = 1
NBR = 1
Network Specifics Feedforward
MLP Backpropagation Varying learning
and momentum values
5 input neurons (1 for each financial ratio),
10 hidden neurons,
2 output neurons (1 indicating a bankrupt firm and the other indicating a nonbankrupt firm)
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-28
A Sample Neural Network A Sample Neural Network ProjectProjectBankruptcy Prediction - Bankruptcy Prediction - ResultsResults
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-29
Other Popular ANN Other Popular ANN ParadigmsParadigmsSelf Organizing Maps Self Organizing Maps (SOM)(SOM)
Input 1
Input 2
Input 3
First introduced by the Finnish Professor Teuvo Kohonen
Applies to clustering type problems
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-30
Other Popular ANN Other Popular ANN ParadigmsParadigmsSelf Organizing Maps Self Organizing Maps (SOM)(SOM) SOM Algorithm –
1. Initialize each node's weights2. Present a randomly selected input vector
to the lattice3. Determine most resembling (winning)
node4. Determine the neighboring nodes5. Adjusted the winning and neighboring
nodes (make them more like the input vector)
6. Repeat steps 2-5 for until a stopping criteria is reached
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-31
Other Popular ANN Other Popular ANN ParadigmsParadigmsSelf Organizing Maps Self Organizing Maps (SOM)(SOM)
Applications of SOM Customer segmentation Bibliographic classification Image-browsing systems Medical diagnosis Interpretation of seismic
activity Speech recognition Data compression
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-32
Other Popular ANN Other Popular ANN ParadigmsParadigmsHopfield NetworksHopfield Networks
First introduced by John Hopfield
Highly interconnected neurons
Applies to solving complex computational problems (e.g., optimization problems)
. . .
I n p u t
Output
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-33
Applications Types of ANNApplications Types of ANN
Classification Feedforward networks (MLP), radial
basis function, and probabilistic NN Regression
Feedforward networks (MLP), radial basis function
Clustering Adaptive Resonance Theory (ART)
and SOM Association
Hopfield networks
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-34
Advantages of ANNAdvantages of ANN Able to deal with highly nonlinear
relationships Not prone to restricting normality
and/or independence assumptions Can handle variety of problem types Usually provides better results
compared to its statistical counterparts
Handles both numerical and categorical variables
Modified from Decision Support Systems and Business Intelligence Systems 9E.
1-35
Disadvantages of ANNDisadvantages of ANN
They are deemed to be black-box solutions, lacking expandability
It is hard to find optimal values for large number of network parameters Optimal design is still an art: requires
expertise and extensive experimentation It is hard to handle large number of
variables Training may take a long time for large
datasets; which may require case sampling