from neuroinformatics to bioinformatics: methods for data analysis david horn spring 2006 weizmann...
Post on 21-Dec-2015
213 views
TRANSCRIPT
![Page 1: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/1.jpg)
From Neuroinformatics to Bioinformatics:Methods for Data Analysis
David HornSpring 2006Weizmann Institute of Science
Course website: http://horn.tau.ac.il/course06.html
Teaching assistant: Roy Varshavsky
![Page 2: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/2.jpg)
From Neuroinformatics to Bioinformatics:Methods for Data Analysis
Bibliography:Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. 1991Bishop: Neural Networks for Pattern Recognition. 1995 Ripley: Pattern Recognition and Neural Networks. 1996Duda, Hart, Stork: Pattern Recognition. 2001Baldi and Brunak: Bioinformatics. 2001Hastie, Tibshirani, Friedman: The Elements of Statistical Learning. 2001Shaw-Taylor and Cristianini: Kernel Methods for Pattern Analysis. 2004
![Page 3: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/3.jpg)
Neural Introduction
Transparencies are based on some material available on the ww:
G. Orr: Neural Networks. 1999 (see my website for pointer)
Y. Peng: Introduction to Neural Networks CMSC 2004
J. Feng: Neural Networks. SussexDuda-Hart-Stork websiteand on some of the books in the bibliography
![Page 4: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/4.jpg)
![Page 5: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/5.jpg)
![Page 6: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/6.jpg)
![Page 7: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/7.jpg)
Introduction
Why ANN Some tasks can be done easily (effortlessly) by
humans but are hard by conventional paradigms on Von Neumann machine with algorithmic approach Pattern recognition (old friends, hand-written
characters) Content addressable recall Approximate, common sense reasoning
(driving, playing piano, baseball player) These tasks are often ill-defined, experience based,
hard to apply logic
![Page 8: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/8.jpg)
Introduction
Von Neumann machine -------------------------- One or a few high speed (ns)
processors with considerable computing power
One or a few shared high speed buses for communication
Sequential memory access by address
Problem-solving knowledge is separated from the computing component
Hard to be adaptive
Human Brain ---------------------------- Large # (1011) of low speed
processors (ms) with limited computing power
Large # (1015) of low speed connections
Content addressable recall (CAM)
Problem-solving knowledge resides in the connectivity of neurons
Adaptation by changing the connectivity
![Page 9: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/9.jpg)
The brain - that's my second most favourite organ! - Woody Allen
![Page 10: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/10.jpg)
Some of the wonders of the brain: what it can do with 10^11 neurons and 10^15 synapses
its performance tends to degrade gracefully under partial damage. In contrast, most programs and engineered systems are brittle: if you remove some arbitrary parts, very likely the whole will cease to function.
it can learn (reorganize itself) from experience. this means that partial recovery from damage is possible if
healthy units can learn to take over the functions previously carried out by the damaged areas.
it performs massively parallel computations extremely efficiently. For example, complex visual perception occurs within less than 100 ms, that is, 10 processing steps!
it supports our intelligence and self-awareness. (Nobody knows yet how this occurs.)
![Page 11: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/11.jpg)
![Page 12: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/12.jpg)
![Page 13: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/13.jpg)
The brain has some architecture…
![Page 14: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/14.jpg)
![Page 15: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/15.jpg)
![Page 16: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/16.jpg)
• Biological neural activity
Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing and rest. Neuron fires if the total incoming stimulus exceeds the
threshold Synapse: thin gap between axon of one neuron and dendrite
of another. Signal exchange Synaptic strength/efficiency
![Page 17: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/17.jpg)
![Page 18: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/18.jpg)
Mc-Cullock and Pitts neurons
![Page 19: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/19.jpg)
Introduction
What is an (artificial) neural network A set of nodes (units, neurons, processing elements)
Each node has input and output Each node performs a simple computation by its
node function Weighted connections between nodes
Connectivity gives the structure/architecture of the net
What can be computed by a NN is primarily determined by the connections and their weights
A very much simplified version of networks of neurons in animal nerve systems
![Page 20: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/20.jpg)
Introduction
ANN --------------------------------------------- Nodes
input output node function
Connections connection strength
Bio NN ------------------------------------------------ Cell body
signal from other neurons
firing frequency firing mechanism
Synapses synaptic strength
Highly parallel, simple local computation (at neuron level) achieves global results as emerging property of the interaction (at network level)
Pattern directed (meaning of individual nodes only in the context of a pattern)
Fault-tolerant/graceful degrading Learning/adaptation plays important role.
![Page 21: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/21.jpg)
History of NN
Pitts & McCulloch (1943) First mathematical model of biological neurons All Boolean operations can be implemented by these
neuron-like nodes (with different threshold and excitatory/inhibitory connections).
Competitor to Von Neumann model for general purpose computing device
Origin of automata theory. Hebb (1949)
Hebbian rule of learning: increase the connection strength between neurons i and j whenever both i and j are activated.
Or increase the connection strength between nodes i and j whenever both nodes are simultaneously ON or OFF.
![Page 22: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/22.jpg)
History of NN
Early boom (50’s – early 60’s) Rosenblatt (1958)
Perceptron: network of threshold nodes for pattern classification
Perceptron learning rule Percenptron convergence theorem:
everything that can be represented by a perceptron can be learned
Widrow and Hoff (1960, 19062) Learning rule based on gradient descent (with
differentiable unit) Minsky’s attempt to build a general purpose machine with
Pitts/McCullock units
x1 x2 xn
![Page 23: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/23.jpg)
History of NN
The setback (mid 60’s – late 70’s) Serious problems with perceptron model (Minsky’s
book 1969) Single layer perceptrons cannot represent
(learn) simple functions such as XOR Multi-layer of non-linear units may have greater
power but there is no learning rule for such nets Scaling problem: connection weights may grow
infinitely The first two problems overcame by latter effort in
80’s, but the scaling problem persists Death of Rosenblatt (1964) Striving of Von Neumann machine and AI
![Page 24: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/24.jpg)
History of NN
Renewed enthusiasm and flourish (since mid-80’s) New techniques
Backpropagation learning for multi-layer feed forward nets (with non-linear, differentiable node functions)
Thermodynamic models (Hopfield net, Boltzmann machine, etc.)
Unsupervised learning Impressive application (character recognition, speech
recognition, text-to-speech transformation, process control, associative memory, etc.)
Traditional approaches face difficult challenges Caution:
Don’t underestimate difficulties and limitations Poses more problems than solutions
![Page 25: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/25.jpg)
ANN Neuron Models
General neuron model
Weighted input summation
Each node has one or more inputs from other nodes, and one output to other nodes
Input/output values can be Binary {0, 1} Bipolar {-1, 1} Continuous
All inputs to one node come in at the same time and remain activated until the output is produced
Weights associated with links
popularmost is function node theis )(
1
n
i ii xwnetnetf
![Page 26: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/26.jpg)
Step (threshold) function
where c is called the threshold
Ramp function
Node Function
Step function
Ramp function
.)( :functionIdentity netnetf .)( :functionConstant cnetf
![Page 27: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/27.jpg)
Node Function Sigmoid function
S-shaped Continuous and
everywhere differentiable Rotationally symmetric
about some point (net = c) Asymptotically approach
saturation points
Examples:
Sigmoid function
When y = 0 and z = 0: a = 0, b = 1, c = 0.When y = 0 and z = -0.5 a = -0.5, b = 0.5, c = 0.
Larger x gives steeper curve
![Page 28: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/28.jpg)
Node Function
Gaussian function Bell-shaped (radial basis) Continuous f(net) asymptotically
approaches 0 (or some constant) when |net| is large
Single maximum (when net = )
Example: Gaussian function
![Page 29: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/29.jpg)
Network Architecture (Asymmetric) Fully Connected Networks
Every node is connected to every other node Connection may be excitatory (positive), inhibitory
(negative), or irrelevant ( 0). Most general Symmetric fully connected nets: weights are
symmetric (wij = wji)Input nodes: receive
input from the environment
Output nodes: send signals to the environment
Hidden nodes: no direct interaction to the environment
![Page 30: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/30.jpg)
Layered Networks Nodes are partitioned into subsets, called layers. No connections that lead from nodes in layer j to
those in layer k if j > k.
• Inputs from the environment are applied to nodes in layer 0 (input layer).
• Nodes in input layer are place holders with no computation occurring (i.e., their node functions are identity function)
Network Architecture
![Page 31: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/31.jpg)
Feedforward Networks A connection is allowed from a node in layer i only to
nodes in layer i + 1. Most widely used architecture.
Conceptually, nodes at higher levels successively abstract features from preceding layers
Network Architecture
![Page 32: From Neuroinformatics to Bioinformatics: Methods for Data Analysis David Horn Spring 2006 Weizmann Institute of Science Course website:](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d6b5503460f94a49c0e/html5/thumbnails/32.jpg)
Acyclic Networks Connections do not form directed cycles. Multi-layered feedforward nets are acyclic
Recurrent Networks Nets with directed cycles. Much harder to analyze than acyclic nets.
Modular nets Consists of several modules, each of which is itself a
neural net for a particular sub-problem Sparse connections between modules
Network Architectures