14 neural networks

Upload: deborahrosales

Post on 03-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 14 Neural Networks

    1/8

    A relatively new type of information processingwhich has only seen limited used in chemistry.

    Originally developed as a way to mimic the humanbrain, this research area now includes a range ofparallel distributed processing methods.

    Neural networks

    Potential advantages of the approach! Ability to learn new things with existing net.! Can automatically adjust to new stimuli.! Exact rules are not required.! Adding poor data degrades net slowly.!

    Current disadvantages! Not many applications in chemistry to date.! Only useful for classification - so far.! Typically slow to develop and train

    Neural networks

    We will only be introducing some general

    concepts in this unit.

    ! Network component

    ! Network topography

    ! General training approaches

    ! Some example network types

    Hopefully, this will give you at least a basic

    understanding of the area.

    Network nodes

    A network consists of a series of nodes.

    Network nodes

    Depending on the network and node type, notall types of functions may be used.

    om output from previous unit

    w weight applied to om

    bias additional weight applied to select

    inputs or layers.

    Responses are then fed to an activationfunction.

    Logistic activation function

    This is one of the most common activation

    function types.

    Function outputwill result in asimple on/offcondition.

  • 7/28/2019 14 Neural Networks

    2/8

    Activation functions

    Other types of functions have been used but forour limited coverage, one example is enough.

    Object of the function is to sum the responsesfrom the previous units to which it is linked and

    produce an output response - usually on (1) or

    off (0)

    This output is then sent to to other nodes.

    Types of nod

    Input

    Information obtained from externalsource.

    Output

    Information sent to external source.

    Dual

    Combination input and output node.

    Hidden

    Node in an internal layer. No externalinteraction.

    A simple network The goal is to present the networkwith a range of input patterns andobtain the desired output.

    One of the most commonapproaches for training a networkis backpropagation of a feed-forward neural network.

    Lets review the basic steps in thisapproach.

    Neural network training

    forwardpropagation

    phase

    back-propagation

    phase

    This process isrepeated foreach pattern inthe training set -in series.

    The entire set isevaluatedrepeatedly untilthe net is trainedto a satisfactorylevel.

    Neurnetwork trainin

    As with other classificationmethods, you can use a separatetraining and evaluation set or cross-

    validation.

    It is assumed that the patternspresented contain information thatcan be used for classification.

    Other methods of data evaluationcan be used prior to training toinsure that a good net can beproduced.

  • 7/28/2019 14 Neural Networks

    3/8

    Types of networks

    Many types of network models have been

    proposed and studied

    We will just look at a few examples.

    ! Backpropagation! Dynamic learning vector quantization! Self-organizing mapsThis should give you an idea as to how widely

    the approaches can vary.

    Free Neural Network Software

    What could be better than free? Here are the twosystems well be looking at choices.

    tlearn

    http://crl.ucsd.edu/innate/tlearn.htmlMac, Windows and UNIX versions.

    Backpropagation type networks.

    PCA and clustering data also provided.

    A bit dated but simple to use.

    Well use it for one of our examples.

    JavaNNS

    Java Neural Network Simulator

    http://www.ra.cs.uni-tuebingen.de/software/JavaNNS/welcome_e.html

    Works on Windows, Mac and UNIX platforms.

    Supports several types of NN models including:

    backpropagation

    counter propagation

    dynamic learning vector quantization

    Kohonen

    ARTAutoassociative memory

    and many others.

    Backpropagation

    This is one of the most common approaches which

    we have already outlined.

    A typical network consists:! Input layer - one function / variable! Output layer - one function / class! Hidden layers - optionalDuring training, the weights are updated after each

    training pattern is presented.

    Backpropagation

    inputlayer-6

    variables

    outputlayer-4

    classes

    two hidden layersBackpropagat

    Arson example

    Same data set we looked at before with5 classes and 19 variables.

    Variables were range normalized to 0-1.

    Classes were encoded as:

    Class 1 - 000! ! Class 4 - 011Class 2 - 001! ! Class 5 - 100Class 3 - 010

    Well use tlearn for this example

  • 7/28/2019 14 Neural Networks

    4/8

    Arson example

    Files used for design of the network, data and classes

    Arson example

    Output layer

    Hidden layer

    Input layer

    The .cf file is used to specify a three layer network.

    Dendrograms

    Shows if the datatends to fall intoclusters and beclassified.

    PCA plots

    Its pretty muchthe same aswhat weobserved whenthe Arson datafile was used

    earlier.

    3D PCA plots Arson example

    The program was then instructed to gothrough 50,000 training cycles.

  • 7/28/2019 14 Neural Networks

    5/8

    Arson example Arson example

    The activity display will indicate which inputs cause eachnode to fire and its impact on the final pattern.

    Arson example

    Finally, wecan evaluatehow well ournetwork works.

    Here we justused the originaltraining set tosee how welleach sample isclassified.

    JavaNNS example

    Lets return to the Iris problem and see if a neural

    network can do the classification.

    To make the software happy, the data was normalized

    on a 0-1 scale -- variable by variable basis.

    Classes were assigned as:

    0 0 I. Setosa

    0 1 I. Verginica

    1 0 I. Verginica

    Initialnetwork

    withrandom

    weights.

    Setting options

  • 7/28/2019 14 Neural Networks

    6/8

    Process for 50,000 cycles Results

    Example results for the three classes. The brighter

    the green, the bigger the value.

    Input or hiddenunits that do nocontribute tothe classificationcan be removed.

    Data subset and results.

    Dynamic learning vector quantization

    This approach attempts to find natural grouping in aset of data.

    The assumption is that a data vector can be found

    that best classifies related samples.

    The vectors are selected that not only best classify

    related samples but maximize the distance between

    unrelated ones.

    The end result is very similar to clustering based on

    PCA and Varimax rotation - very SIMCA like.

    Topology of a DLVQ - initial

    This example shows an initial setup with 42 input variables,5 possible classes and our output (answer) layer.

    1

    7

    6

    5

    4

    2

    3

    8

    14

    13

    12

    11

    9

    10

    15

    21

    20

    19

    18

    16

    17

    22

    28

    27

    26

    25

    23

    24

    29

    35

    34

    33

    32

    30

    31

    36

    42

    41

    40

    39

    37

    38

    input layer class vectors outputlayer

  • 7/28/2019 14 Neural Networks

    7/8

    DLVQ steps

    The normalized training set is loaded and the

    mean vector for each class determined.

    Each pattern out of the training set is evaluated

    with the reference vector.

    Vectors are adjusted towards samples in a class

    and away from other samples.

    The process is completed until the number of

    correctly classified samples does not increase.

    DLVQ

    Once completed, the final class vectors may vary in size.Some classes may be easier to identify than others.

    DLVQ

    In order to work, the input patterns for each

    class share some similarities.

    There must also be something different about

    each class.

    The model must be rebuilt if a new class isdiscovered or a training sample was misidentified.

    Self-organizing maps

    Self-organizing maps (SOM) are also

    called Kohonen feature maps.

    This is an unsupervised learning

    method.

    It will cluster related input data.

    It will maintain spacial ordering so you

    have some idea as to the relationships

    between your patterns.

    Self-organizing maps

    SOM systems consist of two layers of units.A one dimensional input layer

    A two dimensional competitive layerorganized as a 2-D grid of units.

    There are no hidden or output layers.

    inputlayer

    competitivelayer

    Self-organizing maps

    Training

    The competitive layer is initialized with normalized

    vectors.

    The input pattern vectors are presented to the

    competitive layer and the best match (nearest unit) is

    chosen as the winner.

    Topicalogical ordering is achieved using a spatial

    neighborhood relationship between competitive units.

  • 7/28/2019 14 Neural Networks

    8/8

    Self-organizing maps

    Once trained, each sample will produce a pattern

    map on the competitive layer.

    It can be used to visualize how your samples

    relate to each other.