csen 1005-neural networksmet.guc.edu.eg/download.ashx?id=18456&file=lec1_18456.pdf · 2 martin...

CSEN 1005-Neural NetworksIntroduction and Course Information

Hazem M. Abbas– C7.210

German University in CairoMedia Engineering and Technology

[email protected]/Courses/CourseEdition.aspx?crdEdId=348

Winter 2015

H. Abbas (GUC) CSEN1005–NN 2015 1 / 31

Outline

1 Class Organization and InformationCourse Targets

2 Introduction to Neural NetsMotivationBiological backgroundLearning and Adaptation


Outline




Literature

TextbookRudolf Kruse, Christian Borgelt, Frank Klawonn, Christian Moewes, MatthiasSteinbrecher, Pascal Held, Computational Intelligence: A MethodologicalIntroduction, Springer, 2014. (available online)

Other References1 Raul Rojas Neural Networks, Springer, 1996. (available online)2 Martin T. Hagan, Howard B. Demuth, Mark H. Beale, Neural Network

Design, PWS Publishing, 1996.3 Christopher Bishop, Neural Networks for Pattern Recognition, Clarendon

Press, Oxford, 1995.4 Simon O. Haykin, Neural Networks and Learning Machines, 2009.


Marking & Pre-Requisites

MarkingFinal: 40Midterm: 30Class Work: 30- 3 quizzes (2 best of 3): 20- Project: 10

Pre-requisitesBasic calculus, linear algebra and probabilitiesProgramming skillsAlgorithms and Data StructureMatlab and/or Python and/or R


Course Policies

It is very important to attend lecturesTake notes- things might not be on slidesCheck the course website regularly for any announcements or materialOffice hours are mainly after class- but feel free to drop by any time


Contents and Syllabus

Basics of neural network computing, in contrast to algorithmicapproaches, traditional AI problem solving, and Von Neumannarchitecture.Important neural network models, such as Adaline and Perceptron;feedforward and feedback networks; recurrent networks, self-organizingnetworks (Kohonen’s model and the ART models of Grossberg); andthermodynamic networks (Hopfield model, Boltzmann/Gauss/Cauchymachines).Learning methods, such as Hebbian learning, Perceptron learningtheorem, back-propagation learning, unsupervised competitive learning.Analysis of mathematical properties of some network models will begiven, and their limitations discussed.Applications and practical considerations of these techniques will becoveredHands-on experience through a sequence of computer projects.


Course Outcomes

After completing this course, students should be able to:Demonstrate an understanding of the basic concepts and principles ofneural computation as an approach to intelligent problem-solvingDescribe the commonly used neural network architectures and learningalgorithmsDistinguish classes of problems to which neural networks offer solutionssuperior to other methodsDesign a neural network to solve a particular problem


Outline




Why (Artificial) Neural Networks?

(Neuro-)Biology / (Neuro-)Physiology / Psychology:I Exploit similarity to real (biological) neural networks.I Build models to understand nerve and brain operation by simulation.

Computer Science / Engineering / EconomicsI Mimic certain cognitive capabilities of human beings.I Solve learning/adaptation, prediction, and optimization problems.

Physics / ChemistryI Use neural network models to describe physical phenomena.I Special case: spin glasses (alloys of magnetic and non-magnetic metals).


Neural Networks vs. AI

Physical-Symbol System Hypothesis [Newell and Simon 1976]

A physical-symbol system has the necessary and sufficient meansfor general intelligent action.

Neural networks process simple signals, not symbols.

So why study neural networks (in addition to Artificial Intelligence)?

Symbol-based representations work well for inference tasks,but are fairly bad for highlighted textperception tasks.Symbol-based expert systems tend to get slower with growingknowledge, human experts tend to get faster.Neural networks allow for highly parallel information processing.There are several successful applications in industry and finance.


Biological background

ANNs are derived from the biological neural networks that constitute thebrains of humans and animals.Very complex structure, computational capacity superior to that ofcomputers.Computational unit: the neuron, Neuron switching time ˜ .001 secondNumber of neurons ˜ 1010, Connections per neuron ˜ 104−5

Scene recognition time ˜ .1 second100 inference steps does not seem like enough

→ much parallel computation

Properties of artificial neural nets (ANN’s):I Many neuron-like threshold switching unitsI Many weighted interconnections among unitsI Highly parallel, distributed processI Emphasis on tuning weights automatically


Biological background

The neuron consists of:1 Cell body2 Axon: the output wire of the neuron.3 Dendrites: the input wire4 Synapses: form the connections with other neurons.

Signal transmission:1 Electrical, through cell body, axon, dendrites2 Chemical: through synapses


Biological Background

Computational unit: The neuron



(Very) simplified description of neural information processing

Axon terminal releases chemicals, called neurotransmitters.These act on the membrane of the receptor dendrite to change itspolarization.(The inside is usually 70mV more negative than the outside.)Decrease in potential difference: excitatory synapseIncrease in potential difference: inhibitory synapseIf there is enough net excitatory input, the axon is depolarized.The resulting action potential travels along the axon.(Speed depends on the degree to which the axon is covered with myelin.)When the action potential reaches the terminal buttons,it triggers the release of neurotransmitters.


Working principle of the neuron (1)

An electrical potential (caused by the input signals) is built up within theneuron that fires an electrical spike when saturated (Involves verycomplex chemistry too)


Working principle of the neuron (2)

Excitatory synapses increase the ability of the target neuron to fire.Inhibitory synapses decrease the ability to fire.Binary mode of operation: Either a neuron fires, or it does not!Refractory period ˜ 1 ms: Neurons need a period of relaxation (recovery)before a new spike can be fired. ⇒ firing frequency ˜ 1 kHz. (cf.computers working cycle ˜ 3-4 GHz)



The brain is a massively parallel architecture!

A rat has ˜ 1010 neurons, a human brain has ˜ 1012 neurons, and ˜1014 − 1015 synapses.Typically, each neuron is connected to hundreds of other neurons.There are short-range and long-range connections, as well as manyfeed-back loops.Very complex structure (not yet entirely known).NNs operates in a decentralized, asynchronous way, i.e. there is nocentral system clock. ⇒ can do very complex tasks, like e.g. imageprocessing, object recognition etc.


Learning

Learning is constructing or modifying representations of what is beingexperienced.

Learning in ANNs is:1 Modifying synaptic weights.2 Adding or removing synapses.

Hebbian learning:The first attempt to model biological learning in neural networks.Theory formulated in 1949 by D. O. Hebb. Correlated firing of neurons:Cells that fire together, wire together.


Hebbian Learning

Hebbian learningSynapses between neurons that fire (almost) simultaneously will bestrengthened:

dωij

dt= η xi xj

where xi, xj are the signals propagating through the postsynaptic andpre-synaptic neurons,ωij is the synaptic strength, andη is the learning rate.


Modified Hebbian Learning

Modified Hebbian learning (synaptic weakening):

dωij

dt= η (xi − x̂i) (xj − x̂j)

where x̂i, x̂j denote the time averageSynaptic weights are depressed if there is a

1 presynaptic activation in the absence of postsynaptic activation, or2 postsynaptic activation in the absence of a presynaptic activation.

Strong physiological evidence


Architecture of Artificial Neural Networks

Feedforward neural networks (FFNNs)

Recurrent neural networks (RNNs)


Architecture of ANNs

Components of ANNsI Input elements: Mediate the signalsI Hidden layer(s): Carry out the computations.I Output layer: Computes the output.I Connection weights: Correspond to synapses.

Time domains:1 Discrete-time NNs (DTNNs): Function approximation, data classification.2 Continuous-time NNs (CTNNs): Control applications (continuously varying

signal is needed)


Training of ANNs

Supervised learningFor each input vector presented to the ANN there must be a correspondingdesired output signal (training example). strongly guided learning

Unsupervised learningOnly a long-term feedback signal is available. Robotics is a classical example.weakly guided learning

Reinforcement learningLearning system receives a reward signal, tries to learn to maximize thereward signal


Supervised learning examples

Supervised learning: have labeled examples of the correct behaviore.g. Handwritten digit classification with the MNIST dataset

task: given an image of a digit, predict the digit class70,000 images of handwritten digits labeled by humans60,000 used to train the classifier, 10,000 to test its performanceThis dataset is the ”fruit-fly” of neural net researchCurrent best algorithm has only 0.23% error rate!


Supervised learning examples

What makes a ”2”?:


Supervised learning examplesObject Recognition:

ImageNet dataset: thousands of categories, millions of labeled images Lots ofvariability in viewpoint, lighting, etc.


Unsupervised learning examplesUnsupervised learning: no labeled examples – instead, looking forinteresting patterns in the dataE.g. visualization of documents; algorithm was given 800,000 newswirestories, and learned to represent these documents as points intwo-dimensional space

Colors are based on human labels, but these weren’t given to thealgorithm


What are neural networks?Most of the biological details aren’t essential, so we use vastly simplifiedmodels of neurons.While neural nets originally drew inspiration from the brain, nowadays wemostly think about math, statistics, etc.

Neural networks are collections of thousands (or millions) of these simpleprocessing units that together perform useful computations.Deep learning emphasizes that the algorithms often involve hierarchieswith many stages of processingRepresentation learning typically maps the raw data into some otherspace which makes the relationships between different things moreexplicit


Deep learningDeep learning: many layers (stages) of processingE.g. this network which recognizes objects in images: statistics, etc.

Neural networks are collections of thousands (or millions) of these simpleprocessing units that together perform useful computations.Each of the boxes consists of many neurons similar to the one on theprevious


What is a representation in ”RepresentationLearning”?

In your past computer science courses, you may have learned aboutvarious data structures for representing words, documents, etc.

I arrays of charactersI dictionaries of word countsI tries (i.e. trees of prefixes)

How you represent your data determines what questions are easy toanswer.

I E.g. a dict of word counts is good for questions like ”What is the mostcommon word in Hamlet?”

Simple data structures aren’t enough to do higher-level semanticreasoning, e.g.

I Did this reviewer like the book?I Alice liked Harry Potter. Will she like The Hunger Games?I Translate this book into French.


What is a representation in ”RepresentationLearning”?

In a good representation, mathematical relationships between the vectorsshould encode semantic relationships between the things we care about.For instance,

I Measure similarity between words using the dot product of their vectors (ordissimilarity using Euclidean distance)

I Represent a web page with the average of its word vectorsI Complete analogies like ”Paris is to France as London is to —-” by doing

arithmetic on word vectors

It’s very hard to construct representations like these by hand, so we needto learn them from data

I This is a big part of what neural nets do, whether it’s supervised,unsupervised, or reinforcement learning!


csen 1005-neural networksmet.guc.edu.eg/download.ashx?id=18456&file=lec1_18456.pdf · 2 martin...

Documents