csen 1005-neural networksmet.guc.edu.eg/download.ashx?id=18456&file=lec1_18456.pdf · 2 martin...

32
CSEN 1005-Neural Networks Introduction and Course Information Hazem M. Abbas– C7.210 German University in Cairo Media Engineering and Technology [email protected] met.guc.edu.eg/Courses/CourseEdition.aspx?crdEdId=348 Winter 2015 H. Abbas (GUC) CSEN1005–NN 2015 1 / 31

Upload: buithuy

Post on 11-Mar-2018

221 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

CSEN 1005-Neural NetworksIntroduction and Course Information

Hazem M. Abbas– C7.210

German University in CairoMedia Engineering and Technology

[email protected]/Courses/CourseEdition.aspx?crdEdId=348

Winter 2015

H. Abbas (GUC) CSEN1005–NN 2015 1 / 31

Page 2: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Outline

1 Class Organization and InformationCourse Targets

2 Introduction to Neural NetsMotivationBiological backgroundLearning and Adaptation

H. Abbas (GUC) CSEN1005–NN 2015 2 / 31

Page 3: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Outline

1 Class Organization and InformationCourse Targets

2 Introduction to Neural NetsMotivationBiological backgroundLearning and Adaptation

H. Abbas (GUC) CSEN1005–NN 2015 3 / 31

Page 4: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Literature

TextbookRudolf Kruse, Christian Borgelt, Frank Klawonn, Christian Moewes, MatthiasSteinbrecher, Pascal Held, Computational Intelligence: A MethodologicalIntroduction, Springer, 2014. (available online)

Other References1 Raul Rojas Neural Networks, Springer, 1996. (available online)2 Martin T. Hagan, Howard B. Demuth, Mark H. Beale, Neural Network

Design, PWS Publishing, 1996.3 Christopher Bishop, Neural Networks for Pattern Recognition, Clarendon

Press, Oxford, 1995.4 Simon O. Haykin, Neural Networks and Learning Machines, 2009.

H. Abbas (GUC) CSEN1005–NN 2015 4 / 31

Page 5: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Marking & Pre-Requisites

MarkingFinal: 40Midterm: 30Class Work: 30- 3 quizzes (2 best of 3): 20- Project: 10

Pre-requisitesBasic calculus, linear algebra and probabilitiesProgramming skillsAlgorithms and Data StructureMatlab and/or Python and/or R

H. Abbas (GUC) CSEN1005–NN 2015 5 / 31

Page 6: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Course Policies

It is very important to attend lecturesTake notes- things might not be on slidesCheck the course website regularly for any announcements or materialOffice hours are mainly after class- but feel free to drop by any time

H. Abbas (GUC) CSEN1005–NN 2015 6 / 31

Page 7: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Contents and Syllabus

Basics of neural network computing, in contrast to algorithmicapproaches, traditional AI problem solving, and Von Neumannarchitecture.Important neural network models, such as Adaline and Perceptron;feedforward and feedback networks; recurrent networks, self-organizingnetworks (Kohonen’s model and the ART models of Grossberg); andthermodynamic networks (Hopfield model, Boltzmann/Gauss/Cauchymachines).Learning methods, such as Hebbian learning, Perceptron learningtheorem, back-propagation learning, unsupervised competitive learning.Analysis of mathematical properties of some network models will begiven, and their limitations discussed.Applications and practical considerations of these techniques will becoveredHands-on experience through a sequence of computer projects.

H. Abbas (GUC) CSEN1005–NN 2015 7 / 31

Page 8: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Course Outcomes

After completing this course, students should be able to:Demonstrate an understanding of the basic concepts and principles ofneural computation as an approach to intelligent problem-solvingDescribe the commonly used neural network architectures and learningalgorithmsDistinguish classes of problems to which neural networks offer solutionssuperior to other methodsDesign a neural network to solve a particular problem

H. Abbas (GUC) CSEN1005–NN 2015 8 / 31

Page 9: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Outline

1 Class Organization and InformationCourse Targets

2 Introduction to Neural NetsMotivationBiological backgroundLearning and Adaptation

H. Abbas (GUC) CSEN1005–NN 2015 9 / 31

Page 10: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Why (Artificial) Neural Networks?

(Neuro-)Biology / (Neuro-)Physiology / Psychology:I Exploit similarity to real (biological) neural networks.I Build models to understand nerve and brain operation by simulation.

Computer Science / Engineering / EconomicsI Mimic certain cognitive capabilities of human beings.I Solve learning/adaptation, prediction, and optimization problems.

Physics / ChemistryI Use neural network models to describe physical phenomena.I Special case: spin glasses (alloys of magnetic and non-magnetic metals).

H. Abbas (GUC) CSEN1005–NN 2015 10 / 31

Page 11: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Neural Networks vs. AI

Physical-Symbol System Hypothesis [Newell and Simon 1976]

A physical-symbol system has the necessary and sufficient meansfor general intelligent action.

Neural networks process simple signals, not symbols.

So why study neural networks (in addition to Artificial Intelligence)?

Symbol-based representations work well for inference tasks,but are fairly bad for highlighted textperception tasks.Symbol-based expert systems tend to get slower with growingknowledge, human experts tend to get faster.Neural networks allow for highly parallel information processing.There are several successful applications in industry and finance.

H. Abbas (GUC) CSEN1005–NN 2015 11 / 31

Page 12: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Biological background

ANNs are derived from the biological neural networks that constitute thebrains of humans and animals.Very complex structure, computational capacity superior to that ofcomputers.Computational unit: the neuron, Neuron switching time ˜ .001 secondNumber of neurons ˜ 1010, Connections per neuron ˜ 104−5

Scene recognition time ˜ .1 second100 inference steps does not seem like enough

→ much parallel computation

Properties of artificial neural nets (ANN’s):I Many neuron-like threshold switching unitsI Many weighted interconnections among unitsI Highly parallel, distributed processI Emphasis on tuning weights automatically

H. Abbas (GUC) CSEN1005–NN 2015 12 / 31

Page 13: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Biological background

The neuron consists of:1 Cell body2 Axon: the output wire of the neuron.3 Dendrites: the input wire4 Synapses: form the connections with other neurons.

Signal transmission:1 Electrical, through cell body, axon, dendrites2 Chemical: through synapses

H. Abbas (GUC) CSEN1005–NN 2015 13 / 31

Page 14: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Biological Background

Computational unit: The neuron

H. Abbas (GUC) CSEN1005–NN 2015 14 / 31

Page 15: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Biological Background

(Very) simplified description of neural information processing

Axon terminal releases chemicals, called neurotransmitters.These act on the membrane of the receptor dendrite to change itspolarization.(The inside is usually 70mV more negative than the outside.)Decrease in potential difference: excitatory synapseIncrease in potential difference: inhibitory synapseIf there is enough net excitatory input, the axon is depolarized.The resulting action potential travels along the axon.(Speed depends on the degree to which the axon is covered with myelin.)When the action potential reaches the terminal buttons,it triggers the release of neurotransmitters.

H. Abbas (GUC) CSEN1005–NN 2015 15 / 31

Page 16: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Working principle of the neuron (1)

An electrical potential (caused by the input signals) is built up within theneuron that fires an electrical spike when saturated (Involves verycomplex chemistry too)

H. Abbas (GUC) CSEN1005–NN 2015 16 / 31

Page 17: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Working principle of the neuron (2)

Excitatory synapses increase the ability of the target neuron to fire.Inhibitory synapses decrease the ability to fire.Binary mode of operation: Either a neuron fires, or it does not!Refractory period ˜ 1 ms: Neurons need a period of relaxation (recovery)before a new spike can be fired. ⇒ firing frequency ˜ 1 kHz. (cf.computers working cycle ˜ 3-4 GHz)

H. Abbas (GUC) CSEN1005–NN 2015 17 / 31

Page 18: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Biological Background

The brain is a massively parallel architecture!

A rat has ˜ 1010 neurons, a human brain has ˜ 1012 neurons, and ˜1014 − 1015 synapses.Typically, each neuron is connected to hundreds of other neurons.There are short-range and long-range connections, as well as manyfeed-back loops.Very complex structure (not yet entirely known).NNs operates in a decentralized, asynchronous way, i.e. there is nocentral system clock. ⇒ can do very complex tasks, like e.g. imageprocessing, object recognition etc.

H. Abbas (GUC) CSEN1005–NN 2015 18 / 31

Page 19: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Learning

Learning is constructing or modifying representations of what is beingexperienced.

Learning in ANNs is:1 Modifying synaptic weights.2 Adding or removing synapses.

Hebbian learning:The first attempt to model biological learning in neural networks.Theory formulated in 1949 by D. O. Hebb. Correlated firing of neurons:Cells that fire together, wire together.

H. Abbas (GUC) CSEN1005–NN 2015 19 / 31

Page 20: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Hebbian Learning

Hebbian learningSynapses between neurons that fire (almost) simultaneously will bestrengthened:

dωij

dt= η xi xj

where xi, xj are the signals propagating through the postsynaptic andpre-synaptic neurons,ωij is the synaptic strength, andη is the learning rate.

H. Abbas (GUC) CSEN1005–NN 2015 20 / 31

Page 21: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Modified Hebbian Learning

Modified Hebbian learning (synaptic weakening):

dωij

dt= η (xi − x̂i) (xj − x̂j)

where x̂i, x̂j denote the time averageSynaptic weights are depressed if there is a

1 presynaptic activation in the absence of postsynaptic activation, or2 postsynaptic activation in the absence of a presynaptic activation.

Strong physiological evidence

H. Abbas (GUC) CSEN1005–NN 2015 21 / 31

Page 22: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Architecture of Artificial Neural Networks

Feedforward neural networks (FFNNs)

Recurrent neural networks (RNNs)

H. Abbas (GUC) CSEN1005–NN 2015 22 / 31

Page 23: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Architecture of ANNs

Components of ANNsI Input elements: Mediate the signalsI Hidden layer(s): Carry out the computations.I Output layer: Computes the output.I Connection weights: Correspond to synapses.

Time domains:1 Discrete-time NNs (DTNNs): Function approximation, data classification.2 Continuous-time NNs (CTNNs): Control applications (continuously varying

signal is needed)

H. Abbas (GUC) CSEN1005–NN 2015 23 / 31

Page 24: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Training of ANNs

Supervised learningFor each input vector presented to the ANN there must be a correspondingdesired output signal (training example). strongly guided learning

Unsupervised learningOnly a long-term feedback signal is available. Robotics is a classical example.weakly guided learning

Reinforcement learningLearning system receives a reward signal, tries to learn to maximize thereward signal

H. Abbas (GUC) CSEN1005–NN 2015 24 / 31

Page 25: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Supervised learning examples

Supervised learning: have labeled examples of the correct behaviore.g. Handwritten digit classification with the MNIST dataset

task: given an image of a digit, predict the digit class70,000 images of handwritten digits labeled by humans60,000 used to train the classifier, 10,000 to test its performanceThis dataset is the ”fruit-fly” of neural net researchCurrent best algorithm has only 0.23% error rate!

H. Abbas (GUC) CSEN1005–NN 2015 25 / 31

Page 26: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Supervised learning examples

What makes a ”2”?:

H. Abbas (GUC) CSEN1005–NN 2015 26 / 31

Page 27: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Supervised learning examplesObject Recognition:

ImageNet dataset: thousands of categories, millions of labeled images Lots ofvariability in viewpoint, lighting, etc.

H. Abbas (GUC) CSEN1005–NN 2015 27 / 31

Page 28: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Unsupervised learning examplesUnsupervised learning: no labeled examples – instead, looking forinteresting patterns in the dataE.g. visualization of documents; algorithm was given 800,000 newswirestories, and learned to represent these documents as points intwo-dimensional space

Colors are based on human labels, but these weren’t given to thealgorithm

H. Abbas (GUC) CSEN1005–NN 2015 28 / 31

Page 29: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

What are neural networks?Most of the biological details aren’t essential, so we use vastly simplifiedmodels of neurons.While neural nets originally drew inspiration from the brain, nowadays wemostly think about math, statistics, etc.

Neural networks are collections of thousands (or millions) of these simpleprocessing units that together perform useful computations.Deep learning emphasizes that the algorithms often involve hierarchieswith many stages of processingRepresentation learning typically maps the raw data into some otherspace which makes the relationships between different things moreexplicit

H. Abbas (GUC) CSEN1005–NN 2015 29 / 31

Page 30: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

Deep learningDeep learning: many layers (stages) of processingE.g. this network which recognizes objects in images: statistics, etc.

Neural networks are collections of thousands (or millions) of these simpleprocessing units that together perform useful computations.Each of the boxes consists of many neurons similar to the one on theprevious

H. Abbas (GUC) CSEN1005–NN 2015 30 / 31

Page 31: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

What is a representation in ”RepresentationLearning”?

In your past computer science courses, you may have learned aboutvarious data structures for representing words, documents, etc.

I arrays of charactersI dictionaries of word countsI tries (i.e. trees of prefixes)

How you represent your data determines what questions are easy toanswer.

I E.g. a dict of word counts is good for questions like ”What is the mostcommon word in Hamlet?”

Simple data structures aren’t enough to do higher-level semanticreasoning, e.g.

I Did this reviewer like the book?I Alice liked Harry Potter. Will she like The Hunger Games?I Translate this book into French.

H. Abbas (GUC) CSEN1005–NN 2015 31 / 31

Page 32: CSEN 1005-Neural Networksmet.guc.edu.eg/Download.ashx?id=18456&file=lec1_18456.pdf · 2 Martin T. Hagan, Howard B. Demuth ... - 3 quizzes (2 best of 3): 20 - Project: 10 ... CSEN

What is a representation in ”RepresentationLearning”?

In a good representation, mathematical relationships between the vectorsshould encode semantic relationships between the things we care about.For instance,

I Measure similarity between words using the dot product of their vectors (ordissimilarity using Euclidean distance)

I Represent a web page with the average of its word vectorsI Complete analogies like ”Paris is to France as London is to —-” by doing

arithmetic on word vectors

It’s very hard to construct representations like these by hand, so we needto learn them from data

I This is a big part of what neural nets do, whether it’s supervised,unsupervised, or reinforcement learning!

H. Abbas (GUC) CSEN1005–NN 2015 32 / 31