נוירוביולוגיה ומדעי המוח 2010. 1. unsupervised learning: only network inputs...

נוירוביולוגיה ומדעי המוח

2010

1. Unsupervised Learning:

• Only network inputs are available to the learning algorithm.

• The network is given only unlabeled examples.

• Network learns to categorize (cluster) the inputs.

• Example: Hebbian plasticity rule

Wi – weight of ith synapseX – presynaptic activityY – postsynaptic activity; n – number of synaptic changes (input patterns)a – amplitude of learning

Types of Machine Learning

( 1) ( ) ( ) ( ) i i iW n W n aX n Y n

Hebbian Rules

• In 1949, Hebb postulated that the changes in a synapse are proportional to the correlation between firing of the neurons that are connected through the synapse (the pre- and postsynaptic neurons):

“Neurons that fire together, wire together”

• Examples:

Classical conditioning

Spike-timing-dependent synaptic plasticity (STDP)

Synaptic Plasticity and Memory

– הפעלת דפוס פעילות על פני התאים שמייצג את המאורעות בעולם גורם לשינוי למידה חוזקי סינפסות ברשת נוירונים.

– שפעול הקשרים שהשתנו מחדש עקב חשיפה לחלק מהדפוס שנלמד שליפת זיכרון קודם.

LTP:כמנגנון הביאני ללמידה וזיכרון מכיל פאזה מוקדמת ומאוחרת כתהליכים-

נפרדים שניתן לחסום פרמקולוגית רק אחד מהםספציפי-אסוציאטיבי- LTPקורלציה בין למידה ל--

(classical conditioning, fear conditioning)NMDA) )דרך חסימת LTPקורלציה בין חסימת -

(Morris Water Maze)לחסימת למידה ושליפת זיכרון מלאכותי )גירוי חשמלי בלבד( יכול LTPיצירת -

להחליף גירוי סנסורי שמוביל ללמידה וזיכרון

http://www.sciencemag.org/content/vol287/issue5451/images/large/se0108182001.jpeg

Application of the hebbian learning rule: The linear associator

• The activation of each neuron in the output layer is given by a sum of weighted inputs.

• The strength of each connection is calculated from the product of the pre- and postsynaptic activities, scaled by a “learning rate” a )which determines how fast connection weights change(.

Δwij = a * g[i] * f[j].

• The linear associator stores associations between a pattern of neural activations in the input layer f and a pattern of activations in the output layer g.

• Once the associations have been stored in the connection weights between layer f and layer g, the pattern in layer g can be “recalled” by presentation of the input pattern in layer f.

2. Reinforcement Learning:

• The network is only provided with a grade, or score, which indicates network performance.

• The network learns how to act given an observation of the world. Every action has some impact on the environment, and the environment provides feedback in the form of rewards that guides the learning algorithm.

• Reinforcement learning differs from supervised learning in that correct input/output pairs are never presented, and sub-optimal actions aren’t explicitly corrected.

• Formally, the basic reinforcement learning model consists of:a set of environment states Sa set of actions Aa set of scalar "rewards" in

Types of Machine Learning

3. Supervised Learning:

• The network is provided with a set of examples of proper network behavior (inputs/targets).

- Experimenter needs to determine the type of training examples

- The training set needs to be characteristic of the real-world use of the function.

- Determine the input feature representation of the learned function (what and how many features in the vector).

• The network generates a function that maps inputs to desired outputs.

• Example: the Perceptron

Types of Machine Learning )deducing a function from training data (

p1 t1{ , } p2 t2{ , } pQ tQ{ , }

Application of Supervised Learning:Binary Classification

• Given learning data: )x1,x2(, )x1,x2(, … ,)x1,x2(

• A model is constructed:

X Model y {0, 1}

• The output y is a linear combination of x:x1

x2

xm

w2

w1

wm

y

The Perceptron

j jj

h W X 11 sgn

2 Y h

11 sgn

2

Y W X

x1

x2

xm

w2

w1

wm

y

Y – output ; h – sum of scaled inputs ; W – synaptic weight ; X - input

sgn)( =1 if h>0, else sgn)( = 0

Geometrical interpretation

1 1 2 2 W X W X W X

1X 2X

Y

1W 2W

W

1W

2WX

1X

2X


W

1W

2WX

1X

2X

1 1 2 2

cos cos sin sin

cos cos sin sin

cos

W X W X

W X W X

W X

W X W X W X

W X W X

W X

W X


11 sgn

21

sgn cos2

Y W X

1X 2X

Y

1W 2W

W

1W

2W

X

1X

2X

The Perceptron

• A single layer perceptron can only learn linearly separable problems.

• A single layer perceptron of N units can only learn N patterns.

• More than one layer of perceptrons can learn any Boolean function

• Overtraining: accuracy usually rises, then falls

Perceptron Learning Demonstration

Input Features:

Output:sweet fruit = 1

not sweet fruit = 0

Perceptron Learning Demonstration

TasteSweet = 1, Not_Sweet = 0

SeedsEdible = 1, Not_Edible = 0

SkinEdible = 1, Not_Edible = 0

We start with no knowledge:

If ∑ > 0.4 then fire

0.0

0.0

0.0

Input

Output

Taste

Seeds

Skin

• To train the perceptron, we will show it each example and have it categorize each one.

• Since it’s starting with no knowledge, it is going to make mistakes.

• When it makes a mistake, we are going to adjust the weights to make that mistake less likely in the future.

• When we adjust the weights, we’re going to take relatively small steps to be sure we don’t over-correct and create new problems.

Perceptron Learning

1 .We Show it a banana:

1

0

0.0


0.0

0.0

0.0

Input1

1

0

0

Output

1Taste

Seeds

Skin

In this case we have:

[(1 * 0) = 0] + [(1 * 0) = 0] + [(0 * 0) = 0] = 0

Since that is less than the threshold (0.4), we responded “no.”

Is that correct? No.

Since we got it wrong, we need to change the weights using the delta rule:

∆w = learning rate * (overall teacher - overall output) * node output


1. Learning rate: We set that ourselves. Has to be large enough that learning happens in a reasonable amount of time, but small enough not to go too fast. (let’s pick 0.25)

2. (overall teacher - overall output): The teacher knows the correct answer (e.g., that a banana should be a good fruit).

In this case, the teacher says 1, the output is 0, so (1 - 0) = 1.

3. Node output: That’s what came out of the node whose weight we’re adjusting.

first node: ∆w = 0.25 X 1 X 1 = 0.25.

1

0

0.0


0.00.0

0.0

Input1

1

0

0Output1Taste

Seeds

Skin


• If we get the categorization right, (overall teacher - overall output) will be zero (the right answer minus itself).In other words, if we get it right, we won’t change any of the weights.

• If we get the categorization wrong, (overall teacher - overall output) will either be -1 or +1:

- If we said “yes” when the answer was “no,” we’re too high on the weights and we will get a (teacher - output) of -1 which will result in reducing the weights.

- If we said “no” when the answer was “yes,” we’re too low on the weights and this will cause them to be increased.

The Delta Rule

The Delta Rule


• If the node whose weight we’re adjusting is “0”, then it didn’t participate in making the decision. In that case, it shouldn’t be adjusted. Multiplying by zero will make that happen.

• If the node whose weight we’re adjusting is “1”, then it did participate and we should change the weight (up or down as needed).

Feature:Learning rate:)overall teacher - overall output(:

Node output:∆w

taste0.2511+0.25

seeds0.2511+0.25

skin0.25100

How do we change the weights for a banana?

0.0

0.250.25

0.0

Input

0OutputTaste

Seeds

Skin If ∑ > 0.4 then fire

1

0

0.0

0.00.0

0.0

Input1

1

0

0Output1Taste

Seeds

Skin If ∑ > 0.4 then fire

2 .We Show it a pear:

0

1

0.25


0.25

0.25

0.0

Input

1

0

1

0Output

1Taste

Seeds

Skin


0.50

0.25

0.25

Input

OutputTaste

Seeds

Skin

Adjusted weights for a pear:

We change the weights for a pear:


Node output:∆w

taste0.2511+0.25

seeds0.25100

skin0.2511+0.25

0

0

0


0.50

0.25

0.25

Input

0

0

0

0

Output

0Taste

Seeds

Skin

3 .We Show it a lemon:

Adjusted weights for a lemon:

We change the weights for a lemon:


Node output:∆w

taste0.25000

seeds0.25000

skin0.25000


0.50

0.25

0.25

OutputTaste

Seeds

Skin

Input

1

1

1


0.50

0.25

0.25

Input

1

1

1

1

Output

1Taste

Seeds

Skin

4. We Show it a strawberry:

Adjusted weights for a strawberry :

We change the weights for a strawberry :


Node output:∆w

taste0.25010

seeds0.25010

skin0.25010


0.50

0.25

0.25

Input

Output

Taste

Seeds

Skin

Input

0

1

0.25


0.50

0.25

0.25

0

0

1

0

Output

0Taste

Seeds

Skin

The perceptron can now classify correctly any example.

5. We Show it a green apple:

Decision Making

• Neuroanatomical substrates of decision making:

Orbitofrontal cortex (within the prefrontal cortex): Responsible for processing, evaluating and filtering social and emotional information for appropriate decision making abilities. It is seen to be involved because of on-line rapid evaluation of stimulus-reinforcement associations, that is, learning to link a stimulus and action with its reinforcing properties.

Anterior cingulate cortex:Controls and selects appropriate behavior as well as monitors errors and incorrect responses of the organism

Dorsolateral prefrontal cortex (DLPFC):Monitors errors and make appropriate choices during decision making. Analysis of cost-benefit in working memory.

Basal ganglia-thalamocortical circuits (BGTC) and frontoparietal networks:Directing attention toward relevant information as opposed to irrelevant information during goal-related decision making processes

Decision Making

• Neuroanatomical substrates of decision making:

The dopaminergic system: Appears to be a primary substrate for the representation of decision utility. Increased firing of dopamine neurons has been documented when people are faced with unexpected rewards and in response to stimuli that predict future rewards.

The Ventral Striatum: the center of integration of the ‘data’ between the prefrontal cortex, amygdala and hippocampus. It plays a critical role in the representation of the magnitude of anticipated reward

The Amygdala: involved in emotion and learning ; responsible for producing fear responses. Plays a key role in the representation of utility from a gain or dis-utility from losses.

Decision Making

• Factors that impact decision making:

Expertise: with expertise come differences in the function and structure of brain regions required for decision making and task completion.

- London black cab drivers who are required to learn and memorize London streets show a different degree of hippocampal volume distribution when compared to ordinary drivers.

- Physics experts use a ‘working forwards’ strategy to solve problems, making decisions using the information given in the problem to derive a solution. In contrast, neophytes to physics typically employ a ‘working backwards’ strategy in which they start from the perceived goal state or decision and back track.

Age: with age come changes in the recruitment of specific brain regions for task completion during decision making. Older adults will often compensate for age-related declines in prefrontal structure and function by recruiting additional prefrontal regions and more posterior regions

Sex: bias toward men for faster decision making in situations of uncertainty and limited feedback.

Neural Activity Correlates of Decision Making

• Neural correlates of decision variables in parietal cortex (M.L. Platt & P.W. Glimcher, 1999):

The gain (or reward) a monkey can expect to realize from an eye-movement response modulates the activity of neurons in the lateral intraparietal area (LIP). In Addition, the activity of these neurons is sensitive to the probability that a particular response will result in a gain.


• “Neurons in the orbitofrontal cortex encode economic value” (C. Padoa-Schioppa & J.A. Assad, 2006):

- Neurons in the orbitofrontal cortex (OFC) encode the value of offered and chosen goods.

- OFC neurons encode value independently of visuospatial factors and motor responses. (If a monkey chooses between A and B, neurons in the OFC encode the value of the two goods independently of whether A is presented on the right and B on the left, or vice versa).

Conclusion: economic choice is essentially choice between goods rather than

choice between actions.


• “Microstimulation of macaque area LIP affects decision-making in a motion discrimination task” (TD Hanks, J Ditterich & MN Shadlen, 2006):

- In each experiment, they identified a cluster of LIP cells with overlapping response fields (RFs)

- Choices toward the stimulated RF were faster with microstimulation, while choices in the opposite direction were slower.

- Microstimulation never directly evoked saccades, nor did it change reaction times in a simple saccade task.

- These results demonstrate that the discharge of LIP neurons is causally related to decision formation in the discrimination task.

נוירוביולוגיה ומדעי המוח 2010. 1. unsupervised learning: only network inputs...

Documents