natural computation ii virginia de sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the...

22
1 Cogsci 118B Natural Computation II Virginia de Sa desa at cogsci

Upload: others

Post on 18-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

1Cogsci 118B

Natural Computation II

Virginia de Sadesa at cogsci

Page 2: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

2Probability Mass function

Discrete random variables have probability mass functions P associated with them

∑a

P (a) = 1

for all possible outcomes a

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 17

Probability Mass functionProbability Mass function

0

0.5

1

1 2 3 4 5 6a

P (a )

A discrete random variable D has an associatedprobability mass function (pmf) P, defined over the setof outcomes U: P(D = a) = P(a), .

Example:die roll U = {1, 2, 3, 4, 5, 6}What is the pmf of thisrandom variable?

Ua!

" " ###a a

aPaDP 1)()(

Property:

uniform distribution

[figure from J. Triesch]

Page 3: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

3Continuous probability density

For continuous random variables it does not make sense to talk about theprobability of a particular value (which is equal to 0)

Instead we talk about probability density

p(x) is a probability density over a continuous variable

Pr(x ∈ [a, b]) =∫ b

a

p(x)dx

p(x) = lim∆x→0

P (x < X ≤ x + ∆x)∆x

e.g. probability density of heights of females

∫ ∞

−∞p(x)dx = 1

Page 4: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

4Expected Value

Expected value or mean

E(X) =∑

xP (x)

E(X) =∫

xp(x)dx

E(f(X)) =∫

f(x)p(x)dx

Page 5: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

5Variance

V ar(X) = E(X − E(X))2 =∑

P (X)(X − E(X))2

V ar(X) =∫

(x− E(x))2p(x)dx

Note we can prove that

V ar(X) = E(X − E(X))2 = E(X2)− (E(X))2

Page 6: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

6Cumulative distribution function

pdf

p(x) = lim∆x→0

P (x < X ≤ x + ∆x)∆x

cdf

Φ(x) = P (x < X) =∫ ∞

−∞p(x′)dx′

Examples on board

• 0 ≤ Φ(x) ≤ 1

• cdf is monotonically non-decreasing

• probability of X being between A and B = P (A ≤ X ≤ B) = Φ(B)− Φ(A) =∫ B

Ap(x)dx

• pdf is the derivative of the cdf (cdf is the integral of the pdf)

Page 7: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

7Gaussian Distribution

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 7

Page 8: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

8Gaussian Distribution

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 8

Page 9: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

9The Normal Density

���� ��� ��� ���

� � �

�������������������� "!�# $%'& (*),+.-0/1321 46578 9�:<;=8>:<?3@A7CB;=DE:<FHG83I�?3;=83J5?383BC7�K:<IL;=B5M�KF3N0OPQRBSH83?3IL:<;=@<:0837�?>K@0;T:<7M�@U VXWZY,[�\^]C_a`b<cdcfegCh,ijlkmenpon<b�q�grs3enptu3cfs3v=u3wxs>u3giEeb<cay�b<z3xn|{~}3�L�����C��� �C�~�d�l���=��X����3���� �=���A���,�� �<���m <¡3  �~¢���£ � � ¡3���<¤ ��� ��¥<� ��¦p��§ ¡ ��T¨ �a©�ª<«>«3¬<­=®°¯�±3ª<²f²f³3´Cµ<ª<«3³3¶®.·�¸�¹º¼»<½=¾3¿<ÀÁÂÃÅÄÆÆCÇEȼÉLÊÌËÍÎEÏ6Ð3Ñ3Ò<ÉpÓÕÔ<ËCÎÖf×Ø3ÎÙ�Ú Pattern Densities are commonly modeled byNormal Densities for several reasons

• Central Limit Theorem: sum of a large number of independent random variablesis normally distributed

• It’s analytically tractable!

• It’s been well studied

• It has the maximum entropy of all distributions with a given mean and variance

Page 10: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

10Univariate normal density

p(x) =1√2πσ

e−(x−µ)2

2σ2

has mean = µ

variance = σ2

has roughly 95% of its area within 2 standard deviations on either side of themean (this is relevant for t-tests).

Page 11: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

11Little Quiz

What is ∫ ∞

−∞

1√2π

e−(x−5)2

2 =

What is ∫ ∞

−∞e−(x−5)2

2 =

What is ∫ 5

−∞e−(x−5)2

2 =

What is ∫ ∞

−∞xe

−(x−5)2

2 =

Page 12: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

12Multivariate Gaussians

Page 13: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

13Central Limit Theorem

LetSn = X1 + X2 + ... + Xn

be the sum of n independent random variables with the same distribution (IID)with finite non-zero variance σ2 and mean µ

Then Sn is approximately normally distributed with mean E(Sn) = nµ andvariance V ar(Sn) = σ2n

limN−>∞

P(Sn − nµ

σ√

n≤ x) = Φ(x)

where Φ(x) is the probability that a standard normal variable is less than x

Page 14: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

14Central Limit Theorem applet

click here

Page 15: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

15Signal Detection Theory

Consider the case of a Doctor trying to determine whether a biopsy is cancerousor not.

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 15

Signal Detection TheorySignal Detection Theory

Example setting: consider radiologist trying to detect tumor in X-ray

The parts of the X-ray picture will be represented by certain activitypatterns in the radiologist’s brain, upon which the radiologist makesthe decision “tumor” or “no tumor”.

For now consider the case where the doctor is looking at a one-dimensionalmeasurement (and for the next few slides assume that the data is distributednormally for both the cancerous and non-cancerous case but with different means(same variance))

Page 16: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

16Signal Detection Theory

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 17

Let’s consider the simple case that radiologist decides “tumor” if thefiring rate (internal response) is above a certain threshold (criterion):

Probabilities related to hit, miss,false alarm, correct rejection:

ratehit : )yes|*( xxp !

detection missed : )yes|*( xxp "

alarm false : )no|*( xxp !

rejectioncorrect : )no|*( xxp "

define: hit rate, false alarm rate, miss rate, correct rejection rate

Page 17: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

17Discriminability

We may want a measure of how separable the two classes are independent of ourdecision threshold

d′ =|µ2 − µ1|

σ

intrinsic measure of discriminability (independent of decision threshold)

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 18

Depending on the overlap of the distributions and on the criterion,hits and false alarms will be more or less likely.

Define discriminability d’:

!"" 21'

#$d

Note: d’ does not depend on criterion c*, measures inherent difficulty of task.But what c* should we choose? (see blackboard)

Page 18: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

18Changing the criterion (cutoff)

Changing the criterion will change the number of hits and false alarms

Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch 18

Depending on the overlap of the distributions and on the criterion,hits and false alarms will be more or less likely.

Define discriminability d’:

!"" 21'

#$d

Note: d’ does not depend on criterion c*, measures inherent difficulty of task.But what c* should we choose? (see blackboard)

Page 19: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

19Signal Detection Theory

��

��� �

��� ��� � � �� ���

��������� ������� ��!#"%$%! &�'%(*)#+%,.-/+10 )#+�243#-/+%36587%9/+:+%;:9/</3#9/(*+%->=@?%'%=#2A9.)#2B?%(*9>249/+%3#C63#7%9.?%(*;%D%-/D�)#=#)#3#0E 9/+%24)#3F0HG#;%(I-/+J)#+%3#9/(*+%->=K24)#,>+%-/=K)F26+%;%(MLJ-/=#CK3F7%-/3K)#2ACONQP R@S T@UWVYX[Z�\#]_^4`/a�b>cAd%e�f�g/hjiFf%gHg>k/i#g/l*h%m>no4p#q h%m/n p#osr l*g o g/h%i#tui#f%gwv%g>h o4p i#x p#o_yQz {@| }�~>�j���������>�>�I���A�u�u�1�.���/�/�#�4�#�%���#�%�M�/�4�%�%�#�w���B�s #¡#¡¢%£/¤#£/¥*¦  #§ £H¤#¨�£H©%¥*ª%«�¬/«  #¡#  ¤#­sª%®K¬s¨   ¤K¯F¤#¨%£H©  #§%° ¬/¥M£/¬H± § ¢%£/¥I¤#¨%£�²�³s´/µ%¶M·>¸/¹Kº>»%¼�·>¸_½�¾À¿KÁ/Â%ÃJÄ%ÅKÁÅ#Á/Æ#Ç4È8Á>Æ#Á/É*ÊÌË#Í#Î%ÈÐÏ%Æ#Á/Ñ�Ò8Á/É*È/Á8Ó%Â�Ã%È/É�Í#Î%ÈÕÔ@ÖY×/Ø%Ù*Ú�Û/Ü1Ý/Þ%ß�Ú�ÛQà�áãâ#ä�åçæ*è�éJê1ëIì#í�î%ï/æMðòñBä1ósô%ð%ï/õ�öO÷/ø#÷/æù äKúsï/æ*ø#õKï/û%ðjósïçü/ì#ðjýsäKþ/ø#è�æ*ÿ/õ��������������� �������������������������������� "!�#�$�%'&(*)�+�+�,.-�/1032�4�5.687�9�:�/<;= 2�5�>�?�@�5BA�C

Page 20: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

20ROC curve

A plot of hit rate vs false alarm rate

Plot pink area over black area (from previous slide) as the threshold is moved

��� ������� ���� �� �

�� ����� ������

�� �!

�� �#"

�$� �!%�$� �!&

'

'($)+*-, ./)+*-)10�2

3!4 5

687 9;:=<?>A@CB�@CDCB E�FHGJI-KMLMKMN O�K�IQPCRCKMI-G�S�N�FCTALVUCGMIWGMLMS�KMIXN�Y$S�N�LJZ�[]\_^a`bLMcCI-O�KMd/S�UCKAGMeCYfLMNgYfYfGJNgYhS�UiKRCI-PCeCGMeCN�j�N�SgklPCm�mgGMj�YfKlGMjgGMIXn_d�o/p qsrut�v�w xsy{zb|f}$~��M�C�;���C�l�C�X�i���C�M���l�������C�l�C�-�C�C�M�C�����������C���C����~�/� �s���8�M� �s�{�8���$�8���X i¡£¢�¤C¥�¡l¥M¦M§f¨C�X¥�©ª¤C«�¢8¦M¬C©;­�¦M®�§f¥�¦�®�¦M�X¡£�W¦M¢�¥M§�¯�¤C¥M�X¥�°M C�-�X¥M§f±C C¬C©i«�¬C²l¢g ³�´/µ�¶l·Vµ�¸�¹iºC¹�»C¼l½�¶C¾l¿fÀCÁiÂöl½M¿aÄ�ÀiÅÃÆXÅM¾l¾iÁCÄ�Ç�ÈiÂ=ÅÃÉ�½M¶l¾CÅM¾iÊCÉMÅÃÄ�ÀC½MÄ�Ë�Ì#ÍÏÎCÐiÑÓÒXÔCÕlÖi×]Ø�ÙVÚCÛMÒXÜlÝhÐÞÃß ÜCÛMàâá#ãMä�ãMÒ/åMÐâæÃÛMÒXä�àâÛMçCÜ Þ Û�èMØ�ÜAéÃÐâêMä�ÔCÒXëMàaì�í�î�î�ïMðXñAòôó�íMõ$õfö�÷CøMíMî�ögùCñ?úâûôüCýÿþ����������� ����������������������������! #"�����$&%�'���(�)

ROC curves are commonly used by people studying sensory perception to measure

Page 21: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

21the discriminability between different stimuli

Why don’t we use classification error?

Page 22: Natural Computation II Virginia de Sa desa at cogscidesa/oldpublic_html/118b/lec2out.pdf · be the sum of n independent random variables with the same distribution (IID) with finite

22Operating Characteristic

Generalizes ROC to when the two categories are not Gaussian

��� ��� ��� ��

��

��� ����������������� �

� !" !#$! %& '(

)�* +

, -�. / 01-�. -�2 3

465 798;:�<>=@?�=@A@? BDCFE>G�H�C@H�I�E�JLK@MNH�IOE�P�QDC@GSR�TNE�IOE�R�PDH�IUQ�V�P�Q�R>R�W@IUX�H�YLPDT@H>E�Z@V�R�QDV�V�E>Q�V[P�T@H>M@IUK@ZNE�Z@Q�J�Q�P�\K@]^]�E�J�V_H>E`J�E�IUabYNced fhgjilk�m nhoqperts_u^v�w@xzy�{@|~}N�Ux@��w@v�y�|~����y�{N|~�@�U}@�@v��@�����Dy��~}@�^{@��y�uN�e� �h�j����� �h��������������D�����@���� O¡`��¢�£9¤@¢` U¢�¥6¦@§N¢� O¡�����¨N©«ª�¤@¡� �¡�ª���¢� O��������ª«ª��@ U¬�¢���¡� O¢«©�¢�¨@¢� �¡�����­F¨@¦@�6��­`®b®b¢��� O��ª�¥6¡����¤@¦N¯;¨°¡��±��¤N¢b O��©�¤@���±²� U¦@®q³±´±��ª�¤@¡� O£°µ«�±¶;�@£@¡�¥�·^¢���¢� ¹¸��±ºj¡` U��¥±¡�¨@£»¶j¡�¬���£»¼j�±½���¦N U¾�¥À¿�Á`Â�Â�Ã�ÄUÅÆ�Ç Á�È�È�É�ÊNË�Á�Â�É�Ì@Å�Í@ιÏ@ÐLÑ�ÒUÓ�Ô�Õ@ÖF×ØÚÙ@Û@Û@ÜqÝNÞeßáà@â@ãbäæå�ç�è`Þjéëê�à@ã@ì_í@î�ã@ï�ð