computational neuroscience · coursera: computational neuroscience class notes 4 § basis for...

JohnLarkin12/22/16

Coursera:ComputationalNeuroscienceClassNotes

1

ComputationalNeuroscienceCourseHighlights:

• Somelightneurobiology• PCAandeigenbases• Backpropagation• Circuitanalysisforneuromodels• Eigenfaces

WEEK1–IntroductiontoComputationalNeuroscience1.1CourseIntroduction

• DescriptiveModelso howdoneuronsrespondtostimuliandhowisthatquantitativelyencodedo howcanweextractinfofromneurons(decoding)

• Howcanwesimulateasingleneuron?• Whydobraincircuitsoperatethewaytheydo?

Attheendofthecourse…• shouldbeabletoquantitativelydescribewhatisgoingonwithaneuronoranetwork• simulatebehaviorofneurons• formulatecomputationalneurons

1.2DescriptiveModels

• Goal:explainhowbrainsgeneratebehaviors• Goingtocharacterizewhatnervoussystemsdo,howtheyfunction,andwhythey

operateinparticularwayso Descriptivemodels(what)o Mechanisticmodels(howtheneuralsystemdoeswhatitdoes)o Interpretativemodels(why)

• Outputfrombraincellàactionpotential• Def:receptivefield:

o Specificpropertiesofasensorystimulusthatgenerateastrongresponsefromthecell

• Retina–layeroftissueatthebackoftheeyeso Invertedimageprojectedontobackoftheeyeso Retinalganglioncells–conveyinginformationabouttheimagetootherpartsof

thebrain• InformationfromtheretinapassedtotheLateralGeniculateNucleus(LGN)whichthen

passesinformationtothePrimaryVisualCortexV1.

JohnLarkin12/22/16


2

• CentersurroundLGNreceptivefieldsaredisplacedbecauseofthepreferredorientationoftheprimaryvisualcortex

1.3MechanisticandInterpretiveModels

• Efficientcodinghypothesis–supposegoalistorepresentimagesasfaithfullyaspossibleusingneuronswithreceptivefields

• GivenimageI,wecanreconstructwithalinearcombinationofreceptivefieldsmultipliedbytherespectiveneuralresponse

• Wecareaboutminimizingthetotalsquarepixelwiseerrorandalsomakingsurethey’reasindependentaspossible?

• Ideaislikestartwithrandomreceptivefieldandthenrunthecodingalgorithmonnaturalimagepatches

o Whatistheefficientcodingalgorithm?§ Sparsecoding§ Independentcomponentanalysis§ Predictivecoding

• Conclusion:thebrainmaybetryingtofindfaithfulandefficientrepresentationsofthenaturalenvironment

1.4ThePersonalityofNeuronsEssentiallyneurobio101

• Maincharacter:corticalneurono Verysmallabout25micron

• Visualcortexo Axonsformthepyramidaltrackinmotorsystem

• Neurondoctrineo Neuronisfundamentalstructuralandfunctionalunito Neuronsarediscretecellso Informationflowsfromdendritestotheaxonviacellbody

• Dendritesareliketheinputs• EPSP–excitatorypost-synapticpotential• Abunchofthesegetfedintothedendritesandthenessentiallythesummationofthese

istheactionpotential• Ifsomethresholdisreached,thenwehavethisactionpotentialwhichistheoutput• Defneuron

o Leakybagofchargedliquido Neuroninsidesenclosedwithincellmembrane

§ Cellmembraneisalipidbilayer§ Impermeabletochargedionspecies§ BUTthereareionicchannels

• Theionicchannelsletionsflowinandouto Maintainsapotentialdifferenceacrossmembrane

JohnLarkin12/22/16


3

o Concentrationofiondifferenceleadsto-70mV• Ionicchannels

o Voltage-gated:probofopeningdependsonmembranevoltageo Chemically-gated:bindingtoachemicalcauseschanneltoopeno Mechanically-gated:sensitivetopressureorstretch

• Synapseso Junctionsbetweenneuronso Changesinlocalmembranepotential

• Voltagegatedchannelscauseactionpotentialso Depolarizationopenssodiumchannelso Reallyaboutthesodiumandpotassiumbalanceo Downwardspikeofactionpotentialisfromthesodiumchannels

• Thewrappingofpartoftheaxonsiscalledmyelinsheath• Themyelinationofaxonsallowsforfastlong-rangespikecommunication• Actionpotentialhopsfromonenon-myelatedregiontothenext

o Thesenon-myelinatedregionsarecallednodeofRanviero Thisisessentiallyactivewireàlosslesssignalpropagation

1.5MakingConnections:Synapses

• Synapse–connectionbetweentwoneuronso Electricalsynapses–gapjunctions

§ Helpfulforwhenyouneedtosynchronize§ Neuronsfiresimultaneously

o Chemicalsynapses–neurotransmitters

JohnLarkin12/22/16


4

§ Basisforlearningandmemory§ Changesthewaytheotherneuronisaffectedsimplybychangingdensity

o Canbeexcitatoryorinhibitory§ Defexcitatory

• Tendstoincreasethepostsynapticmembranepotential• Tendstoexcitemembraneb• Neurotransmitterscouldbe:glutamate

§ Definhibitory• Tendstodecreasethepostsynapticmembranepotential

§ Sothereisaspike,releaseofneurotransmitter,ionchannelsopen,sodiuminflux,depolarization

• Synapsesarethebasisformemoryandlearning• Allowforlearningthrough:synapticplasticity

o HebbianPlasticity§ Ifaneuronrepeatedlytakespartinfiringanotherneuron,thenthe

synapsebetweenthoseneuronsisstrengthened§ “Neuronsthatfiretogether,wiretogether!”§ Evidence:longtermpotentiation(LTP)

• Experimentallyobservedincreaseinsynapticstrength§ Longtermdepression(LTD)

• Experimentallyobserveddecreaseinsynapticstrength§ LTDandLTParegenerallyconfirmedwithdecreaseinEPSPsize

o Synapticplasticitydependsonspiketiming!o IfinputisafteroutputàLTDo IfinputisbeforeoutputàLTP

1.6TimetoNetwork:BrainAreasandtheirFunction

• Mainlytwotypesofnervoussystems• PeripheralNervousSystem(PNS)

o Twomaincomponentso Somatic–nervesconnectingtovoluntaryskeletalmusclesandsensoryreceptorso Ex.MovingyourarmandhandtoshakeafriendshandàutilizedtheSOMATIC

nervoussystem§ AfferentNerveFibers(incoming)

• AxonsthatcarryinfoawayfromtheperipherytotheCNS(centralnervoussystem)

§ EfferentNerveFibers• CarryinfofromCNStoperiphery

o Autonomic§ Nervesthatconnecttoheart,bloodvessels,etc.§ Guiltyof“fightorflight”reaction

• CentralNervousSystem(CNS)

JohnLarkin12/22/16


5

o SpinalCord+Braino SpinalCord

§ Localfeedbackloopsàreflexarc• Ex:jumpingupwhenyousteponanail• Orjerkingatahotsurface

§ Descendingmotorcontrolsignalsàactivatespinalmotorneurons• Ex:braintellsyourbodytowalk.Yourspinalneuronsaretheones

thatcontrolthis.Sothiswayyoucanwalkandalsotalk.§ Ascendingsensoryaxons

• Conveysensoryinformationfrommusclesandskintothebraino BRAIN

§ Region§ Hindbrain–Medullaoblongata,pons,cerebellum

• Medullaoblongatao Breathing,muscleton

• Ponso Connectedtocerebellumo Involvedinsleepandarousal

• Cerebellumo EQUILLIBRIUMo Languageandattentiono Coordinationandtimingofvoluntarymovements

§ MidbrainandReticFormation• Midbrain

o Eyemovements,visualandauditoryreflexes• ReticularFormation

o Modulatesmusclereflexeso Regulatessleepo Wakefulnessandarousal

§ (nearcenter)ThalamusandHypothalamus• Thalamus

o “relaystation”forallsensoryinformationtothecortexo regulatessleepandwakefulness

• Hypothalamuso Rightbelowthethalamuso BASICNEEDS(thefourf’s)<-lol:

§ FIGHTING§ FLEEING§ FEEDING§ MATING

§ Cerebrum

JohnLarkin12/22/16


6

• Consistsofcerebralcortex,basalganglia,hippocampus,andamygdala

• Perception,motorcontrol,cognitivefunctions,emotions,memoryandlearning

• CerebralCortexo Layeredsheetofneuronso 1/8thofaninchthicko 30billionneurons.10,000synapseseach.o 300trillionconnectionsintotalo Sixlayersofneurons

• NeuralvsDigitalComputingo Thebrainismassivelyparallelizedo Adaptiveconnectivityo Digitalcomputing:

§ MoresequentialviaCPUswithfixedconnectivityo Largecomputationalanalogs

§ Informationstorage:physical/chemicalstructureofneuronsandsynapses

§ Informationtransmission:electrical/chemicalsignaling§ Primarycomputingelements:neurons§ Computationalbasis:unknown

WEEK2–NeuralEncodingandDecoding2.1WhatistheNeuralCode?

• Toolforrecordingfromthebrain:fMRIo Functionalmagneticresonanceimagingo Measuresspatialperturbationsinthemagneticfield

§ Thechangesarecausedfrombloodoxygenation§ Asbloodflowsaroundyoucanseetheunderlyingneuralactivity

• EEG’salsojustshowactivityforabunchofneurons• Calciumimagingisanotherwaytoreadtheneuralcode• Whatistheactualneuralcode?

o Let’slookattheretinao Retina–sheetofcellsatthebackoftheeyeball

§ Takelightfromthelensandconvertstoelectricalsignalso Rasterplot–wayofvisualizingmultipleiterationso Eachneuronencodesabitofthemovie(fromtheexperiment)

• Twoquestions:o Encoding:howdoesastimuluscauseapatternofresponses?

§ Stimulusàresponse§ P(response|stimulus)encoding

o Decoding:howdotheresponsestellusaboutstimulus?

JohnLarkin12/22/16


7

§ Responseàstimulus§ P(stimulus|response)decoding

• Neuronresponseissometypeofaveragefiringrateofgeneratingaspike• Tuningcurve

o Frequencyvsorientationoflighto LooksaboutGaussian

• Thereishigherorderofspatialrecognitions• MRI’shighlightdifferentregionswhenshownfacesvshouses• Tuningcurvescanbedifficulttorecord• Buildingupcomplexselectivity

o Brainareasbuildupthecomplexityofstimulusrepresentationo Geometricinretinaandthalamus,toV1(orientatededges)andthenV4.o Higherorderareasarelesssensitivetodetailssuchascolororlocation.o Thisistheideabehindhierarchicalfeaturesinafeedforwardway

2.1NeuralEncoding:SimpleModels

• Basiccodingmodelo linearresponse

§ r(t)=theta*s(t)(maybe–theta*s(t–tau))§ justgoingtobedelayedandscaledbyalittlebit

o Temporalfiltering(convolution)§ Weexpectresponsetodependonthecombinationofrecentinputs§ r(t)=sumfromk=0tonofs_{t-k}f_k§ thisislikeconvolution§ infactexactdefinition.SeeCheever’spageforrefresher.§ Example:

• Runningaverage• Leakyaverage

JohnLarkin12/22/16


8

o Spatialfiltering§ Connectedwithreceptivefields§ So𝑟 𝑡 = 𝑠%&'𝑓'temporal§ 𝑟 𝑥, 𝑦 = 𝑠-&-.,/&/0-.1&2,/.1&2 𝑓-.,/0§ Thereceptivefieldisf.Howsimilarisittothereceptivefieldisexpressed

byf§ Oftenourreceptivefieldf,isgoingtobeadifferenceofGaussians§ DifferenceofGaussiansreallyjustpicksuptheedges

o Spatiotemporalfiltering§ Bothspaceandtimearegoingtobebest§ Weneedacombination

o Anothersolutionistohavealinearfilterandanonlinearity§ Somethinglike:§ 𝑟 𝑡 = 𝑔(∫ 𝑠 𝑡 − 𝜏 𝑓 𝜏 𝑑𝜏)§ Howdoyoufindthecomponentsofthemodel?

2.3NeuralEncoding:FeatureSelection

• Agoodbasiccodingmodel:combinationofalinearfilterandanonlinearinput-outputfunction

• Oneproblemisofdimensionality• Needtofindthefeaturethatdrivestheneuron• Justenoughsowecanlearnwhatreallydrivescell• Startwiths(t)anddiscretize• Whatistherightstimulustouse?

o Gaussianwhitenoiseo WechooseanewGaussiannumberateachfrequencyo Thepriordistributionisthedistributionofthestimuluso MultivariateGaussian–Gaussiannomatterhowwelookatit

• Determininglinearfeatures->onegoodwayistotaketheaverageo Thevectorthroughthisaverageàspiketriggeredaverageo Thenwecanprojectalloftheotherpointsandprojectalongthataxis

• Linearfiltering=convolution=projection• Lookingforstimulusfeaturefwhichisavectorinhighdimensionalstimulusspace• Summary:findafeatureby:

o Stimulatewithwhitenoiseo Reversecorrelationtocomputespiketriggeredaverageo Thisisgoodapproximationtoourfeature

• Stillthoughhowdowecomputeinput/outputw.r.t.feature• P(spike|stimulus)àP(spike|componentofthestimulusextractedbylinearfilter)• ThenuseBayesRule• P(spike|s1)=P(s1|spike)P(spike)/P(s1)

JohnLarkin12/22/16


9

o Denominatoriscalledthepriorremembero AndP(s1|spike)isthespikeconditionaldistributiono P(spike)isindependentofthestimulus

• P(spike|s1)=P(s1|spike)P(spike)/P(s1)o Let’sassumerandomàthismeansthatisiftheblueandreddon’tchangeo Thenwemighthavefilteredouttherightfeatureo Whatwewanttoseeisanicedifferencebetweenthepriorandthespike

conditionalo Thismeansthatourinput/outputcurvewillbeinterestingandwecanpredict

highfiringrates• Let’saddthepossibilityofmultiplefeatures• Thisessentiallymeansthereareseveralfilters• WecouldusePCA!!Ahhh

o Thiswaywegetlikethemaindimensionalityo Asthevideoputsit,general,famous,andkindofmagicaltoolfordiscoveringlow

dimensionalstructureo Thecomponentscorrespondtoorthogonalsetofvectorsthatspanthecloudo Theimportantdimensionsaresomeunknownlinearcombinationofdimensionso Givesanewbasissettorepresentthedataàlotsofcompressiono Here,itisgoingtobesomebasisofourfeatureso Tangent:eigenfaces!!

§ Wecanrepalmostanynewfacesassumsofdifferenteigenfaces• PCApicksoutthedimensionwiththelargestamountofvariance• Thenweprojecttherestofthedataintothefeaturespace• We’retryingtofindinterestingfeaturesintheretina• Wefindan“on”andan“off”feature• Usingthistechnique,wecanplotourdatainthetwofeatureaxesandwecanfindthe

onandtheofffeatures• NOTE:thetwofeaturesarenottheonandofffeaturethemselves,buttheyallowa

coordinatesystemwherewecanseethestructure2.4NeuralEncoding:Variability

• RecalltheGaussianfunction:

• 𝑃 𝑥 = 𝐴𝑒&(=>=? @

@A@)

• WhenweusesomethinglikePCA,makingsurethatwehaveastimulusthat’sassymmetricaspossiblewithrespecttocoordinatetransformations

• Butwhatifwedon’tusePCA,andwejustlookatthepriorandtheconditionaldistributionandsay,canIfindafilter?Meaning,likewhenIprojectthestimulusontoitarethedistributionandpriorasdifferentaspossible

o Standardformeasuringthedifferencebetweentwoprobabilitydistributions:o KULLBACK-LEIBLERDIVERGENCE(DKL)

JohnLarkin12/22/16


10

§ 𝐷CD 𝑃 𝑠 , 𝑄 𝑠 = ∫ 𝑑𝑠𝑃 𝑠 logIJ KL(K)

§ Sowejustwanttomaximizethisf§ Kindofturnsintoanoptimizationproblem

o Maximallyinformativedimensions§ ChoosefiltertomaximizeDKLbetweenspikeconditionalandprior

distributions§ Sowejustvaryourfilteraround,tomaximizetheDKL§ Tryingtofindastimuluscomponentthatisasinformativeaspossible§ Thisisareallypowerfultechniquebecauseitcangenerate§ HOWEVER,ADOWNSIDEISTHATTHISISAVERYTOUGHOPTIMIZATION

PROBLEMANDGLOBALOPTIMIZATIONISTRICKY• Findingrelevantfeatures

o Singlefilterdeterminedbyconditionalaverageo FamilyoffiltersfromPCAo Informationtheoreticmethodsthatusewholedistribution

• Assumptionthatwemakeisthateveryspikeisindependentofotherso Bernoullitrialso Sokindoflikeacoinflippingo Dividingtimesampleintomultipletimebinso Sequenceofntimebinswheren=T/∆to Binomialdistribution

§ P=probabilityoffiring§ Distribution:𝑃2 𝑘 = 𝑝' 1 − 𝑝 2&'(𝑛\𝑐ℎ𝑜𝑜𝑠𝑒𝑘)§ Thenchoosekisbecausewedon’tcareaboutthewaywe’rearranging

thosekspikes§ Average:nporrT§ Variance:np(1-p)§ Fanofactor:F=1§ Intervaldistribution=P(T)=rexp(-rT)§ Fanofactor–testsifsomethingisaPoissondistributionornot§ Iffanofactor==1:itisPoisson§ Here,wehavedefinedrastherateorprobabilityofperunitsoftime§ Tisourtime§ WedosomecalculationsfrombinomialandbinomialàPoisson§ Exproblem:

• Supposethatwhileastimulusispresent,aneuron’smeanfiringrateisr=4spikes/second.Ifthisneuron’sspikingischaracterizedbyPoissonspiking,thentheprobthattheneuronfireskspikesinTsecondsisgivenby:

• 𝑝 𝑘 = UV WX>YZ

'!

JohnLarkin12/22/16


11

• Whatistheprobthatwhenthisstimulusisshownforonesecondtheneurondoesnotfireanyspikes?

• e^-4bcp(0)=1*e^-4/1§ Intervalsbetweenspikeshaveexponentialdistribution

• TwostrongtraitsofPoisson:o Fanofactor==1o Intervaldistribution:exponentialdistributionoftimes

• Sothenwecanlookattheslopeofthenumberofspikesvsthemeancountandthenwecanlookattheslope

• IfdistributedPoisson,thentheslopesshouldallbe1.Solookingatthevariancevsthemeancountshouldhaveaslopeofabout1

• Poissonnatureoffiringandrandomnessthatweneedtakescareofrandombackgroundnoise

• Poissonassumesspiketimeindependent• Realneuronshaverefractoryperiodthatpreventsthecellfromspikingimmediately• Generalizedlinearmodel:

• Exponentialnonlinearityàabletofindallparametersofthemodel,usingan

optimizationschemethatisgloballyconvergent• Moregeneralitybutmodelnowmorecompleteinanotherway• GLM=generallinearmodel• Timerescalingtheorem

o UsePoissonnaturetotestwhetherwehavecapturedeverythingo Wecanpredictouroutputspikeintervalsandscalethembyfiringratethat’s

predicted

JohnLarkin12/22/16


12

o Takeintervaltimesandscalethembyfiringrateo ThesenewscaledintervalsshouldbedistributedlikeapurePoissonprocesso Asasinglecleanexponential

QUIZ2

1. Acosinefunctionisnotalinearfilteringsystem2. ThedefinitionofaspiketriggeredaverageforaneuronisThesetofstimulipreceding

aspike,eachaveragedovertime.a. Igotthiswrong.ThecorrectanswerisTheaveragedstimulusvaluesovera

giventimebeforeaspikethatelicitaspike.Thatshouldhavebeenobviousfromthepythonscriptbutalas…

3. Samplingrateis1sample/500s.soin1s/500Hz=0.002sampleperiod.Samplingperiodistheinverseofthesamplingfrequency.Thisis2ms.

4. #timestepsinouraveragevectoris300ms/widthbetweeninterval=300ms/2msfrom#3=150.

5. Justlen(num_spikes)=535836. Seecorrespondingcode7. Leakyintegration?Becausewecanseethatthingsaredecayingawaypriortothespike8. Wecankindofthinkofthisneuronlikeacapacitor.IhadtolookthisoneupbecauseI

wasn’tsure.Butyeahsoit’skindofchargingupright?Solikethebestthingisgoingtobeaconstantpositivevaluebecausethenitwillgraduallychargeupandfireit’sneuron.

9. PCAisthebestofthewaysWEEK3–ExtractingInformationfromNeurons:NeuralDecoding3.1NeuralDecodingandSignalDetectionTheory

• Reallygoingtochoosebetweentwocases:o Singleneurono Rangeofchoices,wherethereareafewneuronsthatmightbeaffectedbythe

stimulus• Alsohowdowedecodeinrealtime• Famousexperimenttodeterminehownoisysensoryinformationwasinterpreted

o Monkeywouldfocusonascreeno Watchapatternofrandomdotsmoveacrossthescreeno Monkeytrainedàfollowthedots.Trackingthedotpatterns.o Dotpatternisnoisy.Hardtotellwhichwayit’sgoing.o Fractionthatthemonkeyactuallygetsrightisafunctionofthecoherence.It

lookslikeasigmoidalfunctionalmost• SignalDetectionTheory

o Wecangeneratesomegraphso Risthenumberofspikesinasingletrialo Twoprobabilitydistributionsdist.Normalo P(r|-)andP(r|+)

JohnLarkin12/22/16


13

o Wewanttomapsomerangeofro Thismeanssomethresholdo TheintersectionbetweenthetwoGaussianswouldmaximizethepercentage

correcto P_corr=P(+)P(r\geqz|+)+p(-)(1–p(r\geqz|-)o Falsealarms:P[r\geqz|-)o Goodcalls:P(r\geqz|+)o Theseprobabilitiesp(r|-)andp(r|+)areknownasthelikelihoodso Choosingthemaximumlikelihood

• Likelihoodratioo Puttingathresholdonthelikelihoodratio

o \ 𝑟 +\ 𝑟 − > 1wheneevewechooseplus

o Thisisthemostefficientstatistictouse,ithasthemostpowerforitssizeo ThisiscalledtheNeyman-PearsonLemmao https://en.wikipedia.org/wiki/Neyman%E2%80%93Pearson_lemmao Reallycoollemmaactually

• Seemstobeaclosecorrespondencebetweendecodedneuralresponseandmonkey’sbehavior

• Sowhydowehavesomanyneurons?Tbd• Logodds!!AhZuckertalkedaboutthisinmobile• Sowehave

• 𝑙 𝑠 = J 𝑠 𝑡𝑖𝑔𝑒𝑟\(K|bUXXcX)

• log 𝑙 𝑠 = log 𝑝 𝑠 𝑡𝑖𝑔𝑒𝑟 + log 𝑝(𝑠|𝑏𝑟𝑒𝑒𝑧𝑒)

JohnLarkin12/22/16


14

• • Firingratesrampupuntilacertainsuredecision• Butbacktoourtrial…• Whatistheactualprobabilitythatwehaveatiger?It’sreallylow!Weneedtotakeinto

accountthepriors.• Thewindoratiger?

o Rodsinyoureyescanresponsetolightandevenasinglephotono Soifweadjustourprobabilitydistributionsthenwecanpickoutinstanceswhen

thereisasignificantdifferenceinfiringrateo Buildingincosto Wehavemultiplelossfunctions

• LossFunctionso Loss_minus=L_minusP[+|r]o Loss_plus=L_plusP[-|r]o Cutyourlosses:answerpluswhenloss_plus<loss_minuso Newcriterionforthelikelihoodratio:

§ \ 𝑟 +\ 𝑟 − > DfJ &

D>J g

3.2PopulationCodingandBayesianEstimation(kindofatoughonetogetthrough)

• Cricketsaresensitivetowind.Likewickedsensitive.• Allbecauseofcricketcercalcells.

JohnLarkin12/22/16


15

• Theseneuronsrespondwithpeaksinoneofthefourcardinaldirections,whichis45˚totheanimal.Leftandright,frontandback.

• Thecurvesareapprox.cosine,sothatneuronsrespondtocosineofangle.Neuron’sfiringrateisproportionaltotheprojectionofthewindvelocity.

• BayesianInferenceo 𝑝 𝑠 𝑟 = \ 𝑟 𝑠 \ K

\[U]

o 𝑎𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟𝑖𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 = 𝑙𝑖𝑘𝑙𝑖ℎ𝑜𝑜𝑑𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 ∗ 𝑝𝑟𝑖𝑜𝑟 mnK%Unbo%np2qrUsn2rtmnK%Unbo%np2

o Maximumlikelihoods*whichmaximizesp[r|s]

• Decodinganarbitrarycontinuousstimuluso Assumeindependenceo AssumePoissonfiring

§ Spikesarerandomandindependent

§ 𝑃V 𝑘 = UV W uvw &UV'!

§ Thenwewant,r_atostimuluss§ Thatisthefiringratetoastimulus

§ 𝑃 𝑟r 𝑠 = xy K V YyZ uvw &xy K VUyV !

§ 𝑃 𝑟r 𝑠 = 𝛱 xy K V YyZ uvw &xy K VUyV !

becausewe’reassumingindependence

andthenwegofroma=1toN§ Wecantakethelog§ Themathgotprettyhairsoherearesomephotos

JohnLarkin12/22/16


16

JohnLarkin12/22/16


17

§ Andthenwewanttotakethederivativeandsetthatequaltozerotofindthemostlikelyvalue

JohnLarkin12/22/16


18

§ OkIdidn’twanttowritealltheequationsoutinaworddocsoherearethepictures

§ Thismethodtakescareofweightingthembasedonthevariance• Limitations

o Tuningcurve/meanfiringrateo Correlations

3.3ReadingMinds:StimulusReconstruction…shouldgobackandrewatch

• Oneday–playbackourdreams?• Extendmodeltohandlevaryingcontinuouslyintime.• Wewanttofindestimators_bayesthatgivesusbestpossible• IntroduceerrorfunctionL(s,s_bayes)• Leastsquarescost.SojustL(s,s_bayes)=(s-s_bayes)^2• Solution:s_bayes=intdsp[s|r]s• Readingminds:fMRI

o OutputpredictedonBOLDsignals(bloodoxygensignals)o Itthereforehasadelayo ^that’sonewayo Anotherwayisamotionenergyfilter

3.4FredRiekeonVisualProcessingintheRetina

• Afewrodsoutof1000sarecontributingsignals• Allrodsaregeneratingnoise• Averagingwouldbeadisaster• Haveaccesstorodsignalandnoiseproperties• Soweseeevidenceforanonlinearthresholdbetweenrodandrod-bipolarcells• Visionisworkingunderconditionswherethevastmajorityaregeneratingnoise

o WanttoscalethedistributionstotakeintoaccountthepriorprobabilityQUIZ3

• Stimuluss.Canbeoneoftwovaluess1ors2.Firingrateresponser.Understimuluss1reposerateisroughlyGaussian~N(5,.5^2).S2~N(7,1).

• Itistwiceasbadtomistakenlythinkthatitiss2ratherthans1.o Sothisissayingsomethingaboutwherewe’rethresholding.

• “Thediseaseisveryrare.Thepriorprobabilityofbeingpositiveforthediseaseisthereforeverylow.MAP(maximumaPRIORi)takesthisintoaccount;MLEdoesnot.ThemathematicsdifferinthatMAPincludesatermfortheprior.”

o Frommystats.stackexchangequestionIaskedabout

JohnLarkin12/22/16


19

WEEK4–InformationTheoryandNeuralCoding4.1InformationandEntropy

• Goingtostartbytalkingaboutentropyandinformation• Howtocomputeinformationforneuralspiketrains• Andwhatcanthistellusaboutcoding• Oksobacktothemonkeyexample:

o Informationquantifiessurpriseo Someoverallprobpthatthere’saspikeo P(1)=po P(0)=1-po Information(1)=-log_2po Information(0)=-log_2(1-p)

• Whydoestheinformationhavethisform?• Eachbitofinformationspecifieslocationbyfactorof2• Whatwe’rereallydoingismultiplyingtheprobabilities• Entropy–averageinformationofarandomvariable

o Measuresvariabilityo Unitsareinbitso Entropycountstheyes/noquestionso Entropy=−∑𝑝n logI 𝑝n o Orincontinuous−∫ 𝑑𝑥𝑝 𝑥 logI 𝑝(𝑥)

• Thisisessentiallyjusthuntingforthebinarysearch• 𝐻 =− 𝑝n log 𝑝n • 𝑝n =

}~

• 𝐻 =− }~log }

~n1}%p~

• }~∗ −3 = 3

• Threequestionstofindcar(inexample)andthat’sexactlytheentropy• Maximizetheentropy

o Computetheentropyasafunctionoftheprobabilitypo Whatdoeshavingalargeentropydoforacode?o Givesthemostpossibilityforrepresentinginputso YouwanttofindthevalueofpsuchthatHhasamaxo Ifp==½thenthosetwosymbolsareusedequallyasoften

• Entropytellsaboutintrinsicvariabilityoftheoutputs• Week2wasaskinghowdoweknowwhatourstimuluswas• Butnow,weneedtoincorporateourerrorchances• Assumethesameerror• Howmuchoftheentropyisaccountedforbytheseerrors?• Totalentropy:H[R]=-P(r_+)logP(r_+)–P(r_-)logP(r_-)• Noiseentropy:H[R|+]=-qlogq–(1-q)log(1-q)

JohnLarkin12/22/16


20

• Thesestimulusdrivenentropiesarecallednoiseentropies• Amountofentropythatisusedincodingthestimulus• MI(S,R)=Totalentropy–averagenoiseentropy• 𝑀𝐼 = − 𝑝 𝑟 log 𝑝 𝑟 − 𝑝 𝑠 [− 𝑝 𝑟 𝑠 log 𝑝(𝑟|𝑠)]UKU • Entropyandinformation

o Fixingpo Varythenoiseprobabilityo Whenthereisnoerror,themutualinformationis1to1.Informationisjustthe

entropyofthresponse.o Astheerrorrateincreases,errorprobabilitygrowslargerandlarger.o Ifp(r|s)=p(r),themutualinformationMIofrandsiszero,becausethisis

sayingrandsareindependentandthereforenoinformationisgainedo Ifresponseisperfectlypredicted,thentheMIis1,becausetotalinformationis

conveyedby1.• Mutualinformationmeasuresrelationship

o TheinformationquantifieshowindependentRandS.o GoingtousetheKullbackLeiblerdivergence.

§ Thisisameasureofthedifferencebetweentwoprobdistributions§ Normallyitisbetweena“true”distributionandatheoreticaldistribution§ https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence

o 𝐷CD 𝑃, 𝑄 = ∫ 𝑑𝑥𝑃 𝑥 log J -L(-)

o Goingtogeneralizesothatdistributionsarefunctionsofsandr.Sowewouldneedtointegrateoverbothsandr

o ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 log J U,KJ U J K

= ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 log J 𝑟 𝑠 J KJ U J(K)

o = ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 [log 𝑃 𝑟 𝑠 − log 𝑃(𝑟)]o =−∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 log 𝑃 𝑟 + ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠 𝑃 𝑟 𝑠 log 𝑃(𝑟|𝑠)o Thefirstbitwecanjustintegrateoverso ThesecondtermisgoingtobetheentropyofP(r|s)o Thisgivesusexactlywhatisexpectedo 𝐼 𝑆, 𝑅 = 𝐻 𝑅 − 𝑃 𝑠 𝐻[𝑅|𝑠]K

• Calculatingmutualinformation• TakeonestimulussandrepeatmanytimesàobtainP(R|s)• Computevariabilityduetonoise:noiseentropyàH[R|s]• Repeatforallsandaverageà\sum_sP(s)H[R|s)• ComputeP(R)=\sum_sP(s)P(R|S)andtheoreticalentropy

4.2CalculatingInformationinSpikeTrains

• Twomethods:singlespikes,vslotsofspikes• Mutualinformation=diff(totalresponseentropy,meannoiseentropy)• Methodology:

JohnLarkin12/22/16


21

o Divideupvoltagetrainintolettersize∆tandlengthTo Essentiallythenjusthavea1ifwehaveaspike0ifnoto Fromthis,computep(w_i)o 𝐻 𝑤 =− 𝑝 𝑤n log 𝑝 𝑤n o HowtosampleP(S)àaverageovertimeo Foreachtime,we’regoingtogivensetofwordsP(w|s(t))o Thenwehaveanaverageentropyo Chooselengthofrepeatedsamplelongenoughsothatthesamplethenoise

adequately• Informationinsinglespikesissimilartowhatwejustsawinthepreviouslecture• Afterabitofmathandsomeassumptions,theinformationperspikeis:• 𝐼 𝑟, 𝑠 = }

V 𝑑𝑡 U %

U�yYlog U %

U�yY�%pV

• Noexplicitstimulusdependence(NONEEDFORCODING/DECODINGMODEL)• Theraterdoesnothavetomeanrateofspikesàcanberateofanyevent• Limitationsofinformation:

o Spikeprecision,blursr(t)o Meanspikerate

4.3CodingPrinciples

• Naturalstimulio Hugedynamicrangeo Powerlawscaling

• Efficientcoding:o Inordertohavemaxentropyoutput,agoodencodershouldmatchitsoutputs

tothedistributionofitsinputso Shouldbeabletostretchitsinputaxis(INREALTIME)sothatitcan

accommodatethevariationsintheoverallscaling• Featureadaption

o Powerspectrumandsignaltonoiseratioarelargefactorsforthepredictedreceptivefieldatcertainlightlevels.

o Centerbecomesbroaderinlowlightlevelso ChoosefiltertomaximizeKullbackLeiblerdivergencebetweenspikeconditional

andpriordistributions• Redundancyreduction

o Neuralsystemsshouldbetryingtoencodeasefficientlyaspossibleo Maximizetheentropyshouldtakeintoeffectthemarginaladdedtogethero Correlationscanbegoodàerrorcorrection+correlationshelpdiscrimination

• NeuronpopulationsshouldbeasSPARSEaspossibleo Let’ssaywerightdownasetofbasisfunctions,phi,o Anyimagecanbeexpressedasaweightedsumo 𝐼 𝑥brU = 𝑎n𝜙n 𝑥 + 𝜖(𝑥)n

JohnLarkin12/22/16


22

o Wanttopenalizehavingtoomanyimageso Fourierbasisrepresentsinsinandcosine…butnotnecessarilysparsebecause

thepowerspectrumisbroado Sparecode–excitesaminimumnumberofimage

• ClassicandStateoftheArtMethods:o Modelsforhowstimuliarecodedinspikeso Modelsfordecodingstimulusfromneuralo Informationtheoryo Averyquickglanceathowcodingstrategiesmightshapeotherthings

WEEK5–ComputinginCarbon5.1–ModelingNeurons

• Abouttodelveintocircuitdiagrams• Differentialequations(largelyfirstorder)• HodgkinHuxleymodel

o Shouldbeareviewfrombiomedicalsignals• Basicreviewofcircuitdiagrams• Membranepatch

o Wehavealipidbilayer§ Likeacapacitor

o Poreso Channel

• Cellbatteryo Outsidethecell:highersodium,chlorineandcalciumcontentso Insidehigherpotassiumlevelso Concentrationgradient=battery

§ NernstEquation𝐸 = '�Vc�ln n2KnmX

[po%KnmX]

• Currentsflowthroughionchannel5.2–Spikes

• Whatmakesaneuroncompute?• Neuronrespondstostepsandthresholds

o Uncoverthenon-linearity• Gatehassubunitsthatneedtobeopenforthingstogothrough• Gatingdependsonsubunitstate

o P_k=n^4o nisopenprobo 1-nisclosed

• Reviewofbiomedicalsignals• Independentprobabilityofbeingopen• HodgkinandHuxley’snobelequation

JohnLarkin12/22/16


23

o Specifiesconductancefordifferentchannelso Timeconstantdictateshowrapidlyeachvariablecorrespondstovoltagechange

• Hodgkin-HuxleyModelo Twodifferentplaceso Biophysicalrealmàionchannelphysics,additionalchennlso Simplifiedmodelsàfundamentaldynamics,analyticaltractability

5.3–SimplifiedModelNeurons

• Canonebuildalargemodelwithlotsofneurons• Caputringthebasic

o Forcethemodeltobelinearo dV/dt=f(V)+l(t)#nonlinearbecauseoff(V)o dV/dt=-a(V-V_0)+l(t)o Likeapassivemebraneo C_mdV/dt=-g_L(V–V_e)o Integratedfiringmodel^

• Exponentialintegrate-and-fireneuron• Thethetaneuron

o Greatforperiodicneuronso Onedimensionsalo m�

m%= 1 − cos 𝜃 + (1 + cos 𝜃)𝑙(𝑡)

• Twodimensionalmodelso Needaphaseplatediagramo Canfindthenullclines–theplacewherethederivativeisequaltozeroo Fixedpointisgoingtobetheintersectionbetweentwonullclines

• Variousneuronshavedifferentfiringratesandoscillations5.4–AForestofDendrites(shouldreview)

• Realneuronsarebrutaltomodel• Injectcurrentatthecellbodyandrecordeffectindendrites• Sowe’relookingatthesomatoseetheresponseatsomeinput• Inputsthatcomeinatdifferentpartsofthedendritecanhaveverydifferenteffects• Theoreticalbasisfordendritecommunication

o PDEs!o Linearcableso VoltageVisafunctionofbothxandto Essentiallyabunchofcircuitsdistributedalongatableo Nowaspatialderivativethathastobetakenintoeffecto Essentially,thediffusionequation,butwehaveanadditionalV_m/r_mo Timeconstant:𝑡q = 𝑟q𝑐qo Spaceconstant:𝜆 = U�

U�

JohnLarkin12/22/16


24

o R_mismembraneresistance• Functiondecaysrapidlyasafunctionofspace

o Geometrycanbeextremelycomplicatedàcableequationo Ionchannelso Solution:divideandconquero Eachcompartment=onedV/dtequationo Ifbranchesopenacertainbranchingratio,canreplaceeachpairofbranches

withasinglecablesegmentwithequivalentsurfaceareaandelectroniclength• Ionchannelsintroducethenonlinearity• Dendritescanaddalottoneuronalcomputation

o Logicaloperationso Lowpassfilter,attenuationo Coincidencedetectiono Segregation,amplification

• Example:o Delaylinesinsoundlocalization

EricShea-BrownonNeuralCorrelationsandSynchrony

• Thisguyseemsgood• Encodingviaspikes• Eyeàopticnerveàlateralgeniculatenucleus(LGN)àvisualcortex• Tuningcurve–firingratesasafunctionoftheangleofsomestimulus• Gotabunchofneurons,haveatuningcurve,alsovariancearoundthatmean• Twostatistics• Stillcanquantifysimilarstatistics• Pairwisecorrelationàdeparturefromindependence

o Labelthespikecountso Piersoncorrelationcoefficiento Orjustcorrelationcoefficiento Thenyouaskifthatnumberis0ornon-zeroo Correlationcandegradesignalencoding

• Turnsoutthatyoucanapplythistechniquetonumerousneuronso Computethesignaltonoiseratio

§ Mean/varianceo SNRàgoingtogrowwithM(numberofneurons)o Thenalsoobservethecorrelationcoefficient

• PairwisecouplesofentirepopulationWEEK6–ComputingwithNetworks6.1ModelingConnectionsbetweenNeurons

• Linearfiltermodelofasynapse

JohnLarkin12/22/16


25

• Seeonlinenotesforthislecture• Justlistenedtotheaudio

6.2IntroductiontoNetworkModels

• Learnedthatneuronsusesynapsestoconnect• Learnedhowtomodelwithdifferentialequations• FEEDFORWARDVSRECURRENT• Modellingnetworks

o SpikingNeurons§ Pro:Learningbasedonspiketiming§ Pro:Spikecorrelations§ Con:computationallyexpensive

o Firing-rateoutputs(realvaluedoutputs)§ Greaterefficiency,scaleswelltolargenetworks§ Ignorespiketiming

o Howaretheyrelated?• Synapseb• Inputspiketrainrho_b(t)• 𝜌b 𝑡 = 𝛿(𝑡 − 𝑡n)n • 𝑔b 𝑡 = 𝑔b,qr- 𝐾(𝑡 − 𝑡n)%��% = 𝑔b,qr- 𝐾 𝑡 − 𝜏 𝑝b 𝜏 𝑑𝑡&\n2x • Fromsinglesynapsetomultiplesynapses:

o Eachsynapsehasasynapticweighto Assumenononlinearinteractionso Thentotalsynapticcurrento 𝐼K 𝑡 = 𝑤b∫ 𝑓𝑟𝑜𝑚 − inf 𝑡𝑜𝑡𝐾 𝑡 − 𝜏 𝑝b 𝜏 𝑑𝜏b1}%p� o Wegofromspiketrain,tofiringrateo Thiswouldfailiftherewerecorrelationsorsynchronies

• SupposesynapticfilterKisexponential• Firing-rate-basednetworkmodel• Outputfiringratechangeslike:𝝉𝒓

𝒅𝒗𝒅𝒕= −𝒗 + 𝑭(𝑰𝒔 𝒕 )

• Inputcurrentchangeslike:𝝉𝒔𝒅𝑰𝒔m%

= −𝑰𝒔 + 𝒘 ∗ 𝒖• Weightsmatrixw• Togetsteadystate,weneedtosetbothoftheseequaltozero• Staticinput:v_ss=F(wdotu)• THERICHDYANMICSTHATAREACTUALLYINTHESYANPTICCURRENTAREREPLACED

WITHASIGMOIDALFUNCTIONFORARTIFICALNEURALNETWORKS• THAT’SONEOFTHEBIGDISTINCTIONS• HENCEARTIFICAL• BIGASSUMPTIONSTHATTHESYANPSESARERELATIVELYFAST• Multipleoutputneurons

JohnLarkin12/22/16


26

• Thenwehaveaninputvectorandanoutputvector• Visnowavector.Wbecomesourweightmatrix.• ThishasallbeenFEEDFORWARDNETWORKS• 𝜏 m𝒗

m%= −𝒗 + 𝐹(𝑊𝒖 +𝑀𝒗)

• Forfeedforwardnetworks,Misamatrixofzeros• There’snolikepassbackwithoutrecurrentnetworks!!• LinearFeedforwardNetwork

o Steadystate:𝑣KK = 𝑊𝒖• Edgedetectorsinthebrain

o Primaryvisualcortex(V1)o ReceptivefieldsinV1haveedgedetection

6.3TheFascinatingWorldofRecurrentNetworks

• Wanttofindouthowtheoutputv(t)behavesfordifferentM• Eigenvectorstotherescue!• 𝜏 m¨

m%= −𝑣 + ℎ +𝑀𝑣

• IdeauseeigenvectorsofMtosolvedifferentialequationforv• SupposeNxNmatrixMissymmetric• IFmissymmetric,MhasNorthogonaleigenvectorse_iandNeigenvalueslambda_i• Itisusefulforthemtobeorthonomrlabecausethenwecanwriteouroutputvector

usingeigenvectorso 𝑣 𝑡 = 𝑐n 𝑡 𝑒nn1}%p� o Completeexpression:𝑐n 𝑡 = ℎ𝑡𝑖𝑚𝑒𝑠𝑒 }

}&©�(1 − exp − % }&©�

+

𝑐n 0 exp − % }&©�

o Ifanyofthelambdaisgreaterthan1ànetworkexplodeso Ifallofthemarelessthan1,networkisstableandv(t)convergestosome

steadystatevalue• Networkperformswinner-takes-allinputselection• Gainmodulationinthenonlinearnetwork

o Addingaconstantamounttotheinputhmultipliestheoutput• Memoryinnonlinearnetwork

o Networkmantainssomeshorttermmemory• Nonsymmetricalrecurrentnetworks

o Networkofexcitatoryandinhibitoryneurons• Linearstabilityanalysis

o Stabilitymatrixo THISISJUSTTHEJACOBIANMATRIX

NOTE:couldnotfigureoutoneonthequiz.Postedonstack.

JohnLarkin12/22/16


27

http://stackoverflow.com/questions/41492020/finding-the-steady-state-output-of-a-linear-recurrent-networkWEEK7–NetworksthatLearn:PlasticityintheBrain&Learning7.1SynapticPlasticity,Hebb’sRule,andStatisticalLearning

• Longtermpotentiation(LTP)–experimentallyobservedincreaseinsynapticstrengththatlastforhoursordays

• Longtermdepressison(LTD)–experimentallyobserveddecreaseinsynapticstrengththatlastforhoursordays

• Hebb’sLearningRuleo IfneuronAtakespartinfiringneuronB,thenthesynapseisstrengthenedo Formulationasamathematicalmodel

§ Let’sstartwithlinearfeedforwardmodel§ Wehaveasynapticweightvector§ Basichebbrule§ 𝜏¯

m¯m%= 𝑢𝑣

§ Discretization:𝑤ng} = 𝑤n + 𝜖 ∗ 𝑢𝑣§ Hebbruleonlyincreasessynapticweights(LTP)

o LearningrulesareNOTstableo Wgrowswithoutboundo Covariancerulecanbothincreaseanddecrease

• StartwiththeaveragedHebbrule:𝜏¯m¯m%= 𝑄𝑤

• Solvethisequationtofindw(t)usingeigenvectors• SubstituteinHebbruledifferentialequationandsimplifyasbefore• Synapticweightvectorisalinearcombination• Hastermsthatareexponentiallydependentonthevaluesofthecorrelationmatrix• Forlarget,largesteigenvaluetermdominates• ForOja’sRule:𝑤 𝑡 = X°

√²

• Thuswehaveshownthebraincandostatistics• HebbianLearningimplementsprincipalcomponentanalysis(PCA)• Hebbianlearninglearnsaweigthvectoralignedwiththeprincipaleigenvectorofinput

correlation/covariancematrixo DIRECTIONOFMAXIMUMVARIANCE

7.2IntroductiontoUnsupervisedLearning

• Canneuronslearntorepresentclusters• Feedforwardnetworkwithtwoneurons• Mostactiveneuroninthenetwork

o Theonewhoseweightvectorisclosesttoaninputo WecanshowthatbylookingattheEuclideandistancebetweenvectors

JohnLarkin12/22/16


28

o Givenanewinput,wecansettheweightvectortotherunningaverageofallinputsINTHATCLUSTER

o Thenyoupickthemostactiveneuron• Competitivelearningandselforganizingmaps

o AlsoknownasKohonenmapso Givenaninput,pickthewinningneurono UpdateweightsforthatneuronANDtheotherneuronsintheneighborhoodof

thewinningneuron§ Whatdowemeanbyneighborhood?§ Wehavelocationsassignedandtheneighboringoneslikeliterallyona2d

grid• Unsupervisedlearning

o Wehavecausesvo Datapointsuo Youkindofassumethattherearemultiplegaussiansgivenbysomeprioro Mixtureofgaussiansmodelo Goal:learnagoodgenerativemodelforthedatayouareseeing

§ Mimicthedatagenerationprocesso Generalapproach:

§ Givendatau,needto• Estimatecausesv• LearnparametersG

• Algorithmforlearningtheparameters• Expectation-Maximizationalgorithm:

o Iteratingthroughexpectationstepo Thenthemaximizationstepo Estep–computingtheposteriordistributionofvforeachu

§ 𝒑 𝒗 𝒖;𝑮 = 𝒑 𝒖 𝒗; 𝑮 𝒑 𝒗;𝑮𝒑[𝒖;𝑮]

§ softcompetitiono Mstep–chargeparametersGusingresultsfromE

§ Justupdatingthemean,variance,andtheprior7.3SparseCodingandPredictiveCoding

• Hebb’slearningruleimplementsprincipalcomponentanalysis• Howdowelearnmodelsofnaturalimages?

o Eigenvectorsorprincipalcompenentso TurkandPentlando Eigenvectorsoftheinputcovariancematrixo Anyfaceimagecanjustbealinearsummationoftheeigenfaceso It’sabasis

JohnLarkin12/22/16


29

o Thiscouldbegreatforcompression!Butnotgreattoextractthelocalcomponents

§ Edgesinascene§ Can’tgetthatfromaneigenvectoranalysis

• Definethegenerativemodel:likelihoodo Linearmodelo u=Gv+noiseo You’regeneratingthelikelihoodbasedonaprobabilisticmodelo Alotofmachinelearningalgorithmsandthingsinengineeringwantto

minimizethelogofthelikelihoodo 𝑝 𝑢 𝑣; 𝑔 = − }

I𝑢 − 𝐺𝑣 I + 𝑐

o ifyouMINIMIZEthesquaredreconstructionerroryouareMAXIMIZINGthelikelihoodofthedata

o Prior§ Canmakesomeassumptions§ Assumethecausesv_iareindependent§ Foranyinput,wewantonlyafewcausesv_itobeactive§ SPARSEDISTRIBUTION

• Alsocalledsuper-Gaussiandistribution• Verysharp• You’retakingtheexponentialofaGaussian• 𝑝 𝑣 = 𝑐 ⋅ Π exp 𝑔 𝑣n

o BayesianapproachtofindingvandlearningG§ Goingtomaximizetheposteriorprobabilityofcauses§ Equivalently,maximizethelogposterior§ 𝐹 𝑣, 𝐺 = − }

I 𝑢 − 𝐺𝑣 I + 𝑔 𝑣n + 𝐾n

• maximizeFwithrespecttov,keepingGfixed• maximizeFwithrespecttoG,keepingvfromabove• ThisissimilartotheEMalgorithm• Normally,wejustusegradientascent• m¹

m¨= 𝐺V 𝑢 − 𝐺𝑣 + 𝑔′(𝑣)

• firingratedynamics• 𝜏 m¨

m%= 𝐺V 𝑢 − 𝐺𝑣 + 𝑔′(𝑣)

• Firsttermistheerror,Gvistheprediction• Itconvergestoastablevalue

o LearningthesynapticweightsG§ 𝜏»

m»m%= 𝑢 − 𝐺𝑣 𝑣V

§ ThisistheHebbianterm§ ThisisalmostidenticaltoOja’sruleforlearning

JohnLarkin12/22/16


30

§ Whyisn’tthisnetworkjustdoingprincipalcomponentanalysislikeOja’srule?

§ Answer:Networkistryingtocomputeasparserepresentationoftheimage

§ LearningGforNaturalImages• Thebasisvectorsareabunchofbars• Likeonehotvectors• Theg_ilooklikelocaledgeorbarfeaturesSIMILARTO

RECEPTIVEFIELDSINPRIMARYVISUALCORTEX

WEEK8–LearningfromSupervisionandRewards8.1NeuronsasClassifiersandSupervisedLearning

• Theclassificationproblem• Example:classifyingimagesasfaces

o Whatiswejustgroupthemas+1and-1.Couldwedrawalinetoseparatethosegroups?

• Recall:theidealizedneurono It’sessentiallythresholdingo Inputs:u_i;synapticweights:w_i,andif 𝑤n𝑢n > 𝜇n thenwehaveanoutput

spike• Thisiscalledthe“perceptron”

o Wehaveinputsthatareeither+1or-1o Wecanbuildtheequation 𝑤n𝑢n − 𝜇 = 0n whichisahyperplaneformulao Perceptronscanclassify

• Sothequestionbecomes:howdowelearntheweightsandthethreshold?• Perceptronlearningrule:

o Adjustw_iandmuaccordingtooutputerror(v^d–v):o Δ𝑤n = 𝜖 𝑣m − 𝑣 𝑢n forpositiveinput;increasesweightiferrorispositive

decreasesweightiferrorisnegativeo Δ𝜇 = −𝜖(𝑣m − 𝑣)decreasesthresholdiferrorispositiveandincreasesiferroris

negative• Great!Soperceptronslearnanyfunctions?• Let’sthinkaboutXOR:

o Can’treallydoito There’snolinewecandrawo Perceptronscanonlyclassifylinearlyseparabledata

• However,wecanusemultilayerperceptrons• Whataboutcontinuousoutputs?

o Sigmoidfunctions!o Output:𝑣 = 𝑔 𝑤V𝑢 = 𝑔( 𝑤n𝑢n)n o Sigmoidoutputfunction:𝑔 𝑎 = }

}gX>¾y

JohnLarkin12/22/16


31

o Rangeof–inftoinfinityandthencompresseseverythingtooneo Betacontrolstheslope

• LearningMultilayersigmoidnetworkso Youcouldlearnweightsthatminimizetheoutputerroro 𝐸 𝑊,𝑤 = }

I𝑑n − 𝑣n I

n o Usegradientdescent!!o Howdowechangetheweightsforthehiddenlayer?o Backpropagationlearningruleo Δ𝑤¿' = −𝜖

mÀm¯ÁW

o Theansweressentiallyliesinthechainrulefromcalculuso Example:o mÀ

m¯ÁW= mÀ

m-Á⋅ m-Ám¯ÁW

o Theerrorpropagatesdownthroughtheneuralnetworko Weshouldseetheentirehiddenlayeraffected

8.2ReinforcementLearning:PredictingRewards

• Welearnbytrialanderror.Rewardsarepartofthis• Wehavesomestate,somereward,andsomeaction• Weneedtopicktheactionthatwillmaximizeourfuturereward• Pavlovandhisdog

o Classicconditioningexperimentso Training:bellàfoodo After:bellàsalivateo Buthowdowepredictrewardsdeliveredsometimeafterthestimulus?

• Wanttohavesomeneuronwhopredictstheexpectedtotalfuturereward?• Keyidea:utilizedynamicprogramming

o Wedon’tknowourfuturerewardssoweneedtoapproximateo Learningtheweightsaccordingtowhichv(t)iscalculatedo Temporaldifference(TD)learningrule

§ Δ𝑤 𝜏 = 𝜖 𝑟 𝑡 + 𝑣 𝑡 + 1 − 𝑣 𝑡 𝑢 𝑡 − 𝜏 § Wehaveatemporaldifferencebecausewehaveourfutureprediction

andourcurrentprediction8.3ReinforcementLearning:TimeforAction

• Howdoesthebrainuserewardinformationtoactuallyselecttheactions?• Learnastate-to-actionmappingorapolicy

o 𝜋 𝑢 = 𝑎o Shouldmaximizetheexpectedtotalfuturerewardo < 𝑟(𝑡 + 𝜏)1�%pV&% >

• However,that’susingarandompolicy

JohnLarkin12/22/16


32

• Valuesshouldactassurrogateimmediaterewardsàlocallyoptimalchoiceleadstogloballyoptimalpolicy

• MarkovEnvironmento Thenextstateonlydependsonthecurrentstateandthecurrentactiono Thisiscloselyrelatedtodynamicprogramming

• Puttingitalltogether:Actor-CriticLearningo Twoseparatecomponents:

§ Actor(selectsactionandmaintainspolicy)§ Critic(mantainsvalueofeachstate)

o 1.CriticLearning(PolicyEvaluation)§ Valueofstate𝑢 = 𝑣 𝑢 = 𝑤(𝑢)§ 𝑤 𝑢 ← 𝑤 𝑢 + 𝜖[𝑟 𝑢 + 𝑣 𝑢0 − 𝑣 𝑢 ]

o 2.ActorLearning(PolicyImprovement)

§ 𝑃 𝑎; 𝑢 = uvw ÅLy ouvw(ÅL� o )�

§ probabilisticallyselectanactionaatstateu§ Thisislikeasoftmaxfunction§ Itletsusexploreallpossibilities

o Thenwerepeatsteps1and2Endofclass

computational neuroscience · coursera: computational neuroscience class notes 4 § basis for...

Documents