computational neuroscience · coursera: computational neuroscience class notes 4 § basis for...
TRANSCRIPT
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
1
ComputationalNeuroscienceCourseHighlights:
• Somelightneurobiology• PCAandeigenbases• Backpropagation• Circuitanalysisforneuromodels• Eigenfaces
WEEK1–IntroductiontoComputationalNeuroscience1.1CourseIntroduction
• DescriptiveModelso howdoneuronsrespondtostimuliandhowisthatquantitativelyencodedo howcanweextractinfofromneurons(decoding)
• Howcanwesimulateasingleneuron?• Whydobraincircuitsoperatethewaytheydo?
Attheendofthecourse…• shouldbeabletoquantitativelydescribewhatisgoingonwithaneuronoranetwork• simulatebehaviorofneurons• formulatecomputationalneurons
1.2DescriptiveModels
• Goal:explainhowbrainsgeneratebehaviors• Goingtocharacterizewhatnervoussystemsdo,howtheyfunction,andwhythey
operateinparticularwayso Descriptivemodels(what)o Mechanisticmodels(howtheneuralsystemdoeswhatitdoes)o Interpretativemodels(why)
• Outputfrombraincellàactionpotential• Def:receptivefield:
o Specificpropertiesofasensorystimulusthatgenerateastrongresponsefromthecell
• Retina–layeroftissueatthebackoftheeyeso Invertedimageprojectedontobackoftheeyeso Retinalganglioncells–conveyinginformationabouttheimagetootherpartsof
thebrain• InformationfromtheretinapassedtotheLateralGeniculateNucleus(LGN)whichthen
passesinformationtothePrimaryVisualCortexV1.
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
2
• CentersurroundLGNreceptivefieldsaredisplacedbecauseofthepreferredorientationoftheprimaryvisualcortex
1.3MechanisticandInterpretiveModels
• Efficientcodinghypothesis–supposegoalistorepresentimagesasfaithfullyaspossibleusingneuronswithreceptivefields
• GivenimageI,wecanreconstructwithalinearcombinationofreceptivefieldsmultipliedbytherespectiveneuralresponse
• Wecareaboutminimizingthetotalsquarepixelwiseerrorandalsomakingsurethey’reasindependentaspossible?
• Ideaislikestartwithrandomreceptivefieldandthenrunthecodingalgorithmonnaturalimagepatches
o Whatistheefficientcodingalgorithm?§ Sparsecoding§ Independentcomponentanalysis§ Predictivecoding
• Conclusion:thebrainmaybetryingtofindfaithfulandefficientrepresentationsofthenaturalenvironment
1.4ThePersonalityofNeuronsEssentiallyneurobio101
• Maincharacter:corticalneurono Verysmallabout25micron
• Visualcortexo Axonsformthepyramidaltrackinmotorsystem
• Neurondoctrineo Neuronisfundamentalstructuralandfunctionalunito Neuronsarediscretecellso Informationflowsfromdendritestotheaxonviacellbody
• Dendritesareliketheinputs• EPSP–excitatorypost-synapticpotential• Abunchofthesegetfedintothedendritesandthenessentiallythesummationofthese
istheactionpotential• Ifsomethresholdisreached,thenwehavethisactionpotentialwhichistheoutput• Defneuron
o Leakybagofchargedliquido Neuroninsidesenclosedwithincellmembrane
§ Cellmembraneisalipidbilayer§ Impermeabletochargedionspecies§ BUTthereareionicchannels
• Theionicchannelsletionsflowinandouto Maintainsapotentialdifferenceacrossmembrane
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
3
o Concentrationofiondifferenceleadsto-70mV• Ionicchannels
o Voltage-gated:probofopeningdependsonmembranevoltageo Chemically-gated:bindingtoachemicalcauseschanneltoopeno Mechanically-gated:sensitivetopressureorstretch
• Synapseso Junctionsbetweenneuronso Changesinlocalmembranepotential
• Voltagegatedchannelscauseactionpotentialso Depolarizationopenssodiumchannelso Reallyaboutthesodiumandpotassiumbalanceo Downwardspikeofactionpotentialisfromthesodiumchannels
• Thewrappingofpartoftheaxonsiscalledmyelinsheath• Themyelinationofaxonsallowsforfastlong-rangespikecommunication• Actionpotentialhopsfromonenon-myelatedregiontothenext
o Thesenon-myelinatedregionsarecallednodeofRanviero Thisisessentiallyactivewireàlosslesssignalpropagation
1.5MakingConnections:Synapses
• Synapse–connectionbetweentwoneuronso Electricalsynapses–gapjunctions
§ Helpfulforwhenyouneedtosynchronize§ Neuronsfiresimultaneously
o Chemicalsynapses–neurotransmitters
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
4
§ Basisforlearningandmemory§ Changesthewaytheotherneuronisaffectedsimplybychangingdensity
o Canbeexcitatoryorinhibitory§ Defexcitatory
• Tendstoincreasethepostsynapticmembranepotential• Tendstoexcitemembraneb• Neurotransmitterscouldbe:glutamate
§ Definhibitory• Tendstodecreasethepostsynapticmembranepotential
§ Sothereisaspike,releaseofneurotransmitter,ionchannelsopen,sodiuminflux,depolarization
• Synapsesarethebasisformemoryandlearning• Allowforlearningthrough:synapticplasticity
o HebbianPlasticity§ Ifaneuronrepeatedlytakespartinfiringanotherneuron,thenthe
synapsebetweenthoseneuronsisstrengthened§ “Neuronsthatfiretogether,wiretogether!”§ Evidence:longtermpotentiation(LTP)
• Experimentallyobservedincreaseinsynapticstrength§ Longtermdepression(LTD)
• Experimentallyobserveddecreaseinsynapticstrength§ LTDandLTParegenerallyconfirmedwithdecreaseinEPSPsize
o Synapticplasticitydependsonspiketiming!o IfinputisafteroutputàLTDo IfinputisbeforeoutputàLTP
1.6TimetoNetwork:BrainAreasandtheirFunction
• Mainlytwotypesofnervoussystems• PeripheralNervousSystem(PNS)
o Twomaincomponentso Somatic–nervesconnectingtovoluntaryskeletalmusclesandsensoryreceptorso Ex.MovingyourarmandhandtoshakeafriendshandàutilizedtheSOMATIC
nervoussystem§ AfferentNerveFibers(incoming)
• AxonsthatcarryinfoawayfromtheperipherytotheCNS(centralnervoussystem)
§ EfferentNerveFibers• CarryinfofromCNStoperiphery
o Autonomic§ Nervesthatconnecttoheart,bloodvessels,etc.§ Guiltyof“fightorflight”reaction
• CentralNervousSystem(CNS)
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
5
o SpinalCord+Braino SpinalCord
§ Localfeedbackloopsàreflexarc• Ex:jumpingupwhenyousteponanail• Orjerkingatahotsurface
§ Descendingmotorcontrolsignalsàactivatespinalmotorneurons• Ex:braintellsyourbodytowalk.Yourspinalneuronsaretheones
thatcontrolthis.Sothiswayyoucanwalkandalsotalk.§ Ascendingsensoryaxons
• Conveysensoryinformationfrommusclesandskintothebraino BRAIN
§ Region§ Hindbrain–Medullaoblongata,pons,cerebellum
• Medullaoblongatao Breathing,muscleton
• Ponso Connectedtocerebellumo Involvedinsleepandarousal
• Cerebellumo EQUILLIBRIUMo Languageandattentiono Coordinationandtimingofvoluntarymovements
§ MidbrainandReticFormation• Midbrain
o Eyemovements,visualandauditoryreflexes• ReticularFormation
o Modulatesmusclereflexeso Regulatessleepo Wakefulnessandarousal
§ (nearcenter)ThalamusandHypothalamus• Thalamus
o “relaystation”forallsensoryinformationtothecortexo regulatessleepandwakefulness
• Hypothalamuso Rightbelowthethalamuso BASICNEEDS(thefourf’s)<-lol:
§ FIGHTING§ FLEEING§ FEEDING§ MATING
§ Cerebrum
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
6
• Consistsofcerebralcortex,basalganglia,hippocampus,andamygdala
• Perception,motorcontrol,cognitivefunctions,emotions,memoryandlearning
• CerebralCortexo Layeredsheetofneuronso 1/8thofaninchthicko 30billionneurons.10,000synapseseach.o 300trillionconnectionsintotalo Sixlayersofneurons
• NeuralvsDigitalComputingo Thebrainismassivelyparallelizedo Adaptiveconnectivityo Digitalcomputing:
§ MoresequentialviaCPUswithfixedconnectivityo Largecomputationalanalogs
§ Informationstorage:physical/chemicalstructureofneuronsandsynapses
§ Informationtransmission:electrical/chemicalsignaling§ Primarycomputingelements:neurons§ Computationalbasis:unknown
WEEK2–NeuralEncodingandDecoding2.1WhatistheNeuralCode?
• Toolforrecordingfromthebrain:fMRIo Functionalmagneticresonanceimagingo Measuresspatialperturbationsinthemagneticfield
§ Thechangesarecausedfrombloodoxygenation§ Asbloodflowsaroundyoucanseetheunderlyingneuralactivity
• EEG’salsojustshowactivityforabunchofneurons• Calciumimagingisanotherwaytoreadtheneuralcode• Whatistheactualneuralcode?
o Let’slookattheretinao Retina–sheetofcellsatthebackoftheeyeball
§ Takelightfromthelensandconvertstoelectricalsignalso Rasterplot–wayofvisualizingmultipleiterationso Eachneuronencodesabitofthemovie(fromtheexperiment)
• Twoquestions:o Encoding:howdoesastimuluscauseapatternofresponses?
§ Stimulusàresponse§ P(response|stimulus)encoding
o Decoding:howdotheresponsestellusaboutstimulus?
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
7
§ Responseàstimulus§ P(stimulus|response)decoding
• Neuronresponseissometypeofaveragefiringrateofgeneratingaspike• Tuningcurve
o Frequencyvsorientationoflighto LooksaboutGaussian
• Thereishigherorderofspatialrecognitions• MRI’shighlightdifferentregionswhenshownfacesvshouses• Tuningcurvescanbedifficulttorecord• Buildingupcomplexselectivity
o Brainareasbuildupthecomplexityofstimulusrepresentationo Geometricinretinaandthalamus,toV1(orientatededges)andthenV4.o Higherorderareasarelesssensitivetodetailssuchascolororlocation.o Thisistheideabehindhierarchicalfeaturesinafeedforwardway
2.1NeuralEncoding:SimpleModels
• Basiccodingmodelo linearresponse
§ r(t)=theta*s(t)(maybe–theta*s(t–tau))§ justgoingtobedelayedandscaledbyalittlebit
o Temporalfiltering(convolution)§ Weexpectresponsetodependonthecombinationofrecentinputs§ r(t)=sumfromk=0tonofs_{t-k}f_k§ thisislikeconvolution§ infactexactdefinition.SeeCheever’spageforrefresher.§ Example:
• Runningaverage• Leakyaverage
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
8
o Spatialfiltering§ Connectedwithreceptivefields§ So𝑟 𝑡 = 𝑠%&'𝑓'temporal§ 𝑟 𝑥, 𝑦 = 𝑠-&-.,/&/0-.1&2,/.1&2 𝑓-.,/0§ Thereceptivefieldisf.Howsimilarisittothereceptivefieldisexpressed
byf§ Oftenourreceptivefieldf,isgoingtobeadifferenceofGaussians§ DifferenceofGaussiansreallyjustpicksuptheedges
o Spatiotemporalfiltering§ Bothspaceandtimearegoingtobebest§ Weneedacombination
o Anothersolutionistohavealinearfilterandanonlinearity§ Somethinglike:§ 𝑟 𝑡 = 𝑔(∫ 𝑠 𝑡 − 𝜏 𝑓 𝜏 𝑑𝜏)§ Howdoyoufindthecomponentsofthemodel?
2.3NeuralEncoding:FeatureSelection
• Agoodbasiccodingmodel:combinationofalinearfilterandanonlinearinput-outputfunction
• Oneproblemisofdimensionality• Needtofindthefeaturethatdrivestheneuron• Justenoughsowecanlearnwhatreallydrivescell• Startwiths(t)anddiscretize• Whatistherightstimulustouse?
o Gaussianwhitenoiseo WechooseanewGaussiannumberateachfrequencyo Thepriordistributionisthedistributionofthestimuluso MultivariateGaussian–Gaussiannomatterhowwelookatit
• Determininglinearfeatures->onegoodwayistotaketheaverageo Thevectorthroughthisaverageàspiketriggeredaverageo Thenwecanprojectalloftheotherpointsandprojectalongthataxis
• Linearfiltering=convolution=projection• Lookingforstimulusfeaturefwhichisavectorinhighdimensionalstimulusspace• Summary:findafeatureby:
o Stimulatewithwhitenoiseo Reversecorrelationtocomputespiketriggeredaverageo Thisisgoodapproximationtoourfeature
• Stillthoughhowdowecomputeinput/outputw.r.t.feature• P(spike|stimulus)àP(spike|componentofthestimulusextractedbylinearfilter)• ThenuseBayesRule• P(spike|s1)=P(s1|spike)P(spike)/P(s1)
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
9
o Denominatoriscalledthepriorremembero AndP(s1|spike)isthespikeconditionaldistributiono P(spike)isindependentofthestimulus
• P(spike|s1)=P(s1|spike)P(spike)/P(s1)o Let’sassumerandomàthismeansthatisiftheblueandreddon’tchangeo Thenwemighthavefilteredouttherightfeatureo Whatwewanttoseeisanicedifferencebetweenthepriorandthespike
conditionalo Thismeansthatourinput/outputcurvewillbeinterestingandwecanpredict
highfiringrates• Let’saddthepossibilityofmultiplefeatures• Thisessentiallymeansthereareseveralfilters• WecouldusePCA!!Ahhh
o Thiswaywegetlikethemaindimensionalityo Asthevideoputsit,general,famous,andkindofmagicaltoolfordiscoveringlow
dimensionalstructureo Thecomponentscorrespondtoorthogonalsetofvectorsthatspanthecloudo Theimportantdimensionsaresomeunknownlinearcombinationofdimensionso Givesanewbasissettorepresentthedataàlotsofcompressiono Here,itisgoingtobesomebasisofourfeatureso Tangent:eigenfaces!!
§ Wecanrepalmostanynewfacesassumsofdifferenteigenfaces• PCApicksoutthedimensionwiththelargestamountofvariance• Thenweprojecttherestofthedataintothefeaturespace• We’retryingtofindinterestingfeaturesintheretina• Wefindan“on”andan“off”feature• Usingthistechnique,wecanplotourdatainthetwofeatureaxesandwecanfindthe
onandtheofffeatures• NOTE:thetwofeaturesarenottheonandofffeaturethemselves,buttheyallowa
coordinatesystemwherewecanseethestructure2.4NeuralEncoding:Variability
• RecalltheGaussianfunction:
• 𝑃 𝑥 = 𝐴𝑒&(=>=? @
@A@)
• WhenweusesomethinglikePCA,makingsurethatwehaveastimulusthat’sassymmetricaspossiblewithrespecttocoordinatetransformations
• Butwhatifwedon’tusePCA,andwejustlookatthepriorandtheconditionaldistributionandsay,canIfindafilter?Meaning,likewhenIprojectthestimulusontoitarethedistributionandpriorasdifferentaspossible
o Standardformeasuringthedifferencebetweentwoprobabilitydistributions:o KULLBACK-LEIBLERDIVERGENCE(DKL)
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
10
§ 𝐷CD 𝑃 𝑠 , 𝑄 𝑠 = ∫ 𝑑𝑠𝑃 𝑠 logIJ KL(K)
§ Sowejustwanttomaximizethisf§ Kindofturnsintoanoptimizationproblem
o Maximallyinformativedimensions§ ChoosefiltertomaximizeDKLbetweenspikeconditionalandprior
distributions§ Sowejustvaryourfilteraround,tomaximizetheDKL§ Tryingtofindastimuluscomponentthatisasinformativeaspossible§ Thisisareallypowerfultechniquebecauseitcangenerate§ HOWEVER,ADOWNSIDEISTHATTHISISAVERYTOUGHOPTIMIZATION
PROBLEMANDGLOBALOPTIMIZATIONISTRICKY• Findingrelevantfeatures
o Singlefilterdeterminedbyconditionalaverageo FamilyoffiltersfromPCAo Informationtheoreticmethodsthatusewholedistribution
• Assumptionthatwemakeisthateveryspikeisindependentofotherso Bernoullitrialso Sokindoflikeacoinflippingo Dividingtimesampleintomultipletimebinso Sequenceofntimebinswheren=T/∆to Binomialdistribution
§ P=probabilityoffiring§ Distribution:𝑃2 𝑘 = 𝑝' 1 − 𝑝 2&'(𝑛\𝑐ℎ𝑜𝑜𝑠𝑒𝑘)§ Thenchoosekisbecausewedon’tcareaboutthewaywe’rearranging
thosekspikes§ Average:nporrT§ Variance:np(1-p)§ Fanofactor:F=1§ Intervaldistribution=P(T)=rexp(-rT)§ Fanofactor–testsifsomethingisaPoissondistributionornot§ Iffanofactor==1:itisPoisson§ Here,wehavedefinedrastherateorprobabilityofperunitsoftime§ Tisourtime§ WedosomecalculationsfrombinomialandbinomialàPoisson§ Exproblem:
• Supposethatwhileastimulusispresent,aneuron’smeanfiringrateisr=4spikes/second.Ifthisneuron’sspikingischaracterizedbyPoissonspiking,thentheprobthattheneuronfireskspikesinTsecondsisgivenby:
• 𝑝 𝑘 = UV WX>YZ
'!
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
11
• Whatistheprobthatwhenthisstimulusisshownforonesecondtheneurondoesnotfireanyspikes?
• e^-4bcp(0)=1*e^-4/1§ Intervalsbetweenspikeshaveexponentialdistribution
• TwostrongtraitsofPoisson:o Fanofactor==1o Intervaldistribution:exponentialdistributionoftimes
• Sothenwecanlookattheslopeofthenumberofspikesvsthemeancountandthenwecanlookattheslope
• IfdistributedPoisson,thentheslopesshouldallbe1.Solookingatthevariancevsthemeancountshouldhaveaslopeofabout1
• Poissonnatureoffiringandrandomnessthatweneedtakescareofrandombackgroundnoise
• Poissonassumesspiketimeindependent• Realneuronshaverefractoryperiodthatpreventsthecellfromspikingimmediately• Generalizedlinearmodel:
• Exponentialnonlinearityàabletofindallparametersofthemodel,usingan
optimizationschemethatisgloballyconvergent• Moregeneralitybutmodelnowmorecompleteinanotherway• GLM=generallinearmodel• Timerescalingtheorem
o UsePoissonnaturetotestwhetherwehavecapturedeverythingo Wecanpredictouroutputspikeintervalsandscalethembyfiringratethat’s
predicted
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
12
o Takeintervaltimesandscalethembyfiringrateo ThesenewscaledintervalsshouldbedistributedlikeapurePoissonprocesso Asasinglecleanexponential
QUIZ2
1. Acosinefunctionisnotalinearfilteringsystem2. ThedefinitionofaspiketriggeredaverageforaneuronisThesetofstimulipreceding
aspike,eachaveragedovertime.a. Igotthiswrong.ThecorrectanswerisTheaveragedstimulusvaluesovera
giventimebeforeaspikethatelicitaspike.Thatshouldhavebeenobviousfromthepythonscriptbutalas…
3. Samplingrateis1sample/500s.soin1s/500Hz=0.002sampleperiod.Samplingperiodistheinverseofthesamplingfrequency.Thisis2ms.
4. #timestepsinouraveragevectoris300ms/widthbetweeninterval=300ms/2msfrom#3=150.
5. Justlen(num_spikes)=535836. Seecorrespondingcode7. Leakyintegration?Becausewecanseethatthingsaredecayingawaypriortothespike8. Wecankindofthinkofthisneuronlikeacapacitor.IhadtolookthisoneupbecauseI
wasn’tsure.Butyeahsoit’skindofchargingupright?Solikethebestthingisgoingtobeaconstantpositivevaluebecausethenitwillgraduallychargeupandfireit’sneuron.
9. PCAisthebestofthewaysWEEK3–ExtractingInformationfromNeurons:NeuralDecoding3.1NeuralDecodingandSignalDetectionTheory
• Reallygoingtochoosebetweentwocases:o Singleneurono Rangeofchoices,wherethereareafewneuronsthatmightbeaffectedbythe
stimulus• Alsohowdowedecodeinrealtime• Famousexperimenttodeterminehownoisysensoryinformationwasinterpreted
o Monkeywouldfocusonascreeno Watchapatternofrandomdotsmoveacrossthescreeno Monkeytrainedàfollowthedots.Trackingthedotpatterns.o Dotpatternisnoisy.Hardtotellwhichwayit’sgoing.o Fractionthatthemonkeyactuallygetsrightisafunctionofthecoherence.It
lookslikeasigmoidalfunctionalmost• SignalDetectionTheory
o Wecangeneratesomegraphso Risthenumberofspikesinasingletrialo Twoprobabilitydistributionsdist.Normalo P(r|-)andP(r|+)
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
13
o Wewanttomapsomerangeofro Thismeanssomethresholdo TheintersectionbetweenthetwoGaussianswouldmaximizethepercentage
correcto P_corr=P(+)P(r\geqz|+)+p(-)(1–p(r\geqz|-)o Falsealarms:P[r\geqz|-)o Goodcalls:P(r\geqz|+)o Theseprobabilitiesp(r|-)andp(r|+)areknownasthelikelihoodso Choosingthemaximumlikelihood
• Likelihoodratioo Puttingathresholdonthelikelihoodratio
o \ 𝑟 +\ 𝑟 − > 1wheneevewechooseplus
o Thisisthemostefficientstatistictouse,ithasthemostpowerforitssizeo ThisiscalledtheNeyman-PearsonLemmao https://en.wikipedia.org/wiki/Neyman%E2%80%93Pearson_lemmao Reallycoollemmaactually
• Seemstobeaclosecorrespondencebetweendecodedneuralresponseandmonkey’sbehavior
• Sowhydowehavesomanyneurons?Tbd• Logodds!!AhZuckertalkedaboutthisinmobile• Sowehave
• 𝑙 𝑠 = J 𝑠 𝑡𝑖𝑔𝑒𝑟\(K|bUXXcX)
• log 𝑙 𝑠 = log 𝑝 𝑠 𝑡𝑖𝑔𝑒𝑟 + log 𝑝(𝑠|𝑏𝑟𝑒𝑒𝑧𝑒)
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
14
• • Firingratesrampupuntilacertainsuredecision• Butbacktoourtrial…• Whatistheactualprobabilitythatwehaveatiger?It’sreallylow!Weneedtotakeinto
accountthepriors.• Thewindoratiger?
o Rodsinyoureyescanresponsetolightandevenasinglephotono Soifweadjustourprobabilitydistributionsthenwecanpickoutinstanceswhen
thereisasignificantdifferenceinfiringrateo Buildingincosto Wehavemultiplelossfunctions
• LossFunctionso Loss_minus=L_minusP[+|r]o Loss_plus=L_plusP[-|r]o Cutyourlosses:answerpluswhenloss_plus<loss_minuso Newcriterionforthelikelihoodratio:
§ \ 𝑟 +\ 𝑟 − > DfJ &
D>J g
3.2PopulationCodingandBayesianEstimation(kindofatoughonetogetthrough)
• Cricketsaresensitivetowind.Likewickedsensitive.• Allbecauseofcricketcercalcells.
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
15
• Theseneuronsrespondwithpeaksinoneofthefourcardinaldirections,whichis45˚totheanimal.Leftandright,frontandback.
• Thecurvesareapprox.cosine,sothatneuronsrespondtocosineofangle.Neuron’sfiringrateisproportionaltotheprojectionofthewindvelocity.
• BayesianInferenceo 𝑝 𝑠 𝑟 = \ 𝑟 𝑠 \ K
\[U]
o 𝑎𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟𝑖𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 = 𝑙𝑖𝑘𝑙𝑖ℎ𝑜𝑜𝑑𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 ∗ 𝑝𝑟𝑖𝑜𝑟 mnK%Unbo%np2qrUsn2rtmnK%Unbo%np2
o Maximumlikelihoods*whichmaximizesp[r|s]
• Decodinganarbitrarycontinuousstimuluso Assumeindependenceo AssumePoissonfiring
§ Spikesarerandomandindependent
§ 𝑃V 𝑘 = UV W uvw &UV'!
§ Thenwewant,r_atostimuluss§ Thatisthefiringratetoastimulus
§ 𝑃 𝑟r 𝑠 = xy K V YyZ uvw &xy K VUyV !
§ 𝑃 𝑟r 𝑠 = 𝛱 xy K V YyZ uvw &xy K VUyV !
becausewe’reassumingindependence
andthenwegofroma=1toN§ Wecantakethelog§ Themathgotprettyhairsoherearesomephotos
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
16
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
17
§ Andthenwewanttotakethederivativeandsetthatequaltozerotofindthemostlikelyvalue
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
18
§ OkIdidn’twanttowritealltheequationsoutinaworddocsoherearethepictures
§ Thismethodtakescareofweightingthembasedonthevariance• Limitations
o Tuningcurve/meanfiringrateo Correlations
3.3ReadingMinds:StimulusReconstruction…shouldgobackandrewatch
• Oneday–playbackourdreams?• Extendmodeltohandlevaryingcontinuouslyintime.• Wewanttofindestimators_bayesthatgivesusbestpossible• IntroduceerrorfunctionL(s,s_bayes)• Leastsquarescost.SojustL(s,s_bayes)=(s-s_bayes)^2• Solution:s_bayes=intdsp[s|r]s• Readingminds:fMRI
o OutputpredictedonBOLDsignals(bloodoxygensignals)o Itthereforehasadelayo ^that’sonewayo Anotherwayisamotionenergyfilter
3.4FredRiekeonVisualProcessingintheRetina
• Afewrodsoutof1000sarecontributingsignals• Allrodsaregeneratingnoise• Averagingwouldbeadisaster• Haveaccesstorodsignalandnoiseproperties• Soweseeevidenceforanonlinearthresholdbetweenrodandrod-bipolarcells• Visionisworkingunderconditionswherethevastmajorityaregeneratingnoise
o WanttoscalethedistributionstotakeintoaccountthepriorprobabilityQUIZ3
• Stimuluss.Canbeoneoftwovaluess1ors2.Firingrateresponser.Understimuluss1reposerateisroughlyGaussian~N(5,.5^2).S2~N(7,1).
• Itistwiceasbadtomistakenlythinkthatitiss2ratherthans1.o Sothisissayingsomethingaboutwherewe’rethresholding.
• “Thediseaseisveryrare.Thepriorprobabilityofbeingpositiveforthediseaseisthereforeverylow.MAP(maximumaPRIORi)takesthisintoaccount;MLEdoesnot.ThemathematicsdifferinthatMAPincludesatermfortheprior.”
o Frommystats.stackexchangequestionIaskedabout
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
19
WEEK4–InformationTheoryandNeuralCoding4.1InformationandEntropy
• Goingtostartbytalkingaboutentropyandinformation• Howtocomputeinformationforneuralspiketrains• Andwhatcanthistellusaboutcoding• Oksobacktothemonkeyexample:
o Informationquantifiessurpriseo Someoverallprobpthatthere’saspikeo P(1)=po P(0)=1-po Information(1)=-log_2po Information(0)=-log_2(1-p)
• Whydoestheinformationhavethisform?• Eachbitofinformationspecifieslocationbyfactorof2• Whatwe’rereallydoingismultiplyingtheprobabilities• Entropy–averageinformationofarandomvariable
o Measuresvariabilityo Unitsareinbitso Entropycountstheyes/noquestionso Entropy=−∑𝑝n logI 𝑝n o Orincontinuous−∫ 𝑑𝑥𝑝 𝑥 logI 𝑝(𝑥)
• Thisisessentiallyjusthuntingforthebinarysearch• 𝐻 =− 𝑝n log 𝑝n • 𝑝n =
}~
• 𝐻 =− }~log }
~n1}%p~
• }~∗ −3 = 3
• Threequestionstofindcar(inexample)andthat’sexactlytheentropy• Maximizetheentropy
o Computetheentropyasafunctionoftheprobabilitypo Whatdoeshavingalargeentropydoforacode?o Givesthemostpossibilityforrepresentinginputso YouwanttofindthevalueofpsuchthatHhasamaxo Ifp==½thenthosetwosymbolsareusedequallyasoften
• Entropytellsaboutintrinsicvariabilityoftheoutputs• Week2wasaskinghowdoweknowwhatourstimuluswas• Butnow,weneedtoincorporateourerrorchances• Assumethesameerror• Howmuchoftheentropyisaccountedforbytheseerrors?• Totalentropy:H[R]=-P(r_+)logP(r_+)–P(r_-)logP(r_-)• Noiseentropy:H[R|+]=-qlogq–(1-q)log(1-q)
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
20
• Thesestimulusdrivenentropiesarecallednoiseentropies• Amountofentropythatisusedincodingthestimulus• MI(S,R)=Totalentropy–averagenoiseentropy• 𝑀𝐼 = − 𝑝 𝑟 log 𝑝 𝑟 − 𝑝 𝑠 [− 𝑝 𝑟 𝑠 log 𝑝(𝑟|𝑠)]UKU • Entropyandinformation
o Fixingpo Varythenoiseprobabilityo Whenthereisnoerror,themutualinformationis1to1.Informationisjustthe
entropyofthresponse.o Astheerrorrateincreases,errorprobabilitygrowslargerandlarger.o Ifp(r|s)=p(r),themutualinformationMIofrandsiszero,becausethisis
sayingrandsareindependentandthereforenoinformationisgainedo Ifresponseisperfectlypredicted,thentheMIis1,becausetotalinformationis
conveyedby1.• Mutualinformationmeasuresrelationship
o TheinformationquantifieshowindependentRandS.o GoingtousetheKullbackLeiblerdivergence.
§ Thisisameasureofthedifferencebetweentwoprobdistributions§ Normallyitisbetweena“true”distributionandatheoreticaldistribution§ https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
o 𝐷CD 𝑃, 𝑄 = ∫ 𝑑𝑥𝑃 𝑥 log J -L(-)
o Goingtogeneralizesothatdistributionsarefunctionsofsandr.Sowewouldneedtointegrateoverbothsandr
o ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 log J U,KJ U J K
= ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 log J 𝑟 𝑠 J KJ U J(K)
o = ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 [log 𝑃 𝑟 𝑠 − log 𝑃(𝑟)]o =−∫ 𝑑𝑠𝑑𝑟𝑃 𝑠, 𝑟 log 𝑃 𝑟 + ∫ 𝑑𝑠𝑑𝑟𝑃 𝑠 𝑃 𝑟 𝑠 log 𝑃(𝑟|𝑠)o Thefirstbitwecanjustintegrateoverso ThesecondtermisgoingtobetheentropyofP(r|s)o Thisgivesusexactlywhatisexpectedo 𝐼 𝑆, 𝑅 = 𝐻 𝑅 − 𝑃 𝑠 𝐻[𝑅|𝑠]K
• Calculatingmutualinformation• TakeonestimulussandrepeatmanytimesàobtainP(R|s)• Computevariabilityduetonoise:noiseentropyàH[R|s]• Repeatforallsandaverageà\sum_sP(s)H[R|s)• ComputeP(R)=\sum_sP(s)P(R|S)andtheoreticalentropy
4.2CalculatingInformationinSpikeTrains
• Twomethods:singlespikes,vslotsofspikes• Mutualinformation=diff(totalresponseentropy,meannoiseentropy)• Methodology:
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
21
o Divideupvoltagetrainintolettersize∆tandlengthTo Essentiallythenjusthavea1ifwehaveaspike0ifnoto Fromthis,computep(w_i)o 𝐻 𝑤 =− 𝑝 𝑤n log 𝑝 𝑤n o HowtosampleP(S)àaverageovertimeo Foreachtime,we’regoingtogivensetofwordsP(w|s(t))o Thenwehaveanaverageentropyo Chooselengthofrepeatedsamplelongenoughsothatthesamplethenoise
adequately• Informationinsinglespikesissimilartowhatwejustsawinthepreviouslecture• Afterabitofmathandsomeassumptions,theinformationperspikeis:• 𝐼 𝑟, 𝑠 = }
V 𝑑𝑡 U %
U�yYlog U %
U�yY�%pV
• Noexplicitstimulusdependence(NONEEDFORCODING/DECODINGMODEL)• Theraterdoesnothavetomeanrateofspikesàcanberateofanyevent• Limitationsofinformation:
o Spikeprecision,blursr(t)o Meanspikerate
4.3CodingPrinciples
• Naturalstimulio Hugedynamicrangeo Powerlawscaling
• Efficientcoding:o Inordertohavemaxentropyoutput,agoodencodershouldmatchitsoutputs
tothedistributionofitsinputso Shouldbeabletostretchitsinputaxis(INREALTIME)sothatitcan
accommodatethevariationsintheoverallscaling• Featureadaption
o Powerspectrumandsignaltonoiseratioarelargefactorsforthepredictedreceptivefieldatcertainlightlevels.
o Centerbecomesbroaderinlowlightlevelso ChoosefiltertomaximizeKullbackLeiblerdivergencebetweenspikeconditional
andpriordistributions• Redundancyreduction
o Neuralsystemsshouldbetryingtoencodeasefficientlyaspossibleo Maximizetheentropyshouldtakeintoeffectthemarginaladdedtogethero Correlationscanbegoodàerrorcorrection+correlationshelpdiscrimination
• NeuronpopulationsshouldbeasSPARSEaspossibleo Let’ssaywerightdownasetofbasisfunctions,phi,o Anyimagecanbeexpressedasaweightedsumo 𝐼 𝑥brU = 𝑎n𝜙n 𝑥 + 𝜖(𝑥)n
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
22
o Wanttopenalizehavingtoomanyimageso Fourierbasisrepresentsinsinandcosine…butnotnecessarilysparsebecause
thepowerspectrumisbroado Sparecode–excitesaminimumnumberofimage
• ClassicandStateoftheArtMethods:o Modelsforhowstimuliarecodedinspikeso Modelsfordecodingstimulusfromneuralo Informationtheoryo Averyquickglanceathowcodingstrategiesmightshapeotherthings
WEEK5–ComputinginCarbon5.1–ModelingNeurons
• Abouttodelveintocircuitdiagrams• Differentialequations(largelyfirstorder)• HodgkinHuxleymodel
o Shouldbeareviewfrombiomedicalsignals• Basicreviewofcircuitdiagrams• Membranepatch
o Wehavealipidbilayer§ Likeacapacitor
o Poreso Channel
• Cellbatteryo Outsidethecell:highersodium,chlorineandcalciumcontentso Insidehigherpotassiumlevelso Concentrationgradient=battery
§ NernstEquation𝐸 = '�Vc�ln n2KnmX
[po%KnmX]
• Currentsflowthroughionchannel5.2–Spikes
• Whatmakesaneuroncompute?• Neuronrespondstostepsandthresholds
o Uncoverthenon-linearity• Gatehassubunitsthatneedtobeopenforthingstogothrough• Gatingdependsonsubunitstate
o P_k=n^4o nisopenprobo 1-nisclosed
• Reviewofbiomedicalsignals• Independentprobabilityofbeingopen• HodgkinandHuxley’snobelequation
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
23
o Specifiesconductancefordifferentchannelso Timeconstantdictateshowrapidlyeachvariablecorrespondstovoltagechange
• Hodgkin-HuxleyModelo Twodifferentplaceso Biophysicalrealmàionchannelphysics,additionalchennlso Simplifiedmodelsàfundamentaldynamics,analyticaltractability
5.3–SimplifiedModelNeurons
• Canonebuildalargemodelwithlotsofneurons• Caputringthebasic
o Forcethemodeltobelinearo dV/dt=f(V)+l(t)#nonlinearbecauseoff(V)o dV/dt=-a(V-V_0)+l(t)o Likeapassivemebraneo C_mdV/dt=-g_L(V–V_e)o Integratedfiringmodel^
• Exponentialintegrate-and-fireneuron• Thethetaneuron
o Greatforperiodicneuronso Onedimensionsalo m�
m%= 1 − cos 𝜃 + (1 + cos 𝜃)𝑙(𝑡)
• Twodimensionalmodelso Needaphaseplatediagramo Canfindthenullclines–theplacewherethederivativeisequaltozeroo Fixedpointisgoingtobetheintersectionbetweentwonullclines
• Variousneuronshavedifferentfiringratesandoscillations5.4–AForestofDendrites(shouldreview)
• Realneuronsarebrutaltomodel• Injectcurrentatthecellbodyandrecordeffectindendrites• Sowe’relookingatthesomatoseetheresponseatsomeinput• Inputsthatcomeinatdifferentpartsofthedendritecanhaveverydifferenteffects• Theoreticalbasisfordendritecommunication
o PDEs!o Linearcableso VoltageVisafunctionofbothxandto Essentiallyabunchofcircuitsdistributedalongatableo Nowaspatialderivativethathastobetakenintoeffecto Essentially,thediffusionequation,butwehaveanadditionalV_m/r_mo Timeconstant:𝑡q = 𝑟q𝑐qo Spaceconstant:𝜆 = U�
U�
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
24
o R_mismembraneresistance• Functiondecaysrapidlyasafunctionofspace
o Geometrycanbeextremelycomplicatedàcableequationo Ionchannelso Solution:divideandconquero Eachcompartment=onedV/dtequationo Ifbranchesopenacertainbranchingratio,canreplaceeachpairofbranches
withasinglecablesegmentwithequivalentsurfaceareaandelectroniclength• Ionchannelsintroducethenonlinearity• Dendritescanaddalottoneuronalcomputation
o Logicaloperationso Lowpassfilter,attenuationo Coincidencedetectiono Segregation,amplification
• Example:o Delaylinesinsoundlocalization
EricShea-BrownonNeuralCorrelationsandSynchrony
• Thisguyseemsgood• Encodingviaspikes• Eyeàopticnerveàlateralgeniculatenucleus(LGN)àvisualcortex• Tuningcurve–firingratesasafunctionoftheangleofsomestimulus• Gotabunchofneurons,haveatuningcurve,alsovariancearoundthatmean• Twostatistics• Stillcanquantifysimilarstatistics• Pairwisecorrelationàdeparturefromindependence
o Labelthespikecountso Piersoncorrelationcoefficiento Orjustcorrelationcoefficiento Thenyouaskifthatnumberis0ornon-zeroo Correlationcandegradesignalencoding
• Turnsoutthatyoucanapplythistechniquetonumerousneuronso Computethesignaltonoiseratio
§ Mean/varianceo SNRàgoingtogrowwithM(numberofneurons)o Thenalsoobservethecorrelationcoefficient
• PairwisecouplesofentirepopulationWEEK6–ComputingwithNetworks6.1ModelingConnectionsbetweenNeurons
• Linearfiltermodelofasynapse
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
25
• Seeonlinenotesforthislecture• Justlistenedtotheaudio
6.2IntroductiontoNetworkModels
• Learnedthatneuronsusesynapsestoconnect• Learnedhowtomodelwithdifferentialequations• FEEDFORWARDVSRECURRENT• Modellingnetworks
o SpikingNeurons§ Pro:Learningbasedonspiketiming§ Pro:Spikecorrelations§ Con:computationallyexpensive
o Firing-rateoutputs(realvaluedoutputs)§ Greaterefficiency,scaleswelltolargenetworks§ Ignorespiketiming
o Howaretheyrelated?• Synapseb• Inputspiketrainrho_b(t)• 𝜌b 𝑡 = 𝛿(𝑡 − 𝑡n)n • 𝑔b 𝑡 = 𝑔b,qr- 𝐾(𝑡 − 𝑡n)%��% = 𝑔b,qr- 𝐾 𝑡 − 𝜏 𝑝b 𝜏 𝑑𝑡&\n2x • Fromsinglesynapsetomultiplesynapses:
o Eachsynapsehasasynapticweighto Assumenononlinearinteractionso Thentotalsynapticcurrento 𝐼K 𝑡 = 𝑤b∫ 𝑓𝑟𝑜𝑚 − inf 𝑡𝑜𝑡𝐾 𝑡 − 𝜏 𝑝b 𝜏 𝑑𝜏b1}%p� o Wegofromspiketrain,tofiringrateo Thiswouldfailiftherewerecorrelationsorsynchronies
• SupposesynapticfilterKisexponential• Firing-rate-basednetworkmodel• Outputfiringratechangeslike:𝝉𝒓
𝒅𝒗𝒅𝒕= −𝒗 + 𝑭(𝑰𝒔 𝒕 )
• Inputcurrentchangeslike:𝝉𝒔𝒅𝑰𝒔m%
= −𝑰𝒔 + 𝒘 ∗ 𝒖• Weightsmatrixw• Togetsteadystate,weneedtosetbothoftheseequaltozero• Staticinput:v_ss=F(wdotu)• THERICHDYANMICSTHATAREACTUALLYINTHESYANPTICCURRENTAREREPLACED
WITHASIGMOIDALFUNCTIONFORARTIFICALNEURALNETWORKS• THAT’SONEOFTHEBIGDISTINCTIONS• HENCEARTIFICAL• BIGASSUMPTIONSTHATTHESYANPSESARERELATIVELYFAST• Multipleoutputneurons
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
26
• Thenwehaveaninputvectorandanoutputvector• Visnowavector.Wbecomesourweightmatrix.• ThishasallbeenFEEDFORWARDNETWORKS• 𝜏 m𝒗
m%= −𝒗 + 𝐹(𝑊𝒖 +𝑀𝒗)
• Forfeedforwardnetworks,Misamatrixofzeros• There’snolikepassbackwithoutrecurrentnetworks!!• LinearFeedforwardNetwork
o Steadystate:𝑣KK = 𝑊𝒖• Edgedetectorsinthebrain
o Primaryvisualcortex(V1)o ReceptivefieldsinV1haveedgedetection
6.3TheFascinatingWorldofRecurrentNetworks
• Wanttofindouthowtheoutputv(t)behavesfordifferentM• Eigenvectorstotherescue!• 𝜏 m¨
m%= −𝑣 + ℎ +𝑀𝑣
• IdeauseeigenvectorsofMtosolvedifferentialequationforv• SupposeNxNmatrixMissymmetric• IFmissymmetric,MhasNorthogonaleigenvectorse_iandNeigenvalueslambda_i• Itisusefulforthemtobeorthonomrlabecausethenwecanwriteouroutputvector
usingeigenvectorso 𝑣 𝑡 = 𝑐n 𝑡 𝑒nn1}%p� o Completeexpression:𝑐n 𝑡 = ℎ𝑡𝑖𝑚𝑒𝑠𝑒 }
}&©�(1 − exp − % }&©�
+
𝑐n 0 exp − % }&©�
o Ifanyofthelambdaisgreaterthan1ànetworkexplodeso Ifallofthemarelessthan1,networkisstableandv(t)convergestosome
steadystatevalue• Networkperformswinner-takes-allinputselection• Gainmodulationinthenonlinearnetwork
o Addingaconstantamounttotheinputhmultipliestheoutput• Memoryinnonlinearnetwork
o Networkmantainssomeshorttermmemory• Nonsymmetricalrecurrentnetworks
o Networkofexcitatoryandinhibitoryneurons• Linearstabilityanalysis
o Stabilitymatrixo THISISJUSTTHEJACOBIANMATRIX
NOTE:couldnotfigureoutoneonthequiz.Postedonstack.
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
27
http://stackoverflow.com/questions/41492020/finding-the-steady-state-output-of-a-linear-recurrent-networkWEEK7–NetworksthatLearn:PlasticityintheBrain&Learning7.1SynapticPlasticity,Hebb’sRule,andStatisticalLearning
• Longtermpotentiation(LTP)–experimentallyobservedincreaseinsynapticstrengththatlastforhoursordays
• Longtermdepressison(LTD)–experimentallyobserveddecreaseinsynapticstrengththatlastforhoursordays
• Hebb’sLearningRuleo IfneuronAtakespartinfiringneuronB,thenthesynapseisstrengthenedo Formulationasamathematicalmodel
§ Let’sstartwithlinearfeedforwardmodel§ Wehaveasynapticweightvector§ Basichebbrule§ 𝜏¯
m¯m%= 𝑢𝑣
§ Discretization:𝑤ng} = 𝑤n + 𝜖 ∗ 𝑢𝑣§ Hebbruleonlyincreasessynapticweights(LTP)
o LearningrulesareNOTstableo Wgrowswithoutboundo Covariancerulecanbothincreaseanddecrease
• StartwiththeaveragedHebbrule:𝜏¯m¯m%= 𝑄𝑤
• Solvethisequationtofindw(t)usingeigenvectors• SubstituteinHebbruledifferentialequationandsimplifyasbefore• Synapticweightvectorisalinearcombination• Hastermsthatareexponentiallydependentonthevaluesofthecorrelationmatrix• Forlarget,largesteigenvaluetermdominates• ForOja’sRule:𝑤 𝑡 = X°
√²
• Thuswehaveshownthebraincandostatistics• HebbianLearningimplementsprincipalcomponentanalysis(PCA)• Hebbianlearninglearnsaweigthvectoralignedwiththeprincipaleigenvectorofinput
correlation/covariancematrixo DIRECTIONOFMAXIMUMVARIANCE
7.2IntroductiontoUnsupervisedLearning
• Canneuronslearntorepresentclusters• Feedforwardnetworkwithtwoneurons• Mostactiveneuroninthenetwork
o Theonewhoseweightvectorisclosesttoaninputo WecanshowthatbylookingattheEuclideandistancebetweenvectors
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
28
o Givenanewinput,wecansettheweightvectortotherunningaverageofallinputsINTHATCLUSTER
o Thenyoupickthemostactiveneuron• Competitivelearningandselforganizingmaps
o AlsoknownasKohonenmapso Givenaninput,pickthewinningneurono UpdateweightsforthatneuronANDtheotherneuronsintheneighborhoodof
thewinningneuron§ Whatdowemeanbyneighborhood?§ Wehavelocationsassignedandtheneighboringoneslikeliterallyona2d
grid• Unsupervisedlearning
o Wehavecausesvo Datapointsuo Youkindofassumethattherearemultiplegaussiansgivenbysomeprioro Mixtureofgaussiansmodelo Goal:learnagoodgenerativemodelforthedatayouareseeing
§ Mimicthedatagenerationprocesso Generalapproach:
§ Givendatau,needto• Estimatecausesv• LearnparametersG
• Algorithmforlearningtheparameters• Expectation-Maximizationalgorithm:
o Iteratingthroughexpectationstepo Thenthemaximizationstepo Estep–computingtheposteriordistributionofvforeachu
§ 𝒑 𝒗 𝒖;𝑮 = 𝒑 𝒖 𝒗; 𝑮 𝒑 𝒗;𝑮𝒑[𝒖;𝑮]
§ softcompetitiono Mstep–chargeparametersGusingresultsfromE
§ Justupdatingthemean,variance,andtheprior7.3SparseCodingandPredictiveCoding
• Hebb’slearningruleimplementsprincipalcomponentanalysis• Howdowelearnmodelsofnaturalimages?
o Eigenvectorsorprincipalcompenentso TurkandPentlando Eigenvectorsoftheinputcovariancematrixo Anyfaceimagecanjustbealinearsummationoftheeigenfaceso It’sabasis
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
29
o Thiscouldbegreatforcompression!Butnotgreattoextractthelocalcomponents
§ Edgesinascene§ Can’tgetthatfromaneigenvectoranalysis
• Definethegenerativemodel:likelihoodo Linearmodelo u=Gv+noiseo You’regeneratingthelikelihoodbasedonaprobabilisticmodelo Alotofmachinelearningalgorithmsandthingsinengineeringwantto
minimizethelogofthelikelihoodo 𝑝 𝑢 𝑣; 𝑔 = − }
I𝑢 − 𝐺𝑣 I + 𝑐
o ifyouMINIMIZEthesquaredreconstructionerroryouareMAXIMIZINGthelikelihoodofthedata
o Prior§ Canmakesomeassumptions§ Assumethecausesv_iareindependent§ Foranyinput,wewantonlyafewcausesv_itobeactive§ SPARSEDISTRIBUTION
• Alsocalledsuper-Gaussiandistribution• Verysharp• You’retakingtheexponentialofaGaussian• 𝑝 𝑣 = 𝑐 ⋅ Π exp 𝑔 𝑣n
o BayesianapproachtofindingvandlearningG§ Goingtomaximizetheposteriorprobabilityofcauses§ Equivalently,maximizethelogposterior§ 𝐹 𝑣, 𝐺 = − }
I 𝑢 − 𝐺𝑣 I + 𝑔 𝑣n + 𝐾n
• maximizeFwithrespecttov,keepingGfixed• maximizeFwithrespecttoG,keepingvfromabove• ThisissimilartotheEMalgorithm• Normally,wejustusegradientascent• m¹
m¨= 𝐺V 𝑢 − 𝐺𝑣 + 𝑔′(𝑣)
• firingratedynamics• 𝜏 m¨
m%= 𝐺V 𝑢 − 𝐺𝑣 + 𝑔′(𝑣)
• Firsttermistheerror,Gvistheprediction• Itconvergestoastablevalue
o LearningthesynapticweightsG§ 𝜏»
m»m%= 𝑢 − 𝐺𝑣 𝑣V
§ ThisistheHebbianterm§ ThisisalmostidenticaltoOja’sruleforlearning
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
30
§ Whyisn’tthisnetworkjustdoingprincipalcomponentanalysislikeOja’srule?
§ Answer:Networkistryingtocomputeasparserepresentationoftheimage
§ LearningGforNaturalImages• Thebasisvectorsareabunchofbars• Likeonehotvectors• Theg_ilooklikelocaledgeorbarfeaturesSIMILARTO
RECEPTIVEFIELDSINPRIMARYVISUALCORTEX
WEEK8–LearningfromSupervisionandRewards8.1NeuronsasClassifiersandSupervisedLearning
• Theclassificationproblem• Example:classifyingimagesasfaces
o Whatiswejustgroupthemas+1and-1.Couldwedrawalinetoseparatethosegroups?
• Recall:theidealizedneurono It’sessentiallythresholdingo Inputs:u_i;synapticweights:w_i,andif 𝑤n𝑢n > 𝜇n thenwehaveanoutput
spike• Thisiscalledthe“perceptron”
o Wehaveinputsthatareeither+1or-1o Wecanbuildtheequation 𝑤n𝑢n − 𝜇 = 0n whichisahyperplaneformulao Perceptronscanclassify
• Sothequestionbecomes:howdowelearntheweightsandthethreshold?• Perceptronlearningrule:
o Adjustw_iandmuaccordingtooutputerror(v^d–v):o Δ𝑤n = 𝜖 𝑣m − 𝑣 𝑢n forpositiveinput;increasesweightiferrorispositive
decreasesweightiferrorisnegativeo Δ𝜇 = −𝜖(𝑣m − 𝑣)decreasesthresholdiferrorispositiveandincreasesiferroris
negative• Great!Soperceptronslearnanyfunctions?• Let’sthinkaboutXOR:
o Can’treallydoito There’snolinewecandrawo Perceptronscanonlyclassifylinearlyseparabledata
• However,wecanusemultilayerperceptrons• Whataboutcontinuousoutputs?
o Sigmoidfunctions!o Output:𝑣 = 𝑔 𝑤V𝑢 = 𝑔( 𝑤n𝑢n)n o Sigmoidoutputfunction:𝑔 𝑎 = }
}gX>¾y
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
31
o Rangeof–inftoinfinityandthencompresseseverythingtooneo Betacontrolstheslope
• LearningMultilayersigmoidnetworkso Youcouldlearnweightsthatminimizetheoutputerroro 𝐸 𝑊,𝑤 = }
I𝑑n − 𝑣n I
n o Usegradientdescent!!o Howdowechangetheweightsforthehiddenlayer?o Backpropagationlearningruleo Δ𝑤¿' = −𝜖
mÀm¯ÁW
o Theansweressentiallyliesinthechainrulefromcalculuso Example:o mÀ
m¯ÁW= mÀ
m-Á⋅ m-Ám¯ÁW
o Theerrorpropagatesdownthroughtheneuralnetworko Weshouldseetheentirehiddenlayeraffected
8.2ReinforcementLearning:PredictingRewards
• Welearnbytrialanderror.Rewardsarepartofthis• Wehavesomestate,somereward,andsomeaction• Weneedtopicktheactionthatwillmaximizeourfuturereward• Pavlovandhisdog
o Classicconditioningexperimentso Training:bellàfoodo After:bellàsalivateo Buthowdowepredictrewardsdeliveredsometimeafterthestimulus?
• Wanttohavesomeneuronwhopredictstheexpectedtotalfuturereward?• Keyidea:utilizedynamicprogramming
o Wedon’tknowourfuturerewardssoweneedtoapproximateo Learningtheweightsaccordingtowhichv(t)iscalculatedo Temporaldifference(TD)learningrule
§ Δ𝑤 𝜏 = 𝜖 𝑟 𝑡 + 𝑣 𝑡 + 1 − 𝑣 𝑡 𝑢 𝑡 − 𝜏 § Wehaveatemporaldifferencebecausewehaveourfutureprediction
andourcurrentprediction8.3ReinforcementLearning:TimeforAction
• Howdoesthebrainuserewardinformationtoactuallyselecttheactions?• Learnastate-to-actionmappingorapolicy
o 𝜋 𝑢 = 𝑎o Shouldmaximizetheexpectedtotalfuturerewardo < 𝑟(𝑡 + 𝜏)1�%pV&% >
• However,that’susingarandompolicy
JohnLarkin12/22/16
Coursera:ComputationalNeuroscienceClassNotes
32
• Valuesshouldactassurrogateimmediaterewardsàlocallyoptimalchoiceleadstogloballyoptimalpolicy
• MarkovEnvironmento Thenextstateonlydependsonthecurrentstateandthecurrentactiono Thisiscloselyrelatedtodynamicprogramming
• Puttingitalltogether:Actor-CriticLearningo Twoseparatecomponents:
§ Actor(selectsactionandmaintainspolicy)§ Critic(mantainsvalueofeachstate)
o 1.CriticLearning(PolicyEvaluation)§ Valueofstate𝑢 = 𝑣 𝑢 = 𝑤(𝑢)§ 𝑤 𝑢 ← 𝑤 𝑢 + 𝜖[𝑟 𝑢 + 𝑣 𝑢0 − 𝑣 𝑢 ]
o 2.ActorLearning(PolicyImprovement)
§ 𝑃 𝑎; 𝑢 = uvw ÅLy ouvw(ÅL� o )�
§ probabilisticallyselectanactionaatstateu§ Thisislikeasoftmaxfunction§ Itletsusexploreallpossibilities
o Thenwerepeatsteps1and2Endofclass