basics of bayesian formalism - caltech astronomygeorge/aybi199/mahabal_bayes.pdf · 2011-05-12 ·...

Post on 20-Apr-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BasicsofBayesianFormalism

AshishMahabal(PQ,CSS,JPLcollabs)

Ay/Bi199Caltech,12May2011

AdvantagesofBayesianNetworks

•  Handlingofincompletedata– Real‐worldcases

•  LearningcausalconnecNons– Whatvariablecausedwhat

•  IncorporaNngdomainknowledge– Expertscanweightinatdifferentpoints

•  Memorizing(akaoverfiVng)avoided– Noholdoutnecessary

•  TueMay17MoghaddamBayesianMethods

NonparametricBayesandGaussianProcesses

AmbiNousOutline

•  BasicastronomyclassificaNontrivia•  TimeandplaceforBayesiantechniques

•  Basicconceptsrelatedtobelief•  Logicandprobabilitytheory•  Theinversionformula

•  ApplicaNontoastronomy

AstronomicalClassificaNonandtheNmedomain

•  Movingobjects(asteroids,TNOs,KBOs)•  SNe(cosmologicalstandardcandles,endpointsofstellar

evoluNon)

•  GRBorphana^erglows(constrainingbeamingmodels)

•  Variablestars(stellarastrophysics,galacNcstructure)•  AGN(QSOs,fuellingmechanisms,lifeNmes)•  Blazars,CosmicRays,…

Rapidfollow‐upkeystounderstanding

TowardsAutomatedEventClassificaNon

Eventparameters:m1(t),m2(t),…α,δ,µ,…imageshape…

colors lightcurves

etc.

ExpertandMLgeneratedpriors

contextualinformaNon

EventClassifica<on

Engine

P(SNIa)=…P(SNII)=…P(AGN)=…P(CV)=…P(dM)=…

….

AnecessityforlargesynopNcsurveys

ClassificaNonprobabiliNes(evolving,iterated)

BasicastronomyclassificaNontriviaColors

•  MagnitudeasbasicobservaNon(flux)•  ColorasfluxraNo•  Color‐colordiagramasadiagnosNc

•  Ambiguity

BasicastronomyclassificaNontrivia

•  AdachprobabiliNesthroughpriorstovariousclassesanddeterminewhatclassanewlylookedatobjectbelongsto

•  BayesiantechniquesallowustodothisinaraNonalmannerevenwhensomeofthedataisuncertainormissing

TowardsAutomatedEventClassificaNon

Eventparameters:m1(t),m2(t),…α,δ,µ,…imageshape…

colors lightcurves

etc.

ExpertandMLgeneratedpriors

contextualinformaNon

EventClassifica<on

Engine

P(SNIa)=…P(SNII)=…P(AGN)=…P(CV)=…P(dM)=…

….

AnecessityforlargesynopNcsurveys

ClassificaNonprobabiliNes(evolving,iterated)

Bayesiantechniques

•  Bayesianmethodsprovideaformalismforreasoningaboutpar<albeliefsundercondi<onsofuncertainty

Beliefisgoingtobeacrucialword

A:Worldwillendin2012p(A|K)beliefaboutAgivenabodyofknowledgeK.OJenwriKensimplyasp(A).

•  p(NOTA)beliefthatAwillnothappen•  WhenKchanges,p(A)andp(NOTA)changeaccordingly

•  Ingeneral:•  0≤p(A)≤1•  p(sureproposiNon)=1•  p(AorB)=p(A)+p(B)whenAandBaremutuallyexclusive

•  p(A)+p(NOTA)=1•  p(A)=p(A,B)+p(A,NOTB)– p(A,B)==p(AandB)Ingeneral:– ForBi=B1,B2,B3,…Bnmutuallyexclusivep(A)=Σp(A,Bi)=p(A,B1)+p(A,B2)+..+p(A,Bn)

Wewillrevisittheselater.Catchword:Belief

Anexample

•  A:outcomeof2diceisequal•  B1:2nddieis1•  B2..B6:2nddieis2..6•  Bi=B1..B6formaparNNon(areexhausNve)

•  p(A)=Σp(A,Bi)=p(A,B1)+..+p(A,B6) =1/36+..+1/36

=1/6

Aboutbelief

•  RuleshaveexcepNons•  IgnoringexcepNonsleadstouncertainty•  IfweconsiderallexcepNons,wemaynotbeabletoproceed

•  MiddlewayistosummarizeexcepNons

That’swhereBayesformalismleadsusto

Anexample

SupposeIhaveabirdWhatcanbesaidaboutitsabilitytofly?

Birdsfly.SothebirdunderquesNonshouldbeabletofly

Anexample

SupposeIhaveabirdWhatcanbesaidaboutitsabilitytofly?Birdsfly.SothebirdunderquesNonshouldbeabletofly

•  Whatifit’sapenguin?•  Dead?•  Withwingscut?•  Madeofpaper?

Summary:Mostbirdscanfly.NeedtomakeitquanNtaNve

Logicalapproach?

•  Assignnumericalvaluestouncertaintyandcombinethemliketruthvalues

•  ButsourcesofuncertaintyarenotindependentanditisnoteasytoevaluateeffectofaddiNonalevidencee.g.inthelastexample,ifwearetoldthatthesaidbirdhasexistedfor1year,howdowetakethatintoconsideraNon?

Whatisp(B|A)?

•  Not“GivenA,theprobabilitythatBistrue”ThatistrueonlyifBdoesnot(also)dependonanythingelse.Becauseifweknewotherthings,maybetheprobabilitywillbedifferent.

Ifof10000species10can’tfly,then:p(cantfly|bird)=1/1000(blanketstatement)Thatdoesnotindicateifweknowanythingelse(here,e.g.thatthebirdhasexistedfor1year).

Verifica<onofirrelevancyiscrucial

•  Rule:Whatgoesupmustcomedown– A:foocomesdown

– B:foogoesup– P(A|B)=1

•  Rule:Whatgoesupmustcomedown– A:foocomesdown

– B:foogoesup– P(A|B)=1Isthatreallytrue?

Whatifupwardvelocity>escapevelocity?Wheredidescapevelocitycomefrom?Escapevelocitywasalwaysthere

IntenNonalv.extensional

•  TheserulebasedsystemsarecomputaNonallyconvenient,butsemanNcallyinconvenient.

•  TheoppositeistrueofBayesformalism:itsdeclaraNveandmodelbased

A

B

1‐m

n

mP(B|A)=m=>Inallworldsthatsa<sfyA,thosealsosa<sfyingBareafrac<onm

ParNNonedCondiNonalsandMarginals

Abitmoreaboutlogicalsystems

•  A:Thegrassiswet•  B:ItrainedAgivescredibilitytoB.Inarulebasedsystem,thatweightincreasesirrevocably.

If,later,C:Thesprinklerwason,

AndD:Theneighbor’sgrassisdry

ItisdifficulttoconnectAandC

• Itrained

B

• Sprinklerson

C

• Neighbor’ssprinklersoff

E

• Wetgrass

A

• Neighbor’sgrassdry

D

Ifanything,CbecomeslesscredibleonceweknowAtobetrue

Chaining?

•  Logic:IfAthenBandifBthenC=>ifAthenC•  Inplausiblereasoning,itcanleadtoproblems:–  Ifthegroundiswet,thenitrained–  Ifthesprinklersareon,thegrassiswetDoesthatmean:ifsprinklersareon,itrained?

Ifanything,ittakesawaysupportforsuchasuggesNon.

Asystemofrulesproducescoherentupdatesifandonlyifrulesformadirectedtreei.e.notworulesmaystemfromthesamepremise.WetgrassherepointstotwopossibleexplanaNons.

Itrained

Grassiswet

Sprinklerson

Whynetworks?

Tomakep(B|A)meaningful,wehavetoshow: OtheritemsinknowledgebaseareirrelevanttoB

 BedersNll,makeunignorablequicklyidenNfiableandaccessible

Neighboringnodesinagraphallowthat.Whatisnotlocaldoesnotmader!

NetworksarealsousedinAIetc.butBNhaveclearsemanNcs.Mostfeaturescanbederivedfromtheknowledgebase.

Theseplaycentralroleinuncertaintyformalisms  BN  causalnets  influencediagrams  Constraintnetworks

Moreimportantterms

•  Likelihood:inblackjackthelikelihoodofgeVng10ishigher(becauseitcanbe10,J,QorK)

•  CondiNoning:P(A|C)=p(A,C)/p(C)

Belief

CondiNoningbar

Knowledge/context

P(fly(a)|bird(a))=HIGHP(fly(a)|bird(a),sick(a))=LOW (non‐zero!)

(retracNonpossible)

P(fl|bird)=α*p(fl|bird,sick)+(1‐α)*p(fly(bird,NOTsick) 0<α<1

• Relevance:potenNalchangeinbeliefduetochangeinknowledge

•  CausaNon:

rain

sprinkler

Wetpavement

Bananapeel

Fallduetoslipping

Oncefallingisobserveditsirrelevantwhythepavementwaswet

Casestudy:collegeplans(Heckerman,1995,MSR‐TR‐95‐06)

•  Sex(SEX:M,F)

•  SocioeconomicStatus(SES:L,M,U,H)•  IQ(IQ:L,M,U,H)

•  Parentalencouragement(PE:L,H)•  Collegeplans:(CP:Y,N)

Hiddenvariable

UsingthetermsjustexplainedwecanuseprobabilityfordescribingqualitaNvephenomena.

Onecanseeifrefinements,extensionsarepossible.Ifitisbasedontheory,wecanunderstandexactlywhatadjustmentsneedtobemade.

Theinversionformula

p(H|e)=p(e|H)*p(H)/p(e)Posterior=likelihood*prior/normalizingconstant

p(e)=p(e|H)*p(H)+p(e|NOTH)*p(NOTH)

Theformulaseemstocomefrom:

p(A|B)=p(A,B)/p(B)andp(B|A)=p(A,B)/p(A)

Assesingp(H|e)

Inagamblingroomsomeonecalls12Isitfromapairofdice,orfromarouledewheel?

•  Fordice:p(e|H)=p(12|dice)=1/36•  Forroulede:p(e|H)=p(12|roulede)=1/38

Thusiftherearemorethan38/36rouledewheelsintheroom,p(roulede)ismorelikely

Talkingaboutp(roulede|12)wouldhavebeenmuchmoredifficult.

3prisonerproblem

1prisoner.3doors.2leadtodeath,1toescape.S/Heisaskedtochooseone.Oncehehasindicatedhischoice,oneoftheotherdoorsisindicatedtobeleadingtodeath.Heisgivenachancetoswitchtothethirddoor.Shouldhe?

Whatifthereare1000doors(999leadingtodeath)?Shouldheswitch?

TheBayesiantwist

OfthreeprisonersA,B,Conlyoneisgoingtobehangedandtheothertwopardoned.

Asaystoguard:Givethisledertooneofthepardonedones.

Anhourlater,Aaskstheguard:tellmewhodidyougiveitto?

Theguardanswers“B”.Areasons:SoeitherCwillbehanged,orIwillbe.Prob.ThatIwillbeis50%.Isheright?

p(G_A|I_B)=p(I_B|G_A)*p(G_A)/p(I_B) =1*(1/3)/(2/3)=½

So,applyingBayesianlogicincorrectlyleadstofalseresults.TheerrorhereismisinterpreNngthecontext.

Ratherthan:I_B=Bwillbereleased,I’_B=GuardsaidBwillbereleased.p(G_A|I’_B)=p(I’_B|G_A)*p(G_A)/p(I’_B)=(1/2)*(1/3)/(1/2)=1/3

Caseof1000?

•  1000prisonersofwhich1istobeputtodeath•  Afindsalistof998tobereleasedwithouthisnameonit

•  Whatshouldourbeliefbethathewillbetheonebeingputtodeath?

Caseof1000?

•  Listof998tobereleased•  Queryassociatedwiththeprintout:998righthanders

•  Aisle^‐handed•  ??Difficulttotreatsuchignoranceingeneral.

•  MulNvaluedhypothesis•  Uncertainevidence•  Virtual(intangible)evidence•  PredicNngfutureevents•  MulNplecauses,explainingaway

•  Padernsofplausiblereasoning•  …

NaïveBayes

•  x:featurevectorofeventparameters•  y:objectclassthatgivesrisetox(1<y<k)•  Certainfeaturesofxknown:

–  PosiNon–  Fluxatobservedwavelength

•  Otherswillbeunknown–  Color–  Changeinmag/fluxoverNmebaselines

NaïveBayes(contd.)

•  AssumpNon:basedony,xisdecomposableintoBdisNnctindependentclasses(labeledxb)

•  Thishelpswiththecurseofdimensionality•  Alsoallowsustodealwithmissingvalues•  Alternateparallelsupplementalsupervised

classificaNon–  AutomatedNeuralNetworks(ANN)–  SupportVectorMachines(SVM)

Follow‐up(formissingvalues)

•  Suchthatitwillhelpdiscriminatebeder•  ServeprobabiliNessothatconsumerscanchoosetheirtypesoftransients

•  Widestpossiblemodels

Choosingfollow‐upconfigs

r‐icolor,hi‐zquasar,bluestar

TransientclassificaNonmantra

•  Obtainacoupleofepochsinoneormorefilters

•  AssignsprobabiliNesfordifferentclasses•  ChooseobservaNons(filters,wavelengths)forbestdiscriminaNon

•  FeedthenewobservaNonsbackin•  ReviseprobabiliNes,chooseobservaNons,…•  Basedonconfirmedclass(how?)revisepriors

Summary

•  Modelingisallimportant(topredict/explore/explain)

•  Localdependencies,irrelevanciestobeevaluated

•  Priors,likelihoodstobeobtained•  DirectedAcyclicGraphtobeconstructed•  Datadefinenetwork•  No“training”necessary

BayesianNetworkToolboxhdp://bnt.googlecode.com

ABayesianNetwork

Cloudy

Rain Sprinkler

Wet

CreaNngaDAG

MulNvaluednodes

Makingthebnet

Namingparameters

CondiNonalprobabilitydistribuNon

Enteringevidence

TheCatalinaRealNmeTransientSurvey

Catalina Survey Fast Transient (a flare star), 02 Nov 2007 UT:

4 individual exposures, separated by 10 min Baseline coadd:

CRTS is a search for transients being done at Caltech piggybacking on the data from the search for near-Earth, potentially hazardous asteroids (this later is led by S. Larson, E. Beshore, et al. at UAz LPL). The survey uses the 24-inch Schmidt on Mt. Bigellow, and a single, unfiltered 4kx4k CCD (and also telescopes at Mt. Lemmon and Siding Spring). Coverage of well over 1000 deg2/night

BasicastronomyclassificaNontriviacontextbasedinformaNon

•  GalacNclaNtude–GalacNcness•  Proximitytoagalaxy–SN

CV,SN,Blazars,Rest

‐4‐>4(10binseach)

spectra lightcurves

•  N=5•  Dag=zeroes(N,N)•  C=1;g=2;c1=3;c2=4;c3=5;•  Dag(c,[g,c1,c2,c3])=1•  Discrete_nodes(1:N)•  Node_sizes=[4,10,10,10,10]•  Bnet=mk_bnet(dag,node_sizes,names,{’class’,’galacNc_laNtude’,’g‐r’,’r‐i’,’i‐z’},’discrete’,1:5)

AdvantagesofBayesianNetworks

•  Handlingofincompletedata– Real‐worldcases

•  LearningcausalconnecNons– Whatvariablecausedwhat

•  IncorporaNngdomainknowledge– Expertscanweightinatdifferentpoints

•  Memorizing(akaoverfiVng)avoided– Noholdoutnecessary

top related