bayesianness, cont’d part 2 of... 4?. administrivia csusc (cs unm student conference) march 1,...
Post on 20-Dec-2015
215 views
TRANSCRIPT
Bayesianness, cont’d
Part 2 of... 4?
Administrivia
•CSUSC (CS UNM Student Conference)
•March 1, 2007 (all day)
•That’s a Thursday...
•Thoughts?
Bayesian class: general idea•Find probability distribution that describes
classes of data
•Find decision surface in terms of those probability distributions
•Bayesian decision rule: Bayes optimality
•Want to pick the class that minimizes expected cost
•Simplest case: cost==misclassification
•Expected cost == expected misclassification rate
5 minutes of math•For 0/1 cost, reduces to:
•To minimize, pick the that minimizes:
Bayes optimal decisions•Final rule: for 0/1 loss (accuracy) optimal
decision rule is:
•Equivalently, it’s sometimes useful to use log odds ratio test:
Bayesian learning process•So where do the probability distributions
come from?
•The art of Bayesian data modeling is:
•Deciding what probability models to use
•Figuring out how to find the parameters
• In Bayesian learning, the “learning” is (almost) all in finding the parameters
Back to the H/W data
•Gaussian (a.k.a. normal or bell curve) is a reasonable assumption for this data
•Other distributions better for other data
•Can make reasonable guesses about means
•Probably not -3 kg or 2 million lightyears
•Assumptions like these are called
•Model assumptions (Gaussian)
•Parameter priors (means)
•How do we incorporate these into learning?
Prior knowledge
5 minutes of math...•Our friend the Gaussian distribution
•1n 1-dimension:
•Mean:
•Std deviation:
•Both parameters scalar
•Usually, we talk about variance rather than std dev:
Gaussian: the pretty picture
Gaussian: the pretty picture
Location parameter: μ
Gaussian: the pretty picture
Scale parameter: σ
5 minutes of math...• In d dimensions:
•Where:
•Mean vector:
•Covariance matrix:
•Determinant of covariance:
Exercise:•For the 1-d Gaussian:
•Given two classes, with means μ1 and μ2 and std devs σ1 and σ2
•Find a description of the decision point if the std devs are the same, but diff means
•And if means are the same, but std devs are diff
•For the d-dim Gaussian,
•What shapes are the isopotentials? Why?
•Repeat above exercise for d-dim Gaussian