learning, bayesian probability,graphical models and abduction (slides)

Upload: gardoimcilim

Post on 02-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    1/31

    Learning, Bayesian Probability,

    Graphical Models, and Abduction

    David Poole

    University of British Columbia

    1

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    2/31

    Induction and abduction

    Bayesian

    networks

    probabilistic

    Hornabduction

    logicalabduction

    Bayesianlearning

    neural

    networks

    statisticallearning

    MDL

    principle

    logic

    programming

    possibleworlds

    semantics

    2

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    3/31

    Overview

    Causal and evidential modelling & reasoning

    Bayesian networks, Bayesian conditioning &abduction

    Noise, overfitting, and Bayesian learning

    3

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    4/31

    Causal & Evidential ModellingCausal modelling:

    causes

    of interest

    effects

    observed

    vision: sceneimage

    diagnosis: diseasesymptoms

    learning: model data

    4

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    5/31

    Evidential modelling:

    effects causes

    vision: imagescene

    diagnosis: symptomsdiseases

    learning: datamodel

    5

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    6/31

    Causal & Evidential Reasoningobservation

    cause

    prediction

    evidential reasoning

    causal reasoning

    6

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    7/31

    Reasoning & Modelling StrategiesHow do we do causal and evidential reasoning, given

    modelling strategies?

    Evidential modelling & only evidential

    reasoning (Mycin, Neural Networks).

    Model evidentially + causally(problem:

    consistency, redundancy, knowledge acquisition)

    Model causally; use different reasoningstrategies for causal & evidential reasoning.

    (deduction + abduction or Bayes theorem)

    7

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    8/31

    Bayes Rule

    [de Moivre 1718, Bayes 1763, Laplace 1774]

    P(h|e)=P(e|h)P(h)

    P(e)

    Proof:

    P(h e) = P(e|h)P(h)= P(h|e)P(e)

    8

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    9/31

    Lesson #1

    You should know the difference between

    evidential & causal modelling

    evidential & causal reasoning

    There seems to be a relationship between Bayestheorem and abduction used for the same task.

    9

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    10/31

    Overview

    Causal and evidential modelling & reasoning

    Bayesian networks, Bayesian conditioning &abduction

    Noise, overfitting, and Bayesian learning

    10

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    11/31

    Bayesian Networks Graphical representation of independence.

    DAGs with nodes representing random

    variables.

    Embed independence assumption:

    Ifb1, , bnare the parents ofathen

    P(a|b1, , bn, V)=P(a|b1, , bn)

    ifVis not a descendant ofa.

    11

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    12/31

    Bayesian Network for Overhead Projector

    projector_lamp_on

    screen_lit_up

    lamp_works

    projector_switch_onpower_in_wire

    power_in_building projector_plugged_in

    mirror_working

    room_light_on

    light_switch_on

    alan_reading_book

    ray_says_"screen is dark"

    ray_is_awake

    power_in_projector

    12

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    13/31

    Bayesian networks as logic programsprojector_lamp_on

    power_in_projector

    lamp_works

    projector_working_ok. possible hypothesis

    with associated probabilityprojector_lamp_on

    power_in_projector

    lamp_works

    working_with_faulty_lamp.

    13

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    14/31

    Probabilities of hypotheses

    P(projector_working_ok)= P(projector_lamp_on|

    power_in_projector lamp_works) provided as part of Bayesian network

    P(projector_working_ok)

    = 1 P(projector_working_ok)

    14

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    15/31

    What do these logic programs mean? Possible world for each assignment of truth value to a

    possible hypothesis:

    {projector_working_ok, working_with_faulty_lamp}

    {projector_working_ok, working_with_faulty_lamp}

    {projector_working_ok, working_with_faulty_lamp}

    {projector_working_ok, working_with_faulty_lamp}

    Probability of a possible world is the product of the

    probabilities of the associated hypotheses.

    Logic program specifies what else is true in each possible

    world.

    15

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    16/31

    Probabilistic logic programs & abductionSemantics is abductive in nature

    set of explanations of a proposition characterizes the

    possible worlds in which it is true.(assume possible hypotheses and their negations).

    P(g)= Eis an explanation ofg

    P(E)

    P(E)=hE

    P(h)

    given with logic program

    16

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    17/31

    Conditional Probabilities

    P(g|e)=P(g e)

    P(e)

    explaing e

    explaine

    Given evidencee, explaine, then try to explaing

    from these explanations.

    17

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    18/31

    Lessons #2 Bayesian conditioning = abduction

    The evidence of Bayesian conditioning is what

    is to be explained.

    Condition on all information obtained since

    the knowledge base was built.

    P(h|e k) Pk(h|e)

    18

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    19/31

    Overview

    Causal and evidential modelling & reasoning

    Bayesian networks, Bayesian conditioning &

    abduction

    Noise, overfitting, and Bayesian learning

    19

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    20/31

    Inductiondata

    model

    prediction

    evidential reasoning

    causal reasoning

    20

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    21/31

    Potential ConfusionOften the data is about some evidential reasoning

    task. e.g., classification, diagnosis, recognition

    Example: We can doBayesian learningwhere the

    hypotheses are decision trees, neural networks,

    parametrized distribution, or Bayesian networks.

    Example: We can dohill climbing learning with

    cross validationwhere the hypotheses are decisiontrees, neural networks, parametrized distribution,

    or Bayesian networks.

    21

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    22/31

    Noise and OverfittingMost data contains noise (errors, inadequate

    attributes, spurious correlations)

    overfitting the model learned fits random

    correlations in the data

    Example: A more detailed decision treealwaysfits

    the data better, but usually smaller decision trees

    provides better predictive value.

    Need tradeoff between

    model simplicity + fit to data.

    22

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    23/31

    Overfitting and Bayes theorem

    P (h|e) =P (e|h)P (h)

    P(e)

    normalizing constant

    fit to data bias

    23

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    24/31

    Minimum description length principleChoose best hypothesis given the evidence:

    arg maxh

    P(h|e)

    = arg maxh

    P(e|h)P(h)

    P(e)

    = arg maxh P(e|h)P(h)

    = arg maxh

    log2P(e|h) log2P(h)

    the number of bits todescribe the data in

    terms of the model

    + the number ofbits to describe

    the model

    24

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    25/31

    Graphical Models for LearningIdea: modeldata

    Example: parameter estimation for probability of

    heads (from [Buntine, JAIR, 94])

    heads1

    heads2

    headsN

    . . .

    heads

    N

    25

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    26/31

    Abductive Version of Parameter Learning

    heads(C)

    turns_heads(C, ) prob_heads().

    tails(C)

    turns_tails(C, ) prob_heads().

    C{turns_heads(C, ), turns_tails(C, )} C

    {prob_heads( ): [0, 1]} C

    Prob(turns_heads(C, ))=

    Prob(turns_tails(C, )) =1

    Prob(prob_heads()) =1 uniform on[0, 1].

    26

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    27/31

    Explaining Data

    If you observe:

    heads(c1), tails(c2), tails(c3), heads(c4), heads(c5), . . .

    For each [0, 1]there is an explanation:

    {prob_heads(),turns_heads(c1, ),turns_tails(c2, ),

    turns_tails(c3, ),turns_heads(c4, ),turns_heads(c5, ),

    . . .}

    27

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    28/31

    Abductive neural-network learningi1

    i2

    i3

    h1

    h2

    o2

    o3

    o1

    prop(X, o2, V)

    prop(X, h1, V1) prop(X, h2, V2)

    param(p8, P1) param(p10, P2) abducible

    V =1

    1 + e(V1P1+V2P2). sigmoid additive

    28

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    29/31

    Lesson #3Abductive view of Bayesian learning:

    rules imply data from parameters or possible

    representations

    parameter values or possible representations

    abducible

    rules contain logical variables for data items

    29

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    30/31

    Evidential versus causal modelling

    Neural Nets Bayesian Nets

    modelling evidential causal

    sigmoid additive linear gaussian

    related by Bayes theorem

    evidential reasoning direct abduction

    causal reasoning none direct

    context changes fragile robust

    learning easier more difficult

    30

  • 8/10/2019 Learning, Bayesian Probability,Graphical Models and Abduction (Slides)

    31/31

    Conclusion Bayesian reasoning is abduction.

    Bayesian network is a causal modelling language

    abduction + probability.

    What logic-based abduction can learn from Bayesians:

    Handle noise, overfittingConditioning: explain everything

    Algorithms: exploit sparse structure, exploit extreme

    distributions, or stochastic simulation.

    What the Bayesians can learn:

    Richer representations.

    31