open universes and nuclear weapons

1

Open universes and nuclear weapons

Stuart RussellComputer Science Division, UC Berkeley

2

Outline

Why we need expressive probabilistic languages BLOG combines probability and first-order logic Application to global seismic monitoring for the

Comprehensive Nuclear-Test-Ban Treaty (CTBT)

3

The world has things in it!!

Expressive language => concise models => fast learning, sometimes fast reasoning E.g., rules of chess:

1 page in first-order logicOn(color,piece,x,y,t)

~100000 pages in propositional logicWhiteKingOnC4Move12

~100000000000000000000000000000000000000 pages as atomic-state modelR.B.KB.RPPP..PPP..N..N…..PP….q.pp..Q..n..n..ppp..pppr.b.kb.r

[Note: chess is a tiny problem compared to the real world]

4

Brief history of expressiveness

atomic propositional first-order/relational

logic

probability

5



logic

probability

5th C B.C.

6



logic

probability

5th C B.C.

17th C

7



logic

probability

5th C B.C. 19th C

17th C

8



logic

probability

5th C B.C. 19th C

17th C 20th C

9



logic

probability

5th C B.C. 19th C

17th C 20th C 21st C

10



logic

probability

5th C B.C. 19th C

17th C 20th C 21st C

(be patient!)

11

First-order probabilistic languages

Gaifman [1964]: Possible worlds with objects and relations,

probabilities attached to (infinitely many) sentences Halpern [1990]:

Probabilities within sentences, constraints on distributions over first-order possible worlds

Poole [1993], Sato [1997], Koller & Pfeffer [1998], various others: KB defines distribution exactly (cf. Bayes nets) assumes unique names and domain closure like Prolog, databases (Herbrand semantics)

12

Herbrand vs full first-order

GivenFather(Bill,William) and Father(Bill,Junior)How many children does Bill have?

13



Herbrand semantics:2

14



Herbrand semantics:2First-order logical semantics:

Between 1 and ∞

15

Possible worlds Propositional

16


First-order + unique names, domain closureA B

C D

A B

C D

A B

C D

A B

C D

A B

C D

17


First-order + unique names, domain closure

First-order open-universe

A B

C D

A B

C D

A B

C D

A B

C D

A B

C D

A B C D A B C D A B C D A B C D A B C D A B C D

18

Open-universe models

Essential for learning about what exists, e.g., vision, NLP, information integration, tracking, life

[Note the GOFAI Gap: logic-based systems going back to Shakey assumed that perceived objects would be named correctly]

Key question: how to define distributions over an infinite, heterogeneous set of worlds?

Bayes nets build propositional worlds

19

BurglaryBurglary

AlarmAlarm

EarthquakeEarthquake


20

BurglaryBurglary

AlarmAlarm


Burglary


21

BurglaryBurglary

AlarmAlarm


Burglarynot Earthquake


22

BurglaryBurglary

AlarmAlarm


Burglarynot EarthquakeAlarm

23

Open-universe models in BLOG

Construct worlds using two kinds of steps, proceeding in topological order: Dependency statements: Set the value of a

function or relation on a tuple of (quantified) arguments, conditioned on parent values

24

Open-universe models in BLOG

Construct worlds using two kinds of steps, proceeding in topological order: Dependency statements: Set the value of a

function or relation on a tuple of (quantified) arguments, conditioned on parent values

Number statements: Add some objects to the world, conditioned on what objects and relations exist so far

25

Semantics

Every well-formed* BLOG model specifies a unique proper probability distribution over open-universe possible worlds; equivalent to an infinite contingent Bayes net

* No infinite receding ancestor chains, no conditioned cycles, all expressions finitely evaluable

26

Example: Citation Matching[Lashkari et al 94] Collaborative Interface Agents,

Yezdi Lashkari, Max Metral, and Pattie Maes, Proceedings of the Twelfth National Conference on Articial Intelligence, MIT Press, Cambridge, MA, 1994.

Metral M. Lashkari, Y. and P. Maes. Collaborative interface agents. In Conference of the American Association for Artificial Intelligence, Seattle, WA, August 1994.

Are these descriptions of the same object?

Core task in CiteSeer, Google Scholar, over 300 companies in the record linkage industry

27

(Simplified) BLOG model

#Researcher ~ NumResearchersPrior();

Name(r) ~ NamePrior();

#Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r));

Title(p) ~ TitlePrior();

PubCited(c) ~ Uniform({Paper p});

Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

28








29








30








31








32








33

Citation Matching Results

Four data sets of ~300-500 citations, referring to ~150-300 papers

0

0.05

0.1

0.15

0.2

0.25

Reinforce Face Reason Constraint

Error

(Fraction of Clusters Not Recovered Correctly)

Phrase Matching[Lawrence et al. 1999]

Generative Model + MCMC[Pasula et al. 2002]

Conditional Random Field[Wellner et al. 2004]

34

Example: Sibyl attacks Typically between 100 and 10,000 real people About 90% are honest, have one login ID Dishonest people own between 10 and 1000 logins. Transactions may occur between logins

If two logins are owned by the same person (sibyls), then a transaction is highly likely;

Otherwise, transaction is less likely (depending on honesty of each login’s owner).

A login may recommend another after a transaction: Sibyls with the same owner usually recommend each other; Otherwise, probability of recommendation depends on the honesty of the

two owners.

35

#Person ~ LogNormal[6.9, 2.3]();Honest(x) ~ Boolean[0.9]();#Login(Owner = x) ~ if Honest(x) then 1 else LogNormal[4.6,2.3]();Transaction(x,y) ~ if Owner(x) = Owner(y) then SibylPrior() else TransactionPrior(Honest(Owner(x)), Honest(Owner(y)));Recommends(x,y) ~ if Transaction(x,y) then if Owner(x) = Owner(y) then Boolean[0.99]() else RecPrior(Honest(Owner(x)), Honest(Owner(y)));

Evidence: lots of transactions and recommendations, maybe some Honest(.) assertionsQuery: Honest(x)

36



37



38



39



40

Example: classical data association

41


42


43


44


45


46

#Aircraft(EntryTime = t) ~ NumAircraftPrior();

Exits(a, t) if InFlight(a, t) then ~ Bernoulli(0.1);

InFlight(a, t)if t < EntryTime(a) then = falseelseif t = EntryTime(a) then = trueelse = (InFlight(a, t-1) & !Exits(a, t-1));

State(a, t)if t = EntryTime(a) then ~ InitState() elseif InFlight(a, t) then ~ StateTransition(State(a, t-1));

#Blip(Source = a, Time = t) if InFlight(a, t) then

~ NumDetectionsCPD(State(a, t));

#Blip(Time = t) ~ NumFalseAlarmsPrior();

ApparentPos(r)if (Source(r) = null) then ~ FalseAlarmDistrib()else ~ ObsCPD(State(Source(r), Time(r)));

47

Inference

Theorem: BLOG inference algorithms (rejection sampling, importance sampling, MCMC) converge to correct posteriors for any well-formed* model, for any first-order query

Current generic MCMC engine is quite slow Applying compiler technology Developing user-friendly methods for specifying

piecemeal MCMC proposals

48

CTBT

Bans testing of nuclear weapons on earth Allows for outside inspection of 1000km2

182/195 states have signed 153/195 have ratified Need 9 more ratifications including US, China US Senate refused to ratify in 1998

“too hard to monitor”

49

2053 nuclear explosions

51

254 monitoring stations

53

Vertically Integrated Seismic Analysis

The problem is hard: ~10000 “detections” per day, 90% false CTBT system (SEL3) finds 69% of significant events plus

about twice as many spurious (nonexistent) events 16 human analysts find more events, correct existing ones,

throw out spurious events, generate LEB (“ground truth”) Unreliable below magnitude 4 (1kT)

Solve it by global probabilistic inference NET-VISA finds around 88% of significant events

64

Generative model for IDC arrival data

Events occur in time and space with magnitude Natural spatial distribution a mixture of Fisher-Binghams Man-made spatial distribution uniform Time distribution Poisson with given spatial intensity Magnitude distribution Gutenberg-Richter (exponential) Aftershock distribution (not yet implemented)

Travel time according to IASPEI91 model plus Laplacian error distribution for each of 14 phases

Detection depends on magnitude, distance, station* Detected azimuth, slowness plus Laplacian error False detections with station-dependent distribution

65

# SeismicEvents ~ Poisson[TIME_DURATION*EVENT_RATE];IsEarthQuake(e) ~ Bernoulli(.999);EventLocation(e) ~ If IsEarthQuake(e) then EarthQuakeDistribution()

Else UniformEarthDistribution();Magnitude(e) ~ Exponential(log(10)) + MIN_MAG;Distance(e,s) = GeographicalDistance(EventLocation(e), SiteLocation(s));IsDetected(e,p,s) ~ Logistic[SITE_COEFFS(s,p)](Magnitude(e), Distance(e,s);#Arrivals(site = s) ~ Poisson[TIME_DURATION*FALSE_RATE(s)];#Arrivals(event=e, site) = If IsDetected(e,s) then 1 else 0;Time(a) ~ If (event(a) = null) then Uniform(0,TIME_DURATION)

else IASPEI(EventLocation(event(a)),SiteLocation(site(a)),Phase(a)) + TimeRes(a);TimeRes(a) ~ Laplace(TIMLOC(site(a)), TIMSCALE(site(a)));Azimuth(a) ~ If (event(a) = null) then Uniform(0, 360)

else GeoAzimuth(EventLocation(event(a)),SiteLocation(site(a)) + AzRes(a);AzRes(a) ~ Laplace(0, AZSCALE(site(a)));Slow(a) ~ If (event(a) = null) then Uniform(0,20)

else IASPEI-SLOW(EventLocation(event(a)),SiteLocation(site(a)) + SlowRes(site(a));

66

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks

Forward model structure

Detected atStation 1?


Station 1noise

Station 2 noise

67

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

TypeTime

LocationDepth

MagnitudePhase

68

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

Travel timeAmplitude decay

69

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

Arrival time*Amplitude*Azimuth*

Slowness*Phase*

70

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

73

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

74

Travel-time residual (station 6)

75

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

76

Detection probability as a function of distance (station 6, mb 3.5)

P phase S phase

77

Seismic event

Propagation

Seismic event

Propagation

Station 1picks

Station 2picks




Station 1noise

Station 2 noise

78

Overall Pick Error

79

Overall Azimuth Error

80

Phase confusion matrix

81

Fraction of LEB events missed

82

Fraction of LEB events missed

83

Event distribution: LEB vs SEL3

84

Event distribution: LEB vs NET-VISA

85

Why does NET-VISA work?

Multiple empirically calibrated seismological models Improving model structure and quality improves the results

Sound Bayesian combination of evidence Measured arrival times, phase labels, azimuths, etc.,

NOT taken literally Absence of detections provides negative evidence More detections per event than SEL3 or LEB

86

Example of using extra detections

87

NEIC event (3.0) missed by LEB

88


89


90

Why does NET-VISA not work (perfectly)?

Needs hydroacoustic for mid-ocean events Weaknesses in model:

Travel time residuals for all phases along a single path are assumed to be uncorrelated

Each phase arrival is assumed to generate at most one detection; in fact, multiple detections occur

Arrival detectors use high SNR thresholds, look only at local signal to make hard decisions

91

Detection-based and signal-based monitoring

events

detections

waveform signals

SEL3 NET-VISA SIG-VISA

92

Summary

Expressive probability models are very useful BLOG provides a generative language for defining

first-order, open-universe models Inference via MCMC over possible worlds

Other methods welcome! CTBT application is typical of multi-sensor monitoring

applications that need vertical integration and involve data association

Scaling up inference is the next step

open universes and nuclear weapons

Documents

orderrelationallogicproba

brief history

propositional worlds

order possible worldspoole

order logiconcolor

order logicapplication

herbrand vs

order logical semantics