csci 1300 artificial intelligence lecture mike mozer december 4, … · 2003-12-04 · artificial...
TRANSCRIPT
![Page 1: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/1.jpg)
CSCI 1300
Artificial Intelligence Lecture
Mike Mozer
December 4, 2003
![Page 2: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/2.jpg)
Computer Science
Operating Systems
Programming Languages
Networking
Security
Theory
Artificial Intelligence
![Page 3: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/3.jpg)
Artificial Intelligence
Natural Language Understanding
Speech Recognition
Computer Vision
Robotics
Reasoning
Planning
Machine Learning
![Page 4: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/4.jpg)
Machine Learning
Supervised Learningspam filters (hotmail.com)
ALVINN (autonomous vehicle navigation)
Unsupervised Learningcollaborative filtering (amazon.com)
fault monitoring
Reinforcement Learningtd-gammon (champion backgammon playing program)
elevator controller
adaptive home lighting/heating control
![Page 5: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/5.jpg)
Reinforcement Learning: A Simple Example
Suppose you are in one of two stateshungry
sleepy
Suppose you can take one of two actionsgo to Turley’s
lie on bed
Reward contingencieshungry -> go to Turley’s reward
hungry -> lie on bed no reward
sleepy -> go to Turley’s no reward
sleepy -> lie on bed reward
Reward depends on what action you take in a given state.
![Page 6: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/6.jpg)
Reinforcement Learning: A Simple Example
How do you learn to take the correct action?
Trial and error!
Through experience, system can learn to predict the reward that will be obtained for some action given the current state:
reward(action | state)
This is also notated as “Q(state, action)”
Given the expected reward, agent can choose best action:if Q(hungry, Turley’s) > Q(hungry, lie on bed) then go to Turley’selse lie on bed
![Page 7: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/7.jpg)
Reinforcement Learning in the Real World
IssuesDelayed reinforcement (e.g., car accident due to worn tires)
Occasional reinforcement (e.g., chess playing)
Short term versus long term rewards (e.g., skipping class)
Exploration versus exploitation (e.g., trying new restaurants)
Partially observable state (e.g., viral infection)
Multiple agents (e.g., multiple elevators)
s1 s2 s3 s4 s5 s6 s7
time interval
state
action
instantaneous
1 2 3 4 5 6 7
a1 a2 a3 a4 a5 a6 a7
r1 r2 r3 r4 r5 r6 r7reinforcement
![Page 8: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/8.jpg)
Elevator Control
![Page 9: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/9.jpg)
Elevator Control
![Page 10: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/10.jpg)
Q learning(Watkins, 1989; Watkins & Dayan, 1992)
Q(x,u): If action u is taken in state x, what is the minimum cost we can expect to obtain?
Policy based on Q values:
Incremental update rule for Q values:
Given fully observable state, infinite exploration, etc., guaranteed to converge on optimal policy.
π xt( ) argminuQ xt ut,( ) with probability 1 θ–( )
random with probability θ
=
exploration rate
Q xt ut,( ) 1 α–( )Q xt ut,( ) α maxu ct λQ xt 1+ u,( )+[ ]+←
discount factorlearning rate
![Page 11: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/11.jpg)
The Adaptive HouseMichael Mozer+*Robert Dodier#Debra Miller*
Marc Anderson*Josh Anderson✩ Diane Lukianow✩
Dan Bertini# Tom Moyer�
Matt Bronder* Charles Myers✩
Michael Colagrosso* Tom Pennell*Robert Cruickshank# James Ries✩
Brian Daugherty* Erik Skorpen✩
Mark Fontenot� Joel Sloss✩
Okechukwu Ikeako✩ Lucky Vidmar*Paul Kooros✩ Matthew Weeks✩
University of Colorado*Department of Computer Science+Institute of Cognitive Science
#Department of Civil, Environmental, and Architectural Engineering✩Department of Electrical and Computer Engineering
�Department of Mechanical Engineering�Department of Aerospace Engineering
http://www.cs.colorado.edu/~mozer/adaptive-house
![Page 12: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/12.jpg)
The adaptive house
Not a programmable house, but a house that programs itself.
House adapts to the lifestyle of the inhabitants.House monitors environmental state and senses actions of inhabitant.
House learns inhabitants’ schedules, preferences, and occupancy patterns.
House uses this information to achieve two objectives:(1) anticipate inhabitant needs(2) conserve energy
Domain: home comfort systems• air heating
• lighting
• water heating
• ventilation
![Page 13: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/13.jpg)
The adaptive house
Residence in Marshall, Colorado, outside of Boulder
![Page 14: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/14.jpg)
Some of the gang
![Page 15: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/15.jpg)
Great room
![Page 16: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/16.jpg)
Bedrooms and bathrooms
![Page 17: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/17.jpg)
Sensors
![Page 18: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/18.jpg)
Sensors
![Page 19: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/19.jpg)
Water heater
![Page 20: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/20.jpg)
Furnace
![Page 21: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/21.jpg)
Controls
![Page 22: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/22.jpg)
Computers
![Page 23: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/23.jpg)
Training signals
Actions performed by inhabitant specify setpoints➜ anticipation of inhabitant desires
Gas and electricity costs➜ energy conservation
![Page 24: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/24.jpg)
An reinforcement learning framework
Each constraint has an associated cost:discomfort cost if inhabitant preferences are neglected
energy cost depends on device and intensity setting
The optimal control policy minimizes
where t = index over nonoverlapping time intervalst0 = current time intervalut = control decision for interval txt = environmental state during interval t
J t0( ) E= 1κ--- d xt( ) e ut( )+
t t0 1+=
t0 κ+
∑κ ∞→lim
![Page 25: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/25.jpg)
ACHE(Adaptive Control of Home Environments)
Separate control system for each task
air temperature regulation
furnacespace heatersfansdampersblinds
lighting regulation
wall sconcesoverhead lights
water temperature regulation
hot water heater
device
inhabitant actions
environmentalstate
setpoints
and energy costs
ACHE
![Page 26: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/26.jpg)
General architecture of ACHE
instantaneousenvironmental state
occupancymodel
statetransformation
predictors
setpointgenerator
deviceregulator
decision
staterepresentation
occupiedzones
setpointprofile
future stateinformation
![Page 27: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/27.jpg)
Lighting control
What makes lighting control a challenge?Twenty-two banks of lights, each with 16 intensity levels; seven banks of lights in great room alone
Motion-triggered lighting does not work
Lighting moods
Two constraints must be satisfied simultaneously• maintaining lighting according to inhabitant preferences• conserving energy
Range of time scales involved
Sluggishness of system
![Page 28: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/28.jpg)
Resolving the sluggishness dilemma
Anticipator: Neural network that predicts which zone(s) will become occupied in the next two seconds
Input1, 3, and 6 second average of motion signals (36)instantaneous and 2 second average of door status (20)instantaneous, 1 second, and 3 second average of sound level (33)current zone occupancy status and durations (16)time of day (2)
Outputp(zone i becomes occupied in next 2 seconds | currently unoccupied) (8)
Runs every 250 ms
![Page 29: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/29.jpg)
Training anticipatorOccupancy model provides training signalTwo types of errors
miss
false alarm
Training procedureGiven partially trained net, collect misses and false alarms.Retrain net when 200 additional examples collected.TD algorithm for misses
state(t – 2000 ms)state(t – 1750 ms)...state(t – 250 ms)
zone i becomes occupied
state(t) zone i vacant
0 20000 40000 60000Number of training examples
hit/(
mis
s+fa
)
![Page 30: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/30.jpg)
Examples of anticipator performance
![Page 31: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/31.jpg)
Lighting controller costs
Energy cost7.2 cents per kW-hr
Discomfort cost1 cent per device whose level is manually adjusted
Anticipator miss cost.1 cent per device that was off and should have been on
Anticipator false alarm cost.1 cent per device that was turned on
![Page 32: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/32.jpg)
Results
• about three months of data collection• events logged only from 19:00 – 06:59
2000 4000 6000 80000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
# events
cost
(ce
nts) discomfort
energy
![Page 33: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/33.jpg)
Air temperature control
0 5 10 15 20off
on fu
rnac
eSunday March 6, 2000
0 5 10 15 20away
home
0 5 10 15 200
0.5
1
Time of day
p(st
ate
chan
ge)
![Page 34: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/34.jpg)
Comparison of control policiesusing artificial occupancy data
10.750.50.2507
7.2
7.4
7.6
7.8
8
8.2
Variability Index
Mea
n C
ost \
($/d
ay\)
Productivity Loss = 1.0 hr.
10.750.50.2507
7.5
8
8.5
9
9.5
10
10.5
Variability Index
Mea
n C
ost \
($/d
ay\)
Productivity Loss = 3.0 hr.
constant temperature
constant temperature
NeurothermostatNeurothermostat
setbackthermostat setback
thermostat
occupancytriggered
occupancytriggered
![Page 35: CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4, … · 2003-12-04 · Artificial Intelligence Lecture Mike Mozer December 4, 2003. Computer Science Operating Systems](https://reader033.vdocument.in/reader033/viewer/2022053014/5f11cc0b3785533f8752a194/html5/thumbnails/35.jpg)
Comparison of control policiesusing real occupancy data
Mean Daily Costproductivity lossρ = 1 ρ = 3
Neurothermostat $6.77 $7.05constant temperature $7.85 $7.85occupancy triggered $7.49 $8.66setback thermostat $8.12 $9.74