networks: present to past to future evolution of time in...
TRANSCRIPT
Evolution of Time in Neural
Networks: Present to Past to Future
CSCE 644 (Based on Forum for AI talk 2011)
Yoonsuck ChoeDepartment of Computer Science and Engineering
Texas A&M University
* Joint work with Ji Ryang Chung and Jaerock Kwon
1
What is Time?
No clear understanding (or consensus)
• tensed vs. tenseless
• psychological vs. thermodynamic vs. relativistic
• time and change, their relation
2
What is Time?
Common (psychological) concepts of time:
• Past
• Present
• Future
Present FuturePast
Recollection Prediction
3
Why Time?
• A key to understanding brain function may lie in
understanding time, as it relates to brain function.
• The brain generates (psychological) time!
4
Time and Memory
Without memory, there can be no concept of time:
• No concept of the past
• Thus, no concept of the future
• Only an eternal present.
5
Time, in the Context of Neural
Networks
• Feedforward neural networks:
Have no memory of past input.
• Recurrent neural networks:
Have memory of past input.
e.g., Elman (1991)
6
Feedforward Networks
Input1
Output1
Input2
Output2
Input3
Output3
Input sequence 1
Input1
Output1
Input3
Output3
Input2
Output2
Input sequence 27
Recurrent Networks
Output2 Output3
Input1
Output1
Input2 Input3
Output2 Output3
Input sequence 1
����������������������������������������������������������������������������������������
����������������������������������������������������������������������������������������
Output2
Input3Input2 Input1
Output3Output1
Input sequence 28
Time, in the Context of Neural
Networks• Feedforward nets:
– Reactive
– Living in the eternal present
– No past, no future, no time
• Recurrent nets:
– Contemplative
– Memories of the past
– Dynamic
– Note: The brain is a recurrent nete.g., Elman (1991)
9
Research Questions
Present FuturePast
Recollection Prediction
• [Q1] how did recollection (memory) evolve?
- From feedforward to recurrent architecture
• [Q2] how did prediction evolve?
- Emergence of prediction in recurrent architecture
10
Part I: Recollection
Largely based on Chung et al. (2009)
11
Recollection in Feedforward
Networks?
Is it possible for a feedforward network to show memory
capacity?
• What would be a minimal augmentation?
• Idea: allow material interaction, dropping and
detecting of external markers.
12
Memory Task: Catch the Balls
AB C
D E
B1
speed = 1
B2
speed = 2
agent
5 distance sensors
θ
cf. Beer (2000); Ward and Ward (2006)
• Agent with range sensors move left/right.
• Must catch both falling balls.
• Memory needed when ball goes out of view.13
Three Networks
Evolve three different networks:
• Feedforward
• Recurrent
• Dropper/Detector (with Feedforward net)
14
Feedforward Network
• Stardard feedforward network.
15
Recurrent Network
z −1
ujl
ujl
ujlI(t)
H(t)
O(t)
. . .
H(t−2)
vji
wλ
λ
λ
H(t−1) kj
H(t−N )mem
I1 I3 I4I2 I5
O 1 O 2
H 1 H 2 H 3
H 1 H 2 H 3
H 1 H 2 H 3
H 1 H 2 H 3
• Standard recurrent network (Elman 1991).
16
Feedforward Net + Dropper/Detector
I1
if O3 > θ,
DropMarker = True (1)
else,
DropMarker = False (2)
(1) (2)
I2
I3
I4
I5
I6
I7
H1
H2
H3
O1
O2
O3
Feedforward network plus:
• Extra output to drop markers.
• Extra sensors to detect the markers.17
Results: Feedforward
0
20
40
60
80
100
Fast Right BallFast Left Ball
Catc
h Pe
rform
ance
(%)
Feed-Forward Network
On average, only chance-level performance (50%).
• Always move to the fast ball.
• Randomly pick fast or slow ball and approach it.
18
Results: Recurrent vs. Dropper
0
20
40
60
80
100
Fast Right BallFast Left Ball
Catc
h Pe
rform
ance
(%)
Recurrent Network
0
20
40
60
80
100
Fast Right BallFast Left Ball
Catc
h Pe
rform
ance
(%)
Dropper Network
• No difference in performance between
dropper/detector net (right) and recurrent network
(left).
19
Behavior (Short Sensors)
-60
-40
-20
0
20
40
60
Right 3Right 2Right 1Left 3Left 2Left 1
posit
ion
Trial (time)
Recurrent (Short Memory, SM)Dropper (Short Sensor, SS)
• Slight overshoot and drop the marker.
• Subsequent move repelled away from the marker.20
Behavior (Long Sensors)
-60
-40
-20
0
20
40
60
Right 3Right 2Right 1Left 3Left 2Left 1
posit
ion
Trial (time)
Recurrent (Long Memory, LM)Dropper (Long Sensor, LS)
• Slight overshoot and drop the marker.
• Subsequent move repelled away from the marker.21
Part I Summary
• Reactive, feedforward networks can exhibit
memory-like behavior, when coupled with minimal
material interaction.
• Adding sensors and effectors could have been
easier than adjusting the neural architecture.
• Transition from external olfactory mechanism to
internal memory mechanism?
• Similar results obtained in 2D foraging task (Chung
and Choe 2009).
22
Part II: Prediction
Largely based on Kwon and Choe (2008)
23
Emergence of Prediction in RNN?
.....
A
C
Output
Hidden
Input
feedback
B Activation
x y zx
y
ztime
y
z
x
Can prediction emerge in internal state dynamics?
• Idea: Test if (1) internal state dynamics is
predictable in evolved recurrent nets, and (2) if that
correlates with performance. 24
Task: 2D Pole Balancing
θx
θy
x
y
Anderson (1989)
• Standard 2D pole balancing problem.
• Keep pole upright, within square bounding region.
• Evolve recurrent neural network controllers.
25
Example Internal State Trajectories
Hig
hIS
PLo
wIS
P
• Typical examples of high (top) and low (bottom) ISP.
• High ISP=predictable, Low ISP=unpredictable.
• Note: Both meet the same performance criterion!
26
Measuring Predictability
• Train a simple feedforward network to predict the
internal state trajectories.
• Measure prediction error made by the network.
→ High vs. low internal state predictability (ISP)
27
Experiment: High vs. Low ISP
internal state
analysis
internal stateanalysis
All Controllers High−perform.Controllers
Low ISP
High ISPselectionprocess
evolutionary
1. Train networks to achieve same performance mark.
2. Analyze internal state predictability (ISP).
3. Select top (High ISP) and bottom (Low ISP) five, and
compare their performance in a harder task. 28
Results: Internal State Predictability
(ISP)
• Trained 130 pole balancing agents.
• Chose top 10 highest ISP agents and bottom 10 lowest ISP.
– high ISPs: µ = 95.61% and σ = 5.55%.
– low ISPs: µ = 31.74% and σ = 10.79%.
29
Comparison High ISP and Low ISP
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10
Pre
dic
tio
n S
ucc
ess
Rat
e (
%)
Test Case Number
Comparison of High and Low Predictability
High
Low
• A comparison of the average predictability from two
groups: high ISP and low ISP.
• The predictive success rate of the top 10 and the
bottom 10 agents. 30
Results: Learning Time
0
200
400
600
800
1000
1200
1400
1600
1 2 3 4 5 6 7 8 9 10
Ge
ne
rati
on
Nu
mb
er
Test Case Number
Learning Time
High
Low
• No significant difference in learning time
31
Performance and Int. State Dyn.
0
1000
2000
3000
4000
5000
6000
1 2 3 4 5 6 7 8 9 10
Nu
mb
er
of
Ste
ps
Test Case Number
Performance and Internal State Dynamics
High
Low
• Made the initial conditions in the 2D pole balancing
task harsher.
• Performance of high- and low-ISP groups compared.
• High-ISP group outperforms the low-ISP group in the
changed environment.32
Behavioral Predictability
0
10
20
30
40
50
60
70
80
90
x pos y pos x angle y angle
Pre
dic
tio
n S
ucc
ess
Rat
e (
%)
Behavioral Predictability
High
Low
• Success of high-ISP group may simply be due to
simpler behavioral trajectory.
• However, predictability in behavioral predictability is
no different between high- and low-ISP groups.
33
Examples of cart x and y position
from high ISP
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0 0.1 0.2 0.3 0.4 0.5 0.6
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.10
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 0.2 0.4 0.6 0.8 1 1.2
• Behavioral trajectories of x and y positions show
complex trajectories.
34
Examples of cart x and y position
from low ISP
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
-0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
-0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
-0.2 -0.15 -0.1 -0.05 0 0.05 0.1
-2
-1.5
-1
-0.5
0
0.5
1
-0.2 0 0.2 0.4 0.6 0.8 1
• Behavioral trajectories of x and y positions show
complex trajectories.
35
Part II Summary
• Simulations show potential evolutionary advantage
of predictive internal dynamics.
• Predictive internal dynamics could be a precondition
for full-blown predictive capability.
36
Wrap-Up
37
Discussion
Memory (Internal)Memory (External)No memory
PastPresent Future
Predictive dynamicsOlfactory system? Hippocampus?
• From external memory to internalized memory (cf.
Rocha 1996).
• Analogous to olfactory vs. hippocampal function?
• Pheronomes (external marker) vs. neuromodulators
(internal marker)?
38
Discussion (cont’d)
• Implications on the evolution of internal properties
invisible to the process evolution.
• Consciousness← Self (subject of consciousness)
← Subject of action← Authorship (property of
action)← 100% predictable (property of
authorship, objectively investigatable) 39
Future Work
Memory (Internal)Memory (External)No memory
PastPresent Future
Predictive dynamicsOlfactory system? Hippocampus?
• Actual evolution from dropper/detector net to recurrent net.
• Actual evolution of predictor that can utilize the predictable
dynamics.
40
Conclusion
• From reactive to contemplative to predictive.
– Recollection: External material interaction can
be a low-cost intermediate step toward recurrent
architecture.
– Prediction: Predictable internal state dynamics
in recurrent neural nets can have an evolutionary
edge, thus prediction can and will evolve.
• Time is essential for neural networks, and neural
networks gives us time.
41
Other Projects
• Brain connectomics project
• Delay, delay compensation, and prediction
• etc.
42
Knife-Edge Scanning MicroscopeLine−scan Camera
Microscope objective Diamond knife
Light source
Specimen
Choe et al. (2008); Mayerich et al. (2008)
• Connectomics for the whole mouse brain.
• 1µm3 resolution, 2TB of data per brain.
43
Delay Comp.: Flash-Lag Effect
FLE Actual PerceivedNijhawan (1994)
Various other FLEs exist (orientation, luminance, etc.).
Delay compensation methods at the synaptic level (Lim
and Choe 2005, 2006, 2008).
44
ReferencesAnderson, C. W. (1989). Learning to control an inverted pendulum using neural networks. IEEE Control Systems Maga-
zine, 9:31–37.
Beer, R. D. (2000). Dynamical approaches to cognitive science. Trends in Cognitive Sciences, 4:91–99.
Choe, Y., Abbott, L. C., Han, D., Huang, P.-S., Keyser, J., Kwon, J., Mayerich, D., Melek, Z., and McCormick, B. H.(2008). Knife-edge scanning microscopy: High-throughput imaging and analysis of massive volumes of biologicalmicrostructures. In Rao, A. R., and Cecchi, G., editors, High-Throughput Image Reconstruction and Analysis:Intelligent Microscopy Applications, 11–37. Boston, MA: Artech House.
Chung, J. R., and Choe, Y. (2009). Emergence of memory-like behavior in reactive agents using external markers. InProceedings of the 21st International Conference on Tools with Artificial Intelligence, 2009. ICTAI ’09, 404–408.
Chung, J. R., Kwon, J., and Choe, Y. (2009). Evolution of recollection and prediction in neural networks. In Proceedingsof the International Joint Conference on Neural Networks, 571–577. Piscataway, NJ: IEEE Press.
Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning,7:195–225.
Kwon, J., and Choe, Y. (2008). Internal state predictability as an evolutionary precursor of self-awareness and agency. InProceedings of the Seventh International Conference on Development and Learning, 109–114. IEEE.
44-1
Lim, H., and Choe, Y. (2005). Facilitatory neural activity compensating for neural delays as a potential cause of the flash-lag effect. In Proceedings of the International Joint Conference on Neural Networks, 268–273. Piscataway, NJ:IEEE Press.
Lim, H., and Choe, Y. (2006). Delay compensation through facilitating synapses and STDP: A neural basis for orien-tation flash-lag effect. In Proceedings of the International Joint Conference on Neural Networks, 8385–8392.Piscataway, NJ: IEEE Press.
Lim, H., and Choe, Y. (2008). Extrapolative delay compensation through facilitating synapses and its relation to theflash-lag effect. IEEE Transactions on Neural Networks, 19:1678–1688.
Mayerich, D., Abbott, L. C., and McCormick, B. H. (2008). Knife-edge scanning microscopy for imaging and reconstructionof three-dimensional anatomical structures of the mouse brain. Journal of Microscopy, 231:134–143.
Nijhawan, R. (1994). Motion extrapolation in catching. Nature, 370:256–257.
Rocha, L. M. (1996). Eigenbehavior and symbols. Systems Research, 13:371–384.
Ward, R., and Ward, R. (2006). 2006 special issue: Cognitive conflict without explicit conflict monitoring in a dynamicalagent. Neural Networks, 19(9):1430–1436.
44-2