networks: present to past to future evolution of time in...

Evolution of Time in Neural

Networks: Present to Past to Future

CSCE 644 (Based on Forum for AI talk 2011)

Yoonsuck ChoeDepartment of Computer Science and Engineering

Texas A&M University

* Joint work with Ji Ryang Chung and Jaerock Kwon

1

What is Time?

No clear understanding (or consensus)

• tensed vs. tenseless

• psychological vs. thermodynamic vs. relativistic

• time and change, their relation

2

What is Time?

Common (psychological) concepts of time:

• Past

• Present

• Future

Present FuturePast

Recollection Prediction

3

Why Time?

• A key to understanding brain function may lie in

understanding time, as it relates to brain function.

• The brain generates (psychological) time!

4

Time and Memory

Without memory, there can be no concept of time:

• No concept of the past

• Thus, no concept of the future

• Only an eternal present.

5

Time, in the Context of Neural

Networks

• Feedforward neural networks:

Have no memory of past input.

• Recurrent neural networks:

Have memory of past input.

e.g., Elman (1991)

6

Feedforward Networks

Input1

Output1

Input2

Output2

Input3

Output3

Input sequence 1

Input1

Output1

Input3

Output3

Input2

Output2

Input sequence 27

Recurrent Networks

Output2 Output3

Input1

Output1

Input2 Input3

Output2 Output3

Input sequence 1

��

��

Output2

Input3Input2 Input1

Output3Output1

Input sequence 28

Time, in the Context of Neural

Networks• Feedforward nets:

– Reactive

– Living in the eternal present

– No past, no future, no time

• Recurrent nets:

– Contemplative

– Memories of the past

– Dynamic

– Note: The brain is a recurrent nete.g., Elman (1991)

9

Research Questions

Present FuturePast

Recollection Prediction

• [Q1] how did recollection (memory) evolve?

- From feedforward to recurrent architecture

• [Q2] how did prediction evolve?

- Emergence of prediction in recurrent architecture

10

Part I: Recollection

Largely based on Chung et al. (2009)

11

Recollection in Feedforward

Networks?

Is it possible for a feedforward network to show memory

capacity?

• What would be a minimal augmentation?

• Idea: allow material interaction, dropping and

detecting of external markers.

12

Memory Task: Catch the Balls

AB C

D E

B1

speed = 1

B2

speed = 2

agent

5 distance sensors

θ

cf. Beer (2000); Ward and Ward (2006)

• Agent with range sensors move left/right.

• Must catch both falling balls.

• Memory needed when ball goes out of view.13

Three Networks

Evolve three different networks:

• Feedforward

• Recurrent

• Dropper/Detector (with Feedforward net)

14

Feedforward Network

• Stardard feedforward network.

15

Recurrent Network

z −1

ujl

ujl

ujlI(t)

H(t)

O(t)

. . .

H(t−2)

vji

wλ

λ

λ

H(t−1) kj

H(t−N )mem

I1 I3 I4I2 I5

O 1 O 2

H 1 H 2 H 3

H 1 H 2 H 3

H 1 H 2 H 3

H 1 H 2 H 3

• Standard recurrent network (Elman 1991).

16

Feedforward Net + Dropper/Detector

I1

if O3 > θ,

DropMarker = True (1)

else,

DropMarker = False (2)

(1) (2)

I2

I3

I4

I5

I6

I7

H1

H2

H3

O1

O2

O3

Feedforward network plus:

• Extra output to drop markers.

• Extra sensors to detect the markers.17

Results: Feedforward

0

20

40

60

80

100

Fast Right BallFast Left Ball

Catc

h Pe

rform

ance

(%)

Feed-Forward Network

On average, only chance-level performance (50%).

• Always move to the fast ball.

• Randomly pick fast or slow ball and approach it.

18

Results: Recurrent vs. Dropper

0

20

40

60

80

100


Catc

h Pe

rform

ance

(%)

Recurrent Network

0

20

40

60

80

100


Catc

h Pe

rform

ance

(%)

Dropper Network

• No difference in performance between

dropper/detector net (right) and recurrent network

(left).

19

Behavior (Short Sensors)

-60

-40

-20

0

20

40

60

Right 3Right 2Right 1Left 3Left 2Left 1

posit

ion

Trial (time)

Recurrent (Short Memory, SM)Dropper (Short Sensor, SS)

• Slight overshoot and drop the marker.

• Subsequent move repelled away from the marker.20

Behavior (Long Sensors)

-60

-40

-20

0

20

40

60

Right 3Right 2Right 1Left 3Left 2Left 1

posit

ion

Trial (time)

Recurrent (Long Memory, LM)Dropper (Long Sensor, LS)

• Slight overshoot and drop the marker.

• Subsequent move repelled away from the marker.21

Part I Summary

• Reactive, feedforward networks can exhibit

memory-like behavior, when coupled with minimal

material interaction.

• Adding sensors and effectors could have been

easier than adjusting the neural architecture.

• Transition from external olfactory mechanism to

internal memory mechanism?

• Similar results obtained in 2D foraging task (Chung

and Choe 2009).

22

Part II: Prediction

Largely based on Kwon and Choe (2008)

23

Emergence of Prediction in RNN?

.....

A

C

Output

Hidden

Input

feedback

B Activation

x y zx

y

ztime

y

z

x

Can prediction emerge in internal state dynamics?

• Idea: Test if (1) internal state dynamics is

predictable in evolved recurrent nets, and (2) if that

correlates with performance. 24

Task: 2D Pole Balancing

θx

θy

x

y

Anderson (1989)

• Standard 2D pole balancing problem.

• Keep pole upright, within square bounding region.

• Evolve recurrent neural network controllers.

25

Example Internal State Trajectories

Hig

hIS

PLo

wIS

P

• Typical examples of high (top) and low (bottom) ISP.

• High ISP=predictable, Low ISP=unpredictable.

• Note: Both meet the same performance criterion!

26

Measuring Predictability

• Train a simple feedforward network to predict the

internal state trajectories.

• Measure prediction error made by the network.

→ High vs. low internal state predictability (ISP)

27

Experiment: High vs. Low ISP

internal state

analysis

internal stateanalysis

All Controllers High−perform.Controllers

Low ISP

High ISPselectionprocess

evolutionary

1. Train networks to achieve same performance mark.

2. Analyze internal state predictability (ISP).

3. Select top (High ISP) and bottom (Low ISP) five, and

compare their performance in a harder task. 28

Results: Internal State Predictability

(ISP)

• Trained 130 pole balancing agents.

• Chose top 10 highest ISP agents and bottom 10 lowest ISP.

– high ISPs: µ = 95.61% and σ = 5.55%.

– low ISPs: µ = 31.74% and σ = 10.79%.

29

Comparison High ISP and Low ISP

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8 9 10

Pre

dic

tio

n S

ucc

ess

Rat

e (

%)

Test Case Number

Comparison of High and Low Predictability

High

Low

• A comparison of the average predictability from two

groups: high ISP and low ISP.

• The predictive success rate of the top 10 and the

bottom 10 agents. 30

Results: Learning Time

0

200

400

600

800

1000

1200

1400

1600

1 2 3 4 5 6 7 8 9 10

Ge

ne

rati

on

Nu

mb

er

Test Case Number

Learning Time

High

Low

• No significant difference in learning time

31

Performance and Int. State Dyn.

0

1000

2000

3000

4000

5000

6000

1 2 3 4 5 6 7 8 9 10

Nu

mb

er

of

Ste

ps

Test Case Number

Performance and Internal State Dynamics

High

Low

• Made the initial conditions in the 2D pole balancing

task harsher.

• Performance of high- and low-ISP groups compared.

• High-ISP group outperforms the low-ISP group in the

changed environment.32

Behavioral Predictability

0

10

20

30

40

50

60

70

80

90

x pos y pos x angle y angle

Pre

dic

tio

n S

ucc

ess

Rat

e (

%)

Behavioral Predictability

High

Low

• Success of high-ISP group may simply be due to

simpler behavioral trajectory.

• However, predictability in behavioral predictability is

no different between high- and low-ISP groups.

33

Examples of cart x and y position

from high ISP

-1.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0 0.1 0.2 0.3 0.4 0.5 0.6

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 0.2 0.4 0.6 0.8 1 1.2

• Behavioral trajectories of x and y positions show

complex trajectories.

34

Examples of cart x and y position

from low ISP

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

-0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

-0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

-0.2 -0.15 -0.1 -0.05 0 0.05 0.1

-2

-1.5

-1

-0.5

0

0.5

1

-0.2 0 0.2 0.4 0.6 0.8 1

• Behavioral trajectories of x and y positions show

complex trajectories.

35

Part II Summary

• Simulations show potential evolutionary advantage

of predictive internal dynamics.

• Predictive internal dynamics could be a precondition

for full-blown predictive capability.

36

Wrap-Up

37

Discussion

Memory (Internal)Memory (External)No memory

PastPresent Future

Predictive dynamicsOlfactory system? Hippocampus?

• From external memory to internalized memory (cf.

Rocha 1996).

• Analogous to olfactory vs. hippocampal function?

• Pheronomes (external marker) vs. neuromodulators

(internal marker)?

38

Discussion (cont’d)

• Implications on the evolution of internal properties

invisible to the process evolution.

• Consciousness← Self (subject of consciousness)

← Subject of action← Authorship (property of

action)← 100% predictable (property of

authorship, objectively investigatable) 39

Future Work

Memory (Internal)Memory (External)No memory

PastPresent Future

Predictive dynamicsOlfactory system? Hippocampus?

• Actual evolution from dropper/detector net to recurrent net.

• Actual evolution of predictor that can utilize the predictable

dynamics.

40

Conclusion

• From reactive to contemplative to predictive.

– Recollection: External material interaction can

be a low-cost intermediate step toward recurrent

architecture.

– Prediction: Predictable internal state dynamics

in recurrent neural nets can have an evolutionary

edge, thus prediction can and will evolve.

• Time is essential for neural networks, and neural

networks gives us time.

41

Other Projects

• Brain connectomics project

• Delay, delay compensation, and prediction

• etc.

42

Knife-Edge Scanning MicroscopeLine−scan Camera

Microscope objective Diamond knife

Light source

Specimen

Choe et al. (2008); Mayerich et al. (2008)

• Connectomics for the whole mouse brain.

• 1µm3 resolution, 2TB of data per brain.

43

Delay Comp.: Flash-Lag Effect

FLE Actual PerceivedNijhawan (1994)

Various other FLEs exist (orientation, luminance, etc.).

Delay compensation methods at the synaptic level (Lim

and Choe 2005, 2006, 2008).

44

ReferencesAnderson, C. W. (1989). Learning to control an inverted pendulum using neural networks. IEEE Control Systems Maga-

zine, 9:31–37.

Beer, R. D. (2000). Dynamical approaches to cognitive science. Trends in Cognitive Sciences, 4:91–99.

Choe, Y., Abbott, L. C., Han, D., Huang, P.-S., Keyser, J., Kwon, J., Mayerich, D., Melek, Z., and McCormick, B. H.(2008). Knife-edge scanning microscopy: High-throughput imaging and analysis of massive volumes of biologicalmicrostructures. In Rao, A. R., and Cecchi, G., editors, High-Throughput Image Reconstruction and Analysis:Intelligent Microscopy Applications, 11–37. Boston, MA: Artech House.

Chung, J. R., and Choe, Y. (2009). Emergence of memory-like behavior in reactive agents using external markers. InProceedings of the 21st International Conference on Tools with Artificial Intelligence, 2009. ICTAI ’09, 404–408.

Chung, J. R., Kwon, J., and Choe, Y. (2009). Evolution of recollection and prediction in neural networks. In Proceedingsof the International Joint Conference on Neural Networks, 571–577. Piscataway, NJ: IEEE Press.

Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning,7:195–225.

Kwon, J., and Choe, Y. (2008). Internal state predictability as an evolutionary precursor of self-awareness and agency. InProceedings of the Seventh International Conference on Development and Learning, 109–114. IEEE.

44-1

Lim, H., and Choe, Y. (2005). Facilitatory neural activity compensating for neural delays as a potential cause of the flash-lag effect. In Proceedings of the International Joint Conference on Neural Networks, 268–273. Piscataway, NJ:IEEE Press.

Lim, H., and Choe, Y. (2006). Delay compensation through facilitating synapses and STDP: A neural basis for orien-tation flash-lag effect. In Proceedings of the International Joint Conference on Neural Networks, 8385–8392.Piscataway, NJ: IEEE Press.

Lim, H., and Choe, Y. (2008). Extrapolative delay compensation through facilitating synapses and its relation to theflash-lag effect. IEEE Transactions on Neural Networks, 19:1678–1688.

Mayerich, D., Abbott, L. C., and McCormick, B. H. (2008). Knife-edge scanning microscopy for imaging and reconstructionof three-dimensional anatomical structures of the mouse brain. Journal of Microscopy, 231:134–143.

Nijhawan, R. (1994). Motion extrapolation in catching. Nature, 370:256–257.

Rocha, L. M. (1996). Eigenbehavior and symbols. Systems Research, 13:371–384.

Ward, R., and Ward, R. (2006). 2006 special issue: Cognitive conflict without explicit conflict monitoring in a dynamicalagent. Neural Networks, 19(9):1430–1436.

44-2

networks: present to past to future evolution of time in...

Documents