Michael H. CoenMIT Computer Science and
Artificial Intelligence Laboratory
AAAI’07 TalkJuly 25, 2007
Learning to Sing Like a Bird: The Self-Supervised Acquisition of Birdsong
&
AAAI’07 Talk M.H. Coen
Outline Why do this research?
Background: Cross-Modal Clustering (+ demo) A biologically-inspired algorithm for machine learning
(Coen 2005, 2006a, 2006b, Coen et al. 2007)
A brief introduction to the zebra finch
An architecture for sensorimotor learning (+ demo) A simple, recursive application of cross-modal clustering Views motor control as perception backwards
Discussion
Introduction Background Zebra Finches Sensorimotor Learning Discussion
(Taeniopygia guttata)
AAAI’07 Talk M.H. Coen
In the grand scheme of things…
Statistical NLPDeep Blue/Chinook
DARPA Grand Challenge
OptimizationOperations Research
Statistical Machine Learning
PhysiologyNeuroscience
Cognitive Science
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
A fundamental question
Animals solve extremely difficult non-parametric and distribution free learning problems during development.
How?
Introduction Background Zebra Finches Sensorimotor Learning Discussion
Belief: Answering this lets us:1)Better understand learning in animals2)Build new types of machine learning systems
AAAI’07 Talk M.H. Coen
Cross-modal clustering briefly…
Use multiple viewpoints (or datasets) describing the same events makes learning easier
Biological motivation: Perceptual systems share information constantly during “ordinary”
perception (Stein and Meredith 1993, Shimojo and Shams 2001, Calvert et al. 2004, Spence and Driver 2004)
Introduction Background Zebra Finches Sensorimotor Learning Discussion
In a nutshell, CMC exploits redundancy within correlated datasets to discover unknown categories
AAAI’07 Talk M.H. Coen
How does it work? A simple example
Assume two events in the world: red and blue
Events in the world:
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
How does it work? A simple example
Assume two events in the world: red and blue Assume two datasets: Mode A and Mode B
Events in the world:
Mode A Mode BThought experiment
creature:
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
The view from the inside the creature…
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Mode A Mode B
Can we learn the red and blue events by
sharing internal perspectives?
Introduction Background Zebra Finches Sensorimotor Learning Discussion
Note: We will call these datasets slices
AAAI’07 Talk M.H. Coen
Recovering the categories
1) Iteratively project regions in each dataset onto the other dataset.
2) Merge regions in each dataset whose projections are the closest.
3) Continue…
To play with online, Google:MIT Artificial Intelligence Demonstrations
http://ai6034.mit.edu/fall06/index.php?title=Demonstrations
Introduction Background Zebra Finches Sensorimotor Learning Discussion
Mode A Mode B
Acquire language
Understand fMRI data
Learn to singSensorimotor learning
What can you learn
when you knownothing?
Acquire language
Understand fMRI data
What can you learn
when you knownothing?
Learn to singSensorimotor learning
AAAI’07 Talk M.H. Coen
The zebra finch
Small, unusually social oscine songbird
Perhaps the most studied bird in
neuroscience
Complex vocal harmonics People often mistake spectrograms for human speech
Almost identical FoxP2 gene with humans Governs vocal generation
(Taeniopygia guttata)
Introduction Background Zebra Finches Sensorimotor Learning Discussion
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Dynamics of song acquisition
Day 1:Fledgling is born!
First month:Father sings to
his children
~Day 20:Males begin singing
to themselves
Day 90:Song crystallizesat sexual maturity
Introduction Background Zebra Finches Sensorimotor Learning Discussion
An Architecture for Sensorimotor Learning
Sensory Cortex Motor Cortex
Perceptual Processing
Perceptual Slices
MotorControl
InnateExploratory
Motor Behaviors
Sensory Organs
Muscles/Effectors
Aff
eren
t P
roce
ssin
gE
fferent Processing
External World
Perceptual Slices
Events in the worldIntroduction Background Zebra Finches Sensorimotor Learning Discussion
Cross-Modal Clustering
happens here!
Sensory Cortex Motor Cortex
Perceptual Processing
Perceptual Slices
MotorControl
InnateExploratory
Motor Behaviors
Sensory Organs
Muscles/Effectors
MotorSlices
InternalPerception
(Cartesian Theater)
Aff
eren
t P
roce
ssin
gE
fferent Processing
An Architecture for Sensorimotor Learning
External World
MotorSlices
InnateExploratory
Motor Behaviors
Cross-Modal Clustering now happens here!
Sensory Cortex Motor Cortex
Perceptual Processing
Perceptual Slices
MotorSlices
MotorControl
Sensory Organs
Muscles/Effectors
InternalPerception
(Cartesian Theater)
Aff
eren
t P
roce
ssin
gE
fferent Processing
An Architecture for Sensorimotor Learning
External World
InnateExploratory
Motor Behaviors
AAAI’07 Talk M.H. Coen
Parental training: a simple example
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Self-observation of innate activity
Internal self-observation(Cartesian Theater)
Introduction Background Zebra Finches Sensorimotor Learning Discussion
External self-observation(Perceptual channels)
A) Perceptual Groundingfrom Parent
C) Internal SelfObservation
B) External SelfObseration
Innate Motor Activity
Recursive cross-modal clustering
AAAI’07 Talk M.H. Coen
Acquired intentional motor control
Sensory Map Motor Map
Effector SystemExternal World
Introduction Background Zebra Finches Sensorimotor Learning Discussion
Sensory Cortex Motor Cortex
Perceptual Processing
Perceptual Slices
MotorControl
Sensory Organs
Muscles/Effectors
Aff
eren
t P
roce
ssin
gE
fferent Processing
An Architecture for Sensorimotor Learning
External World
InnateExploratory
Motor Behaviors
MotorSlices
InternalPerception
(Cartesian Theater)
ArticulatorySynthesizer
Perceptual Slices
Sensory Cortex Motor Cortex
Perceptual Processing
Perceptual Slices
MotorControl
Sensory Organs
Muscles/Effectors
Aff
eren
t P
roce
ssin
gE
fferent Processing
An Architecture for Sensorimotor Learning
External World
InnateExploratory
Motor Behaviors
ArticulatorySynthesizer
InternalPerception
(Cartesian Theater)
MotorSlices
Sensory Cortex Motor Cortex
Perceptual Processing
Perceptual Slices
MotorControl
Sensory Organs
Muscles/Effectors
Aff
eren
t P
roce
ssin
gE
fferent Processing
An Architecture for Sensorimotor Learning
External World
InnateExploratory
Motor Behaviors
ArticulatorySynthesizer
InternalPerception
(Cartesian Theater)
MotorSlices
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
AM
FM
Entropy
Amplitude
Mean Frequency
Pitch Goodness
Pitch
Pitch Weight
Time
Fre
quency
Zebra Finch
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60
0.5
1
1.5
2
x 104
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
0 200 400 600 800 1000 1200 1400 1600
2
4
6
8
10
AM
FM
Entropy
Amplitude
Mean Frequency
Pitch Goodness
Pitch
Pitch Weight
Time
Fre
quency
Zebra Finch
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60
0.5
1
1.5
2
x 104
550 600 650 700 750 800 850
2
4
6
8
10
550 600 650 700 750 800 850
2
4
6
8
10
msec
msec
kHz
kHz
Defining Songemes
AAAI’07 Talk M.H. Coen
A learner for birdsong
Lower level features
Higher level features
Introduction Background Zebra Finches Sensorimotor Learning Discussion
A 15 dimension, highly compact manifold
AAAI’07 Talk M.H. Coen
Some zebra finch slices
Goodnessof pitch
Pitch
Mea
nfr
eque
ncy
Wie
ner
Ent
rop
yIntroduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Early “bird” babbling
Time
Fre
quency
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Time
Fre
quency
0 0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Time
Fre
quen
cy
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70
0.5
1
1.5
2
x 104
Time
Fre
quen
cy
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70
0.5
1
1.5
2
x 104
Samba
“Samba’s son”
Birdsong mimicry
Introduction Background Zebra Finches Sensorimotor Learning Discussion
A word about evaluating empirical experiments…
AAAI’07 Talk M.H. Coen
Contributions A new architecture for sensorimotor learning Entirely self-supervised Biologically inspired Extremely simple, dimensionally compact
Wide range of applications Robotics Sensor arrays Computational learning Dynamic control systems Skill acquisition based on observation
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Acknowledgments
Ofer Tchernichovski
Whitman Richards
Rodney Brooks
Howard Shrobe
Patrick Winston
Robert Berwick
Gerald Sussman
Adam Kraft
Kobi Gal
Krzysztof Gajos
To play with online, Google:MIT Artificial Intelligence Demonstrations
http://ai6034.mit.edu/fall06/index.php?title=Demonstrations
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Extra slides follow
AAAI’07 Talk M.H. Coen
Acquisition of harmonic complexity
100 200 300 400 500 600 700
2
4
6
8
10
100 200 300 400 500 600 700
2
4
6
8
10
0 100 200 300 400 500 600
2
4
6
8
10
0 100 200 300 400 500 600 700 800 900 1000
2
4
6
8
10
0 200 400 600 800 1000 1200
2
4
6
8
10
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Related workUnsupervised clustering:
Language de Marcken (1996) de Sa and Ballard (1997) Lin (2004)
Vision Bartlett (2001) Stauffer (2002)
Statistical clustering Dempster et al. (1977) Smyth (1999)
Blind signal separation Hyvärinen (2001)
Neuroscience Becker and Hinton (1995), Becker (2005) Granger (2003)
Auditory scene analysis Slaney et al. (2001)
Minimal supervision Blum and Mitchell (1998)
Co-Clustering (Bi-Clustering, Block Clustering) Friedman, Mosenzon, Slonim, and Tishby
(2001) Taskar, Segal, and Koller (2001) Madeira and Oliveira (2004)
Analysis of animal vocalizations: Birds (finches and buntings) Kogan and Margoliash (1997)
Bowhead Whales Mellinger and Clark (1993)
African elephants Clemins and Johnson (2003)
Humans Guenther and Perkell (2004)
Primary distinctions of our approach:
1. Fully unsupervised 2. Non-parametric:
Distribution free Unknown number of clusters
3. Presumes no domain knowledge4. Neurologically motivated
Introduction Background Zebra Finches Sensorimotor Learning Discussion
AAAI’07 Talk M.H. Coen
Current and future work Human protolinguistic babbling
Proficiency of an eight month old child Entire phonetic structure of English
Building an atlas of modular brain function From human and rat fMRI data New approaches to clinical treatments for autism
Theoretical investigations Convergence properties
Introduction Background Zebra Finches Sensorimotor Learning Discussion