becoming recursive or, recursion as an epiphenomenon of distributed role/filler serialization or,...

Becoming Recursive

or, Recursion as an Epiphenomenon of Distributed Role/Filler Serialization

or, How I Learned to Stop Recurring and Love the Brain

Simon D. LevyComputer Science DepartmentWashington & Lee University

Recursion in Human Languages ConferenceIllinois State University

27 April 2007

Part IBackground

Two Views on Recursion

1. “Essentialist”: Recursion is a fundamental property of the Faculty of Language in the Narrow Sense / FLN / UG (Hauser, Chomsky, Fitch 2002)

2. “Nominalist”: Recursion is one of several strategies for “the transmission of propositional structures through a serial interface”1

1Pinker & Bloom (1990)

c.f. Power Laws (Physics)

M. E. J. Newman. Power laws, Pareto distributions, and Zip's law. Contemporary Physics, 46, 323-351 (2005).

Now, just because these simple mechanisms exist, doesn't mean they explain any particular case.... You need to do "differential diagnosis", by identifying other, non-power-law consequences of your mechanism, which other possible explanations don't share. This, we hardly ever do. - C. Shalizi (2007)

Critique of Pure Recursion

If we want to imitate human memory with models, we must take account of the weaknesses of the nervous system as well as its powers. D. Gabor (1968)

Once again, however, my claim is not that the Pirahã cannot think recursively, but that their syntax is not recursive. D. Everett (2007)

Part IIModel

Role/Filler Serialization

• Propositional representations built from composing role/filler bindings (Fillmore 1968; Schank 1972)

• Syntax / grammar replaced by a neurally plausible mechanism for serializing recursively-structured propositional representations through role prediction (Chang et al. 2006)

• Syntactic recursion becomes possible when, e.g., noun roles (agent, patient) are generalized to intentional predicates (knows, wants)

LOVES

JOHN

MARY

KNOWS

BILL

KN

OW

ER

LOVER

LOVEE

KN

OW

N

LOVES

JOHN

MARY

KNOWS

BILL

AG

EN

T

AG

ENT

PATIENT

PATIE

NT

Neurally Plausible Role/Filler Models

• Distributed Representations: massively parallel, gracefully degrading, non-local storage (McClelland et al. 1986)

• Vector Symbol Architectures (Plate 2003; Kanerva 1994): roles, fillers represented as high-dimensional, low precision vectors of fixed size

• Efficient (parallel) binding, unbinding, composition through vector arithmetic

• Psychologically realistic model of analogy through vector distance metric

Vector Symbolic Architectures: Binding, Composition

Vector Symbolic Architectures: Unbiding

Vector Symbolic Architectures: Recursion

Serializing VSA Representations

• Sequence-processing network (Elman 1990; Dominey et al. 2006) can be trained to predict role-vector sequences for a given language (e.g., AGENT-PRED-PATIENT for English)

• Role vectors unbind fillers

• Associative network maps fillers to words

• Neurally plausible “soft stack” network (Levy 2007) supports fillers requiring further decomposition

Advantages of the Model• Predicts observed progression from simple,

idiosyncratic to complex, recursive constructions in language acquisition (Tomasello 2003)

• “Soft-wired”, learnable, mutable role inventory (Blank & Gasser 1992), generalizable to social & other networks

• Supports both directions of language / culture influence– Sapir-Whorf – Immediacy of Experience (Everett 2005)

Advantages of the Model

• Predicts soft limits on depth of embedding in memory, speech (Rohde 2002)

• Neurally plausible implementation (Eliasmith 2004; Dominey et al. 2006)

• Concept / sequence processing distinction supported by neuroscience (Crow 1997)

Part IIIConclusions

Current Work

• Role Production by Analogy in Vector Symbolic Architectures

• Iterated Learning Model (Kirby & Hurford 2002)

References & Related Work• Blank, D. and M. Gasser (1992) Grounding via Scanning: Cooking

up Roles from Scratch. Proceedings of the 1992 Midwest Artificial Intelligence and Cognitive Science Society Conference.

• Crow, T.J. (1997) Is Schizophrenia the Price that Homo Sapiens Pays for Language? Schizophrenia Research, 28: 127-141.

• Chang, F., G.S. Dell, and K. Bock (2006) Becoming Syntactic. Psychological Review, 113, 2, 234-272.

• Dominey P.F., M. Hoen, and T. Inui (2006) A Neurolinguistic Model of Grammatical Construction Processing, In Press, Journal of Cognitive Neuroscience. 18 : 2088-2107.

• Eliasmith, C. (2004). Learning context sensitive logical inference in a neurobiological simulation. in S. Levy, S. and R. Gayler, eds., Compositional Connectionism in Cognitive Science. AAAI Fall Symposium. AAAI Press. p. 17-20.

References & Related Work• Elman, J.: Finding structure in time. Cognitive Science 14 (1990) 179–211

• Everett., D.L. (2007) Cultural Constraints on Grammar in PIRAHÃ: A Reply to Nevins, Pesetsky, and Rodrigues (2007) lingBuzz/000427.

• Everett, D.L. (2005). Cultural Constraints on Grammar and Cognition in Pirah&atilde: Another Look at the Design Features of Human Language. Current Anthropology, August-October, 2005.

• Fillmore, C. J. (1968) The Case for Case. In Bach and Harms, eds., Universals in Linguistic Theory. New York: Holt, Rinehart, and Winston, 1-88.

• Gabor, D. Improved holographic model of temporal recall. Nature 217 (1968) 1288-1289.

• Hauser, M.D., N. Chomsky, and W. T. Fitch (2002) The Faculty of Language: What Is It, Who Has It, and How Did It Evolve? Science 22 November 2002: Vol. 298. no. 5598, pp. 1569 – 1579.

References & Related Work• Kanerva, P. (1994) The Spatter Code for Encoding Concepts at Many

Levels. In M. Marinaro and P.G. Morasso (eds.), ICANN '94: Proceedings International Conference on Artificial Neural Networks (Sorrento, Italy), vol. 1; 226--229. London: Springer-Verlag.

• Kirby, S. and J. Hurford (2002) The emergence of linguistic structure: An overview of the iterated learning model. In A. Cangelosi and D. Parisi, eds., Simulating the Evolution of Language. London: Springer Verlag, 121–148.

• Levy, S.D. (2007). Continuous States and Distributed Symbols: Toward a Biological Theory of Computation (Poster). Proceedings of Unconventional Computation: Quo Vadis?, Santa Fe, NM

• McClelland, J.L., D. E. Rumelhart and G. E. Hinton (1986) The Appeal of Parallel Distributed Processing. In D. E. Rumelhart and J. L. McClelland, eds., Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, Massachusetts: MIT Press.

• Miikkulainen, R. (1996) Subsymbolic Case-Role Analysis of Sentences with Embedded Clauses. Cognitive Science 20 : 47-73.

References & Related Work

• M. E. J. Newman. Power laws, Pareto distributions, and Zip's law. Contemporary Physics, 46, 323-351 (2005).

• Pinker, S. & P. Bloom (1990). Natural language and natural selection. Behavioral and Brain Sciences 13 (4): 707-784.

• Plate, T. (2003) Holographic Reduced Representations. CSLI Lecture Notes Number 150. Stanford, California: CSLI Publications.

• Rohde, D.L.T. (2002) A Connectionist Model of Sentence Comprehension and Production. PhD thesis, School of Computer Science, Carnegie Mellon University.

• Schank, R.C. (1972). Conceptual Dependency: A Theory of Natural Language Understanding, Cognitive Psychology, (3)4, 532-631.

• Shalizi, C. R. (2007) Power Law Distributions, 1/f Noise, Long-Memory Time Series. http://cscs.umich.edu/~crshalizi/notebooks/power-laws.html

http://cscs.umich.edu/~crshalizi/notebooks/power-laws.html

References & Related Work

• Tomasello, M. (2003). Constructing a Language: A Usage-Based Theory of Language Acquisition. Harvard University Press.

becoming recursive or, recursion as an epiphenomenon of distributed role/filler serialization or,...

Documents

rolevector sequences

given language

faculty of language

syntactic recursion

english role vectors

power laws physics

role prediction chang

neurally plausible mechanism