learnability and semantic universals
TRANSCRIPT
![Page 1: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/1.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Learnability and Semantic Universals
Shane Steinert-Threlkeld(joint work-in-progress with Jakub Szymanik)
Cognitive Semantics and Quantities Kick-off WorkshopSeptember 28, 2017
1 / 38
![Page 2: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/2.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Overview
1 Semantic Universals
2 The Learning Model
3 Experiments
4 Conclusion
2 / 38
![Page 3: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/3.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Universals in Linguistic Theory
Question
What is the range of variation in human languages? That is: which outof all of the logically possible languages that humans could speak, dothey in fact speak?
3 / 38
![Page 4: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/4.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Example Universals
Sound (phonology):All languages have consonants and vowels. Every language has atleast /a i u/.
Grammar (syntax):All languages have verbs and nouns.
Meaning (semantics):All languages have syntactic constituents (NPs) whose semanticfunction is to express generalized quantifiers. (Barwise and Cooper1981)
4 / 38
![Page 5: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/5.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Example Universals
Sound (phonology):All languages have consonants and vowels. Every language has atleast /a i u/.
Grammar (syntax):All languages have verbs and nouns.
Meaning (semantics):All languages have syntactic constituents (NPs) whose semanticfunction is to express generalized quantifiers. (Barwise and Cooper1981)
4 / 38
![Page 6: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/6.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Example Universals
Sound (phonology):All languages have consonants and vowels. Every language has atleast /a i u/.
Grammar (syntax):All languages have verbs and nouns.
Meaning (semantics):
All languages have syntactic constituents (NPs) whose semanticfunction is to express generalized quantifiers. (Barwise and Cooper1981)
4 / 38
![Page 7: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/7.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Example Universals
Sound (phonology):All languages have consonants and vowels. Every language has atleast /a i u/.
Grammar (syntax):All languages have verbs and nouns.
Meaning (semantics):All languages have syntactic constituents (NPs) whose semanticfunction is to express generalized quantifiers. (Barwise and Cooper1981)
4 / 38
![Page 8: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/8.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Determiners
Determiners:
Simple: every, some, few, most, five, . . .Complex: all but five, fewer than three, at least eight or fewer thanfive, . . .
Denote type 〈1, 1〉 generalized quantifiers: sets of models of theform 〈M,A,B〉 with A,B ⊆ M
For example:
JeveryK = {〈M,A,B〉 : A ⊆ B}JmostK = {〈M,A,B〉 : |A ∩ B| > |A \ B|}
5 / 38
![Page 9: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/9.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Determiners
Determiners:
Simple: every, some, few, most, five, . . .Complex: all but five, fewer than three, at least eight or fewer thanfive, . . .
Denote type 〈1, 1〉 generalized quantifiers: sets of models of theform 〈M,A,B〉 with A,B ⊆ M
For example:
JeveryK = {〈M,A,B〉 : A ⊆ B}JmostK = {〈M,A,B〉 : |A ∩ B| > |A \ B|}
5 / 38
![Page 10: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/10.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Determiners
Determiners:
Simple: every, some, few, most, five, . . .Complex: all but five, fewer than three, at least eight or fewer thanfive, . . .
Denote type 〈1, 1〉 generalized quantifiers: sets of models of theform 〈M,A,B〉 with A,B ⊆ M
For example:
JeveryK = {〈M,A,B〉 : A ⊆ B}JmostK = {〈M,A,B〉 : |A ∩ B| > |A \ B|}
5 / 38
![Page 11: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/11.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity
Many French people smoke cigarettes⇒ Many French people smoke
Few French people smoke⇒ Few French people smoke cigarettes
Q is upward (resp. downward) monotone iff:if 〈M,A,B〉 ∈ Q and B ⊆ B ′ (resp. B ′ ⊆ B), then 〈M,A,B〉 ∈ Q
A determiner is monotone if it denotes either an upward ordownward monotone generalized quantifier
6 / 38
![Page 12: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/12.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity
Many French people smoke cigarettes⇒ Many French people smoke
Few French people smoke⇒ Few French people smoke cigarettes
Q is upward (resp. downward) monotone iff:if 〈M,A,B〉 ∈ Q and B ⊆ B ′ (resp. B ′ ⊆ B), then 〈M,A,B〉 ∈ Q
A determiner is monotone if it denotes either an upward ordownward monotone generalized quantifier
6 / 38
![Page 13: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/13.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity
Many French people smoke cigarettes⇒ Many French people smoke
Few French people smoke⇒ Few French people smoke cigarettes
Q is upward (resp. downward) monotone iff:if 〈M,A,B〉 ∈ Q and B ⊆ B ′ (resp. B ′ ⊆ B), then 〈M,A,B〉 ∈ Q
A determiner is monotone if it denotes either an upward ordownward monotone generalized quantifier
6 / 38
![Page 14: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/14.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity Universal
Monotonicity Universal (Barwise and Cooper 1981)
All simple determiners are monotone.
7 / 38
![Page 15: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/15.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity
Key intuition: determiners denote general relations between theirrestrictor and nuclear scope, not ones that depend on the identitiesof particular elements of the domain
Q is permutation-closed iff:if 〈M,A,B〉 ∈ Q and π is a permutation of M, thenπ(〈M,A,B〉) ∈ Q
Q is isomorphism-closed iff:if 〈M,A,B〉 ∈ Q and 〈M,A,B〉 ∼= 〈M ′,A′,B ′〉, then〈M ′,A′,B ′〉 ∈ Q
Q is quantitative:if 〈M,A,B〉 ∈ Q and A∩B,A \B,B \A,M \ (A∪B) have the samecardinality as their primed-counterparts, then 〈M ′,A′,B ′〉 ∈ Q
8 / 38
![Page 16: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/16.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity
Key intuition: determiners denote general relations between theirrestrictor and nuclear scope, not ones that depend on the identitiesof particular elements of the domain
Q is permutation-closed iff:if 〈M,A,B〉 ∈ Q and π is a permutation of M, thenπ(〈M,A,B〉) ∈ Q
Q is isomorphism-closed iff:if 〈M,A,B〉 ∈ Q and 〈M,A,B〉 ∼= 〈M ′,A′,B ′〉, then〈M ′,A′,B ′〉 ∈ Q
Q is quantitative:if 〈M,A,B〉 ∈ Q and A∩B,A \B,B \A,M \ (A∪B) have the samecardinality as their primed-counterparts, then 〈M ′,A′,B ′〉 ∈ Q
8 / 38
![Page 17: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/17.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity
Key intuition: determiners denote general relations between theirrestrictor and nuclear scope, not ones that depend on the identitiesof particular elements of the domain
Q is permutation-closed iff:if 〈M,A,B〉 ∈ Q and π is a permutation of M, thenπ(〈M,A,B〉) ∈ Q
Q is isomorphism-closed iff:if 〈M,A,B〉 ∈ Q and 〈M,A,B〉 ∼= 〈M ′,A′,B ′〉, then〈M ′,A′,B ′〉 ∈ Q
Q is quantitative:if 〈M,A,B〉 ∈ Q and A∩B,A \B,B \A,M \ (A∪B) have the samecardinality as their primed-counterparts, then 〈M ′,A′,B ′〉 ∈ Q
8 / 38
![Page 18: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/18.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity
Key intuition: determiners denote general relations between theirrestrictor and nuclear scope, not ones that depend on the identitiesof particular elements of the domain
Q is permutation-closed iff:if 〈M,A,B〉 ∈ Q and π is a permutation of M, thenπ(〈M,A,B〉) ∈ Q
Q is isomorphism-closed iff:if 〈M,A,B〉 ∈ Q and 〈M,A,B〉 ∼= 〈M ′,A′,B ′〉, then〈M ′,A′,B ′〉 ∈ Q
Q is quantitative:if 〈M,A,B〉 ∈ Q and A∩B,A \B,B \A,M \ (A∪B) have the samecardinality as their primed-counterparts, then 〈M ′,A′,B ′〉 ∈ Q
8 / 38
![Page 19: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/19.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity
Keenan and Stavi 1986, p. 311:“Monomorphemic dets are logical [= permutation-closed].”
Peters and Westerstahl 2006, p. 330:“All lexical quantifier expressions in natural languages denote ISOM[= isomorphism-closed] quantifiers.”
9 / 38
![Page 20: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/20.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity
Keenan and Stavi 1986, p. 311:“Monomorphemic dets are logical [= permutation-closed].”
Peters and Westerstahl 2006, p. 330:“All lexical quantifier expressions in natural languages denote ISOM[= isomorphism-closed] quantifiers.”
9 / 38
![Page 21: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/21.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity Universal
Quantity Universal
All simple determiners are quantitative.
10 / 38
![Page 22: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/22.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity
Key intuition: the restrictor restricts what the determiner talks about
Many French people smoke cigarettes≡ Many French people are French people who smoke cigarettes
Q is conservative iff:〈M,A,B〉 ∈ Q if and only if 〈M,A,A ∩ B〉 ∈ Q
11 / 38
![Page 23: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/23.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity
Key intuition: the restrictor restricts what the determiner talks about
Many French people smoke cigarettes≡ Many French people are French people who smoke cigarettes
Q is conservative iff:〈M,A,B〉 ∈ Q if and only if 〈M,A,A ∩ B〉 ∈ Q
11 / 38
![Page 24: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/24.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity
Key intuition: the restrictor restricts what the determiner talks about
Many French people smoke cigarettes≡ Many French people are French people who smoke cigarettes
Q is conservative iff:〈M,A,B〉 ∈ Q if and only if 〈M,A,A ∩ B〉 ∈ Q
11 / 38
![Page 25: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/25.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity Universal
Conservativity Universal (Barwise and Cooper 1981)
All simple determiners are conservative.
12 / 38
![Page 26: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/26.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
13 / 38
![Page 27: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/27.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
Answer 1: learnability.(Barwise and Cooper 1981; Keenan and Stavi 1986; Szabolcsi 2010)
The universals greatly restrict the search space that a languagelearner must explore when learning the meanings of determiners.This makes it easier (possible?) for them to learn such meaningsfrom relatively small input.[Compare: Poverty of the Stimulus argument for UG.]
In a sense must be true, but:“Likely, the unrestricted space has many hypotheses which are soimplausible, they can be ignored quickly and do not affect learning.The hard part of learning, may be choosing between the plausiblecompetitor meanings, not in weeding out a large space of potentialmeanings.” Piantadosi 2013, p. 22
13 / 38
![Page 28: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/28.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
Answer 1: learnability.(Barwise and Cooper 1981; Keenan and Stavi 1986; Szabolcsi 2010)
The universals greatly restrict the search space that a languagelearner must explore when learning the meanings of determiners.This makes it easier (possible?) for them to learn such meaningsfrom relatively small input.[Compare: Poverty of the Stimulus argument for UG.]
In a sense must be true, but:“Likely, the unrestricted space has many hypotheses which are soimplausible, they can be ignored quickly and do not affect learning.The hard part of learning, may be choosing between the plausiblecompetitor meanings, not in weeding out a large space of potentialmeanings.” Piantadosi 2013, p. 22
13 / 38
![Page 29: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/29.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
Answer 1: learnability.(Barwise and Cooper 1981; Keenan and Stavi 1986; Szabolcsi 2010)
The universals greatly restrict the search space that a languagelearner must explore when learning the meanings of determiners.This makes it easier (possible?) for them to learn such meaningsfrom relatively small input.[Compare: Poverty of the Stimulus argument for UG.]
In a sense must be true, but:“Likely, the unrestricted space has many hypotheses which are soimplausible, they can be ignored quickly and do not affect learning.The hard part of learning, may be choosing between the plausiblecompetitor meanings, not in weeding out a large space of potentialmeanings.” Piantadosi 2013, p. 22
13 / 38
![Page 30: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/30.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
Answer 2: learnability.(Peters and Westerstahl 2006)
The universals aid learnability because quantifiers satisfying theuniversals are easier to learn than those that do not.
Challenge: provide a model of learning which makes good on thispromise.[See Tiede 1999; Gierasimczuk 2007; Gierasimczuk 2009 for earlierattempts.]
13 / 38
![Page 31: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/31.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
Answer 2: learnability.(Peters and Westerstahl 2006)
The universals aid learnability because quantifiers satisfying theuniversals are easier to learn than those that do not.
Challenge: provide a model of learning which makes good on thispromise.[See Tiede 1999; Gierasimczuk 2007; Gierasimczuk 2009 for earlierattempts.]
13 / 38
![Page 32: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/32.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Explaining Universals
Natural Question
Why do the universals hold? What explains the limited range ofquantifiers expressed by simple determiners?
Answer 2: learnability.(Peters and Westerstahl 2006)
The universals aid learnability because quantifiers satisfying theuniversals are easier to learn than those that do not.
Challenge: provide a model of learning which makes good on thispromise.[See Tiede 1999; Gierasimczuk 2007; Gierasimczuk 2009 for earlierattempts.]
13 / 38
![Page 33: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/33.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Overview
1 Semantic Universals
2 The Learning Model
3 Experiments
4 Conclusion
14 / 38
![Page 34: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/34.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Neural Networks
Source: Nielsen, “Neural Networks and Deep Learning”, Determination Press
15 / 38
![Page 35: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/35.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Gradient Descent and Back-propogation
The learning framework will be a non-convex optimization problem.
A total loss function, which will be the mean of a ‘local’ errorfunction:
L =1
N
∑`(yi , yi )
=1
N
∑`(NN(~θ, xi ), yi )
Navigate the landscape of weight space towards lower loss:
~θt+1 ← ~θt − α∇~θL
Back-propagation: calculate `(·, ·) after a forward-pass of thenetwork. ‘Propagate’ the error backwards through the network tocalculate the gradient.
16 / 38
![Page 36: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/36.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Gradient Descent and Back-propogation
The learning framework will be a non-convex optimization problem.
A total loss function, which will be the mean of a ‘local’ errorfunction:
L =1
N
∑`(yi , yi )
=1
N
∑`(NN(~θ, xi ), yi )
Navigate the landscape of weight space towards lower loss:
~θt+1 ← ~θt − α∇~θL
Back-propagation: calculate `(·, ·) after a forward-pass of thenetwork. ‘Propagate’ the error backwards through the network tocalculate the gradient.
16 / 38
![Page 37: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/37.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Gradient Descent and Back-propogation
The learning framework will be a non-convex optimization problem.
A total loss function, which will be the mean of a ‘local’ errorfunction:
L =1
N
∑`(yi , yi )
=1
N
∑`(NN(~θ, xi ), yi )
Navigate the landscape of weight space towards lower loss:
~θt+1 ← ~θt − α∇~θL
Back-propagation: calculate `(·, ·) after a forward-pass of thenetwork. ‘Propagate’ the error backwards through the network tocalculate the gradient.
16 / 38
![Page 38: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/38.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Recurrent Neural Networks
Source: Olah, “Understanding LSTM Networks”, http://colah.github.io/posts/2015-08-Understanding-LSTMs/
17 / 38
![Page 39: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/39.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Long Short-Term Memory
Introduced in Hochreiter and Schmidhuber 1997 to solve theexploding/vanishing gradients problem.
Source: Olah, “Understanding LSTM Networks”, http://colah.github.io/posts/2015-08-Understanding-LSTMs/
18 / 38
![Page 40: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/40.jpg)
Semantic Universals The Learning Model Experiments Conclusion
The Learning Task
Our data will be the following:
Input: 〈Q,M〉 pairs, presented sequentiallyOutput: T/F, depending on whether M ∈ Q
So, this will be a sequence classification task.
The loss function we minimize is cross-entropy. In our case, withy ∈ {0, 1}, very simple:
`(NN(~θ, x), y) = − ln(NN(~θ, x)y )
19 / 38
![Page 41: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/41.jpg)
Semantic Universals The Learning Model Experiments Conclusion
The Learning Task
Our data will be the following:
Input: 〈Q,M〉 pairs, presented sequentiallyOutput: T/F, depending on whether M ∈ Q
So, this will be a sequence classification task.
The loss function we minimize is cross-entropy. In our case, withy ∈ {0, 1}, very simple:
`(NN(~θ, x), y) = − ln(NN(~θ, x)y )
19 / 38
![Page 42: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/42.jpg)
Semantic Universals The Learning Model Experiments Conclusion
The Learning Task
Our data will be the following:
Input: 〈Q,M〉 pairs, presented sequentiallyOutput: T/F, depending on whether M ∈ Q
So, this will be a sequence classification task.
The loss function we minimize is cross-entropy. In our case, withy ∈ {0, 1}, very simple:
`(NN(~θ, x), y) = − ln(NN(~θ, x)y )
19 / 38
![Page 43: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/43.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Data Generation
Algorithm 1 Data Generation Algorithm
Inputs: max len, num data, quants
data← []while len(data) < num data do
N ∼ Unif ([max len])Q ∼ Unif (quants)cur seq← Unif ({A ∩ B,A \ B,B \ A,M \ (A ∪ B)} ,N)if 〈Q, cur seq〉 /∈ data then
data.append(generate point(〈Q, cur seq,Q ∈ cur seq?〉))end if
end whilereturn shuffle(data)
20 / 38
![Page 44: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/44.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Overview
1 Semantic Universals
2 The Learning Model
3 Experiments
4 Conclusion
21 / 38
![Page 45: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/45.jpg)
Semantic Universals The Learning Model Experiments Conclusion
General Set-up
For each proposed semantic universal:
Choose a set of minimally different quantifiers that disagree withrespect to the universal
Run some number of trials of training an LSTM to learn thosequantifiers
Stop when:500 minibatch running mean total accuracy on test set is > 0.99; ormean probability assigned to correct truth-value on test set is > 0.99
Measure, for each quantifier, its convergence point:first i such that mean accuracy from i onward for Q is > 0.98
Code and Data Available At:
http://github.com/shanest/quantifier-rnn-learning
22 / 38
![Page 46: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/46.jpg)
Semantic Universals The Learning Model Experiments Conclusion
General Set-up
For each proposed semantic universal:
Choose a set of minimally different quantifiers that disagree withrespect to the universal
Run some number of trials of training an LSTM to learn thosequantifiers
Stop when:500 minibatch running mean total accuracy on test set is > 0.99; ormean probability assigned to correct truth-value on test set is > 0.99
Measure, for each quantifier, its convergence point:first i such that mean accuracy from i onward for Q is > 0.98
Code and Data Available At:
http://github.com/shanest/quantifier-rnn-learning
22 / 38
![Page 47: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/47.jpg)
Semantic Universals The Learning Model Experiments Conclusion
General Set-up
For each proposed semantic universal:
Choose a set of minimally different quantifiers that disagree withrespect to the universal
Run some number of trials of training an LSTM to learn thosequantifiers
Stop when:500 minibatch running mean total accuracy on test set is > 0.99; ormean probability assigned to correct truth-value on test set is > 0.99
Measure, for each quantifier, its convergence point:first i such that mean accuracy from i onward for Q is > 0.98
Code and Data Available At:
http://github.com/shanest/quantifier-rnn-learning
22 / 38
![Page 48: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/48.jpg)
Semantic Universals The Learning Model Experiments Conclusion
General Set-up
For each proposed semantic universal:
Choose a set of minimally different quantifiers that disagree withrespect to the universal
Run some number of trials of training an LSTM to learn thosequantifiers
Stop when:500 minibatch running mean total accuracy on test set is > 0.99; ormean probability assigned to correct truth-value on test set is > 0.99
Measure, for each quantifier, its convergence point:first i such that mean accuracy from i onward for Q is > 0.98
Code and Data Available At:
http://github.com/shanest/quantifier-rnn-learning
22 / 38
![Page 49: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/49.jpg)
Semantic Universals The Learning Model Experiments Conclusion
General Set-up
For each proposed semantic universal:
Choose a set of minimally different quantifiers that disagree withrespect to the universal
Run some number of trials of training an LSTM to learn thosequantifiers
Stop when:500 minibatch running mean total accuracy on test set is > 0.99; ormean probability assigned to correct truth-value on test set is > 0.99
Measure, for each quantifier, its convergence point:first i such that mean accuracy from i onward for Q is > 0.98
Code and Data Available At:
http://github.com/shanest/quantifier-rnn-learning
22 / 38
![Page 50: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/50.jpg)
Semantic Universals The Learning Model Experiments Conclusion
General Set-up
For each proposed semantic universal:
Choose a set of minimally different quantifiers that disagree withrespect to the universal
Run some number of trials of training an LSTM to learn thosequantifiers
Stop when:500 minibatch running mean total accuracy on test set is > 0.99; ormean probability assigned to correct truth-value on test set is > 0.99
Measure, for each quantifier, its convergence point:first i such that mean accuracy from i onward for Q is > 0.98
Code and Data Available At:
http://github.com/shanest/quantifier-rnn-learning
22 / 38
![Page 51: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/51.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Experiment 1: Monotonicity
Monotonicity Universal (Barwise and Cooper 1981)
All simple determiners are monotone.
Quantifiers: ≥ 4, ≤ 4, = 4
23 / 38
![Page 52: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/52.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Experiment 1: Monotonicity
Monotonicity Universal (Barwise and Cooper 1981)
All simple determiners are monotone.
Quantifiers: ≥ 4, ≤ 4, = 4
23 / 38
![Page 53: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/53.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Results
24 / 38
![Page 54: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/54.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Results
25 / 38
![Page 55: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/55.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Discussion
≥ 4 easier than = 4: X
≥ 4 easier than ≤ 4, and ≤ 4 not easier than = 4: ???
Upward monotone are generally cognitively easier than downward[Just and Carpenter 1971; Geurts 2003; Geurts and Slik 2005;Deschamps et al. 2015]= 4 is a conjunction of monotone quantifiers
A better measure of learning rate than convergence point?
26 / 38
![Page 56: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/56.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Discussion
≥ 4 easier than = 4: X
≥ 4 easier than ≤ 4, and ≤ 4 not easier than = 4: ???
Upward monotone are generally cognitively easier than downward[Just and Carpenter 1971; Geurts 2003; Geurts and Slik 2005;Deschamps et al. 2015]= 4 is a conjunction of monotone quantifiers
A better measure of learning rate than convergence point?
26 / 38
![Page 57: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/57.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Discussion
≥ 4 easier than = 4: X
≥ 4 easier than ≤ 4, and ≤ 4 not easier than = 4: ???
Upward monotone are generally cognitively easier than downward[Just and Carpenter 1971; Geurts 2003; Geurts and Slik 2005;Deschamps et al. 2015]
= 4 is a conjunction of monotone quantifiers
A better measure of learning rate than convergence point?
26 / 38
![Page 58: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/58.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Discussion
≥ 4 easier than = 4: X
≥ 4 easier than ≤ 4, and ≤ 4 not easier than = 4: ???
Upward monotone are generally cognitively easier than downward[Just and Carpenter 1971; Geurts 2003; Geurts and Slik 2005;Deschamps et al. 2015]= 4 is a conjunction of monotone quantifiers
A better measure of learning rate than convergence point?
26 / 38
![Page 59: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/59.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Monotonicity: Discussion
≥ 4 easier than = 4: X
≥ 4 easier than ≤ 4, and ≤ 4 not easier than = 4: ???
Upward monotone are generally cognitively easier than downward[Just and Carpenter 1971; Geurts 2003; Geurts and Slik 2005;Deschamps et al. 2015]= 4 is a conjunction of monotone quantifiers
A better measure of learning rate than convergence point?
26 / 38
![Page 60: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/60.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Experiment 2: Quantity
Quantity Universal
All simple determiners are quantitative.
Quantifiers: ≥ 3, first3
27 / 38
![Page 61: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/61.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Experiment 2: Quantity
Quantity Universal
All simple determiners are quantitative.
Quantifiers: ≥ 3, first3
27 / 38
![Page 62: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/62.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity: Results
28 / 38
![Page 63: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/63.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity: Results
29 / 38
![Page 64: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/64.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity: Discussion
Very promising initial results
Harder to learn an order-sensitive quantifier than one that is onlysensitive to quantity
30 / 38
![Page 65: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/65.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Quantity: Discussion
Very promising initial results
Harder to learn an order-sensitive quantifier than one that is onlysensitive to quantity
30 / 38
![Page 66: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/66.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Experiment 3: Conservativity
Conservativity Universal (Barwise and Cooper 1981)
All simple determiners are conservative.
Quantifiers: nall, nonly[Motivated by Hunter and Lidz 2013]
31 / 38
![Page 67: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/67.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Experiment 3: Conservativity
Conservativity Universal (Barwise and Cooper 1981)
All simple determiners are conservative.
Quantifiers: nall, nonly[Motivated by Hunter and Lidz 2013]
31 / 38
![Page 68: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/68.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity: Results
32 / 38
![Page 69: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/69.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity: Results
33 / 38
![Page 70: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/70.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity: Discussion
No way of ‘breaking the symmetry’ between A ∩ B and B \ A in thismodel
More boldly: conservativity as a syntactic/structural constraint, nota semantic universal[See Fox 2002; Sportiche 2005; Romoli 2015]
34 / 38
![Page 71: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/71.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Conservativity: Discussion
No way of ‘breaking the symmetry’ between A ∩ B and B \ A in thismodel
More boldly: conservativity as a syntactic/structural constraint, nota semantic universal[See Fox 2002; Sportiche 2005; Romoli 2015]
34 / 38
![Page 72: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/72.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Overview
1 Semantic Universals
2 The Learning Model
3 Experiments
4 Conclusion
35 / 38
![Page 73: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/73.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Towards Meeting the Challenge
Challenge: find a model of learnability on which quantifierssatisfying a universal are easier to learn than those that do not.
We showed how to train LSTM networks via backpropagation toverify quantifiers.
Initial experiments show that this setup may be able to meet thechallenge.
36 / 38
![Page 74: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/74.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Towards Meeting the Challenge
Challenge: find a model of learnability on which quantifierssatisfying a universal are easier to learn than those that do not.
We showed how to train LSTM networks via backpropagation toverify quantifiers.
Initial experiments show that this setup may be able to meet thechallenge.
36 / 38
![Page 75: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/75.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Towards Meeting the Challenge
Challenge: find a model of learnability on which quantifierssatisfying a universal are easier to learn than those that do not.
We showed how to train LSTM networks via backpropagation toverify quantifiers.
Initial experiments show that this setup may be able to meet thechallenge.
36 / 38
![Page 76: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/76.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Future Work
More and larger experiments
Different models, including baselines for this task
More realistic visual input (see Sorodoc, Lazaridou, et al. 2016;Sorodoc, Pezzelle, et al. 2017)
Tools to ‘look inside’ the black box and see what the network isdoing
37 / 38
![Page 77: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/77.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Future Work
More and larger experiments
Different models, including baselines for this task
More realistic visual input (see Sorodoc, Lazaridou, et al. 2016;Sorodoc, Pezzelle, et al. 2017)
Tools to ‘look inside’ the black box and see what the network isdoing
37 / 38
![Page 78: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/78.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Future Work
More and larger experiments
Different models, including baselines for this task
More realistic visual input (see Sorodoc, Lazaridou, et al. 2016;Sorodoc, Pezzelle, et al. 2017)
Tools to ‘look inside’ the black box and see what the network isdoing
37 / 38
![Page 79: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/79.jpg)
Semantic Universals The Learning Model Experiments Conclusion
Future Work
More and larger experiments
Different models, including baselines for this task
More realistic visual input (see Sorodoc, Lazaridou, et al. 2016;Sorodoc, Pezzelle, et al. 2017)
Tools to ‘look inside’ the black box and see what the network isdoing
37 / 38
![Page 80: Learnability and Semantic Universals](https://reader035.vdocument.in/reader035/viewer/2022081602/613c5850f237e1331c5115b0/html5/thumbnails/80.jpg)
Semantic Universals The Learning Model Experiments Conclusion
The End
Thank you!
38 / 38