Skip to main content
more options

John Hale: What a Rational Parser Would Do

If we conceptualize a theory of human sentence comprehension as a combination of (1) a grammar (2) a strategy for using the rules of the grammar and (3) some architectural facilities like memory we still have a huge space of possible theories. It would be nice to narrow this class down to just those that somehow made sense in relation to the communicative function sentence-comprehension often serves.

This talk examines a smaller class of comprehension theories that strive to finish parsing as soon as possible. These theories would be ``rational'' on a view of the comprehender as doing his or her best to understand what the speaker means. I shall argue that they correctly derive well-known garden pathing phenomena along with the puzzling Local Coherence effects studied by Tabor, Galantuccia and Richardson (2004). Time permitting, I will discuss the relationship between this class of theories and the Entropy Reduction Hypothesis revived in Hale (2006).

Tabor, W., Galantuccia, B., & Richardson, D. (2004). Effects of merely local syntactic coherence on sentence processing. Journal of Memory and Language , 50 (4), 355-370.

Hale, J. (2006). Uncertainty about the rest of the sentence. Cognitive Science , 30 (4), 609-642.


Mark Baker: Two Modalities of Case Assignment: Case in Sakha

Two distinct ideas about how morphological case is assigned exist in the recent generative literature: the standard Chomskian view that case is assigned by designated functional heads to the closest NP via an agreement relationship, and an alternative view in which case is assigned to one NP if there is a second NP in the same local domain (Marantz 1991). I claim that these two ways of assigning case are complementary, based on data from the Turkic language Sakha. Accusative case and dative case in this language are assigned by Marantz-style configurational rules that do not refer directly to functional categories. This is shown by evidence from passives, agentive nominalizations, subject raising, possessor raising, and case assignment in PPs. In contrast, there is evidence that nominative and genitive are assigned by functional heads in the Chomskian way, as shown by the distribution of nominative case and the relationship between case marking and agreement. The two methods of case assignment thus coexist, not only in Universal Grammar, but even in the grammar of a single language. This raises the interesting question of how these two modes of case assignment are distributed typologically in languages of the world.


Tim O'Donnell: Bayesian models of language and stochastic functional programming

Traditionally, linguists have perceived a dichotomy between highly structured rule-based models and "statistical" approaches to language (Pereira 2000). However, over the years computational linguists and others have developed probabilistic models of language built on top of rich representations. More recently this approach has spread to other areas of cognitive science and psychology in the form of so-called "structured Bayesian modeling." Structured Bayes is one approach for allowing probabilistic models to be defined over traditional logical systems such as first-order logic or the lambda calculus. This Fall, Cornell Linguistics hosts Timothy O'Donnell (Harvard) and Noah Goodman (MIT) as they present a pair of tutorial workshops designed to introduce structured Bayesian modelling and its applications to language. O'Donnell and Goodman will present methods for defining and reasoning with Bayesian hierarchical generative models. They will do so using "Church", a programming language designed for succinctly and elegantly expressing Bayesian models.

A wiki for the Church project can be found here.

Funded by a Small Grant from the Cornell Institute for the Social Sciences.

Fernando Pereira. Formal Grammar and Information Theory: Together Again? Philosophical Transactions of the Royal Society 358 1239-1253 (2000)

Fragment Grammars: exploring Computation and Reuse in Language.
Timothy J. O'Donnell, Noah D. Goodman, and Joshua B. Tenenbaum
Technical Report MIT-CSAIL-TR-2009-013
http://hdl.handle.net/1721.1/44963

Church: a language for generative models. N. D. Goodman, V. K. Mansighka, D. Roy, K. Bonawitz, J. B. Tenenbaum (2008). Uncertainty in Artificial Intelligence 2008.


Beth Levin: When in Means 'into': Implications for a Typology of Motion Events

Talmy's influential proposal that languages fall into two types with respect to the lexicalization of motion events---path languages and manner languages---is now considered too simplistic: many languages, though predominantly of one type, may show properties of the other. Among these are the path languages French, Italian, and Spanish, which allow a handful of manner of motion verbs to take PP complements in the expression of directed motion, counter to their Talmyan type. Although this exceptional behavior has been accommodated through specially annotated lexical entries for the relevant verbs, I propose a pragmatic account is preferable on empirical and conceptual grounds.

Several recent studies trace differences among path and manner languages to their preposition inventories (Cummins 1998, Jones 1983, Song 1997, Son 2007). English is more permissive in the expression of directed motion because to allows goals to be semantically composed with manner of motion verbs. In contrast, Romance languages lack a dedicated goal preposition; a, though often glossed `to' in translations of motion event descriptions, is inherently locative and best glossed `at'. Thus, this preposition alone is typically unable to predicate a result location of manner of motion verbs---the hallmark of a manner language.

The problematic Romance examples, then, are better described as having locative PPs with a directional interpretation. Interestingly, precisely this phenomenon is observed in English and some other Germanic manner languages (Gehrke 2008, Nikitina 2008, Thomas 2004, Tungseth 2008), as in Jill quickly walked in the kitchen. Nikitina (2008) argues that such interpretations arise when adequate pragmatic contextual support is available. Drawing on corpus studies (Baicchi 2005, Kopecka 2009, Martinez Vazquez 2001), I extend this pragmatic explanation to Romance.

The pragmatic account has two benefits. It illuminates observed lexical restrictions on the directional interpretation of locative PPs, including the `squishiness' of the phenomenon in both path and manner languages, which makes lexical diacritic approaches ultimately infeasible. It also assumes the null hypothesis: English and Romance are no different in their verb meanings, and, specifically, all Romance manner of motion verbs simply lexicalize manner.


John Kingston: When does what you know affect what you hear?

What listeners know about their native language can be used to fill in content that isn’t reliably transmitted, and it distorts the perception of foreign sounds. So there is no doubt about whether listeners’ linguistic knowledge affects what they hear – it does –, but there is disagreement about when and how that knowledge is applied. In interactive models (McClelland & Elman, 1986), linguistic knowledge applies to all stages in processing as soon as it’s available, while in autonomous models (Norris, McQueen, & Cutler, 2000), only the auditory qualities evoked by the speech signal influence its initial processing. In this talk, I present the results of three experiments that support autonomy. The first showed that two sounds were discriminated no better when they formed part of a word-nonword continuum than when they instead formed part of a wordword continuum. The second showed that an early event-related potential is as robust to phonotactically illegal [dl] followed by phonotactically legal [gl] as to phonotactically legal [dw] followed by phonotactically legal [gw] (cf. Dupoux, Kakehi, Hirose, Pallier, & Mehler, 1999; Dehaene-Lambertz, Dupoux, & Gout, 2000). The third showed that English listeners discriminate the input to place assimilation, [db], from its output, [bb], far less well than a sequence which is not a possible input to place assimilation, [gb], from its ostensible output, [bb], and likewise far less well than French listeners, whose language lacks a place assimilation rule (cf. Darcy, Ramus, Christophe, Kinzler, & Dupoux, in press). However, this difference disappeared when the format of the discrimination task was changed from same-different to four-interval same-different. The results of all three experiments disconfirm the positive predictions of an interactive model, while the results of the second and third confirm a positive prediction of the autonomous alternative, that listeners or their brains can discriminate sounds that their knowledge renders non-distinct. The answer to the question posed above is, “Later.”

References

Darcy, I., Ramus, F., Christophe, A., Kinzler, K., & Dupoux, E. (in press). Phonological knowledge in compensation for native and non-native assimilation. In F. KØugler, C. F“ery, & R. van de Vijver (Eds.), Variation and gradience in phonetics and phonology. Berlin: Mouton De Gruyter.

Dehaene-Lambertz, G., Dupoux, E., & Gout, A. (2000). Electrophysiological correlates of phonological processing: A cross linguistic study. Journal of Cognitive Neuroscience, 12, 635-647.

Dupoux, E., Kakehi, K., Hirose, Y., Pallier, C., & Mehler, J. (1999). Epenthetic vowels in Japanese: A perceptual illusion? Journal of Experimental Psychology: Human Perception and Performance, 25, 1568-1578.

McClelland, J., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86.

Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23, 299-370.


Adam Albright: Modeling superadditive markedness interactions

A design feature of both ruled-based and constraint-based models of phonology is that processes apply independently of one another: for example, final consonant clusters with disagreeing voicing are always banned in English, causing voicing alternations regardless of whether the word has a simple or complex onset (`caps' /kęp+z/, `claps' /klęp+z/ → [kęps], [klęps]), a round vowel (`copes' /koʊp+z/ → [koʊps]), or any other marked structure. By forcing rules or constraints to apply independently, we exclude the possibility of `superadditive' effects in which the well-formedness of a structure depends on the presence of another structure. In this talk, I argue that when we move beyond alternations and turn to static phonotactics, superadditive effects do seem to occur. For example, English allows words beginning with /bl-/ and /gl-/ clusters, as well as words ending in /-sp/ and /-sk/ clusters, but there are no words with both together (*blesk, *glisp). As it turns out, the rarity or lack of such combinations cannot be predicted from the independent frequencies of /bl-/, /-sp/, etc. I discuss several sources of evidence for superadditive effects: child errors in the acquisition of Dutch syllable structure, underattestation of doubly marked forms in the lexicons of English and Dutch, and acceptability ratings of nonce words in English. In all three cases, it appears that rare or marginal structures become worse in the presence of other rare or marginal structures. Crucially, however, not all combinations are penalized in this way. Relatively common combinations, such as /kr-/ and /-st/ co-occur about as often as expected (crust, crest, etc.), and do not show superadditive effects.

The challenge, then, is to provide a computational model that penalizes some constraint violations more in the presence of another violation, while allowing others to remain independent. I propose a model in which acceptability judgments arise through a combination of two levels of evaluation: (1) a non-grammatical evaluation of phonotactic probability, which assesses the joint probability of the substrings in a word, and (2) evaluation by a grammar of weighted constraints, further penalizing sequences that violate high-weighted constraints. For grammatically licit combinations such as /kr-/ and /-st/, acceptability is determined by simple joint probability. For grammatically penalized clusters such as /bl-/ or /-sk/, phonotactic probability and grammatical probability combine to yield super-additive effects. I sketch a model in which learners factor out phonotactic probability in learning weights of grammatical constraints.