Linguistic generalization on the basis of function and constraints on the basis of statistical preemption. Florent Perek & Adele E. Goldberg. to appear. Cognition.
How do people learn to use language in creative but constrained ways? Experiment 1 investigates linguistic creativity by exposing adult participants to two novel word order constructions that differ in terms of their semantics: One construction exclusively describes actions that have a strong effect; the other construction describes actions with a weaker but otherwise similar effect. One group of participants witnessed novel verbs only appearing in one construction or the other, while another group witnessed a minority of verbs alternating between constructions. Subsequent production and judgment results demonstrate that participants in both conditions extended and accepted verbs in whichever construction best described the intended message. Unlike related previous work, this finding is not naturally attributable to prior knowledge of the likely division of labor between verbs and constructions. In order to investigate how speakers learn to constrain generalizations, Experiment 2 includes one verb (out of 6) that was witnessed in a single construction to describe both strong and weak effects, essentially statistically preempting the use of the other construction. In this case, participants were much more lexically conservative with this verb and other verbs, while they nonetheless displayed an appreciation of the distinct semantics of the constructions with new novel verbs. Results indicate that the need to better express an intended message encourages generalization, while statistical preemption constrains generalization by providing evidence that verbs are restricted in their distribution.
Children probability boost novel classifiers in production but not comprehension. Jessica F. Schwab, Casey Lew-Williams, Adele E. Goldberg. Submitted.
We examined how learners generalize novel gender classifiers in production (6 year olds and adults) and comprehension (6 year olds). Participants were exposed to two novel classifiers whose distribution probabilistically correlated with natural gender. One classifier was witnessed twice as frequently as the other. Adults readily recognized the gender-based distinction, and generalized the classifiers appropriately to new male and female puppets. Children, however, displayed no awareness of the gender-based regularity, although they did show a tendency to correctly produce familiar combinations of classifier + noun, and they were even more accurate at judging familiar classifier + noun as preferable to new combinations. Of special interest was that 6-year-olds showed evidence of boosting the probability of one or the other classifier in the production task, but children at this age displayed no preference for either classifier in a judgment task. Thus, children’s probability maximizing may stem from the high task demands of production, rather than reflecting an overall tendency to generalize or follow a rule.
Modeling the Partial Productivity of Constructions. Libby Barak, Adele E. Goldberg. American Association for Artificial Intelligence Spring Symposium (AAAI), 2017. [pdf]
People regularly produce novel sentences that sound native-like (e.g., she googled us the information), while they also recognize that other novel sentences sound odd, even though they are interpretable (e.g., ? She explained us the information). This work offers a Bayesian, incremental model that learns clusters that correspond to grammatical constructions of different type and token frequencies. Without specifying in advance the number of constructions, their semantic contributions, nor whether any two constructions compete with one another, the model successfully generalizes when appropriate while identifying and suggesting an alternative when faced with overgeneralization errors. Results are consistent with recent psycholinguistic work that demonstrates that the existence of competing alternatives and the frequencies of those alternatives play a key role in the partial productivity of grammatical constructions. The model also goes beyond the psycholinguistic work in that it investigates a role for constructions’ overall frequency.
The Blowfish effect: subordinate categories are inferred from atypical exemplars of a basic level category. Adele E. Goldberg, Lauren L. Emberson, Isaac N. Treves, submitted.
There is a widespread belief that a novel word for a single object will be interpreted by language learners as a descriptor at the basic level. However, the present studies demonstrate that if a novel label is applied to an atypical exemplar (e.g., a blowfish) of its basic level category (fish), learners are likely to assume a subordinate level interpretation. This “blowfish” effect is shown to be independent of and equally as strong as the “suspicious coincidence” effect, which is replicated (Experiment 1). It also holds regardless of whether the single exemplar is selected intentionally or randomly (Experiment 2).
Experiments test whether sequential vs. simultaneous presentation (1) or identical exemplars (2) modulates the “suspicious coincidence” effect reported by Xu & Tenenbaum (2007). Adele E. Goldberg, Lauren L. Emberson, Isaac N. Treves. Unpublished, but posted on Psych FileDrawer. [pdf]
We replicate Xu & Tenenbaum (2007)’s “suspicious coincidence” effect, regardless of whether three exemplars are presented sequentially or simultaneously, or whether the exemplars are identical to one another or distinct. Our replication of Xu & Tenenbaum is a partial failure to replicate Spencer et al. (2011). Differences between our design and previous ones: ours was massively between subjects using participants on Mechanical Turk in order to avoid possible effects. Specifically, each of 511 participants witnessed a single trial. We used instances of categories that were distinct from those of either previous study (dog, fish, flower, bird vs. dog, truck, pepper). Our work does not investigate generalization to the higher, superordinate level, as generalizations to that level on the basis of a single exemplar are uncontroversially rare.
Natural language acquisition relies on appropriate generalization: the ability to produce novel sentences, while learning to restrict productions to acceptable forms in the language. Psycholinguists have proposed various properties that might play a role in guiding appropriate generalizations, looking at learning of verb alternations as a testbed. Several computational cognitive models have explored aspects of this phenomenon, but their results are hard to compare given the high variability in the linguistic properties represented in their input. In this paper, we directly compare two recent approaches, a Bayesian model and a connectionist model, in their ability to replicate human judgments of appropriate generalizations. We find that the Bayesian model more accurately mimics the judgments due to its richer learning mechanism that can exploit distributional properties of the input in a manner consistent with human behaviour.
Subtle Implicit Language Facts Emerge from the Functions of Constructions. Adele E. Goldberg. 2016. Frontiers in Psychology 6. doi: 10.3389/fpsyg.2015.02019 [web-link]
Much has been written about the unlikelihood of innate, syntax-specific, universal knowledge of language (Universal Grammar) on the grounds that it is biologically implausible, unresponsive to cross-linguistic facts, theoretically inelegant, and implausible and unnecessary from the perspective of language acquisition. While relevant, much of this discussion fails to address the sorts of facts that generative linguists often take as evidence in favor of the Universal Grammar Hypothesis: subtle, intricate, knowledge about language that speakers implicitly know without being taught. This paper revisits a few often-cited such cases and argues that, although the facts are sometimes even more complex and subtle than is generally appreciated, appeals to Universal Grammar fail to explain the phenomena. Instead, such facts are strongly motivated by the functions of the constructions involved. The following specific cases are discussed: (a) the distribution and interpretation of anaphoric one , (b) constraints on long-distance dependencies, (c) subject-auxiliary inversion, and (d) cross-linguistic linking generalizations between semantics and syntax.
Ellipsis by constructions. Adele E. Goldberg and Florent Perek. In Handbook of Ellipsis. Edited by Jeroen van Craenenbroeck & Tanja Temmerman. Oxford University Press:
The existence of elliptical constructions in languages is motivated by the Gricean preference to avoid saying more than is necessary. We suggest interpretation is recovered by a semantic pointer function that is independently motivated and consistent with psycholinguistic evidence. This article reviews evidence that ellipsis in language is best understood as licensed by particular constructions, each with its own form and functional properties. Our account of gapping is quite general in that it includes cases of traditional “argument cluster coordination”; it is at the same time more restrictive than other accounts in including a constraint on register. We compare gapping and other ellipsis constructions in French and English with an emphasis on their differences. Finally, we discuss derivational approaches to ellipsis, and concluded that a constructionist approach is more promising than a single, over-arching rule-based approach since it is in a better position to capture distinctions as well as commonalities among ellipsis constructions, within and across languages.
Conventional metaphorical sentences such as She’s a sweet child have been found to elicit greater amygdala activation than matched literal sentences (e.g., She’s a kind child). In the present fMRI study, this finding is strengthened and extended with naturalistic stimuli involving longer passages and a range of conventional metaphors. In particular, a greater number of activation peaks (four) were found in the bilateral amygdala when passages containing conventional metaphors were read than when their matched literal versions were read (a single peak); while the direct contrast between metaphorical and literal passages did not show significant amygdala activation, a parametric analysis revealed that BOLD signal changes in the left amygdala correlated with an increase in metaphoricity ratings across all stories. Moreover, while a measure of complexity was positively correlated with increase in activation of a broad bilateral network mainly involving the temporal lobes, complexity was not predictive of amygdala activity. Thus, the results suggest that amygdala activation is not simply a result of stronger overall activity related to language comprehension, but is more specific to the processing of metaphorical language.
Neural systems involved in processing novel linguistic constructions and their visual referents. Matthew A. Johnson, Nick Turk-Browne, and Adele E. Goldberg. 2015. Language, Cognition, and Neuroscience:
In language, abstract phrasal patterns provide an important source of meaning, but little is known about whether or how such constructions are used to predict upcoming visual scenes. Findings from two fMRI studies indicate that initial exposure to a novel construction allows its semantics to be used for such predictions. Specifically, greater activity in the ventral striatum, a region sensitive to prediction errors, was linked to worse overall comprehension of a novel construction. Moreover, activity in occipital cortex was attenuated when a visual event could be inferred from a learned construction, which may reflect predictive coding of the event. These effects disappeared when predictions were unlikely: that is, when phrases provided no additional information about visual events. These findings support the idea that learners create and evaluate predictions about new instances during comprehension of novel linguistic constructions.
Judgment evidence for statistical preemption: It is relatively better to vanish than to disappear a rabbit, but a lifeguard can equally well backstroke or swim children to shore. Clarice Robenalt & Adele E. Goldberg. 2015. Cognitive Linguistics. 26.3: 467-504.[pdf]
How do speakers know when they can use language creatively and when they cannot? Prior research indicates that higher frequency verbs are more resistant to overgeneralization than lower frequency verbs with similar meaning and argument structure constraints. This result has been interpreted as evidence for conservatism via entrenchment, which proposes that people prefer to use verbs in ways they have heard before, with the strength of dispreference for novel uses increasing with overall verb frequency. This paper investigates whether verb frequency is actually always relevant in judging the acceptability of novel sentences or whether it only matters when there is a readily available alternative way to express the intended message with the chosen verb, as is predicted by statistical preemption. Two experiments are reported in which participants rated novel uses of high and low frequency verbs in argument structure constructions in which those verbs do not normally appear. Separate norming studies were used to divide the sentences into those with and without an agreed-upon preferred alternative phrasing which would compete with the novel use for acceptability. Experiment 2 controls for construction type: all target stimuli are instances of the caused-motion construction. In both experiments, we replicate the stronger dispreference for a novel use with a high frequency verb relative to its lower frequency counterpart, but only for those sentences for which there exists a competing alternative phrasing. When there is no consensus about a preferred way to phrase a sentence, verb frequency is not a predictive factor in sentences’ ratings. We interpret this to mean that while speakers prefer familiar formulations to novel ones, they are willing to extend verbs creatively if there is no readily available alternative way to express the intended meaning.
L2 learners do not take competing alternative expressions into account the way L1 learners do. Clarice Robenalt & Adele E. Goldberg. 2015. Language Learning: [pdf]
The present study replicates the findings in Robenalt & Goldberg (to appear, CogLing–immediately above) with a group of native speakers and critically extends the paradigm to non-native speakers. Recent findings in second language acquisition suggest that second language (L2) learners are less able to generate online expectations during language processing, which in turn predicts a reduced ability to differentiate between novel sentences that have a competing alternative and those that do not. We test this prediction and confirm that while L2 speakers display evidence of learning from positive exemplars, they show no evidence of taking competing grammatical alternatives into account, except at the highest quartile of speaking proficiency in which case L2 judgments align with native speakers.
A-adjectives, statistical preemption, and the evidence: Reply to Yang (2015). Adele E. Goldberg and Jeremy K. Boyd. 2015. Language 91 11: 184-197. pdf
A certain class of English adjectives known as a-adjectives resists appearing attributively as prenominal modifiers (e.g., ??the afraid boy, ??the asleep man). Boyd & Goldberg (2011) had offered experimental evidence suggesting that the dispreference is learnable on the basis of categorization and statistical preemption: repeatedly witnessing predicative formulations in contexts in which the attributive form would otherwise be appropriate. The present paper addresses Yang (2015)’s counterproposal for how a-adjectives are learned, and his instructive critique of statistical preemption. The counterproposal is that children receive evidence that a-adjectives behave like locative particles in occurring with certain adverbs such as far and right. However, in an analysis of the 450 million word COCA corpus, the suggested adverbial evidence is virtually non-existent (e.g., *far alive; *straight afraid). In fact, these adverbs occur much more frequently with typical adjectives (e.g., far greater, straight alphabetical). Furthermore, relating a-adjectives to locative particles does not provide evidence of the restriction, because locative particles themselves can appear as prenominal modifiers (the down payment, the outside world). The critique of statistical preemption is based on a 4.3 million word corpus analysis of child directed speech that suggests that children cannot amass the requisite evidence before they are three years old. While we clarify which sorts of data are relevant to statistical preemption, we concur that the required data is relatively sparsely represented in the input. In fact, recent evidence suggests that children are not actually cognizant of the restriction until they are roughly ten years old, an indication that input of an order of magnitude more than 4.3 million words may be required. We conclude that a combination of categorization and statistical preemption is consistent with the available evidence of how the restriction on a-adjectives is learned.
One among many: anaphoric one and its relationship to numeral one. Adele E. Goldberg & Laura A. Michaelis. 2016. Cognitive Science: [pdf]
One anaphora (e.g., She has a better one) has been used as a key diagnostic in syntactic analyses of the English noun phrase, and ‘one-replacement’ has also figured prominently in debates about the learnability of language. However, much of this work has been based on faulty premises, as a few perceptive researchers, including Ray Jackendoff, have made clear. Abandoning the view of anaphoric one (a-one) as a form of syntactic replacement allows us to take a fresh look at various uses of the word one. In the present work, we investigate its use as a cardinal number (1-one) in order to better understand its anaphoric use. Like all cardinal numbers, 1-one can only quantify an individuated entity and provides an indefinite reading by default. Owing to unique combinatoric properties, cardinal numbers defy consistent classification as determiners, quantifiers, adjectives or nouns. Once the semantics and distribution of cardinal numbers including 1-one are appreciated, many properties of a-one follow with minimal stipulation. We claim that 1-one and a-one are distinct but very closely related lexemes. When 1-one appears without a noun (e.g., Take one), it is nearly indistinguishable from a-one (e.g., take one)—the only differences being interpretive (1-one foregrounds its cardinality while a-one does not) and prosodic (presence versus absence of primary accent). While we ultimately argue that a family of constructions is required to describe the full range of syntactic contexts in which one appears, the proposed network accounts for properties of a-one by allowing it to share (inherit) most of its syntactic and interpretive constraints from its historical predecessor, 1-one.
Generalizing beyond the input: the functions of the constructions matter: Florent Perek & Adele Goldberg. 2015. Journal of Memory and Language 84: 109-127.
A growing emphasis on statistics in language learning raises the question of whether learning a language consists wholly in extracting statistical regularities from the input. In this paper we explore the hypothesis that the functions of learned constructions can lead learners to use language in ways that go beyond the statistical regularities that have been witnessed. The present work exposes adults to two novel word order constructions that differed in terms of their functions: one construction but not the other was exclusively used with pronoun undergoers. In Experiment 1, participants in a lexicalist condition witnessed three novel verbs used exclusively in one construction and three exclusively in the other construction; a distinct group, the alternating condition, witnessed two verbs occurring in both constructions and two other verbs in each of the constructions exclusively. Production and judgment results demonstrate that participants in the alternating condition accepted all verbs in whichever construction was more appropriate, even though they had seen just two out of six verbs alternating. The lexicalist group was somewhat less productive, but even they displayed a tendency to extend verbs to new uses. Thus participants tended to generalize the constructions for use in appropriate discourse contexts, ignoring evidence of verb-specific behavior, especially when even a minority of verbs were witnessed alternating. A second experiment demonstrated that participants’ behavior was not likely due to an inability to learn which verbs had occurred in which constructions. Our results suggest that construction learning involves an interaction of witnessed usage together with the functions of the constructions involved.
Tuning in to the verb-particle construction in English. Adele E. Goldberg. to appear. In Léa Nash and Pollet Samvelian (eds.) Approaches to Complex Predicates:
This work investigates English verb particle combinations (e.g., put on) and argues that item-specific and general information are needed and should be related within a default inheritance hierarchy. When verb particle combinations appear within verb phrases, a tripartite phrasal syntax is defended, whether or not the V and P are adjacent (e.g., She put on the wrong shoes; she put the wrong shoes on). The < V NP P > order is motivated as the default word order by explicitly relating a verb-particle construction to the caused-motion construction (e.g., she put the shoes on her feet). Well-known and independently needed processing considerations related to complement length, information status, and semantics motivate system-wide generalizations that can serve to override the default word order. Lexical verb-particle combinations (e.g., a pickup truck; a showdown) and an idiomatic case, V-off are also briefly discussed as providing further evidence for the need for both item-specific and more general constructions.
Compositionality. Adele E. Goldberg. 2016. In N. Reimer (ed.) Routledge Handbook of Semantics. 419-430.
How do people glean meaning from language? A Principle of Compositionality is generally understood to entail that the meaning of every expression in a language must be a function of the meaning of its immediate constituents and the syntactic rule used to combine them. This paper explores perspectives that range from acceptance of the principle as a truism, to rejection of the principle as false. Controversy has arisen basedon the role of extra-constituent linguistic meaning (idioms; certain cases of paradigmatic morphology; constructional meanings; intonation), and context (e.g., metonymy; the resolution of ambiguity and vagueness).