The Interaction of Syntactic and Lexical Information Sources in Language Processing

  • cc icon
  • ABSTRACT

    This paper reports the results of a lexical decision experiment and a selfpaced reading experiment that investigate the interaction between syntactic and lexical information in on-line language processing, using the noun-verb ambiguity in English. The results of both experiments provide support for the hypothesis whereby syntactic and lexical information are two independent factors in the process of sentence comprehension, consistent with previous work in the sense-ambiguity processing literature. Our results therefore add to the body of literature that demonstrates that the process of language comprehension is guided by numerous independent information sources, rather than syntactic information alone, as some of the earlier proposals in the field of sentence processing hypothesized.


  • KEYWORD

    syntax , lexical processing , ambiguity resolution , top-down and bottom-up processes in language understanding

  • 1. Introduction

    Research in the language comprehension literature has demonstrated that people use a variety of information sources in the moment-by-moment comprehension of sentences, including lexical information, syntactic structure, the lausibility of the described events, the discourse context, and – for auditory input – prosody (Tanenhaus & Trueswell, 1995; Gibson & Pearlmutter, 1998). An important architectural question in this literature is whether all or only some of these information sources can guide the process of sentence comprehension, where guiding processing means operating independent of other information sources to determine the range of possible interpretations, as opposed to deciding between or among the interpretations suggested by other information sources. An early hypothesis in the literature was that only syntactic information can guide the process of sentence comprehension (Fodor, Bever & Garrett, 1974; Frazier, 1978; Frazier & Fodor, 1978; Fodor, 1983), with information sources such as plausibility and context being used to decide among the syntactic alternatives (e.g., Crain & Steedman, 1985). However, subsequent research has established that information sources other than syntax are available to comprehenders as early as can be measured (e.g., MacDonald, Pearlmutter & Seidenberg, 1994; Trueswell, Tanenhaus & Garnsey, 1994; Tanenhaus et al., 1995) and may in fact guide the process of sentence comprehension (see also Tyler & Marslen-Wilson, 1977 and Marslen-Wilson & Tyler, 1987). In particular, semantic information and the plausibility of the described events (Kuperberg et al., 2003; Kim & Osterhout, 2005) and the local discourse context (Altmann & Kamide, 1999; Grodner, Gibson & Watson, 2005) have been shown to guide sentence processing, independent of syntactic information. The focus of the current paper is to investigate whether lexical information guides sentence comprehension, independent of syntactic information. Another way to frame this question is in terms of whether syntactic context constrains lexical access: if so, then lexical access is not independent of syntactic processing; if not, then lexical access proceeds independently of the preceding syntactic context.

    Research from the lexical access literature has established that lexical processing proceeds independent of the existing semantic context. Two paradigms have provided the evidence in support of this claim: (a) a semantic priming paradigm, where participants process a sentence and perform a lexical decision task on a word that may be related to one of the words in the sentence; and (b) a reading paradigm, where reaction time is measured for each word in the sentence. Studies using the semantic priming paradigm have revealed that for equi-biased sense-ambiguous words (e.g., for a word like bug, which is ambiguous between insect and recording device senses), both interpretations are initially accessed, independent of the semantic context (Swinney, 1979; Tanenhaus, Leiman & Seidenberg, 1979; Seidenberg et al., 1982). Furthermore, when the context is biased towards the lowerfrequency interpretation of an ambiguous word, both the low and high frequency interpretations are accessed, but when the context is biased toward the higher-frequency interpretation, only the high frequency interpretation is accessed (Tabossi, Colombo, & Job, 1987; Tabossi, 1988). These results suggest that semantic context and lexical frequency are two independent sources of information that the processor can use in comprehending words: when one or both constraints support the access of a lexical entry, then access of that lexical entry occurs. When no constraints support a particular lexical entry (as e.g., in the case of a low-frequency reading occurring in an unsupportive context), there is no evidence of it being accessed.

    In the reading paradigm, when the context is neutral with regard to the different meanings of an equi-biased sense-ambiguous word, it takes comprehenders longer to process the word, compared to a biased ambiguous word or an unambiguous control word (Duffy, Morris & Rayner, 1988; Rayner, Pacht & Duffy, 1994; Binder & Rayner, 1998; Binder, 2003). This has been explained in terms of competition between the two equally available meanings of the balanced word (e.g., Duffy et al., 1988). Furthermore, when the context is consistent with the subordinate meaning of a biased sense-ambiguous word, it takes comprehenders longer to process the biased ambiguous word, compared to unambiguous controls. This phenomenon has been termed the subordinate bias effect (Rayner et al., 1994), and has been argued to reflect competition between the two meanings of the word, where the dominant meaning is highly available due to its frequency, and the subordinate meaning is made available by the contextual information. These results dovetail nicely with the evidence from the semantic priming paradigm: both sets of results suggest that lexical frequency and semantic context are two independent sources of information that the processor uses in comprehending words in a sentence.

    There has been less research investigating lexical access in different syntactic contexts. Most of the evidence relevant to this question has come from the semantic priming paradigm. In particular, it has been shown that for equi-biased words that are ambiguous with respect to their syntactic category (i.e., words that are ambiguous between a noun and a verb like rose or watch), both interpretations are initially accessed, independent of the syntactic context (Tanenhaus et al., 1979; Seidenberg et al., 1982). Much like the results discussed above had suggested that lexical processing occurs independent of semantic context, this result suggests that lexical processing occurs independent of syntactic context.

    There is little evidence from the literature manipulating lexical frequencies of the different meanings of category-ambiguous words and syntactic context using a reading method. In an early paper investigating these constraints in reading, Frazier & Rayner (1987) examined noun-verb ambiguous words like trains in sentence contexts that were initially consistent with either reading (e.g., The desert trains …) and then were disambiguated towards one syntactic category or the other. It was observed that the ambiguous word was processed faster than its unambiguous control, contrary to the predictions of the independent constraints hypothesis, which would predict slower processing (due to competition), all other factors being equal. However, MacDonald (1993) demonstrated that the relatively fast times for reading the ambiguous words in Frazier & Rayner’s experiment were more likely due to the fact that Frazier & Rayner’s unambiguous control conditions were pragmatically odd in a null context, because of the presence of a demonstrative determiner (e.g., these desert trains …; this desert trains …). MacDonald then showed that when more suitable unambiguous control sentences were used, the temporarily ambiguous words were not processed any faster. MacDonald further argued that a lexically-based processing theory could account for the observed pattern of results. However, MacDonald’s experiments did not evaluate whether lexical information was independent of syntactic context.

    Recently, more direct tests of whether syntactic and lexical constraints are independent have been conducted using a reading method. In one set of eye-tracking studies, Boland & Blodgett (2001) manipulated the syntactic context to be biased to expect either a noun (e.g., She saw his …), or a verb (e.g., She saw him …), for sixteen words that were ambiguous between noun and verb readings (e.g., duck, play), with the bias toward one of the categories varying continuously.1 Boland & Blodgett observed a significant correlation between lexical bias and initial fixation times in the conditions where the syntactic context created an expectation for a noun (more verbbiased items took longer to process), and a marginal correlation in the conditions where the syntactic context created an expectation for a verb (more noun-biased items took longer to process). This evidence suggests that lexical information affects processing difficulty even in cases where syntactic information provides a strong cue to the correct interpretation. However, there were three issues in Boland & Blodgett’s experiments that weaken this interpretation: (a) a relatively small number of noun-verb ambiguous items (sixteen) was included; (b) the result was only reliable in the noun contexts; and (c) two of sixteen items had to be removed in order for the correlations to be significant (Boland & Blodgett justified the removal of these two items by arguing that the two measures of establishing lexical bias – corpus counts and sentence completion norms – were inconsistent for these items). The generalizability of these results is therefore questionable.

    In another set of eye-tracking reading studies, Folk & Morris (2003) failed to find a subordinate bias effect for verb-biased noun-verb ambiguous words in noun contexts in early eye-tracking measures (although they did find such an effect in second pass times). As in the case of Boland & Blodgett’s studies, however, there were issues with Folk & Morris’s experiments that make interpretation of their results difficult. In particular, in their first experiment, Folk & Morris used rich syntactic and semantic contexts (e.g., Biking through Utah, the cyclist lost a spoke (vs. unambiguous control jacket) in the mountains). Given that not only syntactic information but also semantic information in the preceding context points toward the subordinate reading of the ambiguous word, it is perhaps unsurprising that lexical information alone is not strong enough to override these two information sources. In their second experiment, Folk & Morris used more minimal syntactic contexts (similar to Experiment 2 in the current paper). However, in that experiment the lexical bias of the critical noun-verb ambiguous words may not have been sufficiently strong (the mean bias was .63, with a range of 50-.68; cf. a bias of .83 for verb-biased words and .94 for nounbiased words in Experiment 2 in the current paper). In summary, the lack of a subordinate bias effect in Folk & Morris’s studies may have been due to (a) their use of overly rich semantic contexts in their first experiment; and (b) weak lexical biases in their second experiment. Consequently, their studies are still consistent with the independence of lexical and syntactic constraints.

    More recently, a subordinate bias effect has been observed with respect to a category-ambiguous word: the word that, which is ambiguous between a determiner and a complementizer, and is strongly biased in written text toward the complementizer reading (Tabor, Juliano & Tanenhaus, 1997; Gibson, 2006). Tabor et al. (1997) and Gibson (2006) examined the processing of this word in syntactic environments that are inconsistent with the complementizer interpretation, as in (1a) and (2a):

    Both research groups found that the region consisting of that and the following two words (that skilled surgeon in (1a) and (2a)) was processed more slowly than the corresponding region in the control sentences (those skilled surgeons in (1b) and (2b)). One plausible explanation for these results is in terms of the subordinate bias effect: the complementizer reading of the word that is available due to its frequency, and the determiner reading is made available by the syntactic context, leading to competition between the two interpretations, and consequently slower reading times than in the control cases, where no such competition occurs.2 This evidence is therefore consistent with the hypothesis whereby lexical information and syntactic information are independent factors in on-line sentence comprehension. However, a limitation of these studies is that they are based on reading time data for the ambiguity of a single word, the word that.

    Finally, a recent event-related potentials (ERP) study on category-ambiguous words provides further evidence for the independence of lexical and syntactic constraints. In particular, Thierry et al. (2008) investigated a phenomenon where a word is used in a syntactic context that is inconsistent with the word’s dominant category meaning. Such conversion of one part of speech into another (what the authors call “functional shift”) is a common literary device. Thierry et al. focused on materials from Shakespeare’s writings where this device is used extensively. For example, one item from Thierry et al.’s study was I know you don’t want to speak, but lip something loving in my ear, where the word lip – most frequently used as a noun – is used as a verb, to mean whisper/speak softly. In their functional shift condition, Thierry et al. observed two components that have been argued to reflect syntactic processes: a Left Anterior Negativity (LAN) and a P600 component, a positive-going waveform peaking around 600 ms after the onset of the critical word (see e.g., Kaan, 2007, for an overview). These results suggest that comprehenders access the yntactic-context-inappropriate lexical meaning (e.g., the noun reading of lip in the example above), which leads to difficulty in integrating the word with the preceding syntactic context.

    To summarize the results from the literature reported above: (1) multiple meanings are activated during lexical access under most circumstances (except for accessing the low-frequency meaning of a word that appears in a context that supports its high-frequency meaning), (2) biased ambiguous words can cause processing difficulty when the context supports the lowfrequency interpretation, resulting in a subordinate bias effect, (3) with respect to syntactic category mbiguities, the subordinate bias effect has been observed for the category-ambiguous word that (Tabor et al., 1997; Gibson, 2006), but (4) the evidence for a subordinate bias effect in other category ambiguities, such as the noun-verb ambiguity in English, is not as strong (e.g., Boland & Blodgett, 2001; cf. Folk & Morris, 2003). Thus, although previous literature strongly suggests that lexical and syntactic information are likely to guide interpretation independently, the existence of a subordinate bias effect in a productive syntactic category ambiguity has not yet been demonstrated convincingly.

    To address this gap in the literature, the current studies seek to examine the interaction of lexical frequency and syntactic context in sentence processing using a productive syntactic category ambiguity: the noun-verb ambiguity in English. These studies are thus similar to those of Boland & Blodgett and Folk & Morris, but with more items, more sophisticated analyses, and materials presented in minimal syntactic contexts without any preceding semantic / discourse context, in order to narrow in on the interaction of only lexical frequency and syntactic context. If lexical information guides interpretation independent of syntax, then we should observe a subordinate bias effect, much like that observed by Tabor et al. (1997) and Gibson (2006) with respect to the complementizer / determiner ambiguity of that, such that a word with a dominant noun interpretation is read slowly in a verb context, and a word with a dominant verb interpretation is read slowly in a noun context.

    To select a set of materials for our experiments, we initially conducted a study using meta-linguistic judgments (Norming Study 1). Subsequently, we used the materials selected based on the results of Norming Study 1 in an elicited production study (Norming Study 2) and in a corpus study. As will be discussed below, the results from all three studies were highly correlated (rs >.75), suggesting that the lexical biases in the materials used in Experiments 1 and 2 are reliable. We will now describe the methods and the results of the two norming studies and of the corpus study. We will then go on to describe the two experiments.

    1Boland & Blodgett also varied the discourse context prior to the critical sentence, so that it was biased toward the noun or the verb reading. We focus on the conditions where the discourse context was consistent with the syntactic context, because the data patterns with regard to the relationship between lexical bias and syntactic context are easiest to interpret in these conditions.  2Tabor et al. discuss the processing slowdown in examples like (1a) in terms of context-dependent lexical access (similar to proposals by Swinney & Hakes, 1976; Cairns & Hsu, 1980; Carpenter & Daneman, 1981; Simpson, 1981), but other results from the literature on lexical access in context make this interpretation unlikely (Duffy et al., 1988; Rayner et al. 1994, among others). Furthermore, Gibson (2006) provides direct evidence against the context-dependent lexical access interpretation of this particular ambiguity.

    2. Norming Study 1

      >  Methods

    Four independent raters were presented with a list of 240 noun-verb ambiguous items (generated by two of the authors, Fedorenko and Gibson). Each rater was asked to provide a judgment for each word with regard to whether the word is more likely to be a noun, a verb, or equally likely to be both.3

    The goal of the study was to select three subsets of words which would be noun-biased, verb-biased and equi-biased.

      >  Results

    For each word, the number of noun-biased, verb-biased and equi-biased responses was counted. The following criteria were applied to select the three subsets of words. A word was considered noun-biased if it had 3 or 4 noun-biased responses and 0 verb-biased responses. Similarly, a word was considered verb-biased if it had 3 or 4 verb-biased responses and 0 nounbiased responses. A word was considered equi-biased if it had no more than 2 noun-biased responses and no more than 2 verb-biased responses. Based on these criteria, 60 words were selected for each bias group for a total of 180 items.

    3In the current studies we focus on the syntactic category ambiguity of the nounverb ambiguous words and do not take into consideration (a) the degree of semantic relatedness between the noun and the verb readings, or (b) the within-category sense ambiguities for the noun and/or the verb reading. Both of these factors may play a role in the relationship between syntax and lexical information. We decided to ignore differences among noun-verb ambiguous words along these dimensions for two reasons. First, we wanted to use a large set of items (an advantage over previous studies), and having additional constraints on the materials would have decreased the potential set of noun-verb ambiguous words. And second, many decisions about how related some meanings are to one another (either between the noun and the verb reading, or among the different noun or verb readings) are subjective, with the consequence that estimating frequencies for different senses separately is less straightforward than estimating the frequencies of the noun vs. the verb reading (the latter is straightforward because the contexts in which nouns and verbs are used are almost entirely non-overlapping).

    3. Norming Study 2

      >  Methods

    This study used an elicited production task to determine the bias of the ambiguous words. Fifty-six participants from MIT and the surrounding community took part in the study. All were between the ages of 18 and 40, native speakers of English and naive as to the purposes of the study. None had participated in Norming Study 1. All participants were paid for their participation.

    The 180 items selected in Norming Study 1 were used in this study. The list was divided in half, such that each participant only saw 90 critical items (30 noun-biased, 30 verb-biased and 30 equi-biased). This was done due to the time-consuming nature of the task (written sentence generation), such that the participants would not have to spend more than one hour on the experiment. 90 category-unambiguous fillers were included in addition to the critical items (these included 30 verbs, 20 prepositions, 20 adjectives and 20 adverbs; no unambiguous nouns were included because we reasoned that there is an a priori bias to treat single words as nouns in this sentence generation task). Four pseudo-random lists were created (two for each of the two halves of the target items) such that no more than two ambiguous words appeared in a row, and then four additional lists were created by reversing the order of these lists. Thus there were eight experimental lists. Seven participants saw each of the lists.

    Participants were instructed to create a short (4-7 words long) sentence with each of the words. They were told to write the first thing that came to their mind. They were also told that they could change the form of the words: for example, they could make a noun plural and put a verb in the past or future tense. The study took approximately one hour to complete.

      >  Results

    For each word the number of noun and verb uses was counted by handparsing the participants’ responses. 6.8% of the responses could not be coded as a noun or a verb use: either the word remained ambiguous or it was used in a different category (e.g., as an adjective). The analysis of the responses revealed a smooth distribution of the lexical bias across the items, as shown in Figure 1. A correlation analysis between the results of the two norming studies revealed a highly significant correlation (r=.82; F(1,178)=356.3; p<.001).

    4. Corpus Study

      >  Methods

    The CELEX database (Baayen et al., 1995) was used to determine the bias of the ambiguous words. As in Norming Study 2, the items selected in Norming Study 1 were used in this study. Five of the items were not found in the CELEX database, so only 175 items were included. Lemma frequencies (normalized out of 1,000,000) were used. The relative biases were calculated by adding the frequency values for the noun and for the verb readings and then dividing the noun and the verb frequency value by the summed value.

      >  Results

    Similar to Norming Study 2, the analysis of the relative biases revealed a smooth distribution across the items, as shown in Figure 2.

    A correlation analysis between the results of the Corpus Study and the two Norming Studies revealed highly significant correlations: Corpus Study and Norming Study 1 (r=.75; F(1,173)=221.7; p<.001); Corpus Study and Norming Study 2 (r=.82; F(1,173)=367.5; p<.001) (Figure 3).

    5. Experiments 1 and 2

    As discussed above, if lexical information guides interpretation independent of syntax, then a word with a dominant noun interpretation should be read slowly in a syntactic context biased to expect a verb (e.g., following the infinitival marker to), and a word with a dominant verb interpretation should be read slowly in a syntactic context biased to expect a noun (e.g., following a determiner like the), reflecting the competition between the two interpretations of the word when the information sources are in direct conflict. Experiment 1 uses minimal syntactic contexts in a modified lexical decision paradigm to test these predictions, and Experiment 2 uses sentential contexts in a self-paced reading paradigm.

    In both experiments we decided not to use unambiguous controls, matched in frequency to the subordinate reading, and rather to directly compare the processing of category-ambiguous words – ranging in lexical biases (in Experiment 1) or sampled from the ends of the distribution (in Experiment 2) – in different syntactic contexts. The rationale for not including unambiguous controls was two-fold. First, given the lexico-semantic complexity of many noun-verb ambiguous words in English, it is unclear what the unambiguous controls for the subordinate category meaning should be matched to in terms of frequency (and other nuisance variables). In particular, as discussed above, in addition to the category ambiguity between a noun and a verb reading, most words – in our set and more generally – have many different senses for both the noun and the verb reading. As a result, especially in paradigms with minimal syntactic contexts where there is little control over which sense(s) will be retrieved by the comprehender, it is far from obvious what a proper set of unambiguous controls would look like. Consequently, the results may be difficult to interpret.

    Second, as discussed in the Introduction, some of the earlier results in the literature strongly suggest that the subordinate bias effect reflects interference from the word’s dominant category reading, rather than the difficulty of accessing a low-frequency (subordinate) reading. In particular, Tabor et al. (1997) and Gibson (2006) observed a subordinate bias effect for the category- ambiguous word that in the determiner context, relative to unambiguous controls like those and this. Furthermore, Thierry et al. (2008) observed a Left Anterior Negativity (LAN) and a P600 in cases where a word was used in a syntactic context that was inconsistent with the word’s dominant category reading, suggesting difficulty in integrating the incoming word into the preceding syntactic context. If it were the case that the subordinate bias effect was due to the difficulty of accessing a low-frequency meaning, then an N400 component should instead have been observed, which has been shown to be sensitive to lexical frequency, with lower-frequency words eliciting larger N400s (e.g., Van Petten & Kutas, 1990, 1991; Van Petten, 1993).

    As a result, we focused on including a large set of noun-verb ambiguous words – normed carefully for category biases – in order to seek stronger evidence for a subordinate bias effect in a productive category ambiguity in English (cf. Boland & Blodgett, 2001; Folk & Morris, 2003).

       5.1. Experiment 1

    The logic of this experiment relies on the assumption that the lexical decision speed may be affected by the context in which the word appears (e.g., Swinney, 1979; Tanenhaus et al., 1979).

    Methods

    Participants Sixty-three participants from MIT and the surrounding community took part in the experiment. All were between the ages of 18 and 40, native speakers of English and naïve as to the purposes of the study. Participants were paid for their participation. None participated in either of the Norming Studies.

    Design and Materials The experiment had a 3 x 3 factorial design crossing (1) lexical bias (Noun, Verb, Equi), and (2) context (Noun, Verb, Null). As described above, based on the results of Norming Study 1, 180 noun-verb ambiguous items were selected from the original set of 240 items: 60 Nounbiased items (e.g., air, corner, key, table), 60 Verb-biased items (e.g., build, cut, make, wait) and 60 Equi-biased items (e.g., plan, test, joke, hug; for a complete list of materials, see Appendix A; the norms are available from the authors upon request).

    In addition to the 180 experimental materials, 180 pronounceable nonword fillers were included. The fillers were generated using the ARC Nonword Database (Rastle et al., 2002; available at http://www.maccs.mq.edu. au/~nwdb/) and were matched for length in letters with the experimental materials (p=.97).

    Procedure The participants were told that they would be presented with a series of letter-strings and they would be asked to decide as quickly as possible whether each letter-string was a real word of English or not. They were told to indicate their decision by pressing one of two buttons. The participants were further told that on some trials the words / letter-strings would appear with a determiner (e.g., the) or a verb-particle (e.g., to). They were instructed to ignore the determiner/particle and make the word/nonword decision based on the letter-string appearing after the determiner/particle. Each trial started with a fixation cross appearing in the middle of the screen. Participants were told to press the spacebar to begin the trial. After the button-press, a blank screen appeared for 1,000 ms, followed by the stimulus presented for 400 ms, followed again by a blank screen. Participants were told to press one of two buttons to indicate whether they thought the letter-string they saw was a real word of English or not. After the button press, the next trial began. The experiment used the Linger 2.94 software written by Doug Rohde (available at http://tedlab.mit.edu/~dr/linger). The order of trials was randomized for each participant. The experiment took approximately 35 minutes to complete.

    Results

    Accuracy data Across the nine conditions, participants answered correctly 96.5% of the time. There were no effects or interactions in the accuracy data (Fs<1.5). Table 1 presents the mean accuracies across the nine conditions of Experiment 1.

    Reaction time data Across the nine conditions, the mean reaction time (RT) was 744 ms. In the analyses presented below we only included the trials on which the word/non-word decision was made correctly. The data patterns were similar in the analyses of all the trials. Reaction time data points that were more than three standard deviations away from the mean RT within a condition were excluded from the analysis, affecting 1.42% of the data. Figure 4 presents the mean RTs across the nine conditions of Experiment 1.

    A 3 x 3 ANOVA crossing lexical bias (Noun, Verb, Equi) with context (Noun, Verb, Null) revealed a significant interaction between the two factors (F1(2,248)=5.45; MSe=29720; p < .001). (Note that because lexical bias is a between-items factor, the analysis by items is not meaningful in this design.) This interaction results from the fact that noun-biased items took participants longer to respond to in the verb-context condition (784 ms), compared to the noun-context (735 ms) or the null-context (737 ms) conditions, and verb-biased items took participants longer to respond to in the noun-context condition (755 ms), compared to the verb-context (724 ms) or the null-context (726 ms) conditions. Note that for both noun-biased and verb-biased items the RTs for the null-context conditions were very similar to the RTs for the congruent conditions, suggesting that in the null context participants interpreted the ambiguous word in its more frequent category. An alternative possibility for why no facilitation was observed for the noun-biased items in the noun context relative to the noun-biased items in the null context (or for verb-biased items in the verb context relative to the verb-biased items in the null context) is that in the noun and verb context materials consisted of two words, whereas in the null context they consisted of only a single word. As a result, it is possible that the facilitation that would have been observed due to the match between the dominant reading and the context, compared to cases with no context, was counteracted by the differences in the length of the materials in the noun/verb context compared to the null context conditions.

    The analysis also revealed an unpredicted main effect of bias such that across the different contexts, the verb-biased items were processed faster than the equi-biased items and the noun-biased items. This difference may have resulted from a variety of factors that have been shown to affect wordlevel processing but were not controlled across the three bias groups (e.g., overall lexical frequency, length, familiarity, imageability, concreteness, age of acquisition, etc.).

    Regression analysis In addition to the analysis of variance, we performed a mixed-effects regression analysis (Gelman & Hill, 2006; Baayen, 2008) on the log reaction times using subjects and items as random effects. Following Boland & Blodgett (2001), we used the difference between the noun-reading log frequency and the verb-reading log frequency values as an independent variable (Log N Freq – Log V Freq), as well as syntactic context. The regression also controlled for word length and log overall frequency.

    Only correct responses were analyzed, and data points more than three standard deviations from the mean response time per context condition were removed, accounting for 5% of the data. Analyses were carried out using the R statistical programming language with the packages lme4 (Bates, 2005) and languageR (Baayen, 2008). Significance values were computed using Markov chain Monte Carlo method, and sampling was run for 50,000 steps.

    The regression was structured to test for an interaction between lexical bias and syntactic context to evaluate the research question. In particular, since a positive lexical bias indicates a noun-biased word, positive lexical biases should increase reaction times in verb contexts. In contrast, negative lexical biases should increase reaction times in noun contexts. All coefficients were standardized. The fixed effects from the regression analysis are shown in Table 2.

    Consistent with the hypothesis that lexical information can guide interpretation independent of syntax, these results show the effects in the interaction terms. First, the coefficient of NounContext:LexicalBias is significantly negative (p < .001), indicating that a positive lexical bias decreases reaction times in noun contexts, and a negative lexical bias increases reaction times. Second, the coefficient of VerbContext:LexicalBias is significantly positive (p < .02), indicating the opposite pattern for verb contexts. The coefficient of NullContext:LexicalBias is not significant (p > 0.48), indicating that in null contexts, lexical bias did not affect reaction times. In addition, the regression shows significant effects of log overall word frequency and length: more frequent words are read more quickly and longer words are read more slowly. Finally, the regression shows significant main effects of NounContext (p < 0.005) and VerbContext (p < 0.02), indicating that words in these conditions are responded to slower than in the null-context condition.

    Discussion

    The ANOVA results of Experiment 1 demonstrated that there is some difficulty associated with processing noun-biased words in a verb context and with processing verb-biased words in a noun context. Furthermore, the regression results demonstrated that this difficulty is modulated by the degree of lexical bias, across the 180 items in the experiment. This pattern of results is as predicted by the hypothesis that lexical information can guide interpretation independent of syntax, but not by the hypothesis whereby lexical access is filtered according to the syntactic context.

       5.2. Experiment 2

    This experiment was conducted to extend the findings from Experiment 1 to a more natural language comprehension task where the critical words were presented in larger syntactic contexts.

    Methods

    Participants Twenty-four participants from MIT and the surrounding community took part in the experiment. All were between the ages of 18 and 40 years old, native speakers of English, and naïve as to the purposes of the study. None participated in either of the two Norming Studies or in Experiment 1. All participants were paid for their participation.

    Design and Materials This experiment had a 2 x 2 design crossing (1) lexical bias (Noun, Verb), and (2) context (Noun, Verb). A subset of the items used in Experiment 1 was used: 24 noun-biased items and 24 verb-biased items, for a total of 48 items. The experimental sentences were constructed such that the noun-context and the verb-context conditions were minimally different prior to the critical ambiguous word, by including sentence initial phrases like NAME had/wanted/needed a/to. In a small subset of items it was possible to have the regions following the ambiguous word be identical across the two contexts (as in (5)). However, in the majority of items this was not possible without sacrificing plausibility. To minimize the differences in the materials following the ambiguous word and to allow the comparison on the post-ambiguous-word region across the two contexts, the materials were constructed such that the two words after the ambiguous word were always function words (as in (6)).

    Because many of the ambiguous words were short and because it is often difficult to see the effects on a single word in sentence processing, we defined three critical regions: (1) the ambiguous word, (2) the word immediately following the ambiguous word, and (3) the word two words after the ambiguous word.

    In addition to the 48 target items, 96 filler materials were constructed. The filler sentences used constructions similar to those used in the target items (i.e. NAME had/wanted/needed a/to) but used unambiguous nouns and verbs (e.g., Arnold wanted a sweatshirt with MIT written on it and so he went to the university gift shop; Karen planned to confess to the crime but her lawyer was advising against it.)

    Procedure The task was self-paced word-by-word reading with a movingwindow display (Just, Carpenter & Woolley, 1982). The experiment was run using the Linger 2.94 software. Each trial began with a series of dashes marking the length and position of the words in the sentence. Participants pressed the spacebar to reveal each word of the sentence. As each new word appeared, the preceding word disappeared. The amount of time the participant spent reading each word was recorded as the time between keypresses.

    To make sure the participants read the sentences for meaning, at the end of each trial a comprehension question appeared asking about the propositional content of the sentence they just read. Participants pressed one of two keys to respond “Yes” or “No”. After an incorrect answer, the word “INCORRECT” flashed briefly. Before the experiment started, a short list of practice items and questions was presented in order to familiarize the participants with the task.

    Participants took approximately 35 minutes to complete the experiment.

    Results

    Accuracy data Across the four conditions, participants answered correctly 95.4% of the time. A 2 x 2 ANOVA crossing lexical bias (Noun, Verb) with context (Noun, Verb) revealed an unpredicted effect of context, such that the verb-context conditions were less accurate than the nouncontext conditions (F1(1,23)=4.70; MSe=209; p<.05). (As in Experiment 1, because lexical bias is a between-items factor, the analysis by items is not meaningful in this design.) There was also an unpredicted marginal effect of bias, such that participants were less accurate in the noun-biased conditions than the verb-biased conditions (F1(1,23)=3.87; MSe=88; p=.06), and a marginal interaction (F1(1,23)=3.06; MSe=88; p=.09). These effects appear to be driven by the noun-biased/verb-context condition being less accurate than the other three conditions. One possible explanation for the difference among conditions is that the noun-biased words were more biased than the verb-biased words: specifically, the noun-biased words had an average bias of .94 and the verb-biased words had an average bias of .83. The higher bias in the noun-biased conditions combined with the incongruent syntactic context may have resulted in the noun-biased/verb-context condition being less accurate than the other three conditions. Table 3 presents the mean accuracies across the four conditions of Experiment 2.

    Reading time data A 2 x 2 ANOVA crossing lexical bias (Noun, Verb) with context (Noun, Verb) was conducted on different regions of the sentence as shown in 7 (the three critical regions are in bold).

    At the first region (Mary had a/to), there were no significant effects (Fs<1). At the ambiguous word there was only a marginal effect of lexical bias, such that the verb-biased items were read faster (267 ms) than the nounbiased items (281 ms) (F1(1,23)=3.83, MSe=4818, p=.063). This difference may have resulted from a variety of factors that have been shown to affect word-level processing but were not controlled across the two bias groups. At the word immediately following the ambiguous word, the ANOVA revealed a significant interaction between the two factors (F1(1,23)=5.36; MSe=3699; p < .05). This interaction results from the fact that the noun-biased items took the articipants longer to read in the verb-context condition (277 ms), compared to the noun-context condition (266 ms), and the verbbiased items took the participants longer to read in the noun-context condition (275 ms), compared to the verb-context condition (260 ms) (see Figure 5). A similar interaction is observed during the following region (the second word following the critical ambiguous word) (F1(1,23)=13.06; MSe=9021; p < .002): the noun-biased items took the participants longer to read to in the verb-context condition (291 ms), compared to the noun-context condition (266 ms), and the verb-biased items took the participants longer to read in the noun-context condition (273 ms), compared to the verb-context condition (260 ms) (see Figure 5).

    We don’t report comparisons involving later regions in the sentences, because differences in materials across the different conditions make any potential differences hard to interpret.

    Regression analysis As in Experiment 1, we performed a mixed effect regression analysis on the log reading times using subjects and items as random effects and controlling for log overall frequency and length of the critical word. All coefficients were standardized. We used total reading time on the two words following the critical word as the dependent variable. Analyses were carried out using the R statistical programming language with the packages lme4 (Bates, 2005) and languageR (Baayen, 2008). As in Experiment 1, significance values were computed using Markov chain Monte Carlo method, and sampling was run for 50,000 steps.

    As in Experiment 1, we tested for an interaction between lexical bias and syntactic context. The hypothesis whereby lexical information guides interpretation independent of syntactic information predicts that the interaction term NounContext:LexicalBias would be negative, and the interaction term VerbContext:LexicalBias would be positive. This corresponds to positive lexical biases (noun-biased words) slowing reading times in verb contexts, and to negative lexical biases (verb-biased words) slowing reading times in noun contexts. The fixed effects from the regression analysis are shown in Table 4.

    Consistent with the hypothesis that lexical information can guide interpretation independent of syntax, the coefficient of VerbContext:LexicalBias is significantly positive (p < 0.01), indicating that a positive lexical bias increases reading times in verb contexts, and a negative lexical bias decreases reading times. However, although the coefficient of the NounContext:LexicalBias interaction term is in the direction predicted by the hypothesis that lexical information can guide interpretation (negative), it is not significantly different from zero (p > 0.37). This may result from the smaller number of subjects and items in this experiment, compared to Experiment 1.

    Discussion

    The ANOVA results of Experiment 2 demonstrated that in sentential contexts there is some difficulty associated with processing noun-biased words in a verb context and with processing verb-biased words in a noun context providing further support for the hypothesis that lexical information can guide interpretation independent of syntax. Furthermore, similar to Experiment 1, the regression results demonstrated that this difficulty is modulated by the degree of lexical bias. The fact that the regression results were only significant for the verb-context conditions (unlike in Experiment 1, where the regression results were consistent in both the noun- and verb-context conditions) can perhaps be explained by: (a) the fact that there were only 48 items in this experiment as compared to 180 in Experiment 1, resulting in a less powerful analysis; and (b) the fact that the noun-biased words were more biased than the verb- biased verbs, with the possible consequence that the verb-biased items may not have resulted in as much difficulty in the noun contexts as the noun-biased verbs in the verb contexts.

    Overall, the pattern of results is similar to the pattern of results in Experiment 1, and is as predicted by the hypothesis that lexical information can guide interpretation independent of syntax, but not by the hypothesis whereby lexical access is filtered according to the syntactic context.

    6. Summary and conclusions

    Two experiments were presented – a lexical decision experiment and a selfpaced reading experiment – that tested the independence of syntactic and lexical information sources in online language processing. Both experiments provided evidence that syntactic and lexical information guide interpretation independently. In particular, both experiments showed a subordinate bias effect (Duffy et al., 1988; Rayner et al., 1994), such that the dominant lexical meaning was highly available due to its frequency, and the subordinate meaning was made available by the syntactic context, giving rise to elevated RTs for the noun-biased words in a verb context and verb-biased words in a noun context. Furthermore, this set of results generalizes earlier findings from the category-ambiguous word that (Tabor et al., 1997; Gibson, 2006) to a productive syntactic-category ambiguity.

    The current results thus fill the gap in the literature for processing category-ambiguous words in biasing contexts, using a reading paradigm. These results provide further evidence against syntactic information being the only source of information that can guide sentence processing. Like context (Altmann & Kamide, 1999; Grodner, Gibson & Watson, 2005) and semantic and plausibility information (Kuperberg et al., 2003; Kim & Osterhout, 2005), lexical information can also guide sentence processing, independent of the syntactic context (Tyler & Marslen-Wilson, 1977; Ford, Bresnan & Kaplan, 1982; Marslen-Wilson & Tyler, 1987; Culicover & Jackendoff, 2005).

  • 1. Altmann G. T. M., Kamide Y. 1999 Incremental interpretation at verbs: Restricting the domain of subsequent reference. [Cognition] Vol.73 P.247-264 google doi
  • 2. Baayen R.H. 2008 Analyzing Linguistic Data: A Practical Introduction to Statistics using R. google
  • 3. Baayen R.H., Piepenbrock R., Gulikers L. 1995 The CELEX Lexical Database (Release 2). google
  • 4. Bates D. M. 2005 Linear Mixed Model Implementation in R. [R News.] Vol.5 P.27-30 google
  • 5. Binder K. S., Morris R. K. 1995 Eye movements and lexical ambiguity resolution: effects of prior encounter and discourse topic. [Journal of Experimental Psychology: Learning, Memory, & Cognition] Vol.21 P.1186-1196 google doi
  • 6. Binder K. S., Rayner K. 1998 Contextual strength does not modulate the subordinate bias effect: Evidence from eye fixations and self-paced reading. [Psychonomic Bulletin and Review] Vol.5 P.271-276 google doi
  • 7. Binder K.S. 2003 The influence of local and global context: An eye-movement and lexical ambiguity investigation. [Memory & Cognition] Vol.31 P.690-702 google doi
  • 8. Boland J.E., Blodgett A. 2001 Understanding the constraints on syntactic generation: Lexical bias and discourse congruency effects on eye movements. [Journal of Memory and Language] Vol.45 P.391-411 google doi
  • 9. Cairns H. S., Hsu J. R. 1980 Effects of prior context on lexical access during sentence comprehension: A replication and reinterpretation. [Journal of Psycholinguistic Research] Vol.9 P.319-326 google doi
  • 10. Carpenter P. A., Daneman M. 1981 Lexical retrieval and error recovery in reading: A model based on eye fixation. [Journal of Verbal Learning and Verbal Behavior] Vol.20 P.137-160 google doi
  • 11. Crain S., Steedman M. 1985 On not being led up the garden path: the use of context by the psychological parser. In: Dowty, D., Karttunen, L., Zwicky, A. (Eds.), Natural Language Parsing: Psychological, Computational, and Theoretical Perspectives. P.320-358 google
  • 12. Culicover P.W., Jackendoff R. 2005 Simpler Syntax. google
  • 13. Duffy S. A., Morris R. K., Rayner K. 1988 Lexical ambiguity and fixation times in reading. [Journal of Memory and Language] Vol.27 P.429-446 google doi
  • 14. Fodor J.A. 1983 The modularity of mind. google
  • 15. Fodor J. A., Bever T.G., Garrett M.F. 1974 The psychology of language google
  • 16. Folk J., Morris R. 2003 Effects of syntactic category assignment on lexical ambiguity resolution in reading: An eye-movement analysis. [Memory and Cognition] Vol.31 P.87-99 google doi
  • 17. Ford M., Bresnan J., Kaplan R. 1982 A competence-based theory of syntactic closure. In: Bresnan, J. (Ed.), The Mental Representation of Grammatical Relations. P.727-796 google
  • 18. Frazier L. 1978 On Comprehending Sentences: Syntactic Parsing Strategies. google
  • 19. Frazier L., Fodor J. D. 1978 The sausage machine: a new two-stage parsing model. [Cognition] Vol.6 P.291-325 google doi
  • 20. Frazier L., Rayner K. 1987 Resolution of syntactic category ambiguities: eye movements in parsing lexically ambiguous sentences. [Journal of Memory and Language] Vol.26 P.505-526 google doi
  • 21. Gelman A., Hill J. 2008 Data Analysis Using Regression and Multilevel/Hierarchical Models. google
  • 22. Gibson E. 2006 The interaction of top-down and bottom-up statistics in the resolution of syntactic category ambiguity. [Journal of Memory and Language] Vol.54 P.363-388 google doi
  • 23. Gibson E., Pearlmutter N. 1998 Constraints on sentence comprehension. [Trends in Cognitive Science] Vol.2 P.262-268 google doi
  • 24. Grodner D., Gibson E., Watson D. 2005 The influence of contextual contrast on syntactic processing: Evidence for strong-interaction in sentence comprehension [Cognition] Vol.95 P.275-296 google doi
  • 25. Just M.A., Carpenter P.A., Woolley J. D. 1982 Paradigms and processes in reading comprehension. [Journal of Experimental Psychology: General] Vol.111 P.228-238 google doi
  • 26. Kaan E. 2007 Event-related potentials and language processing: An overview. [Language and Linguistics Compass] Vol.1 P.571-591 google doi
  • 27. Kim A., Osterhout L. 2005 The independence of combinatory semantic processing: Evidence from event-related potentials. [Journal of Memory and Language] Vol.52 P.205-225 google doi
  • 28. Kuperberg G. R., Sitnikova T., Caplan D., Holcomb P. 2003 Electrophysiological distinctions in processing conceptual relationships within simple sentences. [Cognitive Brain Research] Vol.17 P.117-129 google doi
  • 29. MacDonald M. C. 1993 The interaction of lexical and syntactic ambiguity. [Journal of Memory and Language] Vol.32 P.692-715 google doi
  • 30. MacDonald M., Pearlmutter N., Seidenberg M. 1994 The lexical nature of syntactic ambiguity resolution. [Psychological Review] Vol.101 P.676-703 google doi
  • 31. Marslen-Wilson W. D., Tyler L. K. 1987 Against modularity. In J.L.Garfield (Ed.), Modularity in Knowledge Representation and Natural Language Understanding. google
  • 32. Rastle K., Harrington J., Coltheart M. 2002 358,534 nonwords: The ARC Nonword Database. [Quarterly Journal of Experimental Psychology] Vol.55A P.1339-1362 google doi
  • 33. Rayner K., Pacht J. M., Duffy S. A. 1994 Effects of prior encounter and global discourse bias on the processing of lexically ambiguous words: Evidence from eye fixations. [Journal of Memory and Language] Vol.33 P.527-544 google doi
  • 34. Seidenberg M., Tanenhaus M., Leiman J., Bienkowski M. 1982 Automatic access of the meanings of ambiguous words in context: Some limitations of the knowledge-based processing. [Cognitive Psychology] Vol.14 P.489-537 google
  • 35. Simpson G. B. 1981 Meaning dominance and semantic context in the processing of lexical ambiguity. [Journal of Verbal Learning and Verbal Behavior] Vol.20 P.120-136 google doi
  • 36. Swinney D. 1979 Lexical access during sentence comprehension: (Re) consideration of context effects. [Journal of Verbal Learning and Verbal Behavior] Vol.18 P.645-659 google doi
  • 37. Swinney D., Hakes D. 1976 Effects of prior context upon lexical access during sentence comprehension. [Journal of Verbal Learning and Verbal Behavior] Vol.15 P.681-689 google doi
  • 38. Tabor W., Juliano C., Tanenhaus M. K. 1997 Parsing in a dynamical system: An attractor-based account of the interaction of lexical and structural constraints in sentence processing. [Language and Cognitive Processes] Vol.12 P.211-272 google
  • 39. Tabossi P. 1988 Accessing lexical ambiguity in different types of sentential contexts. [Journal of Memory and Language] Vol.27 P.324-340 google doi
  • 40. Tabossi P., Colombo L., Job R. 1987 Accessing lexical ambiguity: Effects of context and dominance. [Psychological Research] Vol.49 P.161-167 google doi
  • 41. Tanenhaus M.K., Leiman J.M., Seidenberg M.S. 1979 Evidence for multiple stages in the processing of ambiguous words in syntactic contexts. [Journal of Verbal Learning and Verbal Behavior] Vol.18 P.427-441 google doi
  • 42. Tanenhaus M., Spivey-Knowlton M., Eberhard K., Sedivy J. 1995 Integration of visual and linguistic information in spoken language comprehension. [Science] Vol.268 P.1632-1634 google doi
  • 43. Tanenhaus M. K., Trueswell J. C. 1995 Sentence comprehension. In J. L. Miller & P. D. Eimas (Eds.), Speech, language and communication. P.217-262 google
  • 44. Thierry G., Martin C. D., Gonzalez-Diaz V., Rezaie R., Roberts N., Davis P. 2008 Event-related potential characterization of the Shakespearean functional shift in narrative sentence structure. [Neuroimage] Vol.40 P.923-931 google
  • 45. Trueswell J.C., Tanenhaus M.K., Garnsey S.M. 1994 Semantic influences on parsing: use of thematic role information in syntactic disambiguation. [Journal of Memory and Language] Vol.33 P.285-318 google doi
  • 46. Tyler L.K., Marslen-Wilson W.D. 1977 The on-line effects of semantic context on syntactic processing. [Journal of Verbal Learning and Verbal Behavior] Vol.16 P.683-692 google
  • 47. Van Petten Cyma K. 1993 A comparison of lexical and sentence-level context effects and their temporal parameters. [Language and Cognitive Processes] Vol.8 P.485-532 google doi
  • 48. Van Petten Cyma K., Marta Kutas. 1990 Interactions between sentence context and word frequency in event-related brain potentials. [Memory and Cognition] Vol.18 P.380-93 google doi
  • 49. Van Petten Cyma K., Marta Kutas 1991 Influences of semantic and syntactic context in open-and closed-class words. [Memory and Cognition] Vol.19 P.95-112 google doi
  • [Figure 1.] The distribution of the responses for 180 target items in Norming Study 2, sorted by the percentage of noun responses (plotted on the y-axis). [Note that although the graph contains all 180 data points, the words displayed on the x-axis are a small subset of the 180 words, because it would be impossible to display them all legibly.]
    The distribution of the responses for 180 target items in Norming Study 2, sorted by the percentage of noun responses (plotted on the y-axis). [Note that although the graph contains all 180 data points, the words displayed on the x-axis are a small subset of the 180 words, because it would be impossible to display them all legibly.]
  • [Figure 2.] The distribution of the responses for 175 target items in the Corpus Study, sorted by the noun bias (plotted on the y-axis). [Note that although the graph contains all 175 data points, the words displayed on the x-axis are a small subset of the 175 words, because it would be impossible display them all legibly.]
    The distribution of the responses for 175 target items in the Corpus Study, sorted by the noun bias (plotted on the y-axis). [Note that although the graph contains all 175 data points, the words displayed on the x-axis are a small subset of the 175 words, because it would be impossible display them all legibly.]
  • [Figure 3.] The correlation between the data from Norming Study 2 and the data from the Corpus Study.
    The correlation between the data from Norming Study 2 and the data from the Corpus Study.
  • [Table 1.] Accuracies in percent correct, as a function of lexical bias and context across the nine conditions of Experiment 1 (standard errors in parentheses).
    Accuracies in percent correct, as a function of lexical bias and context across the nine conditions of Experiment 1 (standard errors in parentheses).
  • [Figure 4.] Reaction times as a function of lexical bias and context across the nine conditions of Experiment 1. The error bars represent standard errors of the mean.
    Reaction times as a function of lexical bias and context across the nine conditions of Experiment 1. The error bars represent standard errors of the mean.
  • [Table 2.] Fixed effects in a mixed model regressing log lexical decision reaction times on context and lexical bias. (Following R notation, interaction terms are denoted using colons.)
    Fixed effects in a mixed model regressing log lexical decision reaction times on context and lexical bias. (Following R notation, interaction terms are denoted using colons.)
  • [Table 3.] Accuracies in percent correct, as a function of lexical bias and context across the four conditions of Experiment 2 (standard errors in parentheses).
    Accuracies in percent correct, as a function of lexical bias and context across the four conditions of Experiment 2 (standard errors in parentheses).
  • [Figure 5.] Reading times at the ambiguous word and the two following words as a function of lexical bias and context in Experiment 2. The error bars represent standard errors of the mean.
    Reading times at the ambiguous word and the two following words as a function of lexical bias and context in Experiment 2. The error bars represent standard errors of the mean.
  • [Table 4.] Fixed effects in a mixed model regressing log reading times on context and lexical bias.
    Fixed effects in a mixed model regressing log reading times on context and lexical bias.