Lexical Fillers Permit Real-Time Gap-Search inside Island Domains
- Author: Boxell Oliver
- Publish: Journal of Cognitive Science Volume 15, Issue1, p97~135, March 2014
It has often been reported that lexical fillers (e.g.
which house) improve the overall acceptability of many island constraint violations relative to bare fillers (e.g. what). The current study attempts to test for the first time whether lexical fillers reduce real-time sensitivity to wh-islands as well. Results from an eyetracking-while-reading study are reported that demonstrate native English speakers’ sensitivity to a plausibility manipulation between a fronted filler phrase and a downstream subcategorizing verb inside a wh-island domain. The effect is found as the verb was encountered in real-time, and only when the filler element contains lexical information, not when it is bare. This is taken to show that online sensitivity to the wh-island constraint is reduced when the filler preceding it is lexical. The strengths and weaknesses and overall compatibility of a range of grammatical and processing theories are considered in relation to this finding.
Sentence processing , wh-island constraints , lexical fillers
When processing sentences, fronted “filler” elements are mentally reconstructed in empty “gap” positions which pertain to the syntactic and semantic roles they play in the sentence, see underscores in (1)-(3) below (e.g. Crain & Fodor 1985, Swinney et al. 1988, Frazier & Clifton 1989, Nicol & Swinney 1989, Nicol 1993, Nicol, Fodor & Swinney 1994, Roberts et al. 1997, Chen et al. 2002, Omaki et al. submitted). Specific parts of sentences – known as islands (Chomsky 1962, Ross 1967) – are so-called because they are not generally available to host gaps. However, sensitivity to some of these islands is reduced by the lexicality of the filler element (Karttunen 1977, Maling & Zaenen 1982, Pesetsky 1987, 2000, Goodluck et al. 2008, Hofmeister & Sag 2010, Boxell 2012, submitted). Take (1)-(3), where minimal pairs are presented for three types of island constraint violation, with the island domains themselves in brackets. The (a) variants have bare fillers (e.g.
who or what), while the (b) ones have additional lexical information (e.g. which book or which juice). As such, the (b) forms should be more acceptable than those in (a).
A number of experimental studies have investigated the reactivation of lexical fillers versus bare fillers in gap positions, although in non-island phrases. Several of these found real-time evidence of lexical fillers taking longer to reconstruct at the gap site (e.g. De Vincenzi 1996, Shapiro et al.1999, Shapiro 2000, Boxell 2012, submitted), which has generally been interpreted as being a consequence of the greater amount of specific lexical information that requires reactivation and integration into the syntactic, semantic and conceptual structures. Similarly, some studies suggest lexical filler-gap dependencies are generally tougher to process overall than bare filler-gap dependencies. Some (e.g. Kaan et al. 2000 and Shapiro 2000) have suggested that this is because lexical fillers require linking to a discourse context, whilst others (e.g.Goodluck et al. 2008, Donkers et al. 2011) find a relationship between the specificity of semantic detail in the filler and the difficulty of processing. This is thought to result from the need for more fine-grained set-restriction and visualization of the filler’s referent.
Still other studies (e.g. Hofmeister & Sag 2010, Nicenboim 2012, Boxell submitted) have looked at lexical filler reactivation inside island phrases similar to (1) and (3). All three find evidence of a speed-up effect in the real-time processing of island domain regions of the sentences when preceded by a lexical filler compared to a bare filler, Hofmeister & Sag (2010) and Boxell (submitted) for wh-islands in English, and Nicenboim (2012) for complex DP islands in Hebrew. Specifically, Hofmeister & Sag (2010) interpret faster reactivation of the lexical filler at the gap site inside the island domain in the context of the Memory Facilitation Hypothesis (Hofmeister 2007), whereby the process is facilitated by virtue of lexical specificity providing a stronger memory trace relative to a bare filler. Furthermore, the authors posit a link between this real-time effect and an overall amelioration of the acceptability judgments for wh-island violations that they observed in related materials administered in an offline scalar judgment task. Under this approach, the unacceptability of island violations ordinarily results from the exhaustive processing taxations incurred by island domains (rather than from violations of actual grammatical stipulations). In this vein, the unacceptability of such violations can essentially be reversed by the general memory facilitation effects that lexical fillers contribute during processing. Similarly, others have found that lexical fillers have a strong, early-established discourse prominence that facilitates the processing of structures that require a gap or pronoun referent to be determined on the basis of semantic discourse (Radó 1998, Frazier & Clifton 2002, Diaconescu & Goodluck 2004).
The slower reactivation times at gap sites for lexical fillers found by some authors (e.g. De Vincenzi 1996, Shapiro et al. 1999, Shapiro 2000, Boxell 2012) may appear contradictory to the faster ones that support the Memory Facilitation Hypothesis (Hofmeister 2007, Hofmeister & Sag 2010). However, the Stabilizer Hypothesis (Boxell submitted) is compatible with this apparent discrepancy. This hypothesis is summarized in Figure 1, and is described below.
The Stabilizer Hypothesis says that predicted grammatical violations (such as islands) can be avoided by turning off the full grammatical computation of a sentence, and in its place building a shallower lexically- driven representation. This means that a (proxy of) the meaning of the sentence can still be deduced, while the acceptability of the sentence is preserved by virtue of not having encountered the violation that would have caused a full grammatical representation to crash. Hence, the parse can be considered “stabilized”. However, stabilization is only predicted to activate at a moment in the processing time-course when the following two conditions have been met. Firstly, in order for the parse to be stabilizeable at all, it has to have established a large amount of lexical information with a strong memory trace and/or discourse prominence that can be used as a plausibility cue (i.e. as a “stabilizer”). That is, the parse needs to be able to link its lexical elements together so as to deduce its rough meaning without syntactic mediation, should the need arise. Second, stabilization is only thought to become necessary at a moment in the parse when the processor can predict that the full syntactic representation is likely to crash. For instance, in Boxell (submitted) it is suggested that stabilization becomes activated when the input begins to resemble an abstract template of an island violation, such as (4). The subcategorization frame of the verb
wondercoinciding with an active search for a gap site were proposed to feature quite strongly as predictors of likely up-coming structures that have the potential to become complement clause wh-islands. Of course, if that complement clause is realized, and it contains a wh-phrase in its leftperiphery, the wh-island will have been confirmed.
According to the Stabilizer Hypothesis, an island environment would be predicted to co-occur with high levels of activation of the lexical information contained by a preceding lexical filler, since that is the information being used to link the filler with its gap, rather than relying on syntactic detail. However, in non-island environments where the lexical information is not available for stabilization purposes, all of this information must be reconstructed out of a parse where it may have decayed in the memory trace.1 Thus, it was deemed unsurprising that studies which find quicker reactivation for lexical fillers were doing so selectively in island environments (Hofmeister & Sag 2010, Nicenboim 2012, Boxell submitted), while those which find longer reactivation times for lexical fillers did so in non-island environments (e.g. De Vincenzi 1996, Shapiro et al. 1999, Shapiro 2000, Boxell 2012, submitted).
There are a variety of purely formal grammatical accounts that might be sufficient to explain the ameliorative effects of lexical fillers on island domains. Pesetsky (1987), for instance, suggested that lexical fillers, or rather d(iscourse)-linked fillers, are associated with “Q” operators that may (optionally) take scope over their underlying gap sites via a binding operation. This is unlike regular wh-dependencies, which are thought to unfold using within clause movement steps (known as successive cyclicity), following locality constraints like the Minimal Link Condition (Chomsky 1995). This means that bare fillers need to form an intermediate link to any CPs within a long-distance filler-gap dependency, and so when such an intermediate CP is already occupied with another wh-word, as in (5), a wh-island violation is committed. On the other hand, whenever d-linked fillers are found to be exempt from sensitivity to constraints on regular wh-dependency formation, such as wh-islands, Pesetsky attributes this to the Q-operator simply binding the underlying gap position in a single operation. With no links to intermediate CPs required, sensitivity to the wh-island constraint is necessarily diminished, as in (6).
Relativized Minimality (Rizzi 1990) 2 suggests that in a configuration ‘XYZ’, where X is the filler and Z is the gap, Y constitutes an intervening (island) boundary for X-Z only if it is of the same type. Within this framework, then, it could be that a typical Y element is not of the same type as a lexical X-Z dependency, but is the same type as a bare X-Z dependency. An important issue here is how “type” is actually defined. Indeed, recent work has attempted to codify this in terms of featural Relativized Minimality (fRM) (Rizzi 2001), where the amount of interference caused by Y to X-Z is a function of the amount of featural specification Y has in common with X-Z. In the case of the contrast between lexical and bare fillers the key distinction is clearly a +/-N feature, where N stands for all the information carried by the noun in a complex filler (e.g. Villata et al. 2014). Under this approach, wh-islands that are caused by an intermediate CP being occupied by a
barewh-word should cause less disruption to a lexical filler, as in (6), than to another bare one, as in (5). This is because the –N feature is shared between the bare filler and intervening wh-word in (5), but not in (6). The fRM approach, then, also has the potential to capture the contrast under investigation in the present paper.
Finally, there is a range of formal accounts that attempt to explain the lexical filler effects on island sensitivity with an even more abstract level of syntactic representation. For instance, Pesetsky (2000) and Shields (2008) propose that d-linked fillers mightuse abstract feature movement to satisfy locality constraints, freeing up the surface word order to give the appearance of a constraint violation. Similarly, Van Craenenbroeck (2004, 2010) suggest that complex wh-phrases may satisfy requirements for successive-cyclicity with empty-operators.
In sum, there is broad consensus in both the processing and formal grammar literatures that lexical fillers increase the overall acceptability of island constraint violations. Secondly, each of these literatures has put forward a range of possible explanations as to why this might be. In spite of this, no experimental study has yet been reported that systematically tests whether lexical fillers reduce sensitivity to island constraints during real- time processing. The current study is an attempt to examine this specific question in particular.
1Whether or not the memory trace actually decays depends may depend on the theoretical framework one adopts. The salient point as far as the Stabilizer Hypothesis is concerned is simply that lexical fillers are only predicted to boost the memory trace, and thus facilitate reactivation, when they are used to stabilize grammatical violations. Therefore, the same prediction would not be made for grammatical sentences. This seems to be consistent with the broader research literature on this topic. 2Thanks go to an anonymous reviewer for this suggestion.
The present study examines, for the first time to my knowledge, whether lexical fillers increase the processor’s ability to conduct a real-time gapsearch inside wh-island domains. In other words, the present study is interested in whether the processor’s real-time sensitivity to the constraint is reduced by the lexicality of the filler preceding it.
Thirty-two native speakers of British English were recruited from the University of University of Essex and the surrounding community in the United Kingdom (16 female, 3 left-handed, 29 monolingual, mean age: 21.6 years, SD: 3.01). All participants had normal or corrected-to-normal vision and reported that they did not have any language or other cognitive impairments.
Wh-island regions tend to occur in sentential positions for which the presence or absence of up-coming potential gap sites is highly predictable. This makes testing the
relativesensitivity to the constraint as a function of its different filler types very difficult because processing strategies are typically adopted early on that decide whether or not to violate the island on the basis of these predicted gaps. A review of the previous research literature indicates that where up-coming grammatical gap sites outside of an island domain are very likely, sensitivity to the island constraint is strongly observed in real-time. For instance, wh-islands made up of relative clauses modifying subjects show high levels of realtime sensitivity (e.g. Felser et al. 2012, Stowe 1986, Traxler & Pickering 1996). Since subjects are guaranteed to be followed by verb phrases in English, the processor presumably respected these wh-islands because of the guaranteed downstream verb phrase and its possible gap site that would allow for successful termination of the filler-gap dependency. Meanwhile, in scenarios where it is obvious that no further gap sites outside of an island will be available, islands are consistently violated since this is the only way of completing the filler-gap dependency at all. Both of the prior studies on lexical fillers and wh-islands of which I am aware (Homeister & Sag 2010, Boxell submitted) used complement clause wh-islands like (1). This means the matrix verb phrase has already been processed by the time the island is encountered and so provides no downstream possibility of a gap site. With no other guaranteed structural possibilities, the processor is forced to carry out a gap search inside the complement clause in order to find any possible gaps at all (making it a good syntactic environment for comparing lexical versus bare filler reactivation inside island domains, as was the objective of these studies).
The current study differs from its predecessors, though, in that a design is used that can in principle test for different levels of sensitivity to the wh-island constraint based on the type of filler preceding it. This would not have been possible in an environment that either forces the processor to violate the constraint irrespective of the filler type preceding it (like complement clause wh-islands), or else is in such a position that respecting the constraint irrespective of the filler type offers the more grammatical and canonical option because of likely alternative gaps downstream (like subject relative clause wh-islands). To resolve this problem, a highly artificial but nonetheless grammatical constituent structure is used in the current study. A complement clause wh-island, hereafter the “inner island” is embedded inside a relative clause wh-island, hereafter the “outer island”. In short, an island that would normally force violation is embedded inside one that would normally not force violation. This creates a multiple embedded wh-island environment that is long-distance, taxing on workingmemory, and syntactically highly complex and novel. On encountering the inner island, the processor has to make a rapid call about whether to carry out a gap search there or not, and unlike any of the alternative wh-island structures, it is not possible to rely on what the likely up-coming structure will be, since it is not known. With the predictability of the structure’s gap-filling opportunities reduced (that may otherwise block gap searches), the non-canonical double-island structure should reveal whether a lexical filler enables a gap search inside such a wh-island as compared with a bare counterpart, something that would be difficult to test using more canonical types of wh-island.
In the main part of the experiment, two factors were manipulated: Filler Type refers to whether the filler was lexical or not (+/-Lexical Filler), and Plausibility refers to whether the relationship between the filler and critical verb inside the island was plausible or not (+/-Plausible). This resulted in four main conditions over all, as are illustrated in (5). A full materials list is available in the appendix.
The main region of interest is the verb inside the inner island, namely
built. Care was taken to ensure it appeared towards the center of the second line in each critical trial. It is a plausible subcategorizer for a gap site that is filled by an inanimate filler like which stage/what, but is an implausible subcategorizer for an animate one, which singer/who. Note that this verb is followed by an object, namely the correct stand, and a licit gap site downstream. This means the wh-islands are not actually violated, and these sentences are all globally grammatical (albeit highly artificial). We are interested, though, in whether or not the processor posits a link between the fronted filler and the subject-internal verb at the moment of processing when the verb is encountered. The inner island is created by the presence of the wh-word whetheroccurring at the left edge of the most embedded clause. Note that it is always selected for by the subcategorizer wonderas in Boxell (submitted), should this be significant for the onset of a stabilization effect. The outer island is formed by virtue of the embedded relative clause which has another wh-word, who, at its left edge. The wh-phrase filler itself either contains a lexical noun, which stage/singer, or it is left bare, what/who.
An additional four WEAKER ISLAND CONDITIONS were added which attempted to reduce the island effects, shown in (6). This was done to check whether the lexicality of the filler modulated sensitivity to the plausibility manipulation more prominently inside the stronger islands in the main conditions in (5). The same four conditions were presented in which the outer-island
whowh-word was replaced with the complementizer that. The outer-island still exists since relative clauses are inherently island domains, but in (5) the presence of the overt wh-word had meant it was a more clearly marked out as wh-island, which is no longer the case in (6). Indeed, (depending on one’s theoretical commitments) the lack of a wh-word may free up the specifier of CP, allowing for successive-cyclicity. The innerisland wh-word whetherwas replaced with the complementizer if, which more clearly does reduce the island effect. Indeed, replacing a wh-word with a complementizer has been shown to reduce island effects during real-time processing multiple times (e.g. Hofmeister & Sag 2010, Boxell submitted).
Twenty-four items were made and distributed across eight lists using a Latin Square design such that participants only saw one version of each critical item and an equal number of tokens of each condition overall. The items were pseudo-randomized to make certain that examples from the same condition did not occur adjacently. All critical items were followed by a “yes-no” comprehension question which participants responded to using a button on a key pad. This was to motivate their detailed reading of the materials and to monitor their attention to the task. To the critical items, 65 filler items were added which included filler-gap dependencies with and without lexical fillers, with and without island domains, and with and without comprehension questions. Other sentence types were also followed by comprehension questions to ensure they did not become salient and alert participants to the significance of filler-gap dependencies, islands or lexical fillers. Half of the filler items were as long as the critical items, and all but 11 were grammatical.
In order to ensure that the relationship between the fronted lexical fillers and the island-internal verb was (im)plausible as presupposed in the design, a pretest was conducted. Fifteen native English speaking participants were recruited who did not subsequently participate in the experiment (9 female, 1 left-handed, 15 monolingual, mean age: 20.3 years, SD: 2.98). They were asked to rate sentences based on those of the items of the main experiment, as illustrated in (6).
Here the (in)animate nouns are presented as the direct object which they would be temporarily posited as being, should the processor carry out a gap search inside the island domain online. Adaptations of the 24 sentences which were used in the main experiment were made, together with six other candidate items. These were distributed over two lists using a Latin Square design together with 59 filler sentences which had an assortment of different grammaticality and acceptability statuses, many based on the plausibility of the verb’s relationship with its arguments. Participants were asked to rate the sentences on a scale of 1-10, where one was the least wellformed and meaningful and ten was the most. The mean judgment scores for the pairs for each item were compared separately with related-samples
t-tests with the α-level Bonferroni adjusted to 0.002. Of the pairs that reached statistical significance at this level, 24 were selected for use in the main experiment.
The experiment started with the presentation of three practice items. The text for each item was presented in black Courier New font on a white background on a 1680 mm x 1080 mm monitor. A desk-mounted EyeLink 1000 system was used to track participants’ eye fixations with a sampling rate of 1000Hz as they read through the sentences whilst their heads were kept in position with use of a chin-rest. Although reading was done binocularly, only the right eye was tracked. The hardware was calibrated to the features of each eye using a nine point standardized test in which participants stare at dots as they move in random succession around the screen. The measurements were subsequently validated by running a second similar test. The calibration was automatically re-checked before each trial by requiring participants to stare at a dot on the screen, and they were only able move on to the subsequent trial when the eye-tracking was within a degree of the target dot. Re-calibrations were done whenever participants could not fixate within a degree of the target. Participants start each trial looking at the top-left of the display which prevents fixations being recorded over critical segments of text before reading has even started. They were instructed to read each item to themselves at a normal pace, and they could respond “yes” or “no” to the comprehension questions when they were asked by selecting a button. Participants completed the task in around 40 minutes, and no other tasks were administered in the same experimental session.
In the design of the current experiment, a lack of sensitivity to the island domain would be indicated by an effect of the plausibility manipulation that exists between the filler and the verb which heads a putative gap site inside the island domain. This effect would be triggered by implausible filler-verb relationships increasing reading time at, or following, the critical verb relative to plausible filler-verb relationships.
Taking the hypotheses discussed above, both the Memory Facilitation Hypothesis and the Stabilizer Hypothesis predict that lexical fillers will reduce real-time sensitivity to wh-island domains as a result of some domain general cognitive easement of processing resources. In the case of the former, it is the stronger memory trace for lexical fillers relative to bare ones that make lexical fillers easier to reactivate at gap sites. This general reduction in processing cost percolates up to some general improvement in overall acceptability of island constraint violations. In the case of the latter, it is the application of the stronger lexical memory trace and/or discourse prominence of the lexical fillers to replace a full grammatical computation with a shallower, lexically-driven parse. This stabilizer mechanism has been specifically activated so as to reduce the island constraint’s effect on the parse by removing it (as part of the full grammatical computation) from the representation altogether. The key difference between these two theories, then, is that the Memory Facilitation Hypothesis predicts that complex fillers should always facilitate processing, irrespective of syntactic environment (and when this happens to be an island, the constraint will be ameliorated). Meanwhile, the Stabilizer Hypothesis states that lexical fillers are engaged in a particular process that is strategically attempting to avoid a predicted grammatical violation, and so only predicts them to facilitate processing of the parts of constructions that include (putative) grammatical violations. This means that the Memory Facilitation Hypothesis predicts faster reactivation for lexical fillers both inside the island domain and at the licit tail-of-dependency gap, whilst the Stabilizer Hypothesis only predicts such effects inside the island domain itself. Similarly, the Memory Facilitation Hypothesis predicts just as much facilitation for the lexical fillers in the conditions where the island effects are weakened, as in (6), whilst the Stabilizer Hypothesis predicts stronger effects for the lexical fillers in (5) than in (6).
The third class of theory about the processing of lexical fillers at gap sites says they are harder to reactivate at gap sites because they contain more detailed information (e.g. De Vincenzi 1996, Shapiro et al. 1999, Shapiro 2000). And there are still others who suggest that processing constructions with lexical fillers is generally more taxing overall relative to bare fillers because such referents require linking to some sentence external discourse entity (e.g. Kaan et al. 2000 and Shapiro 2000) or because they require increased set restriction or conceptual visualization (e.g. Goodluck et al. 2008, Donkers et al. 2011). Such theories predict increased reading times across whole lexical filler-gap dependencies in general, relative to bare ones, whether island domains are included or not.
Finally, all of the formal grammatical mechanisms discussed above (e.g. Pesetsky 1987, 2000, Rizzi 1990, 2001, Shields 2008) predict that lexical fillers would be less sensitive to the wh-island domains presented than bare fillers.
Two participants scored below 60% in their comprehension question responses and so were removed altogether (all remaining participants scored above 60%), and another participant was removed for having a skip rate of 44% for the critical region of interest (
built). Of the remaining data, the skip rate was 2.52% for the outer island verb, 4.63% for the critical inner island verb, and 3.32% for the tail of the dependency. Twenty-six data points (3.61%) were unusable because of drift.
For the purpose of exposition of the data, analysis for two key regions is highlighted in Figures 2-3 below. Figure 2 presents the critical inner-island verb itself (
built), where the predicted patterns were searched for. Secondly, the region following the actual gap at the tail of the dependency ( during the show) is shown to see if any potential in-island effects impeded or facilitated final filler reactivation. The total reading times for these regions - that is, the sum of all eye fixations within them - are shown. Additional eyemovement measures for these regions (namely, first-pass, regression path and rereading times, together with the number of regressions in and out) are reported in Table 1, together with those for all the other regions of the critical sentences.
Fixations lower than 80ms and within a degree of another fixation were merged with their closest neighboring fixation. If they could not be merged, they were excluded. Fixations above 650ms were also removed, as were data points greater than 2.5 SDs from a participant’s mean or an item’s mean. Approximately 7.93% of the overall data was removed.
3.7.1. Main conditions
Figure 2 shows the total reading times in milliseconds for the inner island verb (built) and Figure 3 for the tail of the dependency (
during the show). +/-LF (Lexical Filler) refers to whether the filler was lexical or not, and +/-PL (ausible) refers to whether the critical filler-verb relationship was plausible or not.
At the inner island verb there was a main effect of Plausibility (F1(1, 28) = 4.274, p < 0.05, F2(1, 23) = 4.279, p = 0.05), but no main effect of Filler Type (F1(1, 28) = 3.483, p > 0.05, F2(1, 23) = 3.449, p > 0.05). There was, however, a two-way interaction between Filler Type and Plausibility (F1(1, 28) = 4.842, p < 0.05, F2(1, 23) = 4.796, p < 0.05).
To investigate the source of the interaction between Filler Type and Plausibility further, some planned comparisons were run to compare the +/-Plausible pair for the +Lexical Filler conditions on the one hand, and the –Lexical Filler conditions on the other. There was a significant difference between reading times for the +Lexical Filler pair of conditions (t1 (28) = 2.220, p < 0.05, t2 (23) = 2.184, p < 0.05) but not for the –Lexical Filler ones (t1 (28) = 1.032, p > 0.05, t2 (23) = 1.029, p > 0.05). These comparisons can be explained by the faster reading times for +Plausible as compared with –Plausible in the +Lexical Filler pair, and little or no difference in the –Lexical Filler pair.
The tail of the dependency yielded a main effect of Filler Type (F1(1, 28) = 6.863, p = 0.014, F2(1, 23) = 6.195, p < 0.05), with participants taking longer to read conditions involving lexical fillers than bare ones. There was no main effect of Plausibility, and no interaction between the factors was found (ps > 0.05).
Table 1 reports a range of eye-fixation measurements for all regions of the four main conditions. These measurements include: (i)
first passreading times, which is the sum of fixations after a region has initially been entered until it is first exited; (ii) regression pathreading times, which is the sum of first pass and initial rereading of a region until it is first exited to the right; (iii) rereading times, which is the sum of fixations that occur in a region after it was first exited; (iv) total reading times, which gives the overall reading time for a region; (v) regressions in, which refers to the number of times participants looked back into a region after they had moved on to subsequent regions of text; and (vi) regressions out, which refers to the number of times participants looked back to previous regions whilst in a given region before moving onto new regions of text to the right.
With respect to the two regions already presented in Figures 2-3, Table 1 shows numerical trends at the constituent eye-fixation measures. However, the variance at these constituent measures of overall reading time was not sufficient to yield statistically reliable patterns at the p < 0.05 α-level.3 Similarly, no statistically significant data patterns were found for any measures at any of the other non-critical regions of the sentences, as one would expect.
3.7.2. Weaker island conditions
As was illustrated in (6), the four weaker island conditions that manipulated the lexicality of the filler element and the plausibility of its relationship with the inner-island verb were also presented in conditions that should have reduced the island effects. No statistically significant patterns were found at any regions (ps > 0.05) save for the critical verb built and the tail of dependency regions
stood by_ and during the show. Table 2 shows the same reading time measures across all regions as are reported for the main materials inTable 1.4
The critical verb
builtrevealed a main effect of Plausibility in total reading times (F1 (1,28) = 4.562, p < 0.05; F2 (1,23) = 4.641, p < 0.05) and first pass (F1 (1,28) = 4.391, p < 0.05; F2 (1, 23) = 4.311, p < 0.05) but no main effect of Filler Type. (Note that the other measures also show similar numerical trends.) There was no interaction between the two factors at any measurement (ps > 0.05). This reflects the fact that for this set of non-island environments, the plausibility of filler-verb relationships appears to be evaluated at the verb for both lexical and bare filler conditions. Therefore, it must be the presence of the stronger island constraint configuration that prevents the plausibility of the bare fillers’ associations with the critical verb from being evaluated for the main experimental materials, whilst lexical fillers are able to facilitate such an evaluation irrespective of the strength of the structure’s island-hood.
The other finding from the weaker island conditions was at the tail of the dependency (
the stood by_ and during the showregions), and echoed findings from the main conditions with the stronger islands. Namely, the lexical filler conditions resulted in longer reading times than their bare counterparts, most likely as a consequence of the greater amount of information that requires reconstruction at the gap site (De Vincenzi 1996, Shapiro et al. 1999, Shapiro 2000). Again, whilst the numerical trends are only statistically significant for total reading times ( stood by_: F1 (1,28) = 4.27, p < 0.05 ; F2 (1,23) = 4.19, p = 0.05; during the show: F1 (1,28) = 7.21, p = 0.01; F2 (1,23) = 6.97, p = 0.01), the same pattern can be observed for the other measures too.
3.7.3. Omnibus analysis
The factor ISLAND TYPE was also added to the analysis, reflecting the stronger island manipulations in (5), reported in Table 1, versus the weaker ones of (6), reported in Table 2. The omnibus analysis therefore had a 2x2x2 design (Filler Type, Plausibility, Island Type). Total reading times at the critical region, namely the inner-island verb
built, yielded a main effect of Plausibility (F1 (1,28) = 4.32, p < .05; F2 (1,23) = 4.25, p < .05) and a two-way interaction between Plausibility and Island Type (F1 (1,28) = 4.11, p = .05; F2 (1,23) = 4.09, p = .05). This indicates that the effect of the plausibility differs depending on whether the island manipulation was stronger or weaker. Although main effects of Plausibility were reported in the separate analyses for both the main conditions and the weaker island conditions, the F-statistic itself was bigger for the weaker islands than for the main conditions. This is borne out by comparing the Cohen’s deffect size scores for these two Plausibility main effects, which is indeed bigger for the weaker island conditions (.64) than the main conditions (.59). The source of the interaction would seem to be, then, that Plausibility has a greater effect overall in the weaker island conditions. This is likely to be because the factor Filler Type modulated the effect of Plausibility in the main conditions, meaning only Lexical Fillers allowed an effect of Plausibility to emerge. In the weaker island materials, the overall effect of Plausibility seems to be made bigger by not being modulated by Filler Type at all, affecting both +Lexical Filler and –Lexical Filler conditions similarly. However, the two-way interaction was not further modulated by Filler Type statistically to give a three-way interaction.
Finally, the omnibus analysis also yielded a main effect of Filler Type at the tail of the dependency,
during the show(F1 (1,28) = 10.42, p < 0.01; F2 (1,23) = 9.81, p < 0.01), with the lexical fillers taking significantly longer to reconstruct at the gap site across all conditions relative to the bare ones. No other main effects or interactions were found in the omnibus analysis.
3The one exception is a main effect of Filler Type that was also found for rereading times at the tail-of-dependency region (F1 (1,28) = 4.262, p < 0.05; F2(1,23) =4.121, p = 0.05). Also, note that Filler Type was not entered into the analysis of the which stage/what region, since the segment necessarily contained entirely different words across the +/- LF conditions respectively. 4As with the main materials, Filler Type was not entered into the analysis for the which stage/what region.
The inner-island verb
builtwas read significantly faster when its relationship to the fronted filler was plausible than when it was implausible. In the main conditions, where the island manipulations were designed to be stronger, this applied only when the fillers were lexical, and not when they were bare. In the weaker island materials, this applied whether or not the fillers were lexical. This is taken to mean that real-time sensitivity to the stronger island manipulations was greater when the filler was bare than when it was lexical. On the other hand, the processor’s gap search was not constrained by the weaker island manipulations at all. Thus, the lexicality of the filler was irrelevant in determining whether or not the processor searched through the weaker island domains for a gap site: it did so whether or not the filler was lexical. This tells us that the effects of Filler Type are not just general effects of the lexicality of the filler, but are specifically linked to island-hood (or at least the strength of it).
The other key finding came from the tail of the dependency (e.g.
during the show), where the lexical filler conditions took longer to read than the corresponding bare ones (across all conditions). This is consistent with previous studies that find, outside of an island domain, that a lexical filler takes longer to reactivate than a bare one (e.g. De Vincenzi 1996, Shapiro et al. 1999, Shapiro 2000, Boxell 2012, submitted). This is likely to be because the greater amount of information contained within lexical fillers taxes the processor more greatly when it is reactivated and integrated into the structure.
It may also be of interest to note that the tail of the dependency region, and the region preceding it that contains the gap itself (
stood by_), both evoked large numbers of regressions out across conditions.5 This suggests that on encountering the terminal gap site where the filler-gap should end, the material that is being activated is related to that of a prior region. It is of similar interest to note that the filler region itself ( which stage/ what) received a larger number of regressions in than any other region of the text. This indicates that it is of relevance to downstream regions, most likely the aforementioned gap site regions, and is therefore the likely destination of many of the regressions out from them. Additionally, notice that the critical inner-island verb ( built) patterns rather like the tail of the dependency in this respect, causing a similarly large number of regressions out. And although not a statistically significant difference, notice that the +LF conditions in the main stronger island conditions (but not in the weaker island conditions) at this region result in more regressions out than the –LF ones. This could be a further indication that it is the presence of a lexical filler that allows a gap-search to be conducted in the island region, and therefore creates more backward eye-movements to the filler.
As discussed §1.1. there are a range of processing and grammatical theories attempting to account for lexical filler-gap dependencies and/or their reduced sensitivity to island constraints. Each of these will now be evaluated in turn with regard to how well they fit the data reported above.
The Memory Facilitation Hypothesis (Hofmeister 2007) proposed that the strong memory trace for lexical fillers makes them easier to reactivate at a gap site and that this reduction in processing cost percolates up to an overall improvement in acceptability for island violations. This certainly predicts the reported reduction in real-time sensitivity to the islands domains that are preceded by lexical fillers. However, the more specific claim of the Memory Facilitation Hypothesis to the effect that the stronger memory trace for lexical fillers should make linking them to downstream positions easier and thus faster was not borne out. There was no indication that linking the lexical filler to the inner-island verb
builtwas faster than for the bare filler conditions, either in the main conditions or in the weaker island conditions. Of course, the bare fillers do not seem to permit a gap search at all when confronted with the stronger islands of the main conditions, but they do for the weaker islands (as evidenced by sensitivity to the plausibility manipulation at the inner-island verb). In sum, whether we compare the lexical filler’s link with the inner-island verb to a bare filler that either does, or does not, itself also forge a link with the inner-island verb, the memory trace of the lexical filler does not facilitate reading times. Furthermore, the findings at the tail of the dependency contradict the Memory Facilitation Hypothesis, where lexical fillers inhibit reactivation relative to bare ones. On the whole, then, it would seem that many of the more detailed predictions made by the Memory Facilitation Hypothesis are not compatible with the findings of the current study.
An alternative processing theory, the Stabilizer Hypothesis (Boxell submitted), suggests that lexical fillers avail a shallow lexically-driven processing strategy in the place of full grammatical computation as the input begins to resemble an abstract template of an island violation. Under this account, it is the shallower parse that should facilitate processing by making the representation it derives less detailed, but only for the island domain itself. This shallow processing means the island violation itself is never constructed in the grammatical representation, and so the parse is saved from it pre-emptively (or is “stabilized”). This account is supported insofar as the real-time sensitivity to the island domain is reduced when the preceding is lexical, which one would expect if it has been parsed more shallowly and therefore no longer constrains the gap search. The Stabilizer Hypothesis can also be reconciled with the finding that reconstruction of the lexical filler at the licit gap site at the tail of the dependency is not facilitated relative to the bare fillers. This is because the shallow parse is only posited to affect grammatical constraint violations (i.e. the island domain) selectively, and not the rest of the sentence. Similarly, the lack of facilitation effects at the
builtregion of the weaker island conditions is also expected under this account, since the islands do not seem to constrain the gap search in those conditions. However, the fact that lexical fillers do not seem to facilitate processing at builtinside the stronger islands of the main conditions, relative to bare fillers where no gap search is being carried out inside the island domains at all, might be considered problematic for the Stabilizer Hypothesis. If the island domain is being processed shallowly, one might expect a facilitation effect compared to the bare filler condition, since the latter should still be incurring the cost of creating a full grammatical computation. As with the Memory Facilitation Hypothesis, then, the Stabilizer Hypothesis does not seem to fit the whole profile of the data collected. However, it does seem that more of the detailed predictions are met by the Stabilizer Hypothesis, making it a better fit for the current findings.
The class of theories that suggests lexical fillers are tougher to reactivate because of the increased information that has to be reconstructed in the underlying position is supported as far as the tail of the dependency itself,
during the show, is concerned, where the data show exactly this (e.g. De Vincenzi 1996, Shapiro et al.1999, Shapiro 2000). However, as with the theories already discussed, not all of the findings can be accounted for. Namely, at the builtregion there is no main effect of Filler Type indicating inhibition for lexical fillers relative to bare fillers. This is true for both the main conditions and the weaker island conditions. Recall that in the weaker island conditions, but not the main conditions, bare fillers are thought to be reactivated in this region. Although the plausibility of the filler’s relationship with builtclearly is assessed, one speculative explanation for the lack of inhibition for the lexical filler might be that the processor has not yet fully reconstructed it when the direct of object of built, namely the correct stand, is seen (perhaps though parafoveal vision). This might have cut the reconstruction short before the inhibition effect had time to emerge at this region because it disconfirms the presence of a gap, contrary to the full reconstructions found at the licit gap at the tail of the dependency.6
On a separate note, it does seem intuitively unlikely that lexical fillers can decrease the online sensitivity of island domains if they really do
alwaysimpose extra processing costs, rather than alleviating them. As such, perhaps it should be unsurprising that the inhibitory effects of lexical fillers that are known to occur outside of island domains may not occur inside island domains in the same way. Indeed, this very dichotomy is predicted by the Stabilizer Hypothesis, where lexical fillers are supposed to facilitate the processing of island domains, whilst it is perfectly possible for them to be inhibitory elsewhere. This same issue also applies to the final class of processing theories, which posited that constructions with lexical fillers are generally more taxing than those without because of discourse-linking (e.g. Kaan et al. 2000, Shapiro 2000), increased set restriction or conceptual visualization (e.g. Goodluck et al. 2008, Donkers et al. 2011). Again, this is consistent with the findings at the tail of the dependency, where lexical fillers inhibit reactivation times at the gap site. However, such theories clearly do not fit the data of the current experiment overall. Such theories predict increased reading times across whole lexical filler-gap dependencies in general, which is clearly unsupported by the current findings both numerically and statistically (see Tables 1 and 2).
The first formal account reviewed was Pesetsky (1987). Recall that it suggests d-linked fillers (which for our present purposes are the lexical fillers) have a Q operator in the CP to which they have been fronted. This operator takes scope over the gap site using binding operation, which absolves the filler-gap dependency from the locality constraints that are responsible for successive-cyclicity, and thus, wh-island constraints. Therefore, Pesetsky argues that whenever d-linked constraints are absolved from wh-islands this indicates the filler-gap dependency is being formed via a binding operation rather than via a series of within-clause successivecyclic movements. If Pesetsky’s account is correct, then it too would be compatible with reduced sensitivity to wh-islands during real-time processing such as is reported by the current study. There may, however, be an independent empirical reason to question this account. Gibson & Warren (2004) developed an experimental paradigm for testing whether or not fillers are represented at intermediate clause boundaries, as is predicted by successive-cyclic movements. Boxell (2012) extended paradigm to test the Pesetsky’s (1987) account of d-linking. The materials compare lexical and bare fillers across conditions of equal linear length. In one condition there was a clearly marked putative wh-island clause boundary, and in the other there was not. The final tail of dependency reactivation downstream was facilitated in the condition where it was preceded by a clause boundary relative to the one where it was not for both lexical and bare fillers. This was taken to mean that the filler had been reactivated at the clause boundary, boosting its prominence in the memory trace and thus facilitating subsequent reactivations of the same element.7 SinceBoxell (2012) found this for both lexical and bare fillers, it seems that filler-gap dependencies for both filler types are formulated via successive-cyclicity. Particularly in view of the fact that the clause boundaries were designed to be putative wh-islands, Boxell argues that this finding is contra Pesetsky (1987), under whose account only the bare fillers should have intermediate reactivations. The d-linked ones should have bound their gap sites so as to avoid the island violation. Taken together, whilst Pesetsky’s (1987) account is broadly consistent with the findings of the current study, we may have independent cause to question its validity as a viable account in the first place in light of a prior more specific experimental test of its predictions that failed to find support for it.
The second formal account under discussion was (featural) Relativized Minimality (Rizzi 1990, 2001). This proposes that in a configuration ‘XYZ’, where X is the filler and Z is the gap, Y constitutes an intervening (island) boundary for X-Z only if it is of the same type as them. Featural Relativized Minimality developed this idea by encoding the amount of interference caused by Y to X-Z as a function of the amount of featural specification Y has in common with X-Z. According to Rizzi and colleagues (e.g. Villata et al. 2014), a +/-N(oun) feature is thought to capture the lexical/bare filler contrast that the current paper is interested in. Wh-islands caused by an intermediate CP with a bare wh-word, as in the
whetherislands of the conditions in the current experiment, should cause less disruption to a lexical filler than a bare one. This is a result of the –N feature being shared between the bare fillers and the intervening wh-word, but not between lexical fillers and the intervening wh-word. To this extent, then, fRM can account for the findings of the current experiment. Again, though, there may be some independent cause for concern. Consider (8-9), where lexical fillers continue to improve acceptability of island effects even when the intervening wh-word shares its +N feature. fRM predicts that (9) should have a greater island effect than (8) as a result of the shared +N feature, however this prediction was recently disconfirmed in an acceptability rating study by Villata et al. (2014). The lexicality of the second wh-phrase (e.g. which woman) significantly increased acceptability when the first wh-phrase was also lexical, as in (9), but not when it was bare, as in (8).
The +/-N feature within the fRM framework, then, does not capture all of the facts. Of course this picture could in principle change if one were able to redefine the featural specification or “type” of X-Z and Y such that it no longer makes some erroneous predictions about lexical and bare fillers’ relationship with wh-islands. However, since Villata et al. (2014) represents the most recent attempt to address the phenomenon within the Relativized Minimality framework, it would seem such a reworking of the theory remains elusive for the moment. In sum, then, this leaves us in a same position we arrived at for Pesetsky’s (1987) account: while the reduced online sensitivity to wh-islands preceded by lexical fillers is in principle explicable with an fRM approach, its poor track record in a separate experiment that explicitly attempted to test its predictions more directly questions the overall viability of the approach to begin with.
The final group of grammatical theories (e.g. Pesetsky 2000, Shields 2008, Van Craenenbroeck 2010) all proposed that lexical fillers use subtle abstract features or covert movement to satisfy the locality constraints that cause wh-island effects. This means that the surface word order of structures can give the appearance of a wh-island violation. In principle these accounts are therefore also compatible with reduced real-time sensitivity to the wh-island constraint for lexical filler-gap dependencies. If the formal requirements for obeying the island constraint are met by these subtle grammatical features or movements then the island domain would no longer constrain the search for a gap site. The problem with these accounts from the perspective I adopt in the current paper, namely that of an experimentalist, is that testing for the proposed abstract properties is currently beyond our practical capabilities in the laboratory. While this class of theories are therefore perfectly good
potentialexplanations, we have no way of testing to see whether they can be supported under controlled conditions, and without such tests for empirical validation we cannot even begin to see if they are the correctexplanations or not.
5There are also a large number of regressions out from the final region of the sentence (at Wembley). I am reluctant to provide even a speculative interpretation of this, since it could be either a spill-over effect from the behavior seen at the regions directly preceding it, or it is equally possible that it is a “wrap-up” effect. 6Many thanks to an anonymous reviewer for this suggestion 7Note that within this paradigm conditions with the same structures (i.e. with and without the boundary), but with no filler-gap dependency at all, show that the facilitation effects downstream for the condition with the intervening boundary are no longer found. This tells us that the facilitation effects seen at the tail of the dependency are specifically related to the filler-gap dependency crossing a clause boundary, and not to some other element of the length-matched conditions. This is why the effects are taken to indicate the presence of successive-cyclic movement in particular.
The contribution of the current study has been to show that lexical fillers, relative to bare fillers, are able to reduce sensitivity to the restriction that wh-island constraints place on the processor’s search for a gap site. The current study made use of a highly non-canonical double embedded islands structure that meant the processor would not have been able to use canonical predictions about potential future grammatical gaps downstream (or the lack thereof). The presence of such predictions may have forced the processor to carry out a gap search inside a wh-island irrespective of filler type, or else permitted it to obey the island irrespective of filler type, as a result an awareness of the (un)availability of alternative grammatical gap sites for the filler outside the island. Adopting this sentence structure made it possible to see, for the first time to my knowledge, the
relativeeffect of lexical fillers to bare fillers on online sensitivity to a wh-island. This is a long-overdue complement to the well-attested fact that lexical fillers improve the overall acceptability of wh-island violations (Karttunen 1977, Maling & Zaenen 1982, Pesetsky 1987, 2000, Goodluck et al. 2008, Hofmeister & Sag 2010, Boxell 2012, submitted).
The experiment demonstrated that a manipulation of the plausibility of the filler’s relationship with the subcategorizing verb inside the wh-island structure was evaluated when the filler was lexical, but not when bare. Furthermore, a series of conditions that attempted to weaken the island effects by replacing intervening wh-words with complementizers revealed that, in those cases, both lexical and bare fillers permitted an evaluation of their relationship with the verb. This shows that the findings are not just related to the lexicality, but also to the strength of the island-hood of the structure. Thus, taken together, it was concluded that the lexical filler, but not the bare one, reduces sensitivity to wh-island domains.
A range of processing and grammatical explanations for the findings were discussed. Both the Memory Facilitation Hypothesis and the Stabilizer Hypothesis can offer suitable explanations for why lexical fillers reduce real-time sensitivity to wh-island constraints. The former characterizes this in terms of lexical fillers’ stronger memory trace reducing reactivation costs and overall processing burden of the structure, ameliorating islands where they occur. The latter characterizes the phenomenon in terms of lexical fillers enabling a shallow processing route for structures that contain predictable grammatical violations such that the full computation of the violations is strategically avoided. However, it was felt that overall the profile of the data found in the current experiment weakened the Memory Facilitation Hypothesis relative to the Stabilizer Hypothesis. Whilst both hypotheses predicted a reduction in island sensitivity when preceded by a lexical filler, the Memory Facilitation Hypothesis predicted facilitation effects for lexical fillers in general. This is where there is a crucial difference with the Stabilizer Hypothesis, which only predicts facilitation effects in the form of shallow processing during putative island violations themselves. In fact, no facilitation effects were found for lexical fillers inside or outside the island domains, which is potentially problematic for both accounts. But at the tail of the dependency, inhibition effects were even found for lexical fillers, which is particularly problematic for the Memory Facilitation Hypothesis.
A range of formal accounts were also considered, such as Pesetsky’s (1987) d-linking account and Rizzi’s (1990, 2001) (featural) Relativized Minimality account. Whilst both of these are compatible with the finding that lexical fillers should reduce sensitivity to island domains in principle, both of them also failed to pass key empirical experimental tests of the predictions they make. This is taken to mean their explanatory power is seriously weakened. A range of other more abstract formal proposals were briefly discussed, but not considered in further detail since they are too abstract to be considered testable experimentally.
Taken together, the main contribution of the present work is the demonstration, for the first time, that lexical fillers reduce sensitivity to wh-island domains during real-time sentence processing. Many of the processing and grammatical accounts of the relationship between lexical fillers and islands can account for this finding in and of itself, but none of them captures the current findings in their entirety. Overall, the account that runs into the fewest problems, either within the data set of the current report, or within the context of separate empirical investigations into their claims, would be the Stabilizer Hypothesis.
[Figure 1.] Summary of the Stabilizer Hypothesis
[Figure 2.] Total reading times for the inner island verb
[Table 1.] Mean eye-fixation measurements (ms) across all regions of main conditions (SDs)
[Figure 3.] Total reading times for the tail of the dependency
[Table 2.] Mean eye fixation measurements (ms) for weaker island conditions (SDs)