Skip to main content
Open AccessResearch Article

Numerical Congruency Effect in the Sentence-Picture Verification Task

Published Online:https://doi.org/10.1027/1618-3169/a000358

Abstract

Abstract. In two experiments, we showed that irrelevant numerical information influenced the speed of sentence-picture verification. Participants were asked to verify whether the concept mentioned in a sentence matched the object presented in a subsequent picture. Concurrently, the number word attached to the concept in the sentence and the quantity of objects presented in the picture were manipulated (numerical congruency). The number of objects varied from one to four. In Experiment 1, participants read statements such as three dogs. In Experiment 2, they read sentences such as three dogs were wandering in the street. In both experiments, the verification speed revealed the interaction between response and numerical congruency. The verification times for concept-object match were faster when there was also numerical congruence (compared with incongruence) between the number word and quantity. On the other hand, there was no difference between numerical congruence and incongruence when the concept and object mismatched. The results are interpreted as evidence for the symbol grounding of number words in perceptual representation of small quantities, that is, quantities falling in the subitization range.

One of the dominant theoretical frameworks in numerical cognition proposes that numerals and number words acquire their meaning by being mapped onto a nonsymbolic analog representation known as a mental “number line” (Dehaene, 2009; Dehaene & Cohen, 1995; Leibovich & Ansari, 2016). This visuospatial representation of magnitude is part of the innate system for perception of numerosity shared with other species. Behavioral findings such as numerical size and distance effects (Moyer & Landauer, 1967; 1973) and the spatial-numerical association of response codes (SNARC) effect (Dehaene, Bossini, & Giraux, 1993) are interpreted as evidence for such mapping (Fias & Fischer, 2005; Fischer & Shaki, 2014). Functional neuroimaging and neuropsychological studies in human beings and single-unit recordings in monkeys further support the existence of an interaction between numbers and space by demonstrating the involvement of the parietal cortex during the execution of various numerical tasks (Dehaene, Piazza, Pinel, & Cohen, 2003; Hubbard, Piazza, Pinel, & Dehaene, 2005; Nieder & Dehaene, 2009).

Although it seems plausible that number symbols interact with corresponding representations of quantity, several findings have called into question this association. Koechlin, Naccache, Block, and Dehaene (1999) examined repetition priming across different number notations. Participants were asked to classify whether the presented number was larger or smaller than 5 on two successive occasions. The first comparison served as a prime and the second as a target. Numbers appeared as numerals, words, or random pattern of dots. Koechlin et al. (1999) found no priming across symbolic and nonsymbolic notations. Recently, Lyons, Ansari, and Beilock (2012) showed that number comparisons were slower when numerals were compared to dot patterns than when two numerals were compared among each other or when two dot patterns were compared among each other. This finding did not depend on the numerical size or distance. Based on this finding, Lyons et al. (2012) concluded that number symbols are estranged from the sense of the actual quantities they represent. In a recent review of the literature on symbol grounding in numerical cognition, Leibovich and Ansari (2016) also concluded that there is no strong evidence for direct mapping of number symbols onto analog representation of quantity. Interestingly, this claim is at odds with the growing amount of work showing that conceptual knowledge is grounded in experiential traces in sensory-motor systems (Barsalou, 2008; Martin, 2007), and there is evidence that number concepts are also grounded in perception and action (Fischer & Brugger, 2011; Lindemann & Fischer, 2015).

Another line of research on numerical Stroop and size congruency effects has also been taken as an evidence for the automatic activation of number meaning. When the task is to enumerate an array of numerals, it is usually found that participants are slower when the meaning of numerals does not correspond with the numerosity of the array. For instance, it is faster to enumerate non-numerical string *** compared with numerical string 444, analogous to the classic color-word Stroop interference (Pavese & Umiltà, 1998, 1999; Shor, 1971; Windes, 1968). In a similar vein, Henik and Tzelgov (1982) showed that number meaning interferes with the comparison of physical size. However, Pansky and Algom (1999, 2002) found that the interference effects in the comparative judgments of numbers were reduced or eliminated when attention to irrelevant number dimension was carefully controlled. In other words, interference effects are subject to contextual modulations, which suggest that they only partially reflect automatic processing.

Ganor-Stern, Tzelgov, and Meiran (2013) argued that the activation of unintended processing on irrelevant dimensions is triggered or elicited by intended actions required by the experimental task. The amount of triggering depends on the degree of overlap between the parameters of the intended and unintended processes. In the classical interference paradigm, enumeration (or physical size comparison) is part of the task, and by merely spreading activation, it triggers the activation of irrelevant symbolic representation of number. Tzelgov and Ganor-Stern (2005) noted that in all studies that employed an interference paradigm, triggering by the task was substantial.

The activation of an unintended process is not restricted to the task demands, and it can occur at different levels of processing (Ganor-Stern et al., 2013). For instance, Santens and Verguts (2011) found evidence that the size congruity effect is a consequence of the alignment between stimulus dimensions (physical and numerical size) and response codes. In a congruent condition, both stimulus dimensions activate the same response code. Conversely, in an incongruent condition, physical size and numerical size activate different response codes, resulting in a slower response. Another factor that modulates the size of the size congruity effect is the ratio between numbers and the ratio between physical sizes used in the comparison task (Leibovich, Diesendruck, Rubinstein, & Henik, 2013).

As observed from previous discussions, many factors can contribute to the creation and modulation of a congruity effect. Consequently, it is difficult to draw a firm conclusion about the grounding of numbers in representation of quantity based on this evidence alone. Recently, Gabay, Leibovich, Henik, and Gronau (2013) showed that conceptual size can prime numerical value in a task that does not require the activation of magnitude representation. Following their steps, we asked whether numbers can prime numerosity under conditions that do not trigger the activation of the numerical dimension.

The aim of the present work is to examine whether an interaction between numbers and their quantities exists in a task that does not require numerical processing at all, that is, in the condition of minimal triggering by the task, according to Tzelgov and Ganor-Stern (2005). This should provide a stronger argument for the grounding of numbers in the corresponding numerosity. To this end, we employed a sentence-picture verification task where the number of objects mentioned in a sentence was manipulated to produce matches or mismatches with the numerosity of objects in a subsequent visual presentation. The task for participants was to verify whether the concept mentioned in the sentence matched with the object presented in the subsequent picture. Therefore, number words and numerosity were both irrelevant to the task. Our hypothesis was that despite their irrelevance, the number words would activate their nonsymbolic representation during reading in the same way that other concepts activated their corresponding sensory-motor representations of referent objects (Barsalou, 2008; Zwaan, 2004). Indeed, we found evidence for a Stroop-like interference effect, where the match or mismatch in irrelevant numerical dimension modulated the speed of verification of the match between concept and object. We also found that the observed effect could not be attributed to the stimulus-response compatibility and that it was not modulated by the symbolic distance between number word and numerosity.

Experiment 1

Method

Participants

Forty-eight (7 males, age range: 19–24) undergraduate psychology students from the University of Rijeka, Rijeka, participated in the experiment in exchange for course credits.

Apparatus

The stimulus presentation was controlled by E-Prime 2.0 stimulus presentation software (Schneider, Eschman, & Zuccolotto, 2002) running on a PC with a Samsung 19′′ CRT monitor. The responses were collected using the Serial Response Box with millisecond accuracy.

Stimuli and Procedure

Sixty-four statements such as three dogs were created: 32 statements that matched with the object and 32 that did not match with the object in the subsequent picture presentation. Drawings of objects were taken from Rossion and Pourtois’ (2004) version of the Snodgrass and Vanderwart (1980) standardized database, with color and texture added to the objects. The whole set of objects is available online (http://wiki.cnbc.cmu.edu/Objects/Snodgrass and Vanderwart “Like” Objects). As noted by Rossion and Pourtois (2004), color and texture improve object recognition by reducing confusion among similar objects. Within each set, half of the statements also matched the number with quantity, and the other half did not, creating a 2 × 2 factorial design with response (Yes vs. No) and numerical congruency (Congruent vs. Incongruent) as repeated-measures factors (see Figure 1). The number words in the statements and the quantities appearing in the pictures varied within a range of one to four. The appearance of each number word was balanced across conditions. We employed words for small numbers (one, two, three, or four) in statements because rapid and accurate enumeration (subitization) is possible only for small sets of visually presented items (Revkin, Piazza, Izard, Cohen, & Dehaene, 2008; Trick & Pylyshyn, 1994).

Figure 1 Factorial design of the experiment with the example of the statement-picture pair for all combinations of the response (Yes vs. No) and numerical congruency (Congruent vs. Incongruent).

Every trial began with a fixation string (“XXX”) presented at the center of the screen for 300 ms, followed by the presentation of the statement in lowercase Arial font (size 18) for 1,000 ms. The statement was replaced by a blank screen for 100 ms, followed by the presentation of the image containing replications of the same picture aligned and centered horizontally on the screen. The height of the single picture varied between 2 and 3.5 cm, and the width varied between 3 and 4.5 cm. When all four replicates of the picture were presented, the width of the whole image was 25 cm. The pictures were always aligned along the horizontal dimension. The image remained on the screen until a response was made. The viewing distance was approximately 70 cm. The background was light gray throughout the trial.

The task for participants was to verify whether the concept mentioned in the statement matched the object presented in the image. The instructions given to participants emphasized that the number of objects in the statements and in the images would vary across trials but that they could ignore this information since it was irrelevant to the task.1 The instructions also emphasized the need to respond quickly but accurately. Half of the participants responded yes with their left index finger and no with their right index finger. The other half of the participants responded with the opposite assignment of yes and no responses. Feedback was provided with the red word “INCORRECT” when an error was made in order to encourage participants to avoid making mistakes. The feedback duration was 500 ms. When the correct answer was given, a new trial started immediately after the response was made. There were eight practice trials using statement-picture pairs that were not used in the experimental block. The practice block was followed by a single block of 64 experimental trials. The order of presentation of the statement-picture pairs was randomized across participants.

Results and Discussion

RT Analysis

Raw data for Experiment 1 can be found in the Electronic Supplementary Materials, ESM 1. Error trials were removed from the analysis (4.0% of data). There were no latencies below 200 ms, and only 0.5% of correct trials were above 1,500 ms. Therefore, we decided to keep all correct trials in the analysis. The means and corresponding 95% confidence intervals are displayed in Figure 2A. The latencies of the correct responses were submitted to a 2 × 2 analysis of variance (ANOVA) with response (Yes vs. No) and numerical congruency (Congruent vs. Incongruent) as repeated-measures factors.

Figure 2 The mean latencies of correct verifications (in milliseconds) and error rates (in percentages) observed in Experiment 1 (A) and Experiment 2 (B) are shown as a function of response (Yes and No) and numerical congruency (Congruent and Incongruent). Error bars represent 95% confidence intervals for repeated-measure design following Cousineau (2005) and Morey (2008).

ANOVA revealed that there was no main effect of response, F(1, 47) = 2.37, p = .130, ηp2 = .05. There was a significant main effect of numerical congruency, F(1, 47) = 12.98, p < .001, ηp2 = .22, showing faster responses in the numerically Congruent condition (M = 543 ms, SE = 6.74) than in the Incongruent condition (M = 568 ms, SE = 7.47). Importantly, there was a significant two-way interaction between response and numerical congruency, F(1, 47) = 7.25, p = .010, ηp2 = .13. An analysis of simple main effects with a Holm-Bonferroni correction for multiple comparisons revealed that when the correct response was Yes, participants were 46 ms faster in the numerically Congruent condition than in the Incongruent condition, F(1, 47) = 17.39, p < .001, ηp2 = .27. On the other hand, when the correct response was No, participants were 4 ms faster in the numerically Congruent condition than in the Incongruent condition, but this difference was not statistically significant, F < 1, p > .20. The lack of effect for No response suggests that the faster response in the Yes-Congruent condition compared with the Yes-Incongruent condition cannot be attributed to the stimulus-response compatibility or polarity correspondence principle (Santens & Verguts, 2011; Proctor & Cho, 2006). Dimensional alignment between the stimulus and the response would predict that the No-Incongruent condition produces faster responses than the No-Congruent in the same way that the Yes-Congruent condition produces faster responses than the Yes-Incongruent condition. However, this was not observed in the data.

Accuracy was consistently high (> 90%) across all conditions, and we did not analyze it directly. Furthermore, Figure 2A shows that in the critical comparison, faster Yes responses in the Congruent condition than in the Incongruent condition were accompanied by a lower error rate. This suggests that there is no evidence for a speed-accuracy trade-off and that the theoretically relevant effect is observed in the RT data.

The SNARC and the MARC Effect

To check for the existence of the SNARC (Dehaene et al., 1993) and the MARC (linguistic markedness or response codes) effect (Nuerk, Iversen, & Willmess, 2004), we entered in the analysis the hand used to give a Yes response as a between-subject factor and the parity of number word as a within-subject factor. A 2 (Response: Yes vs. No) × 2 (Numerical Congruency: Congruent vs. Incongruent) × 2 (Hand: Left vs. Right) × 2 (Parity: Even vs. Odd) ANOVA revealed that there was no significant main effect of hand, F < 1, p > .20, or parity, F < 1, p > .20. The interaction between hand and parity was not significant, F < 1, p > .20, while the interaction between response and numerical congruency remained significant, F(1, 46) = 6.86, p = .012, ηp2 = .13. Moreover, all three-way interactions were nonsignificant, all Fs < 2, ps > .20, as well as a four-way interaction, F < 1, p > .20, suggesting that the SNARC and MARC effects did not contribute to the observed interaction between response and numerical congruency.

Symbolic Distance Effect

Previous studies found evidence that interference in the numerical Stroop task is modulated by the symbolic distance between the numerals and their quantity (Pavese & Umiltà, 1998, 1999). In particular, it was found that Stroop interference is stronger when the symbolic distance is smaller, suggesting that numeral and quantity are both mapped onto a common representation of magnitude or a mental number line (Fias & Fischer, 2005; Fischer & Shaki, 2014). Here, symbolic distance was computed as a number word minus numerosity. We should note that we did not precisely control for the symbolic distance in the Incongruent condition. In most trials, the symbolic distance ranged between −2 and 2. There were only a few trials with symbolic distances −3 and 3. Therefore, we decided to remove these trials from the analysis. Furthermore, one participant was removed from the analysis because of empty cells. The means and corresponding 95% confidence intervals are displayed in Figure 3A. Due to a substantial departure from sphericity, we used multivariate analysis of variance (MANOVA) with the Pillai test instead of ANOVA with a correction for violation of sphericity. Data for Yes responses were submitted to a one-way MANOVA with symbolic distance (−2, −1, 0, 1, 2) as a within-subject factor. The analysis revealed a main effect of symbolic distance, F(4, 43) = 5.72, p < .001, ηp2 = .35. Pairwise comparisons with a Holm-Bonferroni correction for multiple comparisons revealed that there was no statistically significant difference between close (1) and far (2) symbolic distances when the number word was greater than the numerosity, ΔM = 12 ms, t(46) = .59, p > .20. On the other hand, when the number word was smaller than the numerosity, participants were faster in responding to the far (−2) compared with the close (−1) symbolic distance, ΔM = 55 ms, t(46) = 2.50, p = .032. This result provides only partial support for the hypothesis that the number words and numerosity are mapped onto a common mental number line because if that were the case, we would expect to observe a symbolic distance effect for both positive and negative distances.

Figure 3 The mean latencies of correct Yes responses (in milliseconds) observed in Experiment 1 (A) and Experiment 2 (B) are shown as a function of the symbolic distance between number word and numerosity. Distance was computed as number word – numerosity. Error bars represent 95% confidence intervals for repeated-measure design following Cousineau (2005) and Morey (2008).

It should also be noted that the removal of trials with symbolic distances −3 and 3 did not disrupt the main finding of a significant two-way interaction between response and numerical congruency, F(1, 47) = 9.38, p = .004, ηp2 = .17. An analysis of simple main effects confirmed that when the correct response was Yes, participants were 50 ms faster in the numerically Congruent condition than in the Incongruent condition, F(1, 47) = 18.15, p < .001, ηp2 = .28. On the other hand, when the correct response was No, there was no difference between the numerically Congruent and Incongruent conditions, F < 1, p > .20.

Singular Versus Plural Nouns

Berent, Pinker, Tzelgov, Bibi, and Goldfarb (2005) found evidence that participants extracted quantity information from singular or plural forms of the presented noun. In particular, they found that participants took longer to enumerate the number of words on the screen (one vs. two) if it was incongruent with the word form (singular vs. plural). To check whether singular and plural forms contributed to the current findings, we entered the noun form as a separate within-subject factor in the analysis. A 2 (Response: Yes vs. No) × 2 (Numerical congruency: Congruent vs. Incongruent) × 2 (Noun Form: Singular vs. Plural) ANOVA revealed a significant main effect of numerical congruency, F(1, 47) = 12.39, p = .001, ηp2 = .21, showing faster responses in the Congruent condition (M = 541 ms, SE = 7.13) than in the Incongruent condition (M = 566 ms, SE = 6.86). There was no main effect of response, F(1, 47) = 2.73, p = .105, ηp2 = .05, or noun form, F < 1.5, p > .20. Furthermore, all two-way interactions were nonsignificant (all Fs < 2.5, ps > .10). However, the Response × Numerical Congruency × Noun Form interaction was significant, F(1, 47) = 6.48, p = .014, ηp2 = .12. An analysis of simple main effects revealed that there was no difference between singular and plural forms across all combinations of level of response and numerical congruency: Yes-Congruent, F < 1, p > .20; Yes-Incongruent, F(1, 47) = 4.91, p = .126, ηp2 = .09; No-Congruent, F(1, 47) = 2.64, p = .333, ηp2 = .05; and No-Incongruent, F < 1.5, p > .20.

On the other hand, when we compared the simple main effects for singular and plural forms separately, we found that the interaction between response and numerical congruency was restricted to the plural form because participants were 56 ms faster in the Congruent condition than in the Incongruent condition when the correct response was Yes, F(1, 47) = 18.25, p < .001, ηp2 = .28, but there was no difference between them when the correct response was No, F < 1, p > .20. When the statement was in singular form, there was no difference between the Congruent and Incongruent conditions in the Yes response, F < 1, p > .20, or the No response, F(1, 47) = 4.94, p = .093, ηp2 = .10. This is consistent with the finding of Berent et al. (2005) that only plural nouns are involved in the computation of quantity because singulars are linguistically unmarked for number.

Previous analysis has suggested that the processing of plural form indeed creates an expectation about numerosity that is violated if one object is presented. However, our hypothesis is that the number attached to the word creates a more specific expectation of the exact quantity to be presented in the picture and not just an expectation that there will be more than one object, as signaled by the plural. To disentangle these two possibilities, we ran a separate analysis by excluding all trials with a singular form in the statement and with one object in the picture. This analysis showed that the two-way interaction between response and numerical congruency remained statistically reliable, F(1, 47) = 6.43, p = .015, ηp2 = .12. An analysis of simple main effects confirmed that when the correct response was Yes, participants were 50 ms faster in the numerically Congruent condition than in the Incongruent condition, F(1, 47) = 11.27, p = .003, ηp2 = .19. On the other hand, when the correct response was No, there was no difference between the numerically congruent and incongruent conditions, F < 1, p > .20. Therefore, the interaction between response and numerical congruency for plural forms was not restricted to trials where a single object was presented. In other words, the current findings cannot be reduced to the effect of extracting quantity from the plural form of the noun (Berent et al., 2005).

In Experiment 1, we showed that the irrelevant numerical information given in the statement influenced the speed of decision making in the sentence-picture verification task. However, the statements used in Experiment 1 contained only the combinations of number word and concrete concept. It might be argued that full sentences describing more complex situations will prevent the automatic processing of the number, or at least reduce its effect on the verification task. Furthermore, we wanted to replicate the major finding of Experiment 1 using a different group of participants and a different research design (i.e., Latin square design with four counterbalanced lists).

Experiment 2

Method

Participants

A group of 33 (4 male, age range: 19–22) undergraduate psychology students from the Catholic University of Croatia, Zagreb, participated in the experiment in exchange for course credit. One participant was removed from the analysis because her error rate was > 15%, which was more than four standard deviations above the group mean.

Procedure

The procedure was the same as that in Experiment 1, except that statements were replaced with sentences in the form three dogs were wandering in the street. We constructed four lists that were counterbalanced for items and conditions. Each concept appeared in one of four possible combinations of conditions (response: Yes/No; numerical congruency: Congruent/Incongruent) in every list. Each participant was exposed to one list. The sequence of events during a single trial was as follows: fixation string (300 ms), sentence (3,000 ms), blank screen (100 ms), and an image with one to four replicates of the same picture (until response). The instructions, task, and feedback were the same as in Experiment 1.

Results and Discussion

RT Analysis

Raw data for Experiment 2 can be found in the Electronic Supplementary Materials, ESM 2. We followed the same approach as that of Experiment 1. Error trials were removed from the analysis (1.4% of data). There were no latencies below 200 ms, and the latency was above 1,500 ms in only 0.4% of trials; therefore, we decided to keep all correct trials in the analysis. The means and 95% confidence intervals are shown in Figure 2B.

In all analyses, the list was treated as a between-subject factor in order to increase the statistical power, but we do not report the effect of list or its interactions as this is not theoretically relevant (Pollatsek & Well, 1995). A mixed-design ANOVA with response (Yes vs. No) and numerical congruency (Congruent vs. Incongruent) as within-subject factors and list (1, 2, 3, 4) as a between-subject factor revealed that there was no significant main effect of response, F(1, 28) < 1, p > .20, ηp2 < .01. However, there was a significant main effect of numerical congruency, F(1, 28) = 6.98, p = .013, ηp2 = .20, showing faster responses in the numerically Congruent condition (M = 563 ms, SE = 6.76) than in the Incongruent condition (M = 576 ms, SE = 7.64). Replicating the finding from Experiment 1, there was a significant Response × Numerical Congruency interaction, F(1, 28) = 7.56, p = .010, ηp2 = .21. An analysis of simple main effects with a Holm-Bonferroni correction for multiple comparisons revealed that when the correct response was Yes, participants were 35 ms faster in the numerically Congruent than in the Incongruent condition, F(1, 28) = 10.62, p = .006, ηp2 = .28. On the other hand, when the correct response was No, there was no difference in the latencies between the Congruent and Incongruent conditions (ΔM = −7 ms), F(1, 28) = 0.83, p > .20, ηp2 = .03. Accuracy was consistently high (> 95%) across all conditions, and we did not analyze it explicitly. Figure 2B shows that there is no evidence of a speed-accuracy trade-off because faster Yes responses in the Congruent than in the Incongruent condition are accompanied by lower error rates. Furthermore, it should be noted that in Experiment 2, it was not possible to analyze the SNARC and the MARC effect because all participants had the same stimulus-response assignment.

Symbolic Distance Effect

We analyzed whether the observed effect was modulated by the symbolic distance between the number word mentioned in the sentence and the quantity presented in the picture. Again, we removed trials in the Incongruent condition with symbolic distances of −3 and 3 and restricted our analysis to the Yes response. Descriptive data are shown in Figure 3B. A two-way MANOVA with symbolic distance (−2, −1, 0, 1, 2) as a within-subject factor and list (1, 2, 3, 4) as a between-subject factor revealed a statistically significant main effect of symbolic distance, F(4, 25) = 4.91, p = .005, ηp2 = .44. However, pairwise comparisons with a Holm-Bonferroni correction for multiple comparisons revealed that there was no statistically significant difference between close and far symbolic distances (1 vs. 2) when the number word was greater than the numerosity, ΔM = 36 ms, t(31) = 1.15, p > .20, or when the number word was smaller than the numerosity (−1 vs. −2), ΔM = −27 ms, t(31) = −1.19, p > .20. This analysis suggests that slow Yes responses in the Incongruent condition were not modulated by the symbolic distance between the number word and the numerosity.

It should also be noted that the removal of trials with numerical distances −3 and 3 did not disrupt the main finding of a significant two-way interaction between response and numerical congruency, F(1, 28) = 7.13, p = .012, ηp2 = .20. An analysis of simple main effects confirmed that when the correct response was Yes, participants were 36 ms faster in the numerically congruent condition than in the incongruent condition, F(1, 28) = 10.62, p = .006, ηp2 = .28. On the other hand, when the correct response was No, there was no difference between numerically congruent and incongruent conditions, F < 1, p > .20.

Singular Versus Plural Nouns

As in Experiment 1, we checked whether the observed interaction between response and numerical congruency arose from the differential processing of singular versus plural noun forms. A 2 (Response: Yes vs. No) × 2 (Numerical Congruency: Congruent vs. Incongruent) × 2 (Noun Form: Singular vs. Plural) × 4 (List: 1, 2, 3, 4) ANOVA revealed a significant main effect of numerical congruency, F(1, 28) = 26.23, p < .001, ηp2 = .48, showing faster responses in the Congruent condition (M = 555 ms, SE = 7.13) than in the Incongruent condition (M = 582 ms, SE = 8.15). Furthermore, there was no main effect of response, F < 1, p > .20, or noun form, F < 1, p > .20. There was a significant two-way interaction between response and numerical congruency, F(1, 28) = 5.29, p = .029, ηp2 = .16. When the correct response was Yes, participants were 48 ms faster in the numerically Congruent than in the Incongruent condition, F(1, 28) = 17.94, p < .001, ηp2 = .39. On the other hand, when the correct response was No, there was no difference in the latencies between the Congruent and Incongruent conditions (ΔM = −7 ms), F < 1, p > .20. Moreover, there was a significant Response × Noun Form interaction, F(1, 28) = 8.89, p = .006, ηp2 = .24, and Numerical Congruency × Noun Form interaction, F(1, 28) = 18.87, p < .001, ηp2 = .40. Importantly, the Three-Way Response × Numerical Congruency × Noun Form interaction was not significant, F < 1, p > .20. This analysis suggests that the present findings are not due to the quantity information provided by the singular or plural noun form.

Experiment 2 confirmed the basic finding of Experiment 1 that number-quantity incongruence interfered with concept-object verification even though both number word and quantity were irrelevant to the task. Moreover, we found that the observed effect was not modulated by the symbolic distance effect, as in the classical numerical Stroop effect (Pavese & Umiltà, 1998, 1999), and it could not reduce to the computation of number from the singular or plural noun forms (Berent et al., 2005).

General Discussion

We adapted the sentence-picture verification task to independently manipulate the conceptual and numerical match between the sentence and the picture. In this way, we ensured that both components of the number dimension were irrelevant to the task of matching concept with the object. Therefore, this task did not trigger numerical processing in the sense of Tzelgov and Ganor-Stern (2005). However, the slower responses to the numerically incongruent condition relative to the congruent condition imply that participants spontaneously matched number words with numerosity in parallel with matching the concept with the object in the picture. This is the first demonstration of the interaction between numbers and their quantities with the same task that was previously used to show interactions between symbolic representation and perceptual attributes, such as orientation (Stanfield & Zwaan, 2001) or shape (Zwaan, Stanfield, & Yaxley, 2002). According to Zwaan (2004), reading words activates sensory-motor experiences with their referents. For instance, reading the word red reactivates the perceptual experience of seeing red color. Behavioral data support this claim by showing that words influence the execution of the perceptual tasks (Richter & Zwaan, 2009; Soto & Humphreys, 2007). Furthermore, a recent functional neuroimaging study showed that conceptual knowledge activates cortical areas dedicated to perception (Vandenbroucke, Fahrenfort, Meuwese, Scholte, & Lamme, 2016). Our data suggest that similar processes take place while number words are read, that is, they activate the representation of corresponding numerosity.

An alternative explanation of the observed interaction is that it arises at the decision or response selection stage. Proctor and Cho (2006) argued that the interaction between stimulus and response dimensions in the sentence-picture verification tasks arises from the polarity correspondence principle. This principle states that each pole of the bipolar dimension is coded as a “+” or “−” alternative, and if the polarities of the stimulus and response dimensions match in sign, a faster response should be observed. In the present context, the match between concept and object is the response dimension, where a Yes response is mapped onto the “+” pole, while a No response is mapped onto the “−” pole. Furthermore, we can assume that the numerical congruence is mapped onto the “+” pole, while the numerical incongruence is mapped onto the “−” pole of the stimulus dimension. Consequently, faster Yes responses in a condition of numerical congruence relative to numerical incongruence can arise from the correspondence between stimulus and response polarities. However, this explanation cannot be extended to the “−” polarity because there was no difference between numerical congruence and incongruence for No responses. Polarity correspondence would predict faster No responses for numerical incongruence compared to congruence, but this was not observed in Experiments 1 or 2.

Another possibility is that observed interaction arises from the numerosity information implied by the singular and plural noun forms (i.e., one vs. more than one). Berent et al. (2005) found that plural form interferes with the enumeration of a single word, and in some cases, singular form interferes with the enumeration of two identical words. In the current set of experiments, number word one is attached to a singular form, and number words two, three, and four are attached to the plural nouns, thus creating a potential confounding effect. To address this issue, we introduced noun form (Singular vs. Plural) into the analysis as a separate factor. In Experiment 1, we found evidence for a three-way interaction among numerical congruency, response, and noun form. However, when we removed trials with a singular noun in the sentence and with one object in the picture, we still observed an interaction between numerical congruency and response. This finding suggests that participants created specific expectations about the numerosity of objects in the picture and not just expectation that there would be more than one object in the picture, as indicated by the plural form in the statement. Furthermore, there was no evidence for a three-way interaction among response, numerical congruency, and noun form in Experiment 2, where better control over the stimulus material was employed. When taken together, the analysis of the effect of noun form in Experiments 1 and 2 suggests that the current findings cannot be reduced to the computation of numbers from singulars and plurals, as observed by Berent et al. (2005).

To provide a mechanistic account of symbol grounding, Domijan and Šetić (2016) suggested that an appropriate theoretical framework for understanding the interaction between symbols and perception is adaptive resonance theory (Carpenter & Grossberg, 1993, 2003; Grossberg, 2013). It was developed as a model for stable category learning in dynamic environments. It prevents catastrophic forgetting with a novelty detection system that compares incoming sensory input with learned top-down expectations. In adaptive resonance theory, symbolic (category) representation is bidirectionally linked with sensory representations (visual, spatial, auditory, etc.). Bottom-up links from sensory representation to category nodes enable categorization, that is, attaching an abstract category label to raw sensory experience. However, category nodes are also linked to sensory representation, which is essential for category stability. Moreover, these links enable expectations to be read out at sensory nodes (Carpenter & Grossberg, 2003). When category expectations match the sensory data, adaptive resonance occurs, which supports learning and the refinement of category codes. When there is a mismatch between top-down expectations and sensory data, a reset signal is released, which shuts down currently active category nodes and enables a search for a new category. The reset signal creates a temporal delay in the whole system, which slows down the processing (Domijan & Šetić, 2016).

In the present experiments, the statements and sentences created top-down expectations. One expectation is related to the objects, and another is related to the numerosity of the objects. Object-related expectations are read out in object recognition nodes in the ventral visual stream. On the other hand, numerosity-related expectations are read out in the perceptual representation of numerosity, which is located in the parietal cortex (Dehaene et al., 2003). If one of the expectations is not confirmed by the subsequent visual input, the reset signal will slow down the verification decision because the reset signal is a global (nonselective) inhibitory response that shuts down all currently active nodes in the symbolic (category) representation. Therefore, the symbolic representation of the concept will be suppressed even though it matches the object when there is a concurrent mismatch between the number word and quantity. The same is true when a concept-object mismatch is paired with numerical congruency. This analysis implies that numbers and other concepts share the common symbolic representation that can be accessed from different sensory pathways, consistent with the convergence model of semantic memory (Patterson, Nestor, & Rogers, 2007; Rogers, 2008). Furthermore, the mismatch-related delay in verification will occur when one or both expectations are violated because reset signals from the ventral and/or dorsal stream can occur at approximately the same time. The resonance will prevent reset from occurring and there will be no delay in the response only when both expectations are confirmed by the visual input. Therefore, the expected pattern of speed of verification should look like RTYes-Congruent < RTYes-Incongruent = RTNo-Congruent = RTNo-Incongruent, and this was observed in the data. The same computational analysis has been used to explain other examples of the interaction between symbolic and perceptual representations (Domijan & Šetić, 2016).

Finally, it should be noted that the present experiments do not offer insight into the exact nature of the perceptual representation of numerosity that supports the symbol grounding of number words. One possibility is that number words are mapped onto a spatial representation such as mental number line (Fischer & Shaki, 2014) or a common representation of magnitude (Bonn & Cantlon, 2012; Cohen Kadosh, Lammertyn, & Izard, 2008). Previous studies found that Stroop interference was larger when digit identity was symbolically close to the enumeration response than when it was symbolically far, which suggests the involvement of analog representation of magnitude (Pavese & Umiltà, 1998, 1999). However, in the current study, we did not find evidence for the modulation of the speed of the Yes response with symbolic distance. Therefore, it seems unlikely that the observed interaction arose from the grounding of numbers in the analog representation of magnitude. Consistent with our conclusion is the suggestion that analog representation is not precise enough to support mapping between exact numbers and their quantities. For instance, Izard and Dehaene (2008) found that numerical estimates of the numerosity of random dot patterns are highly inaccurate and tend toward underestimation. Furthermore, electrophysiological studies in the prefrontal and parietal cortices of monkeys revealed the existence of number-selective neurons with broad tuning curves, suggesting that these neurons respond to a range of quantities (Nieder & Dehaene, 2009).

On the other hand, rapid and accurate enumeration (subitization) is possible for small quantities in the range between one and four (Trick & Pylyshyn, 1994). The current understanding of subitization suggests that it reflects the operation of a separate processing system dedicated to the attentional indexing or tagging of small numbers of objects (Hyde, 2011). Support for this hypothesis comes from research on children (Feigenson, Dehaene, & Spelke, 2004) and adults (Revkin et al., 2008), showing abrupt changes in the accuracy of enumeration that cannot be explained by a single process. The system for subitization is able to support numerical symbol grounding because the attentional index is an exact representation; that is, the visual object is either indexed or not. Moreover, Carey (2004; Le Corre & Carey, 2007) argued that subitization is the developmental basis for children’s understanding of number concepts. Irrespective of the exact form of perceptual representation of numerosity that is involved here, an important point is that the current results support the conclusion that number words are not estranged from the corresponding quantities, at least for a small numerosity that falls in the subitization range (Leibovich & Ansari, 2016; Lyons et al., 2012).

Electronic Supplementary Materials

The electronic supplementary material is available with the online version of the article at http://dx.doi.org/10.1027/1618-3169/a000358

This research was supported by the Croatian Science Foundation Research under the Grant HRZZ-IP-11-2013-4139 and the University of Rijeka under the Grant 13.04.1.3.11.

1It might be argued that the explicit mention of the variations in the object’s numerosity in the instructions drew participants’ attention to the numerical dimension. However, this emphasis was necessary because several participants in the pilot study complained and asked about numerosity manipulation during the experimental session. Nevertheless, we think that this intervention at the beginning of the experimental session produced less triggering than the enumeration task in the standard numerical Stroop experiment, which draws attention to the numerical dimension on each trial.

References

Mia Šetić, Psychology Research Laboratory, Department of Psychology, Catholic University of Croatia, Ilica 242, 10000 Zagreb, Croatia, ,