Normally, we recognize and identify others based on a variety of aspects of their appearance; their clothing, hair style, gait, and posture all serve as potential cues revealing their identity. But, maybe most critically of all, we recognize and identify individuals based on facial information. And yet, while some people’s faces may be readily recognizable to us, others may be just as forgettable.
A face’s memorability refers to the combination of its intrinsic visual features tending to facilitate its later recognition.1 A growing body of literature supports the idea that all faces (and other objects) are not equally memorable: there is inter-individual consistency between the memorability of specific images, and of identities (Isola et al., 2011; Bainbridge et al., 2013; Bainbridge, 2017, 2020). That is, the faces we recognize tend to converge with those recognized by others, and vice-versa (Bainbridge, 2017; Vokey & Read; 1992; Rust & Mehrpour, 2020). As a whole, the extant literature suggests that a face’s memorability is an intrinsic property derivable from some combination of two broad types on information: image-dependent information (i.e., image-specific, low-level visual features), as well as viewpoint-invariant information (i.e., cues persisting across images of the same identity) (Bainbridge, 2017; Chang et al., 2017). While many dimensions have been proposed to account for memorability (Bainbridge et al., 2013; Khosla et al., 2015), faces are such complex visual objects that even the most complete accounts are neither exhaustive, nor close to generating scientific consensus (Rust & Mehrpour, 2020).
The common implication across models of face memorability, though, is that the combination of a face’s attributes ought to be quantifiable along some stimulus dimension (or higher dimensional space). So, while face images can be reconstructed in a “face space” from similarity ratings between face images (reflecting image-dependent similarities), or face images and memorial representations (reflecting viewpoint-invariant similarities) (Chang et al., 2017), image statistics alone are insufficient for quantifying memorability either in human observers, or using machine learning algorithms limited to pixel-wise matching (Isola et al, 2011; Chang et al., 2017).
This raises an interesting question: What are the relative contributions of image- and identity-based memorability to face recognition more broadly? Image recognition demands highly accurate perception, whereas recognizing facial identities poses the additional requirement of constructing invariant representations that translate across changing viewpoints and environmental conditions. Thus, while recognizing specific facial images could be accomplished by extracting either image- or identity-related information, doing so consistently across changes in viewpoint necessarily requires both to be successful. Consistency, here, refers to the extent to which an individual’s recognition of faces is non-random, and instead tends to be a function of memorability, such that they tendentially recognize more memorable faces.
To address this question, we capitalized on the extraordinary face processing abilities of SRs (Russel et al., 2009; Ramon et al., 2019a & b, Ramon, 2021) in characterizing their reliance on image-dependent and viewpoint-invariant information as a function of memorability, relative to neurotypical observers. Though there is some evidence for domain generality of face identification among SRs (Faghel-Soubeyrand, 2021), this specific advantage could equally be attributable to superior perceptual abilities. Moreover, previous accounts of SRs’ abilities suggest that these cannot be attributed solely to superior memory abilities; even World Memory Champions in face-name matching do not present with SR levels of performance on standard lab-based tests of face processing (Ramon et al, 2016). So, to the extent that their abilities extend beyond face processing, it seems unlikely that their advantages can be explained purely in terms of memory—even if generally superior memory cannot be altogether ruled out.
Young and Burton (2017, 2018) have proposed that expertise in face recognition across images only truly occurs for familiar ones. However, the extent to which the advantage familiarity confers stems from visual versus semantic representations remains unresolved. Simultaneous familiar face matching (of identities across images) cannot rely on semantic/personal knowledge in the same way that it can for familiar faces (Rossion, 2018). Most recently, though, facial identity representations in memory have been proposed to make up isolated ‘islands of expertise’ (Hancock, 2021). According to this hypothesis, our memories are populated by constellations of familiar face representations, whose common attributes make up the dimensions along which we process unfamiliar faces (Valentine, 1991; Valentine et al., 2004), irrespective of whether those attributes are semantic or visual. So, (re)cognition of faces nearer to those ‘isolated islands’ is facilitated by proxy. It has been suggested, as a corollary of this hypothesis, that SRs might have more highly organized constellations of this sort, and that this may account for their face recognition abilities (Hancock, 2021).
But how might information about faces representations differ between SRs and normal individuals in such a high dimensional space? Representational similarity analysis of SRs’ EEG responses has found that they show distinct activation patterns, evocative of both more detailed facial identity representations than normal observers, and richer representations of non-face categories (Faghel-Soubeyrand et al., 2020; 2021). While this suggests some domain generality of their abilities, it does not immediately explain whether these representations differ from neurotypical individuals’ in either degree (i.e., quantitatively) or kind (i.e., qualitatively). Thus, it remains an open question whether face recognition among SRs relies on more consistent and efficient use of common information, or on sensitivity to further representational dimensions, untapped amongst the normal population.
Recent work (Nador et al., 2021a & b) has shown that SRs’ ability to perceptually match unfamiliar faces is also determined by more consistent exploitation of common global spatial frequency information used by controls. Similarly, previous psychophysical work has found that SRs’ recognition of celebrity faces derives from essentially the same information extracted from local spatial frequency information employed by neurotypical control observers (Tardif et al., 2018). Based on these results, SRs’ famous face recognition and unfamiliar face perception abilities seem to stem from exploitation of the same local and global information, respectively, as the general population. More broadly, these results imply that SRs’ capabilities represent quantitative extensions of typical face processing abilities.
Yet, no research to date has yet assessed SRs’ unfamiliar face recognition in terms of global processing, thus motivating the present study. In the interim, it remains somewhat unclear whether these individuals uniformly excel at extracting image-dependent local features, or viewpoint-invariant global features during unfamiliar face recognition. As a group, though, the existing evidence suggests that they may possess either of these abilities, depending on the specific face cognition process probed. For instance, recent work by Linka and colleagues (2021) found that SRs were more likely to fixate on faces (both in terms of their first saccade and overall dwell time) relative to other semantic categories, irrespective of their particular viewpoint or arrangement within scenes. This might imply, at least perceptually, that SRs may be more sensitive to viewpoint invariant facial features than neurotypical controls. But, in this case, SRs were not specifically tasked with recognizing faces, whereas previous work seems to suggest generally that individual differences show strong task dependency (Fysh et al., 2020; Faghel-Soubeyrand, 2021; Ramon, 2021; DeHaas, 2021). This leaves open the possibility that, within a given task, individual differences could depend on the information content (image-dependent or identity-invariant) they require.
To probe this question further, we explicitly tested whether SRs would show greater sensitivity to either type of information by assessing their recognition of specific unfamiliar face images, and facial identities captured from different viewpoints, respectively. Furthermore, we implemented a novel “With or without you” (WoWY) analytical approach that departs fundamentally from previous work both on SRs and visual memorability. Past studies of SRs’ recognition abilities have tendentially focused on aggregate scores that gloss over latent variability in memorability (both within experiments, and across studies). We reason instead that memorability constitutes an ideal construct for probing the intraindividual consistency of image and identity recognition, which in turn may directly correspond to individual differences in observers’ idiosyncratic capacities.
In Experiment 1, we surreptitiously tested recognition for (previously memorability ranked, c.f. Bainbridge et al., 2013) unfamiliar face images: after initial implicit learning (during a gender categorization task) the exact same images were—without warning—presented as targets among novel distractors in an old/new recognition task. At base, we hypothesized that recognition performance would be greater for high versus low memorability face images. Moreover, if SRs were to show greater sensitivity to memorability than controls, this would suggest that they have somewhat more structured or well-organized mental representations of images, as suggested previously (Hancock, 2021). However, this would not exclude the possibility that individuals in both groups were forming face representations robust to changes across images of the same identity.
Consequently, in Experiment 2, to-be-remembered identities were explicitly learned from frontal view face images. And, with the intention of probing memorability and recognition for viewpoint-invariant, rather than image-dependent stimulus properties, the subsequent old-new recognition task involved stimuli shown from a three-quarter viewpoint. Under these circumstances, recognition would only be possible if the properties extracted during encoding were robust to changes in viewpoint. As before, we anticipated better recognition across viewpoints for high vs. low memorability identities. Yet here, SRs outperforming controls would indicate that their superior skill does not derive from purely image-dependent information, but instead reflects more efficient extraction of viewpoint-invariant representations of facial identity.
Most importantly, we investigated individual and grouped performance profiles across experiments. We hypothesized that consistency of memorability-dependent recognition performance observed between experiments indicates utilization of viewpoint-invariant information during recognition. Therefore, greater consistency of recognition performance among SRs vs. typical observers would imply that SRs extract viewpoint-invariant, identity-diagnostic information from face images in a fundamentally more principled and consistent manner. And, to the extent that individuals’ performances covary with image- or identity-based memorability, this would provide additional evidence for the continuous and dimensional nature of memorability. Unfortunately, neither ANOVA, nor logistic regression are particularly well-suited to assessing this hypothesis, so we developed and implemented a novel WoWY resampling method to measure observers’ and groups’ between-experiment consistency.
To preview our results, we find that, for controls, recognition performance across experiments varies inter-individually, and somewhat inconsistently between image- and identity-based memorability. SRs, on the other hand, show significant consistency between their recognition performance and stimulus memorability (of both types) across experiments. We therefore surmise that SRs have a more detailed dimensional representation of face memorability, untethered to image-specific information available at encoding and retrieval, but better entwined with the aspects of facial identities that render them memorable despite changes in viewpoint between images. This is of particular importance given its implications for diagnosis of pathology (Ramon, 2018; Bainbridge et al., 2019), as well as for identification of potential SRs in future research and applied settings (Ramon, 2021).
Thirty-four observers (26 females) aged 18–47 years (M = 27.5; SD = 8.33), who were either students at the University of Fribourg or non-student individuals contacted by the experimenters, provided informed consent to participate in both experiments (as approved by the local research ethics committee (approval number 473), adherent to the Declaration of Helsinki). Control observers received course credit or financial compensation for their participation. The SR sample consisted of 11 individuals (all right-handed) who achieved superior performance in at least two of three tests of face perception and recognition (Yearbook Test (YBT; Bruck et al., 1991), Facial Identity Card Sorting Test (FICST; Jenkins et al., 2011), Cambridge Face Memory Test long version (CFMT+; Russell et al., 2009) (see Table 1) derived from a recently reported cohort of 70 SR cases (Ramon, 2021). Some of the cases reported here also participated in additional behavioral, oculomotor and neuroimaging studies (Nador et al., 2021a & b; Linka et al., 2021; Faghel-Soubeyrand, 2020, 2021; for a collective overview of SR cases and study participation, see Ramon, 2021). One control and one SR observer’s data were excluded from both experiments due to chance level performance caused by button-press errors.
|IDENTIFIER||DEMOGRAPHIC INFORMATION||DIAGNOSTIC TEST SCORES|
In Experiment 1, 720 face (half fe/male) stimuli from the 10k US Adult Faces Database (Bainbridge et al., 2012) measuring 5.3° of visual angle (VA) (57 cm viewing distance), labelled with hit rate-derived memorability scores (Bainbridge et al., 2017), and an average Michelson contrast of .96 (min = .63, max = .99) were displayed on a MacBook Pro (15”, Mid 2010; 2.4 GHz Intel Core i5). Target stimuli comprised two sets of 180 high and low memorability images with average hit rates of .73 ± .07, and .32 ± .06, respectively, both counterbalanced for gender; 360 medium memorability images served as novel distractors for the old/new recognition task. In Experiment 2, 64 face images (counterbalanced for gender) with hit rate-derived memorability scores (Bainbridge, 2016) originating from the Karolinska Directed Emotion Faces (KDEF) database (Lundqvist et al., 1998) and the Stirling Economic & Social Research Council (ESRC) 3-Dimensional Face Database (Hancock & Tiddeman, 2011) were used (with mean memorability scores of 0.61 and 0.70, for low and high memorability images, respectively). Stimuli subtended 5.3° VA and had an average contrast of .95 (min = .72, max = .99) (see Figure 1). In both experiments, all face stimuli were entirely unfamiliar to all observers, prior to beginning.
Each experiment consisted of two phases: first a learning phase, and then a recognition phase. During the learning phase, observers learned (implicitly in Experiment 1; explicitly in Experiment 2) a set of face stimuli and committed them to memory. During the recognition phase, they were tasked with recognizing face stimuli as either matching (exact image matches in Experiment 1; identity matches in Experiment 2) one of the studied identities, or as a novel foil. An equal number of targets and foils were presented to observers in each experiment. Crucially, in conjunction with the stimulus manipulations described above, the manipulation of learning between experiments was intended to bias observers away from explicitly attending to viewpoint-invariant stimulus information in stimuli during Experiment 1, while forcing them to process viewpoint-invariant information during Experiment 2.
Experiment 1—Recognizing Implicitly Learned Face Images. In the implicit learning phase of Experiment 1, observers bimanually categorized high and low memorability stimuli as male (‘press A’) or female (‘press L’) as quickly and accurately as possible. Stimulus presentation durations are detailed in Figure 1; responses were recorded during stimulus presentation or the following blank interval. In total, six blocks of 60 images (with interleaved 10s breaks; order randomized) were shown. Then, after a short (3–5min) break, the experimenter indicated that the previous gender categorization task was in fact an implicit learning task and that a recognition phase would now follow (all observers were unaware of the subsequent recognition task). In the recognition phase, observers were instructed to indicate via button press—as quickly and accurately as possible—whether they had seen each presented image during the learning phase (‘A’ for ‘old’; ‘L’ for ‘new’). This task comprised 4 blocks of 180 stimuli (in randomized order, with self-paced breaks between blocks). Stimuli were presented for 1s, followed by a blank screen that persisted until a response was provided.
Experiment 2—Recognizing Explicitly Learned Faces across Viewpoint Changes. After completion of Experiment 1, observers were given another short break (3–5 min), before beginning the explicit learning phase of Experiment 2. During this phase, they were explicitly instructed to learn 32 new faces for subsequent recognition. After a short (30s) break, the recognition phase began. Observers were instructed to indicate as quickly and accurately as possible whether the 3/4 viewpoint face stimuli represented facial identities had been presented in the prior learning phase (i.e. press ‘A’ for ‘old’) or not (i.e. press ‘L’ for ‘new’). The 64 stimuli were presented randomly in a single block; each was displayed until a response was provided and trials were separated by a .8s blank screen ITI.
Our analyses included two components. First, a three-way mixed Bayesian ANOVA (JASP Team, 2020, JASP computer software, Version 0.14.1) was conducted (along with post-hoc power analysis (G*Power computer software, Faul et al., 2007)) to investigate the effects of Group, Experiment, and Memorability. Note that Memorability for this analysis was derived from hit rates obtained in different studies (Bainbridge 2017, in Experiment 1, and Bainbridge, 2016 in Experiment 2), with different observers, and an altogether different set of experimental procedures. Second, we implemented a novel With or Without You (WoWY) analysis described in detail later in this section.
Post-Hoc Power Analysis. Post-hoc power analysis (GPower126.96.36.199, Faul, 2019) of the current study’s mixed design (repeated measures with between-subjects factors), given a total sample size of 32, with 2 groups and 7 effects measured (3 main effects, 3 two-way interactions and a 3-way interaction) shows that the ANOVA has statistical power (𝛽) indicative of an 11% chance of detecting small, true effect (f = 0.1); a 44% chance of detecting medium-sized, true repeated-measures effects (f = 0.25); and an 82% chance of detecting a true, large effect (f = 0.4). Thus, we would expect to have adequate power for detecting only relatively large effects with the current design, assuming a type I error rate of 5%. Of note, a recent study by Bainbridge (2020), using essentially the same experimental manipulations as our own, found a large (f = 0.68) within-subjects depth of encoding effect (i.e. varying whether learning was implicit or explicit between experiments). So, while our sample size is relatively small, we expect to have adequate power to detect an even smaller effect than reported in the most comparable study of face memorability known to us.
With or Without You (WoWY) Analysis. To assess whether any individuals’ recognition performances were dependent on image- or identity-based memorability, we conducted a split-half, with or without you (WoWY) analysis on each observer’s hits and misses. This analysis, illustrated in Figure 2, compares each observer’s performance relative to other randomly sampled observers against the null hypothesis that their recognition performance across images is independent of memorability. Critically, with this procedure, the operational measure of memorability relies on concordance between observers in our sample, and not with memorability scores obtained from previous studies.
The procedure first involves randomly sampling half of the observers and half of the images (each one ranked by their average hit rates across all observers) at a time. In this case, we drew 1000 samples, each without replacement. Crucially, over repeated samples, each observer is expected to be included in half of them. Thus, the drawn samples are inherently divisible into one half that include (“with”) versus another that exclude (“without”) any particular observer (Figure 2, Panel 1). Likewise, each image is equally likely to be retained for calculating WoWY scores within a given sample. And, after repeated sampling, will be equally represented in samples including or excluding any particular observer.
Following the repeated sampling, we proceed observer by observer, noting the samples that include versus exclude them. For each of these sample sets (with and without the observer), we then average together the hit rates of the most memorable image from each single sample. That is, the hit rates for the most memorable images in each sample are averaged together, then the second most memorable, and so on, until we have an ordered set of image composite memorability scores. This process is then repeated for each observer, yielding one such set including—and one excluding—every observer:
where i denotes observers and X denotes samples of half of the hit-rate-ranked images across all i.
In kind, the other half of images in each sample X (those not retained for WoWY score calculation) will still be of great importance. They are similarly rank-ordered (according to hit rate across all other participants), to produce composite image memorability scores.
Finally, each observer’s WoWY scores are calculated by taking the difference between the two sets (subtracting “without” from “with”), separately for each level of composite image memorability (from most to least memorable). Each observer’s WoWY scores are thus rank-ordered according to the memorability of the images from which they were derived. Although the WoWY and memorability scores are generated from split halves of the stimulus set for any one drawn sample, it is important to recall that all images are used as often across repeated samples to calculate WoWY scores as they are to calculate composite memorability scores. Essentially, both the WoWY and memorability scores are in the end derived from the full stimulus set. Equally, images with lower memorability scores were most likely to be represented near their true rank in each sample, such that over repeated samples, high composite memorability scores tended to include high memorability images, and vice versa.
Having followed this procedure, we could then determine whether each observer’s performance relative to other randomly sampled observers was a function of memorability (as assessed by all other observers’ recall of the remaining images) by Spearman correlating these two split halves of the data. A positive correlation between WoWY scores and memorability for a given observer would imply that their performance improved relative to other observers as a function of memorability, whereas a negative correlation would indicate that their performance degraded. This obtained correlation is of course derived from the particular circumstance where WoWY scores were correlated against memorability scores ordered from least to most memorable. Therefore, the alternative hypothesis that this correlation was due to this specific ordered structure can be tested by permutation (Figure 2, panel 3). The null hypothesis is thus that WoWY scores are no more correlated with the originally ordered memorability scores than would be expected by chance. We tested this against the alternative (that the two are correlated) by comparing the obtained Spearman correlation coefficient against the distribution of correlation coefficients obtained by permuting memorability scores (see inset histograms in the Supplementary Information). We could then Z-transform the obtained Spearman correlation coefficient by dividing its distance from the mean by the standard deviation of this distribution.
Finally, to assess group-level consistency of performance as a function of memorability, we resampled 10 observers’ Z-transformed correlation coefficients at a time, 1000 times, to create their sampling distribution. We resampled 10 observers at a time because the actual sample of SRs from which we calculated the group-level correlation had N = 10. So, resampling 10 observers at a time ensured that the bootstrapped sampling distribution would have a standard deviation commensurate with the standard error of the mean for the original sample of size 10. We then assessed where the obtained correlation coefficient lay in relation to the center of the sampling distribution.
We conducted a three-way mixed Bayesian ANOVA (2 Group × 2 Experiment × 2 Memorability)—with Group as the only between-subjects factor—on observers’ Hit Rates (Figure 3). Model comparisons revealed that the highest log likelihood (Log BFM = 3.613) was achieved when including main effects of Memorability, Experiment and Group, as well as Memorability by Experiment, and Group by Experiment interactions. Analyses of the effects within this model, by comparison with matched models (including versus excluding each effect) provided decisive evidence favoring inclusion of the main effect of Memorability (Log BFincl = 7.61), but only marginal support for Experiment (Log BFincl = .19). There was strong evidence against a main effect of Group (Log BFincl = –.803), though there was substantially more decisive evidence favoring inclusion of both Group by Experiment (Log BFincl = 3.53), and Memorability by Experiment (Log BFincl = 4.97) interactions. Finally, we note that there was strong evidence against any Group by Memorability interaction (Log BFincl = –1.22), as well as the three-way interaction (Log BFincl = –1.15). And overall, we obtained a model averaged posterior R2 of .59 (95% CI = [.503, .663]).
To assess whether individual observers showed significant sensitivity to memorability in each experiment, we first computed individual WoWY scores following the procedures outlined in Figure 2 and described in detail in the Materials and Methods section. Accordingly, we compared the half of repeated samples including versus excluding each individual observer, retaining a random half of all images for each such comparison. Memorability scores were calculated by averaging the hit rate for each image separately, across all but the current observer. Finally, we correlated each individual observer’s WoWY scores with the memorability scores for the half of images that were not retained. Effectively, this entailed testing the alternative hypothesis—that WoWY scores covary with memorability—against the null hypothesis that WoWY scores are independent of memorability. We then tested the significance of each correlation coefficient by comparing it to the bootstrapped distribution of correlation coefficients obtained by randomly permuting memorability rankings between images (see Figure 2).
In Experiment 1, 11 controls showed significant positive correlations between their WoWY scores and memorability (r(358) = .33, .46, .51, .49, .27, .42, .51, .44, .33, .21, .28; ps < .05), while 5 showed negative correlations (r(358) = –.33, –.64, –.16, –.66, –.51; ps < .05), and 6 showed none (see Supplementary Information for all individual observers’ WoWY data). Five SRs showed significant positive correlations (r(178) = .49, .16, .25, .25, .37; ps < .05) and 5 showed negative ones (r(178) = –.29,–.28,–.15,–.66,–.29; ps < .05). In Experiment 2, only 4 controls showed significant positive correlations (r(14) = .65,.72,.71,.7,.65; ps < .05) and just 3 showed negative ones (r(14) = –.64,–.69,–.58). Meanwhile, among SRs, only 2 of the 10 SRs showed significant positive correlations (r(14) = .53, .93; ps < .05) and 3 showed negative ones (r(14) = –.65, –.72, –.30; ps < .05) (see Figure 4 and Supplementary Information for individual observers’ scatter plots).
Beyond these correlations’ significance, our primary goal was understanding consistency of individuals’ performance as a function of memorability between Experiments 1 and 2, as this speaks to whether they had similar sensitivity to image-dependent and viewpoint-invariant memorability. We therefore Z-transformed each observer’s abovementioned correlation coefficients using the bootstrapped distributions derived for significance testing (Figure 3, panel 3). Thus, for each observer, we obtained one WoWY Z-score per experiment, reflecting their performance relative to other observers, across memorability. As demonstrated in Figure 4a (bootstrapped Group consistency), among controls there was no significant correlation between WoWY Z-scores in the two experiments (r(30) = .21, p = .35). Among SRs, however, there was a significant positive correlation (r(8) = .67, pboot = .01) when compared to the bootstrapped distribution of correlation coefficients obtained by resampling observers (irrespective of Group) 10 at a time (Figure 4a, Bootstrapped Group consistency).
Across experiments, we find that recognition performance varies as a function of Memorability, even when memorability scores are derived from independent observers’ recognition performance measured in the context of different paradigms (cf. Bainbridge et al., 2013; Bainbridge et al., 2017; Bainbridge, 2017). However, SRs did not outperform controls at the group level, overall, suggesting that any average recognition memory differences between groups were modest at best. Given that both groups show comparable memorability-dependent performance, our results lend credence to the notion that image-based memorability is consistent across samples and experimental paradigms. Of course, we could not directly compare our image- and identity-based memorability effects to those found in other studies (Bainbridge et al., 2013; Bainbridge, 2017; Bainbridge et al., 2017), owing to differences in procedures and paradigms. Therefore, we could not ascertain whether a given image was specifically encoded based on image-dependent or viewpoint-invariant features. Nevertheless, the decisive evidence in favor of an Experiment by Memorability interaction strongly suggests that image- and identity-based (i.e., viewpoint-invariant) memorability effects are separable, depending on the conditions during encoding. Moreover, the studied identities themselves differed between experiments as a methodological consequence of maintaining single exposures to each one during study. Effectively, this prevents testing the memorability of the exact same image during both implicit versus explicit learning, as well as when probing recognition with image- or identity-matched stimuli.
The absence of a Group effect, or Group by Memorability interaction suggest that SRs were no better than controls at recognizing face stimuli, irrespective of their memorability. However, this could be explained by the large individual differences amongst the individuals in our two groups. While some individuals’ performances deteriorated relative to peers as memorability increased in Experiment 1, the exact opposite could be said of their own performance in Experiment 2 (see Supplementary Information Figure S1 scatterplots). Since memorability effectively stems from concordance across individuals, these inter-observer differences constitute an important source of variability, clearly not captured by the simple dichotomy between high and low.
Moreover, the memorability levels we tested could have been too heterogeneous to observe any interaction with Group. However, this seems unlikely, given the observed Memorability by Experiment interaction, and the concordance between our sample’s performance and the memorability scores obtained for these images from previous experiments (see Stimuli). In fact, recent work by Bainbridge (2020) employing similar procedures and stimuli to the current experiments’ finds comparable levels of performance. Consequently, our high and low memorability stimuli seem neither too heterogeneous, nor too extreme, despite relatively low overall hit rates. In sum, this suggests that the Experiment by Memorability interaction was most likely due to usage of differential information content (image-dependent versus viewpoint-invariant), and not simply a product of the specific ranges of memorability at which we happened to test.
Finally, the Group by Experiment interaction indicates that SRs outperformed control observers when encoding was explicitly solicited during the learning phase, and the images seen during recognition were shown from different viewpoints than when they were studied (Experiment 2). This suggests an advantage for SRs specific to these conditions, under which only viewpoint-invariant information could reliably transfer between encoding and recall. Thus, the control observers were capable of recognizing specific face images (in Experiment 1), but not necessarily identities across viewpoint changes (Experiment 2). This raises the question of whether SRs or controls employed different information across experiments. They could have used image-based information in Experiment 1, and viewpoint-invariant information in Experiment 2, or used viewpoint-invariant information in both experiments, even though image-based information alone would have been sufficient to facilitate recall in Experiment 1. But, since ANOVA cannot address this issue, we developed and employed WoWY analyses to disentangle these two possibilities.
While SRs’ average performance may not have been substantially superior to control observers’ across experiments, the Group by Experiment interaction suggests that SRs performance improved from Experiment 1 to 2, whereas controls’ performance remained stable. Overall, then, SRs seem to have been better at processing viewpoint-invariant information than controls. However, the memorabilities from which this effect was derived were obtained under quite different circumstances (see Stimuli), and we sought to test whether observers’ performance in each experiment varied in kind with the remainder of the current sample. As a first step, we confirmed that observers’ performance indeed varied as a function of memorability within each experiment, using a leave-one-out logistic regression (See Supplementary Information Figure S2).
However, since any given observer could have implicitly relied on identity-based memorability (by exploiting identity-invariant face information) in both experiments, we sought to test whether this was the case, using novel WoWY analyses. Logistic regression (Supplementary Information, Figure S2) is limited to assessment of within-experiment consistency, at least at the individual observer level. To the extent that individuals’ hit rates are well-fitted by the logistic regression model, this suggests that their behavioral responses vary as a function of memorability. The fit, then, reflects their sensitivity to fluctuation in memorability, albeit separately for each experiment. This latter caveat is nonetheless quite important, since consistency between experiments here suggests similar extraction of information in both. Since image-dependent information was only sufficient to support recognition during Experiment 1, consistency between experiments necessarily entails extraction of viewpoint-invariant information in both.
WoWY analyses allowed us to assess the consistency of recognition performance between experiments for individual observers, and indeed, only SRs exhibited enhanced consistency between experiments. Since only viewpoint-invariant information could have been used effectively in both experiments, this suggests that SRs were more sensitive to it, even implicitly, when encoding of identity was not solicited during the learning phase. This agrees with the observed Group by Experiment interaction, which implies that the kind, and not the magnitude, of memorability is what distinguishes SRs from controls. From these results, we might expect any given control observer to have a more idiosyncratic memorability profile (i.e., recognition for specific identities), less consistent with that of other observers between experiments. In order to determine whether this was the case at the individual level, we assessed the diagnosticity of image-dependent versus viewpoint-invariant information following the WoWY methodology outlined previously.
In Experiment 1, split-half WoWY analyses showed that observers’ hit rates were more consistent with memorability scores (derived from all other observers’ performance) than would be predicted by chance alone (see Figure 4a, Bootstrapped group consistency). In 26 of the 32 observers, we find there were significant deviations from chance in Experiment 1, indicating that the ordering of memorability scores (ranked based on all other observers’ performance) was significantly correlated with their own performance (See Supplementary Information, Figure S1 for individual observers’ WoWY profiles). In line with previous research, this result strongly suggests that stimulus memorability is consistent across observers (Bainbridge et al., 2013; Bainbridge et al., 2017; Chang et al., 2017). However, in Experiment 1 observers learned targets (face images) implicitly in the context of an orthogonal (gender categorization) task, and their recognition performance for these targets was tested using these exact same images. Thus, from these results alone, it is unclear whether their internal representations of memorability derive from image-dependent or viewpoint-invariant, identity-diagnostic information, since either could have been successfully used to aid recognition of an exact image match during the recognition phase.
In Experiment 2, memorability of the to-be-learned target identities varied continuously (c.f. Bainbridge, 2016), whereas the images used in Experiment 1 were taken from the high and low memorability tails of a performance distribution (c.f. Bainbridge et al., 2017). Therefore, by comparison with Experiment 1, the correlation between observers’ WoWY performance and stimulus memorability was weaker within observers, on average. But crucially, since observers had to recognize studied identities across viewpoints, above-chance performance required viewpoint-invariant identity (as opposed to image) recognition. While observers explicitly encoded faces for later recognition, the instructions in Experiment 2 provided no indication as to whether their recognition would be tested for identical images or not. Under these circumstances, local features would be exceedingly unlikely to transfer from head-on to three-quarter viewpoints unless an observer utilized some viewpoint-invariant information. Split-half permutation analysis confirms this, but also shows significantly greater consistency with the results of Experiment 1 among SRs than controls (Figure 4). Critically, between-experiment consistency in performance as a function of memorability could only be achieved if viewpoint-invariant information were used in both, since image-dependent information could not. Our results show that SRs tend to exhibit consistent patterns of recognition performance (across experiments) as a function of memorability, whereas controls do not. Overall, this supports the hypothesis that SRs show higher sensitivity to viewpoint-invariant information as a continuous dimension. Furthermore, it implies that compared to controls, SRs more consistently built robust representations of encoded facial identities across contexts—even without explicit instructions to do so (i.e., Implicit Learning Phase, Experiment 1).
Observers also studied substantially fewer images in Experiment 2 than 1. So, variation in performance between the two could potentially be attributed to this difference, rather than our intended experimental manipulations of encoding type and stimulus information. However, the fact that we Z-transformed WoWY scores (Figure 4a) accounts for this possibility—at least in theory—by reducing them in proportion to the number of stimuli from which they were derived. As well, at least upon visual inspection, logistic regression fits to the data (see Supplementary Information, Figure S2) appear to be somewhat more similar between experiments for SRs than controls, which, while not statistically interpreted here, nonetheless agrees with the results of our novel WoWY analyses.
Recent work characterizing perceptual processing (vs. recognition) of facial identity in the same group of SRs reported here provides converging evidence that increased intra-observer consistency distinguishes SRs from control observers (Nador & Vomland, 2021; Nador et al., 2021a & b). In two experiments, their psychometric assessments revealed that SRs utilized the same range of spatial frequency content across orientations as controls, albeit more consistently. While their study did not explicitly assess memorability, its findings showed for the first time that within the sub-process of face perception, controls and SRs differed only quantitatively, as a result of differential consistency between groups. In a similar line of research, Tardif and colleagues (2019) applied systematically varied local SF filters to facial features of celebrity faces, finding that SRs’ outperformed control observers at recognition. Moreover, the same SF information content to which their control observers had access was sufficient to predict SRs’ performance, without loss of generality of the model. This implies that SRs exploited the same perceptually available SF information as controls, but with greater consistency. Taken together, these recent results (Nador & Vomland, 2021; Nador et al., 2021a & b; Tardif et al., 2019), along with those of the current study, provide converging evidence that consistency is a distinguishing characteristic of SRs’ performance both within and across subprocesses of face cognition.
Even though image-based information would suffice for encoding in Experiment 1, SRs’ recognition performance was still commensurate with the formation of viewpoint-invariant, identity-based representations, whereas controls were equally likely to form image-specific face representations instead. This can be seen by the distribution of controls’ WoWY Z-scores across all four quadrants of Figure 4a, while SRs are overrepresented in quadrants 1 and 3. In particular, a large proportion of controls are situated in quadrant 4, where performance relative to other observers was increasing with memorability in Experiment 1, but decreasing in Experiment 2.
This is indicative of an important—and seemingly qualitative—distinction between SRs’ and controls’ internal concepts of memorability: SRs seem to implicitly build viewpoint-invariant representations of facial identity, rather than relying on image-dependent information. Only among SRs do we note a significant positive correlation between information usage in Experiments 1 and 2, implying that this information was most likely viewpoint-invariant. Taken together, our results suggest that face memorability can be conceptualized as a viewpoint-invariant and dimensional attribute, to which at least SRs seem to be sensitive. In the context of previous results, this could mean that internal agreement between image- and identity-based memorability may be lower or higher depending on the pictorial similarity of stimuli used during encoding/learning and recognition.
As an important caveat, though, face recognition was probed under relatively extreme circumstances in the present experiments. All targets were shown only once during learning phases and were entirely unfamiliar; encoding was solicited only fully implicitly (in Experiment 1) or explicitly (in Experiment 2); we only used frontal and ¾ viewpoints of face images. So, we cannot exclude the possibility that controls might indeed be able to build representations resembling those of SRs if given more frequent (or longer) exposures to face stimuli during learning, or less extreme viewpoint changes during recognition. While here we do find strikingly different patterns of recognition performance between controls and SRs related to memorability, they should not be taken as unequivocal evidence for either qualitative or quantitative differences between groups. Though the relative absence of between-experiment consistency in controls relative to SRs might suggest a qualitative distinction, we would temper this view without testing under an expanded range of learning phases, image memorabilities and viewpoints.
In sum, our results support the hypothesis that face memorability should not only be conceptualized in terms of memory for a specific image (e.g., Bainbridge et al., 2013; Khosla et al., 2015; Broers et al., 2017), but also with respect to viewpoint-invariant information diagnostic of its identity (e.g., Bartlett et al., 1984; Bruce et al., 1994; Valentine et al., 2004; Bainbridge, 2017; Chang et al., 2017). Our results thus invite a reinterpretation of face memorability to include identity-diagnostic information conveyed across variations in viewpoint, as well as a more detailed evaluation of the content of such representations. Currently, it remains relatively unclear what kind(s) of information (i.e., image statistics) contained across various viewpoints of a given facial identity are in fact crucial to the formation of its memorial representation. Although previous research in our lab has identified consistency as a key feature of SRs’ extraordinary abilities (Nador & Vomland, 2021; Nador et al., 2021a & b), more work is needed to elucidate the qualitative and quantitative differences between them and neurotypical individuals. Future research should consider the use of SRs as a special population particularly adept at such abilities, and comparison with controls who may well also be capable under less straining stimulus conditions, as well as automatic solutions developed for face recognition.
All raw data and code for the WoWY analyses are publicly available on the Open Science Framework (https://osf.io/5p7yk/).
The additional file for this article can be found as follows:
1The umbrella term memorability refers to inter-observer consistency of recognition performance measured in vastly different ways, including over different time scales. For instance, face images can be recognized in sequences during an n-back task (c.f. Bainbridge, 2017), during old/new recognition tasks immediately following a dedicated study phase (Bainbridge, 2020), or over longer-term delays (Rugo et al., 2017). While any of these contexts could suffice to measure image memorability, we focused on recognition of previously encountered versus novel stimuli, in the context of old/new recognition tasks. In this context, memorability denotes the extent to which observers tend to recognize common images following previous exposure.
We thank all observers for their participation, Matteo Zoia for assistance during data acquisition, and Wilma Bainbridge for providing information and stimulus material. MR is supported by a Swiss National Science Foundation PRIMA (Promoting Women in Academia) grant (PR00P1_179872).
The last author serves as an editorial board member of Swiss Psychology Open. Members of the editorial team/board are permitted to submit their own papers to the journal. In cases where an author is associated with the journal, they will be removed from all editorial tasks for that paper and another member of the team will be assigned responsibility for overseeing peer review. A competing interest must also be declared within the submission and any resulting publication.
JDN & TAA shared first authorship; MR & TAA designed the experiments; TAAP acquired the data; JDN & AG analyzed the data; MR, JDN & AG wrote the manuscript.
Bainbridge, W. A. (2017). The memorability of people: Intrinsic memorability across transformations of a person’s face. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(5), 706–716. DOI: https://doi.org/10.1037/xlm0000339
Bainbridge, W. A. (2020). The resiliency of image memorability: A predictor of memory separate from attention and priming. Neuropsychologia, 141, 107408. DOI: https://doi.org/10.1016/j.neuropsychologia.2020.107408
Bainbridge, W. A., Berron, D., Schütze, H., Cardenas-Blanco, A., Metzger, C., Dobisch, L., … Düzel, E. (2019). Memorability of photographs in subjective cognitive decline and mild cognitive impairment: Implications for cognitive assessment. Alzheimer‘s & Dementia: Diagnosis, Assessment & Disease Monitoring, 11(1), 610–618. DOI: https://doi.org/10.1016/j.dadm.2019.07.005
Bainbridge, W. A., Dilks, D. D., & Oliva, A. (2017). Memorability: A stimulus-driven perceptual neural signature distinctive from memory. NeuroImage, 149, 141–152. DOI: https://doi.org/10.1016/j.neuroimage.2017.01.063
Bainbridge, W., Isola, P., Blank, I., & Oliva, A. (2012). Establishing a database for studying human face photograph memory. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 34, No. 34). https://escholarship.org/uc/item/49p3934p
Bainbridge, W. A., Isola, P., & Oliva, A. (2013). The intrinsic memorability of face photographs. Journal of Experimental Psychology: General, 142(4), 1323–1334. DOI: https://doi.org/10.1037/a0033872
Bartlett, J. C., Hurry, S., & Thorley, W. (1984). Typicality and familiarity of faces. Memory & Cognition, 12(3), 219–228. DOI: https://doi.org/10.3758/BF03197669
Broers, N., Potter, M. C., & Nieuwenstein, M. R. (2017). Enhanced recognition of memorable pictures in ultra-fast RSVP. Psychonomic Bulletin & Review, 25(3), 1080–1086. DOI: https://doi.org/10.3758/s13423-017-1295-7
Bruce, V., Burton, M. A., & Dench, N. (1994). What’s Distinctive about a Distinctive Face? The Quarterly Journal of Experimental Psychology Section A, 47(1), 119–141. DOI: https://doi.org/10.1080/14640749408401146
Bruck, M., Cavanagh, P., & Ceci, S. J. (1991). Fortysomething: Recognizing faces at one’s 25th reunion. Memory & Cognition, 19(3), 221–228. DOI: https://doi.org/10.3758/BF03211146
Chang, C., Nemrodov, D., Lee, A. C., & Nestor, A. (2017). Memory and Perception-based Facial Image Reconstruction. Scientific Reports, 7(1). DOI: https://doi.org/10.1038/s41598-017-06585-2
Faghel-Soubeyrand, S., Ramon, M., Bamps, E., Zoia, M., Woodhams, J., Alink, A., … Charest, I. (2020). Multivariate pattern analysis reveals domain-general enhancement of visual representations in individuals with “super-recognition” of faces. Journal of Vision, 20(11), 502. DOI: https://doi.org/10.1167/jov.20.11.502
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G * Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191. DOI: https://doi.org/10.3758/BF03193146
Fysh, M. C., Stacchi, L., & Ramon, M. (2020). Differences between and within individuals, and subprocesses of face cognition: Implications for theory, research and personnel selection. Royal Society Open Science, 7(9), 200233. DOI: https://doi.org/10.1098/rsos.200233
Hancock, P. J. (2021). Familiar faces as islands of expertise. Cognition, 214, 104765. DOI: https://doi.org/10.1016/j.cognition.2021.104765
Hancock, P. J. B., & Tiddeman, B. (2011). Stirling/ESRC 3D Face Database. http://pics.stir.ac.uk/ESRC/3d_images.htm
Isola, P., Parikh, D., Torralba, A., & Oliva, A. (2011). Understanding the Intrinsic Memorability of Images. DOI: https://doi.org/10.21236/ADA554133
Isola, P., Xiao, J., Torralba, A., & Oliva, A. (2011). What makes an image memorable? Cvpr 2011. DOI: https://doi.org/10.1109/CVPR.2011.5995721
Jenkins, R., White, D., Montfort, X. V., & Burton, A. M. (2011). Variability in photos of the same face. Cognition, 121(3), 313–323. DOI: https://doi.org/10.1016/j.cognition.2011.08.001
Khosla, A., Raju, A. S., Torralba, A., & Oliva, A. (2015). Understanding and Predicting Image Memorability at a Large Scale. 2015 IEEE International Conference on Computer Vision (ICCV). DOI: https://doi.org/10.1109/ICCV.2015.275
Linka, M., Alsheimer, T., Broda, M. D., de Haas, B., & Ramon, M. (Mar. 14–17, 2021). Atypical Visual Salience in Super-Recognizers. Tagung experimentell arbeitender Psychologen (Conference of Experimental Psychologists): Ulm, Germany.
Lundqvist, D., Flykt, A., & Öhman, A. (1998). The Karolinska directed emotional faces (KDEF). CD ROM from Department of Clinical Neuroscience, Psychology section, Karolinska Institutet, 91(630), 2–2. https://www.kdef.se/home/aboutKDEF.html. DOI: https://doi.org/10.1037/t27732-000
Nador, J. D., & Vomland, M. (2021). Value of lab-based assessment of facial identity processing for police officers› work sample performance. In BPS Cognitive Section Symposium: World Meets Lab. https://www.delegate-reg.co.uk/cognitive2021/programme
Nador, J. D., Zoia, M., Pachai, M. V., & Ramon, M. (2021a). Psychophysical profiles in super-recognizers. Scientific Reports, 11(1), 1–11. DOI: https://doi.org/10.1038/s41598-021-92549-6
Nador, J. D., Zoia, M., Pachai, M. V., & Ramon, M. (Mar. 14–17, 2021b). Super-Recognizers: Psychophysical Examination of Individual Differences. Tagung Experimentell Arbeitender Psychologen (Conference of Experimental Psychologists): Ulm, Germany.
Ramon, M. (2018). The power of how—lessons learned from neuropsychology and face processing. Cognitive Neuropsychology, 35(1–2), 83–86. DOI: https://doi.org/10.1080/02643294.2017.1414777
Ramon, M. (2021). Super-Recognizers –a novel diagnostic framework, 70 cases, and guidelines for future work. Neuropsychologia, 107809. DOI: https://doi.org/10.1016/j.neuropsychologia.2021.107809
Ramon, M., Bobak, A. K., & White, D. (2019a). Super-recognizers: From the lab to the world and back again. British Journal of Psychology, 110(3), 461–479. DOI: https://doi.org/10.1111/bjop.12368
Ramon, M., Bobak, A. K., & White, D. (2019b). Towards a ‘manifesto’ for super-recognizer research. British Journal of Psychology, 110(3), 495–498. DOI: https://doi.org/10.1111/bjop.12411
Rossion, B. (2018). Humans are visual experts at unfamiliar face recognition. TICS, 22(6), 471–472. DOI: https://doi.org/10.1016/j.tics.2018.03.002
Rugo, K. F., Tamler, K. N., Woodman, G. F., & Maxcey, A. M. (2017). Recognition-induced forgetting of faces in visual long-term memory. Attention, Perception, & Psychophysics, 79(7), 1878–1885. DOI: https://doi.org/10.3758/s13414-017-1419-1
Russell, R., Duchaine, B., & Nakayama, K. (2009). Super-recognizers: People with extraordinary face recognition ability. Psychonomic Bulletin & Review, 16(2), 252–257. DOI: https://doi.org/10.3758/PBR.16.2.252
Rust, N. C., & Mehrpour, V. (2020). Understanding Image Memorability. Trends in Cognitive Sciences, 24(7), 557–568. DOI: https://doi.org/10.1016/j.tics.2020.04.001
Valentine, T. (1991). A unified account of the effects of distinctiveness, inversion, and race in face recognition. The Quarterly Journal of Experimental Psychology Section A, 43(2), 161–204. DOI: https://doi.org/10.1080/14640749108400966
Valentine, T., Darling, S., & Donnelly, M. (2004). Why are average faces attractive? The effect of view and averageness on the attractiveness of female faces. Psychonomic Bulletin & Review, 11(3), 482–487. DOI: https://doi.org/10.3758/BF03196599
Vokey, J. R., & Read, J. D. (1992). Familiarity, memorability, and the effect of typicality on the recognition of faces. Memory & Cognition, 20(3), 291–302. DOI: https://doi.org/10.3758/BF03199666
Young, A. W., & Burton, A. M. (2018). What we see in unfamiliar faces: A response to Rossion. TICS, 22(6), 472–473. DOI: https://doi.org/10.1016/j.tics.2018.03.008