EASP Seedcorn Grant Report
07.02.2025, by Media Account
By Matteo Masi, Fabio Fasoli, and Marco Brambilla

Stereotypes, Emotions and Social Categorization Based on Voices and Faces
Matteo Masi (University of Milano-Bicocca), Fabio Fasoli (University of Surrey), & Marco Brambilla (University of Milano-Bicocca)
We categorize others depending on how they sound or look. Do we also rely on vocal and visual cues to interpret others’ emotions? And does this interpretation vary depending on the social identities of the individuals being perceived? These questions guide our research into the interplay between facial features, vocal accents, and the emotions they may evoke.
Social categorization simplifies the complex task of understanding others by relying on group stereotypes (Johnson et al., 2015; Macrae & Bodenhausen, 2000) and minimal cues. Facial characteristics (Rule & Sutherland, 2017) and speech features such as accents (Dragojevic et al., 2021) are two key modalities through which categorization occurs. Research has shown that, in some contexts, accents can be even more diagnostic of ethnic group membership than facial features, particularly when the accent is heard before the face is seen (Hansen et al., 2017, 2018; Paladino & Mazzurega, 2020). Hence, simply seeing or listening to a person is enough to place them into social categories and activate associated stereotypes.
Minimal cues are not only used for social categorization and stereotyping. Emotion recognition similarly relies on facial expressions and vocal cues (Young et al., 2020). Yet, stereotypes about specific groups can influence how emotions are recognized and perceived. Recent frameworks suggest that ethnic facial features can bias emotion recognition, with stereotypes acting as a bridge between emotional and social perceptions (Bijlstra et al., 2010, 2014; Freeman et al., 2020; Hugenberg & Sacco, 2008). For instance, in Western societies, groups like African Americans and North African men are stereotypically associated with anger (Cotrell & Neuberg, 2005; Fourgassie et al., 2023) and such stereotypes can facilitate emotion recognition. Indeed, in several studies by Bijlstra et al. (2010, 2014), Dutch participants were quicker to associate anger with Moroccan Dutch than White Dutch faces, reflecting this stereotype. Research on voice and emotion recognition is scarce when social identities are concerned. However, Laukka and Elfenbein (2021) have shown that listeners are better at recognizing emotions from speakers of the same accent. Still, it remains to understand the role that voice-based categorization and group stereotypes play in emotion recognition when voice is concerned.
Research Questions
This research aims to first establish if social identities signaled by accent and associated stereotypes influence emotional recognition. Next, it aims to examine the interplay between voice and face (cross-modal effects).
This led us to investigate two preliminary questions: Do vocal cues associated with social categories (i.e., national accents) influence emotional categorization? Do group stereotypes account for this process? Then, given that both face and voice are used for social categorization (i.e., accent vs. ethnic facial features), a third question would be: how do they interact for emotion categorization? Experiments answering this latter question are ongoing, so we cannot report results at this time.
Experiment 1: Accents and Emotional Stereotyping
Our first pre-registered experiment examined whether accents could evoke emotional stereotypes. This was a necessary step to understand if speakers with standard and non-standard accents were differently associated with emotions. We sampled American participants who speak English as their native language (N = 270) and used recordings of male and female speakers with Standard American English (standard accent) as well as Arabic, Russian, and Hispanic accents while speaking in English (non-standard accents). We selected such non-standard accents as they represent minority groups in the United States, where the study was conducted. Each speaker recorded brief neutral sentences in English using a neutral tone.
Participants listened to the audio recordings of standard and non-standard accented speakers (varying between-subjects, about 125 participants per condition) in a randomized order and rated the likelihood that each speaker was to experience emotions such as anger, sadness, or happiness on a daily basis – a measure we termed ‘emotion susceptibility’. They also identified the speakers’ accents and completed scales assessing group stereotypes about emotions (i.e., the extent to which different groups experience emotions).
Findings: We first checked how participants perceived the speakers’ group membership. Participants could identify the accents more accurately than chance, confirming the diagnosticity of our stimuli. Second, we examined whether emotional stereotypes emerged. Self-reported stereotypes were found, although not compelling: Hispanic-accented speakers were rated as more emotional overall, Arabic-accented speakers as happier and angrier, and Russian-accented speakers as predominantly angrier. Third, we tested whether differences emerged in emotional susceptibility. Contrary to our expectations, non-standard accents did not consistently lead to higher ratings of susceptibility to any emotion compared to standard American accents.
Experiment 2: Emotion Categorization with Emotional and Accented Speakers
In our second pre-registered experiment, we used a more indirect procedure to examine how accents interact with emotional vocal cues. This time, we used recordings of the same standard and non-standard accented speakers (except Hispanic speakers) expressing anger and sadness via vocal cues. We asked American participants who speak English as their native language (N = 297, about 150 per condition) to listen to the standard and non-standard accent speakers and assessed their reaction times and accuracy in categorizing the emotions conveyed by the speakers. We hypothesized that listening to non-standard accents of speakers of groups stereotyped as angry (vs. sad) would facilitate emotion categorization (i.e., faster categorization times) of their tone as angry (vs. sad) compared to listening to the speaker with a standard accent and the same emotional tone (Biljstra et al., 2014).
Findings: Participants were generally faster and more accurate in recognizing emotions when listening to standard-accented than non-standard accented speakers. For Arabic speakers only, an interesting result emerged: the more participants believed Arabic people experience anger, the slower they were in detecting anger expressed by Arabic-accented speakers compared to speakers with the standard accent and the same emotional tone. This finding suggests that stereotypes may affect emotion perception differently for voice accent than for facial features, possibly interfering with rather than facilitating emotion categorization.
Limitations and Future Directions
The Seedcorn Grant allowed us to conduct relevant research in the field of person perception and stereotyping. Our initial findings shed light on the nuanced ways in which vocal cues, along with stereotypes, shape social and emotional judgments. They showed that individuals held stereotypes about minority groups concerning the emotions they experience. However, we did not find that speakers whose accents conveyed a group membership associated with a stereotype of the group experiencing anger or sadness facilitated such emotion recognition. Instead, we found that American listeners were better at recognizing emotions expressed by speakers of their same language variation. This result is in line with the dialect theory of emotion (Elfenbein et al., 2007; Laukka & Elfenbein, 2021), which posits that people are better at recognizing emotions expressed by ingroup members.
As our results differ from some of the previous findings with ethnic facial features (Bijlstra et al., 2014), the remaining funding will be used to investigate this further. We are preparing a third study with the aim of comparing categorization from faces and voices to highlight differences between the two types of cues. Moreover, since we employed self-reported measures of stereotypes, but previous research used implicit ones (i.e., emotional IAT), we will incorporate this latter to examine if our results are due in part to the type of measure we employed (see Bijlstra et al., 2014). A final and fourth study is in preparation as well, with the aim of highlighting the face/voice interplay, consistent with the idea that multi-modal cues are integrated for person perception (Freeman et al., 2020).
References
Bijlstra, G., Holland, R. W., & Wigboldus, D. H. (2010). The social face of emotion recognition: Evaluations versus stereotypes. Journal of Experimental Social Psychology, 46(4), 657-663.
Bijlstra, G., Holland, R. W., Dotsch, R., Hugenberg, K., & Wigboldus, D. H. (2014). Stereotype associations and emotion recognition. Personality and Social Psychology Bulletin, 40(5), 567-577.
Bodenhausen, G. V., Kang, S. K., & Peery, D. (2012). Social categorization and the perception of social groups. In C. N. Macrae & S. T. Fiske (Eds). The Sage handbook of social cognition, (pp. 311-329). Sage Publications.
Cottrell, C. A., & Neuberg, S. L. (2005). Different emotional reactions to different groups: a sociofunctional threat-based approach to" prejudice". Journal of Personality and Social Psychology, 88(5), 770- 789.
Dragojevic, M., Fasoli, F., Cramer, J., & Rakić, T. (2021). Toward a century of language attitudes research: Looking back and moving forward. Journal of Language and Social Psychology, 40(1), 60-79.
Elfenbein, H. A. (2017). Emotional Dialects in the Language of Emotion. In J. A. Russell & J. M. Fernandez Dols (Eds.) The Science of Facial Expression (pp. 479-496). Oxford University Press.
Formanowicz, M., & Suitner, C. (2020). Sounding strange (r): Origins, consequences, and boundary conditions of sociophonetic discrimination. Journal of Language and Social Psychology, 39(1), 4-21.
Fourgassie, L., Subra, B., & Sanitioso, R. B. (2023). Stereotype Content of North African Men and Women in France and Its Relation to Aggression. Social Psychology Quarterly, 87(1), 64-83.
Freeman, J. B., Stolier, R. M., & Brooks, J. A. (2020). Dynamic interactive theory as a domain-general account of social perception. Advances in Experimental Social Psychology. 61, 237-287.
Hansen, K., Rakić, T., & Steffens, M. C. (2017). Competent and warm?. Experimental Psychology, 64(1), 27-36.
Hansen, K., Rakić, T., & Steffens, M. C. (2018). Foreign-looking native-accented people: More competent when first seen rather than heard. Social Psychological and Personality Science, 9(8), 1001– 1009.
Hugenberg, K., & Bodenhausen, G. V. (2004). Ambiguity in social categorization: The role of prejudice and facial affect in race categorization. Psychological Science, 15(5), 342-345.
Hugenberg, K., & Sacco, D. F. (2008). Social categorization and stereotyping: How social categorization biases person perception and face memory. Social and Personality Psychology Compass, 2(2), 1052-1072.
Johnson, K. L., Lick, D. J., & Carpinella, C. M. (2015). Emergent research in social vision: An integrated approach to the determinants and consequences of social categorization. Social and Personality Psychology Compass, 9(1), 15-30
Laukka, P., & Elfenbein, H. A. (2021). Cross-cultural emotion recognition and in-group advantage in vocal expression: A meta-analysis. Emotion Review, 13(1), 3-11.
Paladino, M. P., & Mazzurega, M. (2020). One of us: On the role of accent and race in real-time in-group categorization. Journal of Language and Social Psychology, 39(1), 22-39.
Rule, N. O., & Sutherland, S. L. (2017). Social categorization from faces: Evidence from obvious and ambiguous groups. Current Directions in Psychological Science, 26(3), 231-236.
Young, A. W., Frühholz, S., & Schweinberger, S. R. (2020). Face and voice perception: Understanding commonalities and differences. Trends in cognitive sciences, 24(5), 398-410.