Emotion Communication Through Voice Modulation: Insights on Biological and Evolutionary Underpinnings of Language

Piera Filippi

DOI: http://dx.doi.org/10.12775/ths.2019.005


Abstract. The aim of this review is to enhance our understanding of the role of emotional communication in the emergence of language. I provide data on the following research topics: 1) Cross-species comparative approach to the anatomical principles governing emotional vocal production. 2) Analysis of acoustic parameters conveying emotional arousal and valence through voice modulation across human cultures and a wide variety of vocalizing nonhuman animals. On this regard, I will describe the evolutionary advantage of being able to identify emotional contentin both heterospecific and conspecific vocalizations. 3) The relative salienceof emotional voice modulation and verbal content in emotional meaning processing, as an indicator of the biological role of voice modulation in the emergence of language. Finally, I propose that co-evolutionary dynamics between genetic transmission of the cognitive mechanisms underpinning language and socio-cultural transmission of vocal behaviors are responsible for the emergence of the abilities involved in language.


language evolution; co-evolutional; emotion; prosody; word meaning; interactions

Full Text:



Adolphs, R. (2013). The biology of fear. Current Biology, 23(2), R79–R93.

Arnal, L. H., Flinker, A., Kleinschmidt, A., Giraud, A. L., & Poeppel, D. 2015. Human screams occupy a privileged niche in the communication soundscape. Current

Biology, 25(15), 2051–2056. https://doi.org/10.1016/j.cub.2015.06.043

Belin, P., Fecteau, S., Charest, I., Nicastro, N., Hauser, M. D., & Armony, J. L. (2008). Human cerebral response to animal affective vocalizations. Proceedings of the the Royal Society B, 275(1634), 473–481. https://doi.org/10.1098/


Boë, L.-J., Berthommier, F., Legou, T., Captier, G., Kemp, C., Sawallis, T. R., ... Payan, Y. (2017). Evidence of a vocalic proto-system in the baboon (Papio papio) suggests pre-hominin speech precursors. PLOS One, 12(1), e0169321. https://doi.org/10.1371/journal.pone.0169321

Bosker, H. R. (2017). Accounting for rate-dependent category boundary shifts in speech perception. Attention, Perception, & Psychophysics, 79(1), 333–343. https://doi.org/10.3758/s13414-016-1206-4

Bowling, D. L., Gingras, B., Han, S., Sundararajan, J., & Opitz, E. C. L. (2013). Tone of voice in emotional expression: Relevance for the affective character of musical mode. Journal of Interdisciplinary Music Studies, 7, 29–44. https:// doi.org/10.4407/jims.2014.06.002

Briefer, E. (2012). Vocal expression of emotions in mammals: Mechanisms of production and evidence. Journal of Zoology, 288(1), 1–20.

Briefer, E. F., Tettamanti, F., & Mcelligott, A. G. (2015a). Animal studies repository emotions in goats: Mapping physiological, behavioural and vocal profiles. Animal Behaviour, 99, 131–143.

Briefer, E. F., Maigrot, A. L., Mandel, R., Freymond, S. B., Bachmann, I., & Hillmann, E. (2015b). Segregation of information about emotional arousal and valence in horse whinnies. Scientific Reports, 4, 9989.

Brown, P., & Levinson, S. C. (2006). Chapter 22: Politeness: Some universals in language usage. In A. Jaworski & N. Coupland (Eds.), The discourse reader (pp. 311–323). Abingdon: Routledge.

Brown, S. (2017). A joint prosodic origin of language and music. Frontiers in Psychology, 8, 1894. https://doi.org/10.3389/fpsyg.2017.01894

Charlton, B. D., & Reby, D. (2016). The evolution of acoustic size exaggeration in terrestrial mammals. Nature Communications, 7, 12739. https://doi. org/10.1038/ncomms12739

Collier, K., Bickel, B., van Schaik, C. P., Manser, M. B., & Townsend, S. W. (2014). Language evolution: Syntax before phonology? Proceedings of the Royal Society B, 281(1788), 20140263. https://doi.org/10.1098/rspb.2014.0263

Cutler, A., Dahan, D., & van Donselaar, W. (1997). Prosody in the comprehension of spoken language: A literature review. Language and Speech, 40, 141–201. https://doi.org/10.1177/002383099704000203

Dalgleish, T. (2004). The emotional brain. Nature Reviews Neuroscience, 5(7), 583. Darwin, C. (1871). The descent of man, and selection in relation to sex. London:

John Murray.

Deacon, T. W. (1998). The symbolic species: The co-evolution of language and the brain. New York City, NY: W. W. Norton & Company.

de Boer, B., Wich, S. A., Hardus, M. E., & Lameira, A. R. (2015). Acoustic models of orangutan hand-assisted alarm calls. The Journal of Experimental Biology, 218(6), 907–914. https://doi.org/10.1242/jeb.110577

de Carvalho, A., Dautriche, I., Lin, I., & Christophe, A. (2017). Phrasal prosody constrains syntactic analysis in toddlers. Cognition, 163, 67–79. https://doi.


Edmunds, M. (1974). Defence in animals: A survey of anti-predator defences. Harlow: Longman Publishing Group.

Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6(3–4), 169–200. https://doi.org/10.1080/02699939208411068

Engesser, S., Ridley, A. R., & Townsend, S. W. (2016). Meaningful call combinations and compositional processing in the southern pied babbler. Proceedings of the National Academy of Sciences, 113(21), 5976–5981. https://doi.org/10.1073/pnas.1600970113

Fallow, P. M., Gardner, J. L., & Magrath, R. D. (2011). Sound familiar? Acoustic similarity provokes responses to unfamiliar heterospecific alarm calls. Behavioral Ecology, 22(2), 401–410. https://doi.org/10.1093/beheco/arq221

Faragó, T., Andics, A., Devecseri, V., Kis, A., Gácsi, M., & Miklósi, D. (2014). Humans rely on the same rules to assess emotional valence and intensity in conspecific and dog vocalizations. Biology Letters, 10(1), 20130926. https:// doi.org/10.1098/rsbl.2013.0926

Filippi, P. (2016). Emotional and interactional prosody across animal communication systems: A comparative approach to the emergence of language. Frontiers

in Psychology, 7, 1393. https://doi.org/10.3389/fpsyg.2016.01393

Filippi, P., Gingras, B., & Fitch, W. T. (2014). Pitch enhancement facilitates word learning across visual contexts. Frontiers in Psychology, 5, 1468. doi: 10.3389/


Filippi, P., Congdon, J. V., Hoang, J., Bowling, D. L., Reber, S. A., Pašukonis, A., [...] Güntürkün, O. (2017a). Humans recognize emotional arousal in vocalizations across all classes of terrestrial vertebrates: Evidence for acoustic universals. Proceedings of the Royal Society B, 284(1859), 20170990.

Filippi, P., Gogoleva, S. S., Volodina, E. V., Volodin, I. A., & de Boer, B. (2017b). Humans identify negative (but not positive) arousal in silver fox vocalizations: Implications for the adaptive value of interspecific eavesdropping. Current Zoology, 63(4), 445–456. https://doi.org/10.1093/cz/zox035

Filippi, P., Ocklenburg, S., Bowling, D. L., Heege, L., Güntürkün, O., Newen, A., & de Boer, B. (2017c). More than words (and faces): Evidence for a Stroop effect of prosody in emotion word processing. Cognition and Emotion, 31(5), 879–891.

Filippi, P., Laaha, S., & Fitch, W. T. (2017d). Utterance-final position and pitch marking aid word learning in school-age children. Royal Society Open Science, 4(8), 161035.

Fischer, J., & Price, T. (2017). Meaning, intention, and inference in primate vocal communication. Neuroscience and Biobehavioral Reviews, 82, 22–31. https:// doi.org/10.1016/j.neubiorev.2016.10.014

Fitch, W. T., de Boer, B., Mathur, N., & Ghazanfar, A. A. (2016). Monkey vocal tracts are speech-ready. Science Advances, 2(12), e1600723. https://doi.org/10.1126/ sciadv.1600723

Fitch, W. T., Huber, L., & Bugnyar, T. (2010). Social cognition and the evolution of language: Constructing cognitive phylogenies. Neuron, 65(6), 795–814. https://doi.org/10.1016/j.neuron.2010.03.011

Fitch, W. T., Neubauer, J., & Herzel, H. (2002). Calls out of chaos: The adaptive significance of nonlinear phenomena in mammalian vocal production. Animal Behaviour, 63(3), 407–418. https://doi.org/10.1006/anbe.2001.1912

Fitch, W. T. S. (2010). The evolution of language. Cambridge: Cambridge Univeristy Press.

Gould, S. J., & Eldredge, N. (1977). Puncuated equilibria: The tempo and mode of evolution reconsidered. Paleobiology, 3(2), 115–151. https://doi. org/10.1017/S0094837300005224

Gout, A., Christophe, A., & Morgan, J. L. (2004). Phonological phrase boundaries constrain lexical access II. Infant data. Journal of Memory and Language, 51(4), 548–567. https://doi.org/10.1016/j.jml.2004.07.002

Hauser, M. D. (1996). The evolution of communication. Cambridge, MA: The MIT Press.

Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298(5598), 1569–1579.

Hockett, C. (1960). The origin of speech. Scientific American, 203, 88–111. doi:10.1038/scientificamerican0960-88

Johnson, E. K., & Jusczyk, P. W. (2001). Word segmentation by 8-month-olds: When speech cues count more than statistics. Journal of Memory and Language,

(4), 548–567. https://doi.org/10.1006/jmla.2000.2755

Kim, S. K., & Sumner, M. (2017). Beyond lexical meaning: The effect of emotional prosody on spoken word recognition. The Journal of the Acoustical Society

of America, 142(1), EL49–EL55.

Kitchen, D. M., Bergman, T. J., Cheney, D. L., Nicholson, J. R., & Seyfarth, R. M. (2010). Comparing responses of four ungulate species to playbacks of baboon alarm calls. Animal Cognition, 13(6), 861–870. https://doi. org/10.1007/s10071-010-0334-9

Kotz, S. A., & Paulmann, S. (2011). Emotion, language, and the brain. Language and Linguistics Compass, 5(3), 108–125.

Lameira, A. R., Maddieson, I., & Zuberbühler, K. (2014). Primate feedstock for the evolution of consonants. Trends in Cognitive Sciences, 18(2), 60–62. https:// doi.org/10.1016/j.tics.2013.10.013

Laukka, P., Juslin, P., & Bresin, R. (2005). A dimensional approach to vocal expression of emotion. Cognition & Emotion, 19(5), 633–653. https://doi. org/10.1080/02699930441000445

Lehiste, I. (1970). Suprasegmentals. Cambridge, MA: The MIT Press.

Liao, D. A., Zhang, Y. S., Cai, L. X., & Ghazanfar, A. A. (2018). Internal states and extrinsic factors both determine monkey vocal production. Proceedings of the National Academy of Sciences, 201722426. https://doi.org/10.1073/pnas.1722426115

Linhart, P., Ratcliffe, V. F., Reby, D., & Špinka, M. (2015). Expression of emotional arousal in two different piglet call types. PLoS One, 10(8), e0135414. https:// doi.org/10.1371/journal.pone.0135414

Magrath, R. D., Pitcher, B. J., & Gardner, J. L. (2009). Recognition of other species’ aerial alarm calls: Speaking the same language or learning another? Proceedings of the Royal Society B: Biological Sciences, 276(1657), 769–774. https://doi.org/10.1098/rspb.2008.1368

Maigrot, A. L., Hillmann, E., Anne, C., & Briefer, E. F. (2017). Vocal expression of emotional valence in Przewalski’s horses (Equus przewalskii). Scientific Reports, 7(1), 8779.

Manser, M. B., Seyfarth, R. M., & Cheney, D. L. (2002). Suricate alarm calls signal predator class and urgency. Trends in Cognitive Sciences, 6(2), 55–57. https:// doi.org/10.1016/S1364-6613(00)01840-4

Maruščáková, I. L., Linhart, P., Ratcliffe, V. F., Tallet, C., Reby, D., & Špinka, M. (2015). Humans (Homo sapiens) judge the emotional content of piglet (Sus scrofa domestica) calls based on simple acoustic parameters, not personality, empathy, nor attitude toward animals. Journal of Comparative Psychology, 129(2), 121–131.

McComb, K., Taylor, A. M., Wilson, C., & Charlton, B. D. (2009). The cry embedded within the purr. Current Biology, 19(13), R507–R508.

Mendl, M., Burman, O. H. P., & Paul, E. S. (2010). An integrative and functional framework for the study of animal emotion and mood. Proceedings. Biological Sciences / The Royal Society, 277(1696), 2895–2904. https://doi.org/10.1098/ rspb.2010.0303

Mendl, M., Paul, E. S., & Chittka, L. (2011). Animal behaviour: Emotion in invertebrates? Current Biology, 21(12), R463–R465. https://doi.org/ 10.1016/j.cub.2011.05.028

Mithen, S. J. (2005). The singing Neanderthals: The origins of music, language, mind, and body. Cambridge, MA: Harvard University Press.

Morton, E. S. (1977). On the occurrence and significance of motivation-structural rules in some bird and mammal sounds. The American Naturalist, 111(981), 855–869. https://doi.org/10.1086/283219

Nesse, R. M. (1990). Evolutionary explanations of emotions. Human Nature, 1(3), 261–289.

Nygaard, L. C., & Lunders, E. R. (2002). Resolution of lexical ambiguity by emotional tone of voice. Memory and Cognition, 30(4), 583–593. https://doi. org/10.3758/BF03194959

Ouattara, K., Lemasson, A., & Zuberbühler, K. (2009). Campbell’s monkeys concatenate vocalizations into context-specific call sequences. Proceedings of the National Academy of Sciences, pnas-0908118106.

Ohala, J. J. (1983). Cross-language use of pitch: An ethological view. Phonetica, 40(1), 1–18. https://doi.org/10.1159/000261678

Owings, D. H., & Morton, E. S. (1998). Animal vocal communication: A new approach. Cambridge: Cambridge University Press.

Owren, M. J., & Rendall, D. (2001). Sound on the rebound: Bringing form and function back to the forefront in understanding nonhuman primate vocal signaling. Evolutionary Anthropology: Issues, News, and Reviews, 10(2), 58–71. https://doi.org/10.1002/evan.1014

Pongrácz, P., Molnár, C., & Miklósi, Á. (2006). Acoustic parameters of dog barks carry emotional information for humans. Applied Animal Behaviour Science, 100(3–4), 228–240.

Price, T., Wadewitz, P., Cheney, D., Seyfarth, R., Hammerschmidt, K., & Fischer, J. (2015). Vervets revisited: A quantitative analysis of alarm call structure and context specificity. Scientific Reports, 5(13220), 1–11. https://doi.org/10.1038/ srep13220

Reichert, M. S. (2013). Patterns of variability are consistent across signal types in the treefrog Dendropsophus ebraccatus. Biological Journal of the Linnean Society, 109(1), 131–145. https://doi.org/10.1111/bij.12028

Romeo, R. R., Leonard, J. A., Robinson, S. T., West, M. R., Mackey, A. P., Rowe, M. L., & Gabrieli, J. D. E. (2018). Beyond the 30-million-word gap: Children’s conversational exposure is associated with language- related brain function. Psychological Science, 29(5), 700–710. https://doi. org/10.1177/0956797617742725

Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161.

Sauter, D. A., Eisner, F., Ekman, P., & Scott, S. K. (2010). Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations. Proceedings of the National Academy of Sciences of the United States of America, 107(6), 2408–2412. https://doi.org/10.1073/pnas.0908239106

Scherer, K. R. (1986). Vocal affect expression: A review and a model for future research. Psychological Bulletin March, 99(2), 143–165. https://doi. org/10.1037/0033-2909.99.2.143

Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256. https://doi.org/10.1016/ S0167-6393(02)00084-5

Schirmer, A., Kotz, S. A., & Friederici, A. D. (2002). Sex differentiates the role of emotional prosody during word processing. Cognitive Brain Research, 14(2), 228–233.

Soderstrom, M., Seidl, A., Kemler Nelson, D. G., & Jusczyk, P. W. (2003). The prosodic bootstrapping of phrases: Evidence from prelinguistic infants. Journal of Memory and Language, 49(2), 249–267.

Stoeger, A. S., Baotic, A., Li, D., & Charlton, B. D. (2012). Acoustic features indicate arousal in infant giant panda vocalisations. Ethology, 118(9), 896–905. https:// doi.org/10.1111/j.1439-0310.2012.02080.x

Stoeger, A. S., Charlton, B. D., Kratochvil, H., & Fitch, W. T. (2011). Vocal cues indicate level of arousal in infant African elephant roars. The Journal of the Acoustical Society of America, 130(3), 1700–1710. https://doi. org/10.1121/1.3605538

Taylor, A. M., & Reby, D. (2010). The contribution of source-filter theory to mammal vocal communication research. Journal of Zoology, 280(3), 221–236.

Templeton, C. N., Greene, E., & Davis, K. (2005). Allometry of alarm calls: Black- capped chickadees encode information about predator size. Science, 308(5730), 1934–1937.

Titze, I. R. (1994). Principles of voice production. Upper Saddle River, NJ: Prentice Hall.

Townsend, S. W., Koski, S. E., Byrne, R. W., Slocombe, K. E., Bickel, B., Boeckle, M., [...] Glock, H. J. (2017). Exorcising G rice’s ghost: An empirical approach to studying intentional communication in animals. Biological Reviews, 92(3), 1427–1433.


Tusing, K. (2000). The sounds of dominance. Vocal precursors of perceived dominance during interpersonal influence. Human Communication Research, 26(1), 148–171. https://doi.org/10.1093/hcr/26.1.148

Van Donselaar, W., Koster, M., & Cutler, A. (2005). Exploring the role of lexical stress in lexical recognition. Quarterly Journal of Experimental Psychology Section A: Human Experimental Psychology, 58(2), 251–273. https://doi. org/10.1080/02724980343000927

Yoshida, S., & Okanoya, K. (2005). Evolution of turn-taking: A bio-cognitive perspective. Cognitive Studies, 12(3), 153–165.

Zuberbühler, K. (2002). A syntactic rule in forest monkey communication. Animal Behaviour, 63(2), 293–299.


  • There are currently no refbacks.

ISSN 2392-1196 (online)

Partnerzy platformy czasopism