Decoding elephant communication with AI.
Background
Elephants communicate extensively and display a remarkable variability in their calls. With their highly developed vocal production systems, they imitate, modify, and combine sounds in sophisticated ways to convey information to their conspecifics. Several acoustic parameters relevant to elephant communication are already known. Previous research based on the analysis and classification of manually extracted sounds enabled, for instance, the identification of individuals by their calls, yielded insight into the animals’ vocal repertoire, and found links between their vocalizations and arousal levels. However, decoding and interpreting elephant communication patterns remains a significant challenge.
Project Content and Aims
Building upon the latest developments in artificial intelligence (AI), the world’s largest annotated collection of elephant field recordings, and novel bioacoustic insights, we model elephant sound production and comprehension through a novel computational approach. More precisely, we leverage AI to mine acoustic patterns in our large-scale recording data that potentially carry meaningful information. Using our acoustic models of sound production and perception and the identified acoustic patterns, we aim to synthesize naturalistic elephant calls and use them to directly 'talk' and listen to elephants in the wild.
The specific aims of the projects include:
- Developing methods for the automatic recognition of elephant calls and methods to synthesize realistic elephant calls artificially.
- Applying those methods to elephant calls for
- the identification of individual information such as arousal state, age, sex, reproductive state, etc., of an elephant, as well as for the collection of social information (e.g., dialects within a family).
- the synthesis of realistic calls for controlled playback experiments to test their effect on elephants in the field.
- Analysing context-related information to better understand how context changes the meaning of a call.
- Developing a "dictionary" of elephant vocal communication that includes all identified and verified acoustic patterns along with their potential meanings.
Methods
We develop an innovative and multidisciplinary approach, leveraging extensive expert knowledge from behavioural biology, a large-scale curated dataset, and the latest AI methodologies. Our dataset includes approximately 10,000 fully annotated and categorized calls from wild and captive elephants of all ages and sexes. We use advanced models of sound production and perception in humans for conditioning acoustic pattern mining algorithms to the identification of relevant cues in elephant communication. Based on the resulting models, we employ novel sound synthesis and analysis algorithms to simulate elephant sound production and comprehension. We validate our approach through field experiments including playback experiments conducted in South Africa.
You want to know more. Feel free to ask.
Media Computing Research Group
Institute of Creative\Media/Technologies
Department of Media and Digital Technologies
Daniel Haider
Jure Zeleznik
- Österreichische Akademie der Wissenschaften