Birdsong analysis and visualisation

I am Dr. Ayesha Hakim, a computer scientist with an aim to use my skills to do something 'Good for Nature'. I joined Cacophony to contribute towards preserving New Zealand's birds in the wild. My research interests include behaviour analysis using machine learning techniques and producing pretty graphics to visualise the complex data. To be specific, I am interested in recognition and behaviour analysis of humans and birds using audio and visual signals.

Acoustic recordings of birds have been used by conservationists and ecologists to determine population density of specific bird species in a region. However, it is very hard to analyse and visualise the presence/absence of a specific bird species by manually hearing these recordings even by an expert bird song specialist. I am working on developing computational tools to automatically classify and visualise bird sounds in order to recognise different bird species in the wild. It is a powerful combination of machine learning, ecology, and applications of multimedia visualisation. These tools can be used by conversationalists, ornithologists, ecologists, and evolutionary scientists to visually identify a bird’s species using their sound alone.

In this blog, I want to share some of the initial results obtained by automatic clustering of bird species based on their sound features using machine learning techniques. It is an attempt to find similarities between sounds of the same bird species and with other bird species as well as differentiating between birds and human sounds. Many thanks to Nirosha Priyadarshani, Stephen Marsland, Isabel Castro, and Amal Punchihewa for making their datasets available for further research. These datasets are available at

This is an interactive document-driven data visualisation (chordial diagram using d3js) to represent sound similarities and dissimilarities between North Island brown kiwi (female1, female2, female2, female4), kakapo (booming: boom1, boom2, boom3, boom4 and chinging: kc1, kc2, kc3, kc4), and ruru (trill: tril1, tril2, tril3, tril4, more: more1, more2, more3, more4, and pork: pork1, pork2, pork3, pork4). The colored stripes represents each bird and their relationship with each other based on their sound similarity measure.

Hovering the mouse over the visualisation will highlight specific relationships.

A visual representation of 20 human voices (on the right half) and bird sounds (on the left half) and their relationship with each other. We can see a boundary line separating human and birds sounds.

A visualisation representing automatic clustering of male kiwi (left half) and female kiwi (right half) sounds and their relationship with each other.

This document-driven data visualisation represents classification of kakapo chinging (left half) and booming (right half) and their relationship with each other.


These results are quite promising that are obtained by analysis of mid-term audio features of 'clean' audio data. Actually, birds’ sounds are recorded in natural noisy environment, including human voices (due to privacy issues), sounds of vehicles, wind, rain, leaves movement. The first step is to 'clean' the data in order to obtain reliable classification and visualisation using machine learning techniques. Future work could include automatically:

  • classifying bird species
  • detecting the presence/absence of specific bird species
  • differentiating between songs and calls
  • identifying behaviour based on their songs/calls
  • taking appropriate action if the call is for help
  • examining population density of endangered bird species

For further updates on this ongoing project, keep an eye on my latest research articles, blogs or feel free to email me at