How accurate is Artificial Intelligence for predator recognition?

One of the most common questions we get asked is how accurate our Artificial Intelligence(AI) solution for predator identification is. But we suggest more useful questions are:

How accurate is predator identification at different distances?
How does accuracy compare to humans?
How accurate does it need to be in order to be useful?

How accurate is the identification at different distances?

It is probably pretty obvious that accuracy will depend on how far away the predator is. It helps to ask yourself how accurate humans are at recognising faces. It depends how far away they are, right? Imagine someone walking down the street towards you - when they’re a hundred metres away, it’s hard to be certain who they are but as the person approaches, you become increasingly certain who it actually is. A similar thing happens to AI models.

Below is a graph that shows the accuracy of our AI model relative to the average size of the image (tracking the width of the animal in the image, which is a useful proxy for distance). As the predator gets closer the AI gets closer to 100%. Note that it may never get to 100% accuracy as it could be partly obscured or at an odd angle etc.

Example showing certainty of identification against the width of the animal (# of pixels filled).

(Note: the number shown in brackets is the number of samples - a sample being 25 frames of a recording)

To help you visualise what we mean by width, the image below is an example of a frame from a recording where the hedgehog is filling 22 pixels.

The graph shown above is for hedgehogs but the pattern of the graph is the same no matter which species we choose.

AI confidence across species as distance decreases

Two exceptions to this pattern can be seen in the graph above:

Where we have fewer images (e.g. vehicle)
Where certainty drops off is for a human. This is because, as a human gets close to the camera, it can be hard to see any shape at all - they fill the frame

A key thing to recognise is that the AI does not need to achieve 100% to be useful because each “visit” of a predator is typically 2-10 short videos as it comes in and out of frame. Typically the first view may be in the distance and difficult to identify but, as the animal moves closer or simply spends more time in the frame, over a whole series of visits, the AI accuracy increases dramatically. So the average accuracy may be 70% but you can be close to 100% sure that the animal in a visit was (for instance) a hedgehog because 4 out of the 6 videos were 90% sure it was a hedgehog - no need to worry about the other 2 where the AI was less sure.

The way the Cacophony software works is that it groups all the videos into a “visit” (a small set of recordings) and makes a summary estimation. This is usually much more accurate than any one of the individual videos (for the technically inclined among you, that's effectively layering a simple algorithm on top of the AI model results). This is obviously much less work than traditional trail cameras where the user has to collect images from SD cards and manually go through all the images (including false positives) trying to spot predators. Also worth mentioning again that traditional trail cameras are designed for larger mammals and may miss many of the predators New Zealand conservationists are interested in.

In summary, the AI does not have to achieve100% accuracy for each recording to get close to 100% identification accuracy for each predator visit, it just needs to have enough recordings where the animal gets close enough to the camera.

The clip below shows a progressively increasing confidence (as seen in the running label "commentary") of a classification as the animals get closer and clearer.

The final result in this recording is very high confidence even though some of the intermediate classifications along the way are somewhat uncertain.

How does the accuracy compare to humans

This is a harder question to answer but we can see the potential that this AI will actually end up better than humans over time. We have seen examples where it has appeared that, in the early recordings of a visit (where the animal might be far away or partially obscured), the AI seems to make the wrong identification.

A small, blurry shape appears on the first recording. "Possum" the AI states confidently. The human, watching, smiles.

"Good effort AI but that's a rat" says the arrogant, condescending human. The animal moves closer, steps out from behind the tree that was partially obscuring it.

"Oh!" says the surprised human.

"Possum" repeats the AI. The human scowls.

"Darn it, you're right, AI". The chastened human retires to lick their wounds. This story has had more than one sequel, the cast changing from time to time.

How accurate does AI need to be in order to be useful?

You may recall from previous blog entries that there are a number of use cases where our cameras useful have proved useful:

Predator density estimation
Re-invasion detection and notification
Digital trigger for a kill trap
Digital trigger for adaptive lure devices

It's worth thinking about each in turn - the answer to our question varies for each.

Predator density estimation

In this case you want to know how many predators of different types are in a given area (a baseline). This will typically help decide what treatment to apply and then a follow up survey can be done to see how well the treatment worked (comparing the data back to the baseline). Compared to chew-cards, tracking tunnels, or traditional trail cameras, the thermal camera with AI will not only see a lot more predators but the AI automatically counts them for you, drastically reducing the human effort required.

We are working on a report in collaboration with DOC that shows that, in this use case, thermal cameras supported by automated identification and reporting software have a lower total cost of ownership over time (watch this space for a copy of that report).

As the technology develops, we expect the cost will come down even further and the AI models will become even more accurate meaning that this part of the problem can be solved.

Re-invasion detection and notification

In the case of re-invasion, immediate and accurate detection and notification are imperative. Our cameras will now automatically identify the animal spotted and send out a notification (by email) with a thumbnail of the image. This allows a human to verify the species. This can save valuable time and resources in managing a potential re-invasion. If you’re involved in protecting a predator-free area, you’ll be aware how important it is that the identification can be verified - important so you can decide the next course of action. The most important thing is to have the AI rapidly identify anything that could possibly be an invader, inform you of its current best guess, and then allow you to manually check that identification, giving you a chance to respond. It matters less if it gets it wrong. More important that it gives you a chance to check.

Digital trigger for a kill trap

One of the most useful possibilities for this technology is to replace a “dumb” mechanical enclosure and trip plate with a “digital trigger” where the AI decides if or when to trigger the trap. In this case you need the AI to be much more accurate than in a predator counting use case. The data above shows that when the predator is close enough the AI accuracy is very high. We have prototypes of this working and the results look promising. Once we have enough testing under our belts, we’re confident this will be an important tool in keeping our native species safe from accidental damage in traps - with a bit of effort, this could be any trap capable of being put into a “safe” mode, not just the Cacophony trap. A digital trigger will allow a much lower by-catch rate than traditional mechanical triggers.

Bear in mind that a trigger such as this only needs to correctly identify an animal as a predator, it doesn’t really matter if it mistakes a hedgehog for a rodent, as long as it recognises it as a predator as opposed to, for instance, a bird. We call this our pest/not pest AI model.

Digital Trigger for Adaptive lures

An adaptive lure is, for instance, a species-specific lure. Imagine if you knew a stoat had just run into a clearing. Probably not much use having the smell of a fresh apple about but the smell of a rat inside your trap might be really useful to lure the stoat in. We now have a system that allows anyone to develop these types of species-specific lures - all they need is one of our cameras.

The adaptive lures that can be developed are triggered specifically by what the camera “sees”. The camera sends out a Bluetooth notification with the predator and an associated confidence level. An adaptive lure device can then respond appropriately (for instance, our scent lure could decide to spray a rat scent). We have three types of devices in testing currently - an audio device (speaker), a scent spraying device, and a visual lure device.

Each of these devices can be programmed to respond in an adaptive way, triggered by the notification from the AI camera. In terms of the AI accuracy question, in this case, any improvement on a standard lure is a bonus so any level of accuracy is better than nothing. But the adaptive lure device can also decide what level of accuracy is acceptable before it chooses to respond - maybe wait until the AI broadcasts a >90% accuracy before you dispense that tasty egg mayo?

Summary

Although AI accuracy may only be 60-70% on average for each individual video, the AI accuracy per visit for common predators is approaching 100% and makes predator surveys accurate enough to be useful in the field already.

The usefulness for other applications such as re-invasion notification or digital triggers for traps also looks close enough to be useful now and is improving with each test result we see.

Over time it is inevitable that the cost of these AI devices will continue to reduce dramatically, lowering the cost of overall predator control in New Zealand.

As always, we welcome your feedback so don't hesitate to get in touch - leave a comment below or email us at blog@cacophony.org.nz.

How accurate is Artificial Intelligence for predator recognition?

Total Cost of Ownership: Trailcams vs cognified Thermal Cams

Predator Free 2050 Ltd fund further development of our trap