How AI works in Animl
Last updated
Last updated
For the foreseeable future, we expect that a single ML model will probably not emerge that can meet everyone's camera trap image classification needs. Instead, the ML landscape for camera traps will look more like a mosaic of smaller, niche models designed for specific conservation use-cases, different environments, and different target species.
In order to support this, we designed Animl to be as flexible and model-agnostic as possible. In other words, we see Animl as a platform that allows users to rapidly deploy their own domain-specific classifiers and integrate them into their image processing pipelines. In addition to Bring-Your-Own-Model (BYOM) support, Animl offers some high-level, general purpose algorithms - like Megadetector v5a and v5b - out of the box for anyone to use.
Every new Project has access to Megadetector to perform object detection. Megadetector does a fantastic job of identifying whether or not there's an object of interest in an image, effectively allowing users to filter out empties automatically. If there is an object of interest in the image, Megadetector also draws a bounding box around it and attempts to predict whether object is a person, vehicle, or animal.
Animl also allows users to configure settings for all of the models they use - including Megadetector - so, for example, if you are unlikely to see any vehicles in your camera trap data or simply don't need to label them and don't want to deal with invalidating any false positives that might arise, you can simply turn off that label in your Megadetector Automation Rule settings. You can also adjust the confidence thresholds up and down for each label Megadetector (or any other model) returns, allowing you to optimize performance for your unique data sets and labeling needs.
Currently, the deployment of new models is a semi-manual process that TNC will have to assist you with. If you have a classifier and you're interested in deploying it for use in your Animl project, contact Nathaniel Rindlaub at <first name>.<last name>@tnc.org.
It's often desirable to run images through a sequence (or pipeline) of ML models that perform different functions. For example, if you have a classifier that can distinguish between deer, bears, and mountain lions, it would be a lot more efficient to request Megadetector predictions first, and then, if it thinks that there's an animal present, send that image along to your classifier to try to determine what kind of animal it is.
Another advantage to chaining object detectors and classifiers is that Animl will automatically pass along the bounding-box that was generated in the object-detection stage and use it to crop the background out of the image before sending it to the classifier for inference. Because the backgrounds are often what makes image classification hard on camera trap images, stripping as much of it out as possible will make classification predictions much more accurate.
Animl's flexible Automation Rules support all of this by allowing users to completely control and customize their pipelines, deciding which models their images are submitted to, under what conditions, and in what order.
It's important to note that Animl does not learn or improve on its own, nor does it allow users to automatically train their own classifiers (training classifiers is out of the scope of this documentation, but a brief summary can be found here).
What it does help you do is provide a user interface and user management system for developing a labeled training dataset, which can then be exported and used to train new classifiers/models in an environment suited for deep learning. In fact, if you are using Animl to structure and label your data for any other reason, you are also inadvertently creating a training dataset that might be useful for developing your own bespoke classifier down the road.
There are many other more general purpose tools out there that help you develop labeled training data for image classifiers, but Animl offers a handful of features geared towards addressing some of the unique challenges involved with training ML models for camera traps. For example:
if you're using Megadetector, you will automatically have bounding boxes generated for you, thus dramatically reducing the time it takes to label your images because you don't have to manually draw each one
The bounding box information is exported along with your labels, allowing you to crop the backgrounds out of your training images before you initiate training
location information - i.e., what deployment the image came from - is also exported, which can be critical when training camera trap classifiers. Because one of the challenges of ML with camera trap data is that the models tend to learn too much about the backgrounds of the images they've been trained on, it is often a good strategy to try to mitigate this by splitting your data into training, validation, and test sets by location.