Skip to Main content Skip to Navigation
New interface
Preprints, Working Papers, ...

DeepWILD: Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning

Fanny Simões 1, 2, 3 Charles Bouveyron 4, 5 Frédéric Precioso 5, 6 
Abstract : Videos and images from camera traps are more and more used by ecologists to estimate the population of species on a territory. Most of the time, it is a laborious work since the experts analyse manually all this data. It takes also a lot of time to filter these videos when there are plenty of empty videos or with humans presence. Fortunately, deep learning algorithms for object detection could help ecologists to identify multiple relevant species on their data and to estimate their population. In this study, we propose to go even further by using object detection model to detect, classify and count species on camera traps videos. We developed a 3-parts process to analyse camera trap videos. At the first stage, after splitting videos into images, we annotate images by associating bounding boxes to each label thanks to MegaDetector algorithm. Then, we extend MegaDetector based on Faster R-CNN architecture with backbone Inception-ResNet-v2 in order to not only detect the 13 species considered but also to classify them. Finally, we define a method to count species based on maximum number of bounding boxes detected, it included only detection results and an evolve version of this method included both, detection and classification results. The results obtained during the evaluation of our model on the test dataset are: (i) 73,92% mAP for classification, (ii) 96,88% mAP for detection with a ratio Intersection-Over-Union (IoU) of 0.5 (overlapping ratio between groundtruth bounding box and the detected one), and (iii) 89,24% mAP for detection at IoU=0.75. Big species highly represented, like human, have highest values of mAP around 81% whereas species less represented in the train dataset, such as dog, have lowest values of mAP around 66%. As regards to our method of counting, we predicted a count either exact or ± 1 unit for 87% with detection results and 48% with detection and classification results of our video sample. Our model is also able to detect empty videos. To the best of our knowledge, this is the first study in France about the use of object detection model on a French national park to locate, identify and estimate the population of species from camera trap videos.
Document type :
Preprints, Working Papers, ...
Complete list of metadata
Contributor : Charles Bouveyron Connect in order to contact the contributor
Submitted on : Tuesday, October 4, 2022 - 6:58:43 PM
Last modification on : Tuesday, October 25, 2022 - 4:20:32 PM


Files produced by the author(s)


  • HAL Id : hal-03797530, version 1


Fanny Simões, Charles Bouveyron, Frédéric Precioso. DeepWILD: Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning. 2022. ⟨hal-03797530⟩



Record views


Files downloads