DeepWILD: Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning - 3IA Côte d’Azur – Interdisciplinary Institute for Artificial Intelligence Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

DeepWILD: Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning

Résumé

Videos and images from camera traps are more and more used by ecologists to estimate the population of species on a territory. Most of the time, it is a laborious work since the experts analyse manually all this data. It takes also a lot of time to filter these videos when there are plenty of empty videos or with humans presence. Fortunately, deep learning algorithms for object detection could help ecologists to identify multiple relevant species on their data and to estimate their population. In this study, we propose to go even further by using object detection model to detect, classify and count species on camera traps videos. We developed a 3-parts process to analyse camera trap videos. At the first stage, after splitting videos into images, we annotate images by associating bounding boxes to each label thanks to MegaDetector algorithm. Then, we extend MegaDetector based on Faster R-CNN architecture with backbone Inception-ResNet-v2 in order to not only detect the 13 species considered but also to classify them. Finally, we define a method to count species based on maximum number of bounding boxes detected, it included only detection results and an evolve version of this method included both, detection and classification results. The results obtained during the evaluation of our model on the test dataset are: (i) 73,92% mAP for classification, (ii) 96,88% mAP for detection with a ratio Intersection-Over-Union (IoU) of 0.5 (overlapping ratio between groundtruth bounding box and the detected one), and (iii) 89,24% mAP for detection at IoU=0.75. Big species highly represented, like human, have highest values of mAP around 81% whereas species less represented in the train dataset, such as dog, have lowest values of mAP around 66%. As regards to our method of counting, we predicted a count either exact or ± 1 unit for 87% with detection results and 48% with detection and classification results of our video sample. Our model is also able to detect empty videos. To the best of our knowledge, this is the first study in France about the use of object detection model on a French national park to locate, identify and estimate the population of species from camera trap videos.
Fichier principal
Vignette du fichier
Detection__classification_and_counting_of_species_on_camera_trap_videos_using_deep_learning-3.pdf (31.67 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03797530 , version 1 (04-10-2022)
hal-03797530 , version 2 (04-04-2023)

Identifiants

  • HAL Id : hal-03797530 , version 1

Citer

Fanny Simões, Charles Bouveyron, Frédéric Precioso. DeepWILD: Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning. 2022. ⟨hal-03797530v1⟩
353 Consultations
167 Téléchargements

Partager

Gmail Facebook X LinkedIn More