Skip to Main content Skip to Navigation
Journal articles

VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data

Abstract : High-throughput sequencing has provided the capacity of broad virus detection for both known and unknown viruses in a variety of hosts and habitats. It has been successfully applied for novel virus discovery in many agricultural crops, leading to the current drive to apply this technology routinely for plant health diagnostics. For this, efficient and precise methods for sequencing-based virus detection and discovery are essential. However, both existing alignment-based methods relying on reference databases and even more recent machine learning approaches are not efficient enough in detecting unknown viruses in RNAseq datasets of plant viromes. We present VirHunter, a deep learning convolutional neural network approach, to detect novel and known viruses in assemblies of sequencing datasets. While our method is generally applicable to a variety of viruses, here, we trained and evaluated it specifically for RNA viruses by reinforcing the coding sequences’ content in the training dataset. Trained on the NCBI plant viruses data for three different host species (peach, grapevine, and sugar beet), VirHunter outperformed the state-of-the-art method, DeepVirFinder, for the detection of novel viruses, both in the synthetic leave-out setting and on the 12 newly acquired RNAseq datasets. Compared with the traditional tBLASTx approach, VirHunter has consistently exhibited better results in the majority of leave-out experiments. In conclusion, we have shown that VirHunter can be used to streamline the analyses of plant HTS-acquired viromes and is particularly well suited for the detection of novel viral contigs, in RNAseq datasets.
Document type :
Journal articles
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03671482
Contributor : Macha Nikolski Connect in order to contact the contributor
Submitted on : Thursday, June 2, 2022 - 9:48:31 AM
Last modification on : Tuesday, October 4, 2022 - 4:46:30 PM
Long-term archiving on: : Saturday, September 3, 2022 - 6:38:43 PM

File

2022-Sukhorukov-Frontiers_in-B...
Publication funded by an institution

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Grigorii Sukhorukov, Maryam Khalili, Olivier Gascuel, Thierry T. Candresse, Armelle Marais, et al.. VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data. Frontiers in Bioinformatics, Frontiers Media, 2022, 2, pp.867111. ⟨10.3389/fbinf.2022.867111⟩. ⟨hal-03671482⟩

Share

Metrics

Record views

41

Files downloads

4