Comparing sequences without using alignments: application to HIV/SIV subtyping - Institut de Mathématiques de Luminy Accéder directement au contenu
Article Dans Une Revue BMC Bioinformatics Année : 2007

Comparing sequences without using alignments: application to HIV/SIV subtyping

Résumé

Background In general, the construction of trees is based on sequence alignments. This procedure, however, leads to loss of informationwhen parts of sequence alignments (for instance ambiguous regions) are deleted before tree building. To overcome this difficulty, one of us previously introduced a new and rapid algorithm that calculates dissimilarity matrices between sequences without preliminary alignment. Results In this paper, HIV (Human Immunodeficiency Virus) and SIV (Simian Immunodeficiency Virus) sequence data are used to evaluate this method. The program produces tree topologies that are identical to those obtained by a combination of standard methods detailed in the HIV Sequence Compendium. Manual alignment editing is not necessary at any stage. Furthermore, only one user-specified parameter is needed for constructing trees. Conclusion The extensive tests on HIV/SIV subtyping showed that the virus classifications produced by our method are in good agreement with our best taxonomic knowledge, even in non-coding LTR (Long Terminal Repeat) regions that are not tractable by regular alignment methods due to frequent duplications/insertions/deletions. Our method, however, is not limited to the HIV/SIV subtyping. It provides an alternative tree construction without a time-consuming aligning procedure.
Fichier principal
Vignette du fichier
2007_Didier_BMC-Bioinformatics_1.pdf (341.99 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

inria-00289069 , version 1 (30-05-2020)

Identifiants

Citer

Gilles Didier, Laurent Debomy, Maude Pupin, Ming Zhang, Alexander Grossmann, et al.. Comparing sequences without using alignments: application to HIV/SIV subtyping. BMC Bioinformatics, 2007, 8 (1), ⟨10.1186/1471-2105-8-1⟩. ⟨inria-00289069⟩
217 Consultations
109 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More