Skip to Main content Skip to Navigation
Journal articles

Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A

Abstract : Motivation: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly 'explore' the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. Results: S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains-a typical scenario-S3A is up to an order of magnitude faster than general purpose metagenomic assem-blers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. Availability and implementation: S3A is available at
Document type :
Journal articles
Complete list of metadatas

Cited literature [23 references]  Display  Hide  Download
Contributor : Hal Sorbonne Université Gestionnaire <>
Submitted on : Thursday, September 10, 2020 - 2:01:59 PM
Last modification on : Thursday, October 15, 2020 - 2:44:05 PM
Long-term archiving on: : Thursday, December 3, 2020 - 2:04:54 AM


Publication funded by an institution


Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License



Laurent David, Riccardo Vicedomini, Hugues Richard, Alessandra Carbone. Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A. Bioinformatics, Oxford University Press (OUP), 2020, 36 (13), pp.3975-3981. ⟨10.1093/bioinformatics/btaa272⟩. ⟨hal-02935519⟩



Record views


Files downloads