Efficient Batch-Incremental Classification Using UMAP for Evolving Data Streams - Equipe Data, Intelligence and Graphs Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Efficient Batch-Incremental Classification Using UMAP for Evolving Data Streams

Résumé

Learning from potentially infinite and high-dimensional data streams poses significant challenges in the classification task. For instance, k-Nearest Neighbors (kNN) is one of the most often used algorithms in the data stream mining area that proved to be very resource-intensive when dealing with high-dimensional spaces. Uniform Manifold Approximation and Projection (UMAP) is a novel manifold technique and one of the most promising dimension reduction and visualization techniques in the non-streaming setting because of its high performance in comparison with competitors. However, there is no version of UMAP that copes with the challenging context of streams. To overcome these restrictions, we propose a batch-incremental approach that pre-processes data streams using UMAP, by producing successive embeddings on a stream of disjoint batches in order to support an incremental kNN classification. Experiments conducted on publicly available synthetic and real-world datasets demonstrate the substantial gains that can be achieved with our proposal compared to state-of-the-art techniques.
Fichier principal
Vignette du fichier
bahri2020efficient.pdf (1.54 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03190032 , version 1 (05-04-2021)

Identifiants

Citer

Maroua Bahri, Bernhard Pfahringer, Albert Bifet, Silviu Maniu. Efficient Batch-Incremental Classification Using UMAP for Evolving Data Streams. IDA 2020 - 18th International Symposium on Intelligent Data Analysis, Apr 2020, Konstanz / Virtual, Germany. pp.40-53, ⟨10.1007/978-3-030-44584-3_4⟩. ⟨hal-03190032⟩
301 Consultations
435 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More