Unsupervised Fine-grained Hate Speech Target Community Detection and Characterisation on Social Media - 3IA Côte d’Azur – Interdisciplinary Institute for Artificial Intelligence Accéder directement au contenu
Article Dans Une Revue Social Network Analysis and Mining Année : 2023

Unsupervised Fine-grained Hate Speech Target Community Detection and Characterisation on Social Media

Résumé

Recent studies have highlighted the importance to reach a fine-grained online hate speech characterisation to better understand how hate is conveyed, especially on social media. A key element in this scenario is the identification and characterisation of the hate speech target community, e.g., national, ethnic, religious minorities. In this paper, we propose a full pipeline relying on unsupervised methods to distinguish specific hate speech manifestations, i.e., targeted (group of) victim(s) and the protected characteristics (target-types) discriminated. Our contribution is threefold: (1) we leverage multiple data views to contrast different abusive behaviours; (2) we explore the use of clustering techniques to perform fine-grained hate speech target community detection, and (3) we address an in-depth content analysis of the generated hate speech target communities. Relying on multiple data views derived from multilingual pre-trained language models (i.e., multilingual BERT and multilingual Universal Sentence Encoder) and the Multi-view Spectral Clustering (MvSC) algorithm, the 69 experiments performed on the Multilingual Hate Speech dataset (MLMA) of tweets show that most of the configurations of the proposed pipeline significantly outperforms state-of-the-art clustering algorithms on French and English. Our experiments confirm the ability of the proposed approach to capture complex hate speech phenomena (i.e., intersections between victim-groups, target-types or both).
Fichier principal
Vignette du fichier
snam_2023.pdf (1.51 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04014977 , version 1 (05-03-2023)

Identifiants

Citer

Anaïs Ollagnier, Elena Cabrio, Serena Villata. Unsupervised Fine-grained Hate Speech Target Community Detection and Characterisation on Social Media. Social Network Analysis and Mining, 2023, ⟨10.1007/s13278-023-01061-4⟩. ⟨hal-04014977⟩
66 Consultations
122 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More