Unsupervised Word Segmentation from Speech with Attention - Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Unsupervised Word Segmentation from Speech with Attention

Résumé

We present a first attempt to perform attentional word segmen-tation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.
Fichier principal
Vignette du fichier
template.pdf (266.28 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01818092 , version 1 (18-06-2018)

Identifiants

  • HAL Id : hal-01818092 , version 1

Citer

Pierre Godard, Marcely Zanon Boito, Lucas Ondel, Alexandre Berard, François Yvon, et al.. Unsupervised Word Segmentation from Speech with Attention. Interspeech 2018, Sep 2018, Hyderabad, India. ⟨hal-01818092⟩
264 Consultations
591 Téléchargements

Partager

Gmail Facebook X LinkedIn More