Comparing Data Augmentation and Annotation Standardization to Improve End-to-end Spoken Language Understanding Models

Published 10 Feb 2022

Published on10 Feb 2022

Leah Nicolich-Henkin, Taichi Nakatani , Zach Trozenski, Joel Whiteman, Nathan Susanj

All-neural end-to-end (E2E) Spoken Language Understanding (SLU) models can improve performance over traditional compositional SLU models, but have the challenge of requiring high-quality training data with both audio and annotations. In particular they struggle with performance on “golden utterances”, which are essential for defining and supporting features, but may lack sufficient training data. In this paper, we compare two data-centric AI methods for improving performance on golden utterances: improving the annotation quality of existing training utterances and augmenting the training data with varying amounts of synthetic data. Our experimental results show improvements with both methods, and in particular that augmenting with synthetic data is effective in addressing errors caused by both inconsistent training data annotations as well as lack of training data. This method leads to improvement in intent recognition error rate (IRER) on our golden utterance test set by 93% relative to the baseline without seeing a negative impact on other test metrics.

Posts

Topics

Comparing Data Augmentation and Annotation Standardization to Improve End-to-end Spoken Language Understanding Models

Leah Nicolich-Henkin, Taichi Nakatani , Zach Trozenski, Joel Whiteman, Nathan Susanj

This video is from the NeurIPS 2021 Data-centric AI workshop proceedings.

Join the Data-centric AI Movement

Share