OK Aura Wake-up Word Dataset

DOI10.5281/zenodo.5734340Zenodo5734340MaRDI QIDQ6700106

Dataset published at Zenodo repository.

Mireia Farrús, David Bonet, Carlos Segura, Fernando López, Pablo Gómez, Jordi Luque, Guillermo Cámbara

Publication date: 29 November 2021

Speech dataset for wake-up word (WuW) detection in Telefnicas home assistant, Aura. It contains 1247 utterances (1.4 hours) from ~80 speakers. Speakers pronounce the wake-up word itself OK Aura, plus other sentences that might be similar, or not, to OK Aura. This dataset contains rich metadata annotations, so it is possible to study diverse factors and biases that might affect wake-up word detection performance: accent, gender, prosody/emotion,room size, distance to the microphone, etc. Besides, it also contains recordings of sentences that are phonetically similar to OK Aura, like Porque Laura... or ... como Aura..., with the purpose to experiment with difficult sentences.

This page was built for dataset: OK Aura Wake-up Word Dataset