ZRC Series

The Zero Resource Speech Challenge is organized by CoML. CoML is based in the Departement d’Etudes Cognitives of the Ecole Normale Supérieure and is a joint EHESS, ENS, CNRS and Inria research team at the intersection between cognitive and computer science.

Past and Present

Six editions of the Zero Resource Challenge have been proposed over the years as events in different venues (Interspeech, ASRU, NeurIPS) and are summarized in Table II. Each edition has explored a different combination of tasks and introduced different datasets. Overall, the six editions have received a total of 115 submissions from 29 teams. In addition, several papers have been published using some of the Zero Resource benchmark metrics, which we also include in our review.

Chall. Tasks Train Data Test Benchmark
2015 T1, T2 English (Buckeye 5h),Xitsonga (2h30) ABX-15, TDE-15
2017 T1, T2 English (45h), French (24h),Mandarin (2h30), German(25h),Wolof (10h) ABX-17, TDE-17
2019 T3 English (15h+4h40),Indonesian (15h+1h30) TTS0-19
2020 T1, T2, T3 reboot of ZR17, ZR19
2021a T1, T4 English (Librispeech 960 or 100) sLM-21 (ABX-LS,sWUGGY, sBLIMP, sSIMI)
2021b T1, T4 idem plus speech coco

References :

  • 2015: ( Citation: , & al., , , & (). The zero resource speech challenge 2015: Proposed approaches and results. Procedia Computer Science: Proceedings of SLTU 2016, 81. 67–72. )
  • 2017: ( Citation: , & al., , , , , , , & (). The Zero Resource Speech Challenge 2017. IEEE. Retrieved from https://arxiv.org/abs/1712.04313 )
  • 2019: ( Citation: , & al., , , , , , , , , , & (). The zero resource speech challenge 2019: TTS without T. Retrieved from https://arxiv.org/abs/1904.11469 )
  • 2020: ( Citation: , & al., , , , , , , , & (). The zero resource speech challenge 2020: Discovering discrete subword and word units. )
  • 2021: ( Citation: , & al., , , , , , , , & (). The zero resource speech challenge 2021: Spoken language modelling. )

An archive of the pages of the older challenges can be found in the Challenges Archive subsection of this website.

Funding

  • For E. Dunbar, this research was supported by the Connaught Fund and the Arts and Science Tri-Council Bridging Fund, University of Toronto, and by French Agence Nationale de la Recherche (ANR) grant ANR-17-CE28-0009 GEOM-PHON.

  • For E. Dupoux in his EHESS role, this work was supported by ANR grants ANR-10-IDEX-0001-02 PSL* and ANR-19-P3IA-0001 PRAIRIE 3IA Institute, and by a Meta AI Research Gift.