How to participate

Choosing a train dataset

For the training you can use the zrc2017-train-dataset or the zrc2015-dataset depending on your use case. The provided datasets can be downloaded using our toolkit or directly using the provided URLs in our repository . For more details on the train dataset see datasets section

Using our toolkit

It is recommended to install and use our toolkit to manage, evaluate & upload your submissions. The toolkit consists of a python package containing evaluation scripts, scripts to download datasets & other relevant files, also scripts to facilitate uploading of results to the leaderboards. You can find instructions on how to download and use our toolkit here

Submission Preparation

Each benchmark requires a specific set of files to be prepared.

To facilitate this you can use the zrc submission:init <name> <location> command from the toolkit to create an empty submission template folder. Where is the name of the benchmark (tde15, tde17) And location is the path where the directory will be created


This file contains meta information about the author and how this submission was created.

example :

  model_id: null
  gpu_budget: 60
  system_description: "CPC-big (trained on librispeech 960), kmeans (trained on librispeech 100), LSTM. See for more details."
  train_set: "librispeech 960, librispeech 100"
  author_label: "Nguyen et al."
  authors: "Nguyen, T., Seyssel, M., Rozé, P., Rivière, M., Kharitonov, E., Baevski, A., Dunbar, E. & Dupoux, E."
  paper_title: "The zero resource speech benchmark 2021: Metrics and baselines for unsupervised spoken language modeling."
  paper_url: ""
  publication_year: 2021
  institution: "EHESS, ENS, PSL Research University, CNRS and Inria"
  team: "CoML Team"
code_url: ""
open_source: true

To Note

While most of the information in meta.yaml is optional, we appreciate if you take the time and fill this information as it allows us to verify the submissions and be able to keep track of all the systems that use our benchmarks.

We also would appreciate if you made your code open source and provided a link to it, although we understand that this is not always possible.


This file contains various parameters that can override the defaults of each benchmark.

njobs: <int> specifies the number of processes to use for evaluation acceleration.

model outputs

The spoken word discovery system should output an ASCII file listing the set of fragments that were found with the following format:

Class <classnb>
<filename> <fragment_onset> <fragment_offset>
<filename> <fragment_onset> <fragment_offset>
Class <classnb>
<filename> <fragment_onset> <fragment_offset>

For example:

Class 1
dsgea01   1.238  1.763
dsgea19   3.380  3.821
reuiz28  18.036 18.537

Class 2
zeoqx71   8.389  9.132

The onset and offset are in seconds. If your system only does matching and not clustering, your classes will only have two elements each. If your system does not only matching, but also clustering and parsing, the fragments found will cover the entirety of the files, and there may be classes with only one element in it (the remainder of lexical-based segmentation).

Structure of files for each abx benchmark :
  • tde15
  • tde17

Running the evaluation

Once the submission has been successfully created we can now run the evaluation. Depending on your benchmark choice you can use the following command to run the evaluation :

  • zrc benchmarks:run tde17 </path/to/submission> -o scores_dir
  • zrc benchmarks:run tde15 </path/to/submission> -o scores_dir

Your results are created in the scores_dir directory.

ou can run a partial evaluation using the -t, --tasks option to specify specific sub-tasks, Ex:

zrc benchmarks:run tde17 </path/to/submission> -o scores_dir -t english french mandarin

allows you to run evaluations only on those languages skipping german & wolof. This can be used in development, please try to use all the languages when uploading results to our leaderboards, as it makes more sense for system comparison.

Uploading Results BETA

We appreciate if you upload your results so that we can compile them into our leaderboards, this helps us with a couple of ways :

  • It allows us to follow new systems that are evaluated on our benchmarks and compare them.
  • It also helps us with creating a central place where all systems trying to solve unsupervised speech processing can be indexed.
  • It shows that interest in our benchmarks is still active and motivates us to create more

To submit your results you need to create an account on our website (if one is not already available). You can follow this link to create your account

Using the toolkit create a local session zrc user:login provide your username & password.

Once this is done you can upload using the following command zrc submit <submission_dir>

Multiple Submissions

If your system can be used for multiple tasks (for example, Task 1 and Task 3, Task 1 and Task 4), you are strongly encouraged to make submission to all the tasks you can. To link submissions of a single system you need to use the same model_id in your meta.yaml auto-generated after the first submission.