How to participate
Choosing a train dataset
You can train on any of the standard ZeroSpeech Task 4 train sets listed on the Benchmarks and Datasets page, together or combined. You can also train on external datasets, as long as they are publicly available. During the submission process, you will be asked to specify what dataset was used to train your system, providing a link (or publication reference) if it is an external dataset.
Using our toolkit
It is recommended to install and use our toolkit to manage, evaluate & upload your submissions. The toolkit consists of a python package containing evaluation scripts, scripts to download datasets & other relevant files, also scripts to facilitate uploading of results to the leaderboards. You can find instructions on how to download and use our toolkit here
Each benchmark requires a specific set of files to be prepared.
To facilitate this you can use the
zrc submission:init sLM21 <location>command from the toolkit to create an empty submission template folder.
<location>is the path where the directory will be created
This file contains meta information about the author and how this submission was created.
model_info: model_id: null gpu_budget: 60 system_description: "CPC-big (trained on librispeech 960), kmeans (trained on librispeech 100), LSTM. See https://zerospeech.com/2021 for more details." train_set: "librispeech 960, librispeech 100" publication: author_label: "Nguyen et al." authors: "Nguyen, T., Seyssel, M., Rozé, P., Rivière, M., Kharitonov, E., Baevski, A., Dunbar, E. & Dupoux, E." paper_title: "The zero resource speech benchmark 2021: Metrics and baselines for unsupervised spoken language modeling." paper_url: "https://arxiv.org/abs/2011.11589" publication_year: 2021 institution: "EHESS, ENS, PSL Research University, CNRS and Inria" team: "CoML Team" code_url: "https://github.com/zerospeech/zerospeech2021_baseline" open_source: true
While most of the information in meta.yaml is optional, we appreciate if you take the time and fill this information as it allows us to verify the submissions and be able to keep track of all the systems that use our benchmarks.
We also would appreciate if you made your code open source and provided a link to it, although we understand that this is not always possible.
model_idparameter is generated when you submit a system to our backend, if you wish to submit the same system to multiple benchmarks keep the model_id the same to allow our system to link the submissions.
This file contains various parameters that can override the defaults of each benchmark.
semantic: metric: <str> The metric to use for semantic evaluation. May be any metric supported by scipy.spatial.distance.cdist. n_jobs: <int> accelerate semantic evaluation by adding multiple processes pooling: <str> The pooling method to use for semantic evaluation, must be 'min', 'max', 'mean', 'sum', 'last' or 'lastlast'.
For each of the tasks a model output is required.
The /lexical and /syntactic folders of the submission must contain the two files
test.txt. For each *.wav file in the dataset must correspond
a line either in
test.txt with its corresponding pseudo-probability (order does not matter).
For example if the dev dataset contains:
/path/to/dataset/lexical/dev ├── aAAfmkmQpVz.wav ├── AaaggUZsvkR.wav ├── aAakhKfuvQI.wav ├── aAaOswLeeBL.wav ├── AaasVuoMJnS.wav
The submitted file dev.txt must contain entries like:
aAAfmkmQpVz -313.37445068359375 AaaggUZsvkR -447.8950500488281 aAakhKfuvQI -383.8902587890625 aAaOswLeeBL -430.2048645019531 AaasVuoMJnS -356.9426574707031
The semantic folder of the submission must contain the following subdirectories:
.wavfile in the dataset must have its corresponding
.npyfile in the submission under the same directory structure. For example the dataset file
/path/to/dataset/semantic/dev/synthetic/aAbcsWWKCz.wavmust have its submitted file
Each .npy file encodes a single 2D numpy array of floats, each line encoding one features frame.
The number of columns (the features dimension) must be constant across the files. The number of lines depends on the speech sample duration.
The metric and pooling method used for evaluation must be specified in
It is recommended to use .npy files to save your arrays as they are binary formats and use less space. Although .txt format is supported as well for backwards compatibility. Reference to both methods of export :
Running the evaluation
Once the submission has been successfully created we can now run the evaluation.
zrc benchmarks:run sLM21 </path/to/submission> -o scores_dir
Your results are created in the
- A validation will run before each evaluation to skip use option
- If the dataset has subsets you can run the eval on only a selected subset
- If the benchmark has multiple sub tasks you can run your benchmark on a selected subtask using
--task lexical semantic
We appreciate if you upload your results so that we can compile them into our leaderboards, this helps us with a couple of ways :
- It allows us to follow new systems that are evaluated on our benchmarks and compare them.
- It also helps us with creating a central place where all systems trying to solve unsupervised speech processing can be indexed.
- It shows that interest in our benchmarks is still active and motivates us to create more
To submit your results you need to create an account on our website (if one is not already available). You can follow this link to create your account
Using the toolkit create a local session
zrc user:login provide your username & password.
Once this is done you can upload using the following command
zrc submit <submission_dir>
To submit your scores you need include all the required files in the same directory.
- source files: (embeddings/probabilities) these are files extracted from your model.
- score files: these are the result of the evaluation process.
- params.yaml: these are the parameters of the evaluation process.
- meta.yaml: generic information on submission
To run the ProsAudit evaluation task you also need to create a separate submission (as the two benchmarks have been separated).
- Create a submission directory (same as for the sLM21 task)
zrc submission:init prosAudit <location>
- Add your pseudo-probability files
/path/to/submission ├── english_dev.txt ├── english_test.txt
txt file a list of pseudo-probabilities same as done for the lexical & syntactic tasks but using the
prosAudit-dataset this time as shown in the following example (order does not matter)
10_7_2723 -2.2400479316711426 2_7_2723 -2.21551513671875 10_7_2807 -1.9842218160629272 2_7_2807 -1.886082410812378 10_7_1624 -2.1218247413635254 2_7_1624 -2.1739344596862793 10_7_1540 -1.8953361511230469 2_7_1540 -1.8365209102630615 10_7_3596 -2.123969078063965 2_7_3596 -2.1809585094451904 ...
meta.yamlis in the same format as shown here
Evaluation can be run using the command
zrc benchmarks:run /path/to/submission
Scores are added in the scores’ subdirectory.
If your system can be used for multiple tasks (for example, Task 1 and Task 3, Task 1 and Task 4), you are strongly encouraged to make submission to all the tasks you can.
To link submissions of a single system you need to use the same
model_id in your
meta.yaml auto-generated after the first submission.